cancel
Showing results for 
Search instead for 
Did you mean: 

Performance with columns and lots o'nulls

Former Member
2,801

How is performance impacted when you have a table that, lets say could have over 1M rows, and there are 3-5 columns that are consistently NULL.

Is it bad design to have data structures in place that support this? Should we look at normalizing it a little more?

Thanks! -Calvin

P.S. Love this place.

Accepted Solutions (1)

Accepted Solutions (1)

Breck_Carter
Participant

Glad you like this place! It's only going to get better, when more people show up.

I'm not sure what you are asking in your second sentence, in particular I'm not sure what NULL has to do with normalization.

Tackling the first question...

I don't think NULL values cause performance problems per se, at least not that you'd notice. I know that historically, especially with those icky legacy DBMS products like [expletive deleted] and [bodily harm], folks used to worry a lot about fixed-versus-variable length columns, physical ordering of columns in rows, [excrement] like that.

What CAN hurt performance somewhat is inserting rows with all nulls, all zeroes, all empty character strings, and then later updating rows to fill in the values: instant row splits across multiple pages, instant fragmentation of your physical database storage. At least, that was true historically with SQL Anywhere... I am not absolutely sure V11 hasn't eased that problem, I will let Mr. Paulley et al fill in the blanks here.

I haven't run any INSERT/UPDATE/rowsplit tests lately, and when I did it wasn't easy to demonstrate the problem, you had to work at it. And it wasn't directly caused by NULLs per se, but by increasing value sizes over time.

I guess I'm saying, there are lots of reasons to hate NULLs but performance isn't one of them.

VolkerBarth
Contributor
0 Kudos

I guess the question is: Is is good design to put similar things in only one table when some of them can't have values for a few fields whereas others can, or should one put the addional fields in a separate child table, i.e. the common question of OOP-Types/Subtypes and how to map them to a RDM. - Well, there's a whole ORM industry as an answer:)

Former Member

Breck is correct about causing row splits due to updates of columns that were initially NULL. Some of this can be mitigated through the use of new INLINE column specification modifier introduced in Version 10, but the behaviour remains.

Former Member
0 Kudos

Excellent!

The main reason is for asking is we have had some debate on whether or not to place a column on a table, if say, 75% of the time, it would remain NULL.

Breck_Carter
Participant
0 Kudos

Gee, I didn't know you could upvote comments... cool! I like this SQLA thing! 🙂

Breck_Carter
Participant
0 Kudos

@Volker: Are you saying nulls make ORMs harder? Are you trying to make me love nulls? <grad>

VolkerBarth
Contributor
0 Kudos

@Breck: Well, I can't upvote comments (or don't know how to) though I have noticed that this can be done. May it be bound to the amount of score or the part one plays in the current question/answer?

VolkerBarth
Contributor
0 Kudos

Ah, found it out. Obviously I can upvote comments on my own questions/answers. Nice feature, really:)

VolkerBarth
Contributor
0 Kudos

@Breck: I can't say much about ORMs, as I haven't used one so far. Guess I share your point of view of prefering SQL over some abstraction on that, as long as the data model isn't too complex. W.r.t. nulls, I think they are useful in certain circumstances, and for that, I prefer them. But I agree, one shouldn't use them too frequently...if a column mustn't accept nulls, it should be NOT NULL. - Well, this answer might be more serious than your questions:)

Answers (1)

Answers (1)

VolkerBarth
Contributor

In addition to Breck's and Martin's responses, Nulls are quite ignorable w.r.t. to storage size:

From the SA 11 docs:

Column space utilization for NULL values is 1 bit per column and space is allocated in multiples of 8 bits. The NULL bit usage is fixed based on the number of columns in the table that allow NULL values.