cancel
Showing results for 
Search instead for 
Did you mean: 

Re-Order Columns during Un-load / Re-Load

justin_willey
Participant
1,160

The useful Schema Analysis function in SQL Central gives a useful set of recommendations. One of these identifies sub-optimal column ordering saying:

Ordering columns so that wider columns precede narrow columns can negatively impact response times. Narrow columns should be defined in the table declaration before wider columns, unless they are accessed infrequently. Consider re-ordering table columns so that the wider columns are defined at the end. Wide columns are columns greater than 15 in size, or LONG data types (for example, LONG VARCHAR), or columns defined as XML.

This inevitably happens in a database whose structure evolves over time, but unless I'm missing something there's no easy (or even feasible) way of addressing this in a large schema with hundreds of tables / thousands of columns, foreign keys etc.

Since for other purposes, one really doesn't care about column order, what would seem to be a really useful feature, would be an option on dbunload / UNLOAD to optimize the column order automatically during the unload - presumably the primary key would come first, then columns involved in constraints, and then everything else in ascending size.

While an automatically calculated order might not be entirely optimal, it is likely to be better than a more or less random one!

Better ideas eagerly requested!

Accepted Solutions (1)

Accepted Solutions (1)

Breck_Carter
Participant

IMO this exhortation is an incorrect description of third-order effect on runtime performance. It is not discussing first-order set-level retrieval performance, but the scalar analysis of columns in a single row:

The order of the columns in a table affects performance. Columns in a row are accessed sequentially in the order of their creation. For example, to access columns at the end of a row, the database server traverses the columns that appear earlier in the row. You should order your columns so that narrow and/or frequently accessed columns are placed before seldom accessed and/or wider columns in the table. Wide columns are columns greater than 15 bytes in size, or LONG data types (for example, LONG VARCHAR), or columns defined as XML. Primary key columns are always stored at the beginning of row.

In other words, the engine has to "step over" column values 1 to 10 in each row in order to reach column value 11. If you are really worried about these calculations, you shouldn't worry that 55+1 is going to be faster than 55+100 after you move all the short values to the front... because it isn't... addition performance is not value dependent.

Note that it doesn't have to step-over-byte-by-byte, in case that's what you were thinking 🙂

Instead, you should move all the MOST-FREQUENTLY-REFERENCED columns to the front, so the engine has to "step over" fewer columns (fewer calculations)... but I doubt that you will see any improvement, EVEN IF YOU CAN identify the most-frequently-referenced columns (go ahead, try 🙂

The mention about LONG columns is particularly misleading since most of their data isn't even stored in the row.

Suggestion: Try a benchmark test, betcha won't be able to measure the effect :).

justin_willey
Participant
0 Kudos

Note that it doesn't have to step-over-byte-by-byte, in case that's what you were thinking 🙂

I think that is what I was thinking it could be :}

Answers (0)