Sunday, March 11, 2012

Choosing clustered index

Dear friends:
There is some confusion here about the choice of which index should be
clustered. The choices are generally:
- the surrogate identity column.
- one or more columns that make up the natural key.
My contention has been that the latter is the obvious choice. This is the
order in which the rows are commonly placed on the screen and on reports.
The identity column really places the rows in no particular order at all,
except somewhat "sequentially" with respect to the order they were entered
(assuming an incrementing integer key).
What is the conventional wisdom about this?
Tom EllisonUsually clustered indexes should:
- Be as narrow as possible.
- Be placed on columns that would benefit the most:
-- Columns that have very high cardinality are good candidates.
-- So are columns that are searched for large ranges at a time using
operators like BETWEEN, LIKE 'x%' and >, <, etc.
-- Columns that are searched the most frequently, since they eliminate
bookmark lookups which can occur with narrow nonclustered indexes.
A lot of times the Primary Key will fit a lot of these requirements, but not
always. I usually would only make an IDENTITY column the clustered index on
supporting (lookup-type) tables, to improve JOIN performance. One or more
columns (with high cardinality) that make up the natural key on your main
tables would be good clustered index candidates.
[url]http://blogs.sqlservercentral.com/blogs/michael_coles/archive/2006/05/08/599.aspx[
/url]
"Tom Ellison" <tellison@.jcdoyle.com> wrote in message
news:O0nC9gVdGHA.5016@.TK2MSFTNGP04.phx.gbl...
> Dear friends:
> There is some confusion here about the choice of which index should be
> clustered. The choices are generally:
> - the surrogate identity column.
> - one or more columns that make up the natural key.
> My contention has been that the latter is the obvious choice. This is the
> order in which the rows are commonly placed on the screen and on reports.
> The identity column really places the rows in no particular order at all,
> except somewhat "sequentially" with respect to the order they were entered
> (assuming an incrementing integer key).
> What is the conventional wisdom about this?
> Tom Ellison
>|||Dear Mike:
Thanks for the opinions. The article was quite helpful.
Now, how do you figure that an IDENTITY column makes a good clustered index?
I ask because:
- in my experience, processing rarely proceeds in identity number order.
- user directed searches don't proceed along these lines
- moving through the data in some sequential order, such as generating data
for the screen or a report will follow a natural key order, not an identity
order
My thought is that the identity column is assigned sequentially over time
(if auto-incremented) but does not generally follow any organization of the
data that is likely to be repeated.
The natural key of which I spoke is unique (thus, highly cardinal) and tends
to be the most commonly used sequence in processing (screens and reports).
It is often the column(s) filtered or JOINed in many of the queries used.
Thanks again,
Tom Ellison
"Mike C#" <xxx@.yyy.com> wrote in message news:pBQ8g.184$Ut2.60@.fe09.lga...
> Usually clustered indexes should:
> - Be as narrow as possible.
> - Be placed on columns that would benefit the most:
> -- Columns that have very high cardinality are good candidates.
> -- So are columns that are searched for large ranges at a time using
> operators like BETWEEN, LIKE 'x%' and >, <, etc.
> -- Columns that are searched the most frequently, since they eliminate
> bookmark lookups which can occur with narrow nonclustered indexes.
> A lot of times the Primary Key will fit a lot of these requirements, but
> not always. I usually would only make an IDENTITY column the clustered
> index on supporting (lookup-type) tables, to improve JOIN performance.
> One or more columns (with high cardinality) that make up the natural key
> on your main tables would be good clustered index candidates.
> http://blogs.sqlservercentral.com/b...99.asp
x
> "Tom Ellison" <tellison@.jcdoyle.com> wrote in message
> news:O0nC9gVdGHA.5016@.TK2MSFTNGP04.phx.gbl...
>|||Tom,
are you using non-clustered indexes? If yes, take in account that
bookmark lookups take more time if bookmarks are wider.|||I generally use IDENTITY columns as clustered indexes only when it's in
supporting (lookup/foreign-key) tables. The only reason then is to improve
JOIN performance by potentially eliminating bookmark lookups you might get
with a narrow nonclustered index. Other than that, put your clustered
indexes where they'll do the most good.
"Tom Ellison" wrote:

> Dear Mike:
> Thanks for the opinions. The article was quite helpful.
> Now, how do you figure that an IDENTITY column makes a good clustered inde
x?
> I ask because:
> - in my experience, processing rarely proceeds in identity number order.
> - user directed searches don't proceed along these lines
> - moving through the data in some sequential order, such as generating dat
a
> for the screen or a report will follow a natural key order, not an identit
y
> order
> My thought is that the identity column is assigned sequentially over time
> (if auto-incremented) but does not generally follow any organization of th
e
> data that is likely to be repeated.
> The natural key of which I spoke is unique (thus, highly cardinal) and ten
ds
> to be the most commonly used sequence in processing (screens and reports).
> It is often the column(s) filtered or JOINed in many of the queries used.
> Thanks again,
> Tom Ellison
>
> "Mike C#" <xxx@.yyy.com> wrote in message news:pBQ8g.184$Ut2.60@.fe09.lga...
>
>

No comments:

Post a Comment