Showing posts with label identity. Show all posts
Showing posts with label identity. Show all posts

Tuesday, March 20, 2012

Chunked Delete?

I can't truncate a table because I'd lose the Identity columns that
are referenced in other tables, but I do need to delete all 2 million
rows in there.
Is there a way to recusrively delete rows in say 1,000-row chunks to
keep the transactions small and DB load lighter?
Thanks.See if this thread helps:
http://groups-beta.google.com/group...
a0488274c6e93
--
HTH,
Vyas, MVP (SQL Server)
SQL Server Articles and Code Samples @. http://vyaskn.tripod.com/
"xenophon" <xenophon@.online.nospam> wrote in message
news:8qhga11qephoebnh1afec88usol8rulg44@.
4ax.com...
I can't truncate a table because I'd lose the Identity columns that
are referenced in other tables, but I do need to delete all 2 million
rows in there.
Is there a way to recusrively delete rows in say 1,000-row chunks to
keep the transactions small and DB load lighter?
Thanks.sqlsql

Sunday, March 11, 2012

Choosing clustered index

Dear friends:
There is some confusion here about the choice of which index should be
clustered. The choices are generally:
- the surrogate identity column.
- one or more columns that make up the natural key.
My contention has been that the latter is the obvious choice. This is the
order in which the rows are commonly placed on the screen and on reports.
The identity column really places the rows in no particular order at all,
except somewhat "sequentially" with respect to the order they were entered
(assuming an incrementing integer key).
What is the conventional wisdom about this?
Tom EllisonUsually clustered indexes should:
- Be as narrow as possible.
- Be placed on columns that would benefit the most:
-- Columns that have very high cardinality are good candidates.
-- So are columns that are searched for large ranges at a time using
operators like BETWEEN, LIKE 'x%' and >, <, etc.
-- Columns that are searched the most frequently, since they eliminate
bookmark lookups which can occur with narrow nonclustered indexes.
A lot of times the Primary Key will fit a lot of these requirements, but not
always. I usually would only make an IDENTITY column the clustered index on
supporting (lookup-type) tables, to improve JOIN performance. One or more
columns (with high cardinality) that make up the natural key on your main
tables would be good clustered index candidates.
[url]http://blogs.sqlservercentral.com/blogs/michael_coles/archive/2006/05/08/599.aspx[
/url]
"Tom Ellison" <tellison@.jcdoyle.com> wrote in message
news:O0nC9gVdGHA.5016@.TK2MSFTNGP04.phx.gbl...
> Dear friends:
> There is some confusion here about the choice of which index should be
> clustered. The choices are generally:
> - the surrogate identity column.
> - one or more columns that make up the natural key.
> My contention has been that the latter is the obvious choice. This is the
> order in which the rows are commonly placed on the screen and on reports.
> The identity column really places the rows in no particular order at all,
> except somewhat "sequentially" with respect to the order they were entered
> (assuming an incrementing integer key).
> What is the conventional wisdom about this?
> Tom Ellison
>|||Dear Mike:
Thanks for the opinions. The article was quite helpful.
Now, how do you figure that an IDENTITY column makes a good clustered index?
I ask because:
- in my experience, processing rarely proceeds in identity number order.
- user directed searches don't proceed along these lines
- moving through the data in some sequential order, such as generating data
for the screen or a report will follow a natural key order, not an identity
order
My thought is that the identity column is assigned sequentially over time
(if auto-incremented) but does not generally follow any organization of the
data that is likely to be repeated.
The natural key of which I spoke is unique (thus, highly cardinal) and tends
to be the most commonly used sequence in processing (screens and reports).
It is often the column(s) filtered or JOINed in many of the queries used.
Thanks again,
Tom Ellison
"Mike C#" <xxx@.yyy.com> wrote in message news:pBQ8g.184$Ut2.60@.fe09.lga...
> Usually clustered indexes should:
> - Be as narrow as possible.
> - Be placed on columns that would benefit the most:
> -- Columns that have very high cardinality are good candidates.
> -- So are columns that are searched for large ranges at a time using
> operators like BETWEEN, LIKE 'x%' and >, <, etc.
> -- Columns that are searched the most frequently, since they eliminate
> bookmark lookups which can occur with narrow nonclustered indexes.
> A lot of times the Primary Key will fit a lot of these requirements, but
> not always. I usually would only make an IDENTITY column the clustered
> index on supporting (lookup-type) tables, to improve JOIN performance.
> One or more columns (with high cardinality) that make up the natural key
> on your main tables would be good clustered index candidates.
> http://blogs.sqlservercentral.com/b...99.asp
x
> "Tom Ellison" <tellison@.jcdoyle.com> wrote in message
> news:O0nC9gVdGHA.5016@.TK2MSFTNGP04.phx.gbl...
>|||Tom,
are you using non-clustered indexes? If yes, take in account that
bookmark lookups take more time if bookmarks are wider.|||I generally use IDENTITY columns as clustered indexes only when it's in
supporting (lookup/foreign-key) tables. The only reason then is to improve
JOIN performance by potentially eliminating bookmark lookups you might get
with a narrow nonclustered index. Other than that, put your clustered
indexes where they'll do the most good.
"Tom Ellison" wrote:

> Dear Mike:
> Thanks for the opinions. The article was quite helpful.
> Now, how do you figure that an IDENTITY column makes a good clustered inde
x?
> I ask because:
> - in my experience, processing rarely proceeds in identity number order.
> - user directed searches don't proceed along these lines
> - moving through the data in some sequential order, such as generating dat
a
> for the screen or a report will follow a natural key order, not an identit
y
> order
> My thought is that the identity column is assigned sequentially over time
> (if auto-incremented) but does not generally follow any organization of th
e
> data that is likely to be repeated.
> The natural key of which I spoke is unique (thus, highly cardinal) and ten
ds
> to be the most commonly used sequence in processing (screens and reports).
> It is often the column(s) filtered or JOINed in many of the queries used.
> Thanks again,
> Tom Ellison
>
> "Mike C#" <xxx@.yyy.com> wrote in message news:pBQ8g.184$Ut2.60@.fe09.lga...
>
>

Thursday, March 8, 2012

Child/ Parent relationship within table

Hi everyone, I have a categories table which has the following main
attributes
CategoryName, CategoryID <- Identity , ParentCategoryID
What I need help doing is constructing a Procedure/ SQL query where I can
show the expanded relationships for each record in the table.
e.g. If I have three records in the table (Following the attributes
described above)
ParentCategory, 1, 0
ChildCategory, 2,1
SubChildCategory, 3,2
I want to be able to dynamically return the following information when I
execute the query (Each row looks at the parent id and concatenates itself
to its parents CategoryName).
ParentCategory
ParentCategory/ChildCategory
ParentCategory/ChildCategory/SubChildCategory
Thanks in advance
MarlkEverything you are doing is wrong. Let's get back to the basics of an
RDBMS. Rows are not records; fields are not columns; tables are not
files; there is no sequential access or ordering in an RDBMS, so
"first", "next" and "last" are totally meaningless.
Stop using IDENTITY and learn what a relational key is.
Get a copy of TREES & HIERARCHIES IN SQL You are trying to write
(uughh!) procedural code to build a traversal.that will create a path.
This is not the best way; google a "nested sets model" instead.|||There are many, many resources for describing ways to efficiently model
hierarchies and trees in SQL.
Google for the following topics: Nested Sets, Nested Intervals, Adjacency
List, Materialized Path
Here is a good starting point for finding information about this topic:
http://troels.arvin.dk/db/rdbms/links/#hierarchical
As a first step, you might consider creating a new table to manage the
relationship between categories - right now, you are modeling both the
category and the relationship between categories in the same table.
Normalizing the design can give you added flexibility depending on your
requirements.
What you have here is basically an Adjacency List model. This model has
excellent characteristics with regard to modifying the layout of the
hierarchy; you simply change the ParentCategoryID of a node to a different
value, and you instantly "move" that node and all referencing nodes to a
different location in the hierarchy. However, it does not work very well for
retrieving the structure, as you are seeing. SQL Server Books Online has a
section titled "Expanding Hierarchies" that describes an iterative process
of querying the hierarchy that involves using a temporary table as a stack,
but SQL is really optimized for set-based operations. You will probably find
something at the link above that better meets your needs.
"Mark" <dont@.spam.me> wrote in message
news:%233O9T%23KuFHA.1560@.TK2MSFTNGP09.phx.gbl...
> Hi everyone, I have a categories table which has the following main
> attributes
> CategoryName, CategoryID <- Identity , ParentCategoryID
> What I need help doing is constructing a Procedure/ SQL query where I can
> show the expanded relationships for each record in the table.
> e.g. If I have three records in the table (Following the attributes
> described above)
> ParentCategory, 1, 0
> ChildCategory, 2,1
> SubChildCategory, 3,2
> I want to be able to dynamically return the following information when I
> execute the query (Each row looks at the parent id and concatenates itself
> to its parents CategoryName).
> ParentCategory
> ParentCategory/ChildCategory
> ParentCategory/ChildCategory/SubChildCategory
> Thanks in advance
> Marlk
>
>
>
>|||Whoa! Cool, thanks for the links
Cheers
Mark
"Jeremy Williams" <jeremydwill@.netscape.net> wrote in message
news:ekyuNZLuFHA.2072@.TK2MSFTNGP14.phx.gbl...
> There are many, many resources for describing ways to efficiently model
> hierarchies and trees in SQL.
> Google for the following topics: Nested Sets, Nested Intervals, Adjacency
> List, Materialized Path
> Here is a good starting point for finding information about this topic:
> http://troels.arvin.dk/db/rdbms/links/#hierarchical
> As a first step, you might consider creating a new table to manage the
> relationship between categories - right now, you are modeling both the
> category and the relationship between categories in the same table.
> Normalizing the design can give you added flexibility depending on your
> requirements.
> What you have here is basically an Adjacency List model. This model has
> excellent characteristics with regard to modifying the layout of the
> hierarchy; you simply change the ParentCategoryID of a node to a different
> value, and you instantly "move" that node and all referencing nodes to a
> different location in the hierarchy. However, it does not work very well
for
> retrieving the structure, as you are seeing. SQL Server Books Online has a
> section titled "Expanding Hierarchies" that describes an iterative process
> of querying the hierarchy that involves using a temporary table as a
stack,
> but SQL is really optimized for set-based operations. You will probably
find
> something at the link above that better meets your needs.
> "Mark" <dont@.spam.me> wrote in message
> news:%233O9T%23KuFHA.1560@.TK2MSFTNGP09.phx.gbl...
can
itself
>

Sunday, February 19, 2012

Checking last user Identity on remote server

Is it possible to get the latest inserted Identity value on a remote server?
(The following gives a NULL result:
select ident_current('[my_linked_server].thedatabase.dbo.thetable')
)@.@.IDENTITY (assuming you mean server wide irrespective of the user\ connections etc).

HTH|||Beg your pardon - @.@.Identity is specific to session - sorry - didn't read every last line of BoL before hitting Submit.|||@.@.IDENTITY (assuming you mean server wide irrespective of the user\ connections etc).
HTH

Are you sure? I think that's what I tried first. Doesn't
it give the latest Id value used in the server where
you are (so to speak), not the remote server?|||Finally, I understood this is the way to do it. Just as feedback.

DECLARE @.return_status int
EXEC @.return_status = MyLinkedServer.the_database.dbo.a_stored_procedure
SELECT 'Return Status' = @.return_stat

The SP "a_stored_procedure" is stored on the linked server. It returns
Ident_Current for the appropriate table, using RETURN, simply.

Tuesday, February 14, 2012

Checking / Opinion ?

I am taking these steps in a test environment;
Publishing database
1. Changed identity columns = Not for Replication (Hilary's script)
2. Scripted all Triggers & changed to "Not for Replication" (Is there a method to do this with a script?)
3. Added Primary Keys where not present as follows;
Alter Table x
add pk_col int identity(1,1) NOT FOR REPLICATION
Alter Table x
add constraint x_tbl_repl
primary key(pk_col)
4. Backed up Publishing Database
5. Restored Publishing Database with a new name = Subscribing Database
6. Setup Transactional continuous replication from Publishing Database to Subscribing Database
The reason I did everything in the publishing database is so that it's all set in the Subscribing database.....Make sense?
I am going to have a production oltp database that replicates all transactions to my reporting database. I also want to think of my reporting database as sort of a fall back position. If anything goes wrong in the oltp publishing database, everything resides in the subscribing db.
It is my understanding that I wanted the Identity columns in the subscribing database to just be written to, not incremented, therefore, the necessity of the Not for Replication.
The same goes for triggers in the subscribing database, I don't want them firing when replication occurs, therefore, the Not for Replication.
I don't have any inserts as follows;
Insert into Table A (Table A may have a newly added pk_col )
Select * from Table B (Table B may have a newly added pk_col )
therefore, I should not have any problems with my stored procedures.
Do I have it all covered? Am I on the right track? I really want to get this implemented, and correctly, to get this monkey off my back.
Any comments will certainly be appreciated!
This is not the standard way of deploying subscribers, but it does work. I just tried it.
The problem is when you go to do a resync. When a no-sync subscription needs to be re-initialized you must drop and recreate your subscription. You have the potential of having data consistency problems with your approach.
Also, it sounds like you might be thinking you are implementing bi-directional transactional replication with your method. You aren't. Its not clear from your post if you are or not. But the "fall back position" could be interpreted as this.
If you fail over to your Subscriber you will have to manually resync your Publisher with your Subscriber when it comes back on line.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
|||I am only doing Transactional Replication in one direction. What I want is
to have a production OLTP database that replicates to a reporting database.
The reason I set everything up in the test production database is so that
these databases mirror one another from the start, and hoping I would not
miss anything.
If the schema needs to change in the production database, I intend to make
the same change in the subscriber. We have change management, where the
scripts are sent to me and I execute them over production. I am hoping that
it will just be a matter of running the scripts to create the objects
(views, sp's) in both databases to keep them in sync object wise.
I don't understand what you mean that I may face problems when doing a
resync, what am I doing wrong, or what is wrong with my approach to setting
up a reporting database, how would you approach setting up a reporting
database that is replicated to from the production database?
Am I out in left field making this project more difficult than it needs to
be?
For fall back I was thinking that since all transactions are being
replicated from production, if something went wrong in the production db, we
could just point the users to the subscribing database.
"Hilary Cotter" <hilaryk@.att.net> wrote in message
news:B8A63257-FE63-4673-AD55-9494C453404B@.microsoft.com...
> This is not the standard way of deploying subscribers, but it does work. I
just tried it.
> The problem is when you go to do a resync. When a no-sync subscription
needs to be re-initialized you must drop and recreate your subscription. You
have the potential of having data consistency problems with your approach.
> Also, it sounds like you might be thinking you are implementing
bi-directional transactional replication with your method. You aren't. Its
not clear from your post if you are or not. But the "fall back position"
could be interpreted as this.
> If you fail over to your Subscriber you will have to manually resync your
Publisher with your Subscriber when it comes back on line.
>
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
>