Introduction
Anyone with more than a little exposure to SQL Server will soon come to the conclusion that the use of cursors is best avoided. SQL Server is a Relational Database Management System, and as such is very proficient at processing sets of data. The use of a cursor effectively repeats an operation once for each row being processed, thereby consuming far more server resources than would be necessitated by a set-based operation. In addition, since SQL cursors are processed much more slowly than set-based operations, any locks generated by the use of said cursor are held longer than would otherwise be necessary. In this way, the use of a cursor can affect not only the performance of a given batch of work, but also the performance of any other batches which need to make use of the same data resources.In order to effectively avoid using a SQL cursor, one must know why a cursor is being used in the first place. The list below details the most common reasons why people use cursors, and some possible solutions for each problem.
Problem: The Developer or Designer is new to SQL Server
Problem: Need to update a table with incrementing column values for records already in the table
Problem: Need to use a custom identity function to perform an insert
Problem: Need to compare a row to adjacent rows in a table
Problem: The Developer or Designer is New to SQL Server
By far, this is the most common reason for cursor usage. If a programmer is already well versed in an event-driven programming language such as Visual Basic, they will possess a great wealth of experience in dealing with data items one at a time. As such, their natural inclination when moving into SQL is to do the same. Unfortunately, what is appropriate in one language may be (and in this case is) entirely inappropriate in another.Resolution
The primary weapon against cursor usage in this scenario is to develop a strong understanding of the basics of Transact-SQL. There are countless courses, books, and websites dedicated to the topic. In addition, you can review the examples below, which present set-based solutions to problems which may otherwise necessitate a cursor.If you’re still struggling with finding a set-based solution, post a question in the MSDN Transact-SQL Forum
Problem: Need to update a table with incrementing column values for records already in the table
This problem is most often encountered when importing data from an external data source. The external data source may not have a key field, for instance, and for one of various reasons, an identity column cannot be used while importing.Resolution
A number of methods may be employed to resolve this issue. The most efficient and straightforward method, available in SQL 2005 and greater, is to use theROW_NUMBER() function.Using the AdventureWorks database:
ROW_NUMBER() can also be used to number rows based upon logical groupings, or partitions in data, although this is less applicable to cursor replacement. For instance, the test data contains logins from multiple domains, and you may want to number the rows in the table based upon their position within a grouping of rows from the same domain.
Also worth noting is that the sort order of the OVER clause does not have to match the sort order of the outermost query.
In the event that the numbering has to start at a number other than zero, that number can be added to ROW_NUMBER() in order to obtain the desired row numbers.
For instance:
The same function can be performed, albeit far less efficiently, in SQL 2000. The following solution uses an inline subquery in order to COUNT the number of rows on or before a given LoginID:
This solution is subject to error if duplicate values are present in the columns being used for comparison. If there were two identical LoginIDs, they would both be granted the same RowKey. When dealing with duplicate fields such as this, a local temporary table can be used in order to ensure unique row numbering:
Problem: Need to use a custom identity function to perform an insert
Many databases employ custom identity generation functions. There are VERY FEW good reasons for using custom identity functions in SQL Server, given the presence of IDENTITY columns and the UniqueIdentifier data type. That is not to say that you will never encounter such a situation - just that all possible alternatives should be vetted prior to deciding upon a custom identity solution.A custom identity function commonly resembles the following:
No comments:
Post a Comment