Archive | Data RSS feed for this section

Surrogate keys: keep ‘em meaningless

Note:  this article was originally published to the Wharton Computing Developer Center.

At two recent code reviews, we’ve discussed the use of SQL Server auto-incrementing fields as primary keys. Using system-generated numbers to uniquely identify rows (i.e., implementing surrogate keys) is a good practice* because it reduces the likelihood of errors introduced by bad data. However, these surrogate keys are meaningless and exist only for the purpose of identification.

The problem is that many applications do not treat surrogate keys as meaningless numbers, a practice that reduces portability. Consider a “master table” that lists an application’s user roles. The most basic structure for such a table would be two columns, role_id and role_name:

role_id role_name
1 student
2 faculty
3 TA
4 admin

Many applications would do the following to get a list of faculty users:

SELECT user_name FROM tblRole WHERE rold_id = 2.

However, there’s no guarantee that the faculty role will be assigned to the id of 2 in perpetuity. 2 doesn’t mean faculty, it just means 2; if you reload your data and port the application to another environment, the faculty role id might be 4 or 10 or 33,288.

It’s safer to look up the role_id where role_name = ‘faculty’ and use that value instead of assuming 2. This method, of course, would create problems if the application users decide to change the ‘faculty’ role to ‘instructor.’ A solution would be to create more descriptive master tables that supports front-end changes while maintaining meaningless surrogate keys:

role_id role_code role_name role_sort_order role_create_date role_update_date
1 s student 3 10/31/2007
2 f faculty 1 10/31/2007
3 t TA 2 10/31/2007 11/03/2008
4 a admin 4 10/31/2007

This table structure allows you to get the faculty surrogate key by querying for role_code = ‘f.’ The role_name field is what users see and can be modified without breaking the application. Finally, role_sort_order controls the front-end display order of these items and allows you to change the sort order without touching any code.

*The surrogate key versus natural key debate still rages among database geeks, but it seems that surrogate keys are a standard practice at this shop.

Comments are closed

Dynamic SQL: exec vs. sp_executesql

Note: this article was originally published to the Wharton Computing Development Center

Dynamic SQL came up again at a recent code review.  If only for your own maintenance sanity, it’s worth eliminating dynamic SQL where possible, but there are times you can’t avoid it (for example, some implementations of the new SQL 2005 PIVOT function). If you must do the dynamic SQL thing, you should know that there are two ways to execute it

  • EXEC(@stringofsql)
  • sp_executesql @stringofsql, @someparameters

The first option is old school, but it still works. Sp_executesql, however, is newer and allows you to use parameters in conjunction with the dymanically-built SQL string. Having the option to use parameters is a great improvement–if the parameters are the only part of the SQL command that changes, the optimizer can reuse the execution plan instead of generating a new one each time the code is run. See SQL Server Books Online for the exact sp_executesql syntax.

If you want to know more about dynamic SQL, take a look at this aricle.

Comments are closed

SQL error-handling snafu

Note:  this article was originally published to the Wharton Computing Developer’s Center

Recently I encountered a SQL Server error-handling trap that caused a transaction not to roll back as expected. At the risk of sounding like an amateur, here’s what happened. The code was issuing two insert statements that were grouped into a single transaction:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
BEGIN TRANSACTION
 
INSERT INTO dbo.Table1 (SOME COLUMNS)
VALUES (SOME VALUES)
 
SELECT @error = @@error, @rowcount = @@rowcount
 
IF @error <> 0
BEGIN ROLLBACK TRANSACTION RAISERROR('error on the first insert!', 16, 1) RETURN END
 
[DO STUFF]
 
INSERT INTO dbo.Table2 (SOME COLUMNS)
VALUES (SOME VALUES)
 
SELECT @error = @@error, @rowcount = @@rowcount
 
IF @error <> 0
BEGIN ROLLBACK TRANSACTION RAISERROR('error on the second insert!', 16, 1) RETURN END
 
COMMIT TRANSACTION


The above error-handling works most of the time. However, if there is a run-time T-SQL error (for example, a missing object or a divide by 0 error), the offending statement fails but the rest of the transaction executes. In my case, the second insert failed because Table2 had been renamed. The subsequent @error check was never invoked, so the first insert wasn’t rolled back.

This behavior can be overridden by setting SET XACT_ABORT to ON, which specifies that the current transaction will automatically roll back in the event of a T-SQL run-time error.

External Link: (SET XACT_ABORT)

Comments are closed