Archive | Database RSS feed for this section

Python, Django, & MySQL on Windows 7, Part 5: Installing MySQL

This is the fifth and final post in a  dummies guide to getting stared with Python, Django, & MySQL on Windows 7.

By now, you should have Django installed into a virtual environment.  These tutorials aren’t meant to cover building a django app, just to point out the quirks involved with getting a project up and running on Windows.  These tutorials also assume you want to construct real applications using a real development environment.

To that end, you’ll want a heftier database than sqlite.  We use MySQL at the office, so these instructions cover installing it and using it with Django.

Install MySQL

  1. Download and install MySQL.
  2. Once MySQL is installed, proceed through the configuration wizard. Check Include Bin Directory in Windows PATH box.
  3. When prompted, set a password for the MySQL root account.
  4. Once the installation wizard is done, open a command window and log in to MySQL with the root account: mysql -uroot -p (you’ll be prompted for the password).
  5. After logging in, run the following commands to create a database, create a user for your Django project, and grant the user database access.

Install MySQL-python

You’ll need the MySQL-python package, a Python interface to MySQL.

  1. Download the windows MySQL-python distribution here.  The author has some instructions about the appropriate version; assuming a 32-bit version of Python 2.7, you’d download this package (.exe).
  2. After downloading, do not run the Windows installer. Doing so will install MySQL-python to your root python, which virtual environments created via –no-site-packages won’t be able to see.
  3. Instead, install the downloaded package to your virtual environment by using easy_install, which can install from Windows binary installers:
    easy_install file://c:/users/you/downloads/mysql-python-1.2.3.win32-py2.7.exe (modify to reflect the location of the downloaded installer and its name).installing mysql-python package via easy_install

Configure Django

Next, you’ll need to update the database-related settings of your Django project.

  1. From the directory of your Django project, open settings.py using your favorite editor.
  2. Update the default key in the DATABASES dictionary.  Set ENGINE to django.db.backends.mysql and set NAME, USER, and PASSWORD to the database name, username, and password you chose when installing MySQL.  See Part I of the Django tutorial for more information about database settings.
  3. Open a command window, activate your virtual environment, and change to the directory of your Django project.
  4. Type python manage.py syncdb. This command creates the underlying tables required for your Django project.
    syncdb output
  5. If the syncdb worked, you have Python, Django, and MySQL communicating in harmony.  Congratulations!  You can now proceed through the Django tutorial and create your first application.
Comments { 18 }

Python, Django, & MySQL on Windows 7, Part 4: Installing Django

This is the fourth post in a  dummies guide to getting stared with Python, Django, & MySQL on Windows 7.

We’re finally ready to install Django, a popular Web-development framework. Detailed instructions for building out a Django site are beyond the scope of this humble tutorial; try The Definitive Guide to Django or Django’s online Getting started docs for that.

These directions will simply make sure you can get up and running.

Installing Django

  1. Open a command window.
  2. Go to (or create) the virtual environment you’ll be using for your django project. For this example, I created a virtualenv called django-tutorial: virtualenv django-tutorial --no-site-packages
  3. Install django: pip install django
    install django 
  4. Start an interactive interpreter by typing python (or iPython, if you’ve made it virtual environment-aware).
  5. Test the install by importing the django module and checking its version: https://gist.github.com/1177372
  6. Create a new directory to hold your Django projects and code. Change to it.
  7. Think of a name for your first Django project and create it by running the following command: python -m django-admin startproject [projectname].
    If that doesn’t work, try python -m django-admin startproject [projectname] (thanks JukkaN!)
    Important: most Django docs show django-admin.py startproject [projectname] to start a new project, which can cause import errors and other trouble for Windows users. See this stackoverflow thread for details.
  8. You should now see the project’s folder in your Django directory:django project folder
  9. Change into the new project folder.
  10. Test the new project by typing python manage.py.  Manage.py is Django’s command line utility; you should see a list of its available subcommands.
  11. A further test is to start up Django’s development server: python manage.py runserver. You should see something like this:
    django runserver

If you’ve made it this far, you’ve successfully installed Django and created your first project.

Next up is Part 5: Installing MySQL.

Comments { 8 }

Two month milestone

flowering tree in Childs Park

Monday marked two completed months with the National Priorities Project. Though these weeks haven’t produced much writing, they’ve been a whirlwind of learning:

  • Python
  • Django
  • MySQL
  • The joy of setting up a proper Windows dev environment using the above three items
  • Piston, a tool for powering APIs through Django
  • Linux
  • Git/Github
  • The Federal Budget process
  • The Consolidated Federal Funds Report , a huge annual file of government expenditures.
  • Various other indicators about the state of our union: gas emissions by state, average teacher salaries, people in poverty, insurance enrollments, etc.
  • Finally, I’m NPP’s interim Twitterer, a fascinating distraction.

One day soon I’ll write a Dummies Guide to Setting up Python/Django/MySQL on Windows post. In the meantime, it’s great to be back in the hands-on tech saddle.

Comments { 2 }

Beware the change scripts created by SQL Server Management Studio

Part of my job is moonlighting as a SQL Server database admin, so a co-worker recently asked me to run a script against a production table. The task was simple: add a few new columns and create some indexes. He had generated the script by using SQL Server 2008 Management Studio’s (SSMS) Generate Change Script function.

Although the general sequence of steps in the generated script made sense, I have some complaints about the SSMS output. Here’s a very simple recreation of the scenario.

Consider a table with two columns. One is a primary key, and one has a default constraint:

1
2
3
4
5
6
CREATE TABLE phillies (
   phils_id INT IDENTITY(1,1)
,  phils_year CHAR(4) NOT NULL
   CONSTRAINT df_phillies_year DEFAULT (2008) 
,  CONSTRAINT pk_phils_id 
      PRIMARY KEY CLUSTERED (phils_id))

Using the SSMS table design view to add two columns to the phillies table and saving the change script results in the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
BEGIN TRANSACTION
GO
ALTER TABLE dbo.phillies
   DROP CONSTRAINT df_phillies_year
GO
CREATE TABLE dbo.Tmp_phillies (
   phils_id INT NOT NULL IDENTITY (1, 1),
   phils_year CHAR(4) NOT NULL,
   division_champ_flag bit NOT NULL,
   national_champ_flag bit NOT NULL
)  ON [PRIMARY]
GO
ALTER TABLE dbo.Tmp_phillies 
   ADD CONSTRAINT
   df_phillies_year DEFAULT ((2008)) FOR phils_year
GO
SET IDENTITY_INSERT dbo.Tmp_phillies ON
GO
IF EXISTS(SELECT * FROM dbo.phillies)
   EXEC('INSERT INTO dbo.Tmp_phillies 
      (phils_id, phils_year)
   SELECT phils_id, phils_year 
   FROM dbo.phillies 
   WITH (HOLDLOCK TABLOCKX)')
GO
SET IDENTITY_INSERT dbo.Tmp_phillies OFF
GO
DROP TABLE dbo.phillies
GO
EXECUTE sp_rename 
   N'dbo.Tmp_phillies', N'phillies', 'OBJECT' 
GO
ALTER TABLE dbo.phillies 
   ADD CONSTRAINT
   pk_phils_id PRIMARY KEY CLUSTERED 
   (phils_id) ON [PRIMARY]
GO
COMMIT

This script contains BEGIN TRANSACTION and COMMIT TRANSACTION statements but doesn’t accompany them with any kind of error handling. So if you run it as-is and encounter an error, nothing gets rolled back.

Any error handling that you might add, however, is thwarted by the fact that the script’s statements are contained in individual batches (i.e., separated by GO statements).

Say you individually check each statement for errors and issue a rollback/return if something goes awry.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
BEGIN TRANSACTION
GO
[snip]
CREATE TABLE dbo.Tmp_phillies (
   phils_id INT NOT NULL IDENTITY (1, 1),
   phils_year CHAR(4) NOT NULL,
   division_champ_flag bit NOT NULL,
   national_champ_flag bit NOT NULL
)  ON [PRIMARY]
GO
[snip]
-- throw an error
SELECT 23/0
IF @@ERROR <> 0
BEGIN
   PRINT 'error!'
   ROLLBACK TRANSACTION
   RETURN
END
[snip]
GO
DROP TABLE dbo.phillies

In this scenario, changes that occurred prior to the error will be rolled back. However, although the RETURN statement exits the current batch, subsequent batches (for example, the one that deletes your table) will execute.

A TRY/CATCH block is another potential error-handling method, but TRY/CATCH blocks can’t span multiple batches.

So what’s with the batch-happy SSMS? In older versions of SQL it might have been necessary to separate a CREATE TABLE and subsequent ALTER TABLE statements into separate batches (I haven’t tested this). But SQL 2005 and 2008 provide statement-level recompilation, so the multitude of batches is not necessary.

When I tested a “GO-less” version of the script, it worked swimmingly. It’s understandable that SSMS can’t generate the error-handling, but if it would reduce the scripts’ number of batches to the minimum required, it would be easier to add the error handling yourself.

Conclusion: don’t rely on these SQL Server Management Studio-generated change scripts for any serious work, and definitely don’t use them as a model for how to write robust T-SQL.

Comments are closed

Database Design Presentation

I’m extremely happy to work in an organization with an uber-smart and supportive group of developers. Once a month we meet for a lunchtime “tech talk” presentation on a topic chosen via Google Moderator.

This month, Tim Allen and I collaborated and spoke about database design. Working with Tim to research and organize the material was a blast, and even though we put silly pictures of ourselves on the slides, I’m posting them.

We focused most closely on normalization and indexing, with a few of our other best practices thrown in the mix. The concepts should be applicable to any RDBMS, but the details are specific to MS SQL Server, the database used in most of the organization’s applications.

wcit-techtalk-database-design

Comments are closed

Hypnotic B-Tree Video

While preparing a database design presentation with my co-worker Tim, I discovered this hypnotic video demonstrating inserts into a b-tree data structure. For some reason, the music compels me to leave the cubicle and pay a visit to Rami’s falafel truck.

Comments are closed

SQL Server operand type clash!

Note:  This article was originally published to the Wharton Computing Developer Center.

Yesterday a fellow developer hit a strange SQL error and determined that the culprit was a T-SQL CASE statement used in his ORDER BY clause.

1
2
3
4
5
6
7
8
9
10
ORDER BY
CASE
WHEN UPPER(@orderBy)='GROUPMESSAGEID' THEN groupMessageID
WHEN UPPER(@orderBy)='GAMEID' THEN gameID
WHEN UPPER(@orderBy)='GROUPID' THEN groupID
WHEN UPPER(@orderBy)='MESSAGEID' THEN messageID
WHEN UPPER(@orderBy)='ROUNDSENT' THEN roundSent
WHEN UPPER(@orderBy)='SENTON' THEN sentOn
ELSE groupMessageID
END

This code results in the following message:
Operand type clash: uniqueidentifier is incompatible with datetime

After some digging around, we found the underlying cause of the error:  the case statement return values have different data types.  In some cases, returning different data types will behave as expected.  However, mixing numeric and character data causes problems.

In the statement above, gameId, groupID, roundSent, and groupMessageID are integers, sentOn is a datetime, and messageID is a uniqueidentifier.  Because the data type precedence pecking order in this case is datetime, int, and then uniqueidentifier, SQL choose datetime as the return type. Uniqueidentifiers cannot be converted to datetimes, hence the error message.

It all became clear after reading this article by George Mastros.  Thank you to George.

Comments are closed

Surrogate keys: keep ‘em meaningless

Note:  this article was originally published to the Wharton Computing Developer Center.

At two recent code reviews, we’ve discussed the use of SQL Server auto-incrementing fields as primary keys. Using system-generated numbers to uniquely identify rows (i.e., implementing surrogate keys) is a good practice* because it reduces the likelihood of errors introduced by bad data. However, these surrogate keys are meaningless and exist only for the purpose of identification.

The problem is that many applications do not treat surrogate keys as meaningless numbers, a practice that reduces portability. Consider a “master table” that lists an application’s user roles. The most basic structure for such a table would be two columns, role_id and role_name:

role_id role_name
1 student
2 faculty
3 TA
4 admin

Many applications would do the following to get a list of faculty users:

SELECT user_name FROM tblRole WHERE rold_id = 2.

However, there’s no guarantee that the faculty role will be assigned to the id of 2 in perpetuity. 2 doesn’t mean faculty, it just means 2; if you reload your data and port the application to another environment, the faculty role id might be 4 or 10 or 33,288.

It’s safer to look up the role_id where role_name = ‘faculty’ and use that value instead of assuming 2. This method, of course, would create problems if the application users decide to change the ‘faculty’ role to ‘instructor.’ A solution would be to create more descriptive master tables that supports front-end changes while maintaining meaningless surrogate keys:

role_id role_code role_name role_sort_order role_create_date role_update_date
1 s student 3 10/31/2007
2 f faculty 1 10/31/2007
3 t TA 2 10/31/2007 11/03/2008
4 a admin 4 10/31/2007

This table structure allows you to get the faculty surrogate key by querying for role_code = ‘f.’ The role_name field is what users see and can be modified without breaking the application. Finally, role_sort_order controls the front-end display order of these items and allows you to change the sort order without touching any code.

*The surrogate key versus natural key debate still rages among database geeks, but it seems that surrogate keys are a standard practice at this shop.

Comments are closed

Dynamic SQL: exec vs. sp_executesql

Note: this article was originally published to the Wharton Computing Development Center

Dynamic SQL came up again at a recent code review.  If only for your own maintenance sanity, it’s worth eliminating dynamic SQL where possible, but there are times you can’t avoid it (for example, some implementations of the new SQL 2005 PIVOT function). If you must do the dynamic SQL thing, you should know that there are two ways to execute it

  • EXEC(@stringofsql)
  • sp_executesql @stringofsql, @someparameters

The first option is old school, but it still works. Sp_executesql, however, is newer and allows you to use parameters in conjunction with the dymanically-built SQL string. Having the option to use parameters is a great improvement–if the parameters are the only part of the SQL command that changes, the optimizer can reuse the execution plan instead of generating a new one each time the code is run. See SQL Server Books Online for the exact sp_executesql syntax.

If you want to know more about dynamic SQL, take a look at this aricle.

Comments are closed

SQL error-handling snafu

Note:  this article was originally published to the Wharton Computing Developer’s Center

Recently I encountered a SQL Server error-handling trap that caused a transaction not to roll back as expected. At the risk of sounding like an amateur, here’s what happened. The code was issuing two insert statements that were grouped into a single transaction:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
BEGIN TRANSACTION
 
INSERT INTO dbo.Table1 (SOME COLUMNS)
VALUES (SOME VALUES)
 
SELECT @error = @@error, @rowcount = @@rowcount
 
IF @error <> 0
BEGIN ROLLBACK TRANSACTION RAISERROR('error on the first insert!', 16, 1) RETURN END
 
[DO STUFF]
 
INSERT INTO dbo.Table2 (SOME COLUMNS)
VALUES (SOME VALUES)
 
SELECT @error = @@error, @rowcount = @@rowcount
 
IF @error <> 0
BEGIN ROLLBACK TRANSACTION RAISERROR('error on the second insert!', 16, 1) RETURN END
 
COMMIT TRANSACTION


The above error-handling works most of the time. However, if there is a run-time T-SQL error (for example, a missing object or a divide by 0 error), the offending statement fails but the rest of the transaction executes. In my case, the second insert failed because Table2 had been renamed. The subsequent @error check was never invoked, so the first insert wasn’t rolled back.

This behavior can be overridden by setting SET XACT_ABORT to ON, which specifies that the current transaction will automatically roll back in the event of a T-SQL run-time error.

External Link: (SET XACT_ABORT)

Comments are closed