11.4. The Python Database API (DBAPI) 2.0
As I mentioned earlier, the Python standard library does not come with an RDBMS interface, but there are many free third-party modules that let your Python programs access specific databases. Such modules mostly follow the Python Database API 2.0 standard, also known as the DBAPI. A new version of the DBAPI (possibly to be known as 3.0) is likely to appear in the future, but currently there are no firm plans or schedules for one. Programs written against DBAPI 2.0 should work with minimal or no changes with any future DBAPI 3.0, although 3.0, if and when it comes, will no doubt offer further enhancements that future programs will be able to take advantage of.
If your Python program runs only on Windows, you might prefer to access databases by using Microsoft's ADO package through COM. For more information on using Python on Windows, see Python Programming on Win32, by Mark Hammond and Andy Robinson (O'Reilly). Since ADO and COM are platform-specific, and this book focuses on cross-platform use of Python, I do not cover ADO and COM further in this book. However, at http://adodbapi.sourceforge.net/ you will find a useful Python extension that lets you access ADO indirectly through DBAPI.
After importing a DBAPI-compliant module, call the module's connect function with suitable parameters. connect returns an instance of Connection, which represents a connection to the database. The instance supplies commit and rollback methods to deal with transactions, a close method to call as soon as you're done with the database, and a cursor method to return an instance of Cursor. The cursor supplies the methods and attributes used for database operations. A DBAPI-compliant module also supplies exception classes, descriptive attributes, factory functions, and type-description attributes.
11.4.1. Exception Classes
A DBAPI-compliant module supplies exception classes Warning, Error, and several subclasses of Error. Warning indicates such anomalies as data truncation during insertion. Error's subclasses indicate various kinds of errors that your program can encounter when dealing with the database and the DBAPI-compliant module that interfaces to it. Generally, your code uses a statement of the form:
try: ... except module.Error, err: ...
to trap all database-related errors that you need to handle without terminating.
11.4.2. Thread Safety
When a DBAPI-compliant module has an attribute threadsafety greater than 0, the module is asserting some level of thread safety for database interfacing. Rather than relying on this, it's safer and more portable to ensure that a single thread has exclusive access to any given external resource, such as a database, as outlined in "Threaded Program Architecture" on page 350.
11.4.3. Parameter Style
A DBAPI-compliant module has an attribute paramstyle to identify the style of markers used as placeholders for parameters. Insert such markers in SQL statement strings that you pass to methods of Cursor instances, such as method execute, to use runtime-determined parameter values. Say, for example, that you need to fetch the rows of database table ATABLE where field AFIELD equals the current value of Python variable x. Assuming the cursor instance is named c, you could perform this task by using Python's string-formatting operator %, as follows:
c.execute('SELECT * FROM ATABLE WHERE AFIELD=%r' % x)
However, this is not the recommended approach. This approach generates a different statement string for each value of x, requiring such statements to be parsed and prepared anew each time. With parameter substitution, you pass to execute a single statement string, with a placeholder instead of the parameter value. This lets execute perform parsing and preparation just once, giving potentially better performance. For example, if a module's paramstyle attribute is 'qmark', you can express the above query as:
c.execute('SELECT * FROM ATABLE WHERE AFIELD=?', [x])
The read-only string attribute paramstyle tells your program how it should use parameter substitution with that module. The possible values of paramstyle are:
When paramstyle is neither 'pyformat' nor 'named', the second argument of method execute is a sequence. When parameters are named (i.e., paramstyle is 'pyformat' or 'named'), the second argument of method execute is a dictionary.
11.4.4. Factory Functions
Parameters passed to the database via placeholders must typically be of the right type: this means Python numbers (integers or floating-point values), strings (plain or Unicode), and None to represent SQL NULL. There is no type universally used to represent dates, times, and binary large objects (BLOBs). A DBAPI-compliant module supplies factory functions to build such objects. The types used for this purpose by most DBAPI-compliant modules are those supplied by modules datetime and mxDateTime (covered in Chapter 12), and strings or buffer types for BLOBs. The factory functions specified by the DBAPI are as follows.
11.4.5. Type Description Attributes
A Cursor instance's attribute description describes the types and other characteristics of each column of a query. Each column's type (the second item of the tuple describing the column) equals one of the following attributes of the DBAPI-compliant module:
A cursor's description, and in particular each column's type, is mostly useful for introspection about the database your program is working with. Such introspection can help you write general modules and work with tables using different schemas, including schemas that may not be known at the time you are writing your code.
11.4.6. The connect Function
A DBAPI-compliant module's connect function accepts arguments that vary depending on the kind of database and the specific module involved. The DBAPI standard recommends, but does not mandate, that connect accept named arguments. In particular, connect should at least accept optional arguments with the following names:
11.4.7. Connection Objects
A DBAPI-compliant module's connect function returns an object x that is an instance of class Connection. x supplies the following methods.
11.4.8. Cursor Objects
A Connection instance provides a cursor method that returns an object c that is an instance of class Cursor. A SQL cursor represents the set of results of a query and lets you work with the records in that set, in sequence, one at a time. A cursor as modeled by the DBAPI is a richer concept, since it represents the only way in which your program executes SQL queries in the first place. On the other hand, a DBAPI cursor allows you only to advance in the sequence of results (some relational databases, but not all, also provide richer cursors that are able to go backward as well as forward), and does not support the SQL clause WHERE CURRENT OF CURSOR. These limitations of DBAPI cursors enable DBAPI-compliant modules to provide cursors even on RDBMSes that provide no real SQL cursors at all. An instance of class Cursor c supplies many attributes and methods; the most frequently used ones are documented here.
11.4.9. DBAPI-Compliant Modules
Whatever relational database you want to use, there's at least one (often more than one) Python DBAPI-compliant module downloadable from the Internet. All modules listed in the following sections, except mxODBC (and SAPDB, which uses GPL) have liberal licenses that are similar to Python's: you can use them freely in either open source or closed source programs. mxODBC can be used freely for noncommercial purposes, but you must purchase a license for commercial use. There are so many relational databases that I can't list them all, but here are some of the most popular ones:
Gadfly (http://gadfly.sf.net) is not an interface to some other RDBMS but rather a complete RDBMS engine written in Python. Gadfly supports a large subset of standard SQL. For example, Gadfly lacks NULL, but it does support VIEW. Gadfly can run as a daemon server, to which clients connect with TCP/IP. Alternatively, you can run the Gadfly engine directly in your application's process if you don't need other processes to be able to access the same database concurrently.
The gadfly module has several discrepancies from the DBAPI 2.0 (covered in "The Python Database API (DBAPI) 2.0" on page 292) because Gadfly implements a variant of the older DBAPI 1.0. The concepts are quite close, but several details differ. The main differences are:
The gadfly module supplies the following functions.
SQLite (http://www.sqlite.org) is similar to Gadfly in that it's not an interface to some other RDBMS but rather a complete, self-contained RDBMS engine. However, SQLite is written in C and may offer better performance, and accessibility from multiple programming languages, compared to Gadfly. The most popular Python interface to SQLite is PySQLite, available at http://initd.org/tracker/pysqlite. It's quite compatible with DBAPI 2.0, except that it doesn't support types (all data is actually held by SQLite as strings). PySQLite's paramstyle is 'qmark'. An alternative is APSW, available at http://www.rogerbinns.com/apsw.html. APSW does not even try to meet the DBAPI 2.0's specs but rather faithfully implements SQLite's own preferred API.
In Python 2.5, PySQLite is included in the Python Standard Library as package sqlite3.