A transaction is a set of SQL statements that execute as a unit. Either all the statements execute successfully, or none of them have any effect. This is achieved through the use of commit and rollback capabilities. If all of the statements in the transaction succeed, you commit it to record their effect permanently in the database. If an error occurs during the transaction, you roll it back to cancel it. Any statements executed up to that point within the transaction are undone, leaving the database in the state it was in prior to the point at which the transaction began.
Commit and rollback provide the means for ensuring that halfway-done operations don't make their way into your database and leave it in a partially updated (inconsistent) state. The canonical example of this involves a financial transfer where money from one account is placed into another account. Suppose that Bill writes a check to Bob for $100.00 and Bob cashes the check. Bill's account should be decremented by $100.00 and Bob's account incremented by the same amount:
UPDATE account SET balance = balance - 100 WHERE name = 'Bill'; UPDATE account SET balance = balance + 100 WHERE name = 'Bob';
If a crash occurs between the two statements, the operation is incomplete. Depending on which statement executes first, Bill is $100 short without Bob having been credited, or Bob is given $100 without Bill having been debited. Neither outcome is correct. If transactional capabilities are not available to you, you have to figure out the state of ongoing operations at crash time by examining your logs manually in order to determine how to undo them or complete them. The rollback capabilities of transaction support allow you to handle this situation properly by undoing the effect of the statements that executed before the error occurred. (You may still have to determine which transactions weren't entered and re-issue them, but at least you don't have to worry about half-transactions making your database inconsistent.)
Another use for transactions is to make sure that the records involved in an operation are not modified by other clients while you're working with them. MySQL automatically performs locking for single SQL statements to keep clients from interfering with each other, but this is not always sufficient to guarantee that a database operation achieves its intended result, because some operations are performed over the course of several statements. In this case, different clients might interfere with each other. A transaction groups statements into a single execution unit to prevent concurrency problems that could otherwise occur in a multiple-client environment.
Transactional systems typically are characterized as providing ACID properties. ACID is an acronym for Atomic, Consistent, Isolated, and Durable, referring to four properties that transactions should have:
Transactional processing provides stronger guarantees about the outcome of database operations, but also requires more overhead in CPU cycles, memory, and disk space. MySQL offers some storage engines that are transaction-safe (such as InnoDB and BDB), and some that are not transaction-safe (such as MyISAM and MEMORY). Transactional properties are essential for some applications and not for others, and you can choose which ones make the most sense for your applications. Financial operations typically need transactions, and the guarantees of data integrity outweigh the cost of additional overhead. On the other hand, for an application that logs web page accesses to a database table, a loss of a few records if the server host crashes might be tolerable. In this case, you can use a non-transactional storage engine to avoid the overhead required for transactional processing.
Using Transactions to Ensure Safe Statement Execution
To use transactions, you must use a transactional storage engine. This means using either InnoDB or BDB tables. Engines such as MyISAM and MEMORY will not work. If you're not sure whether your MySQL server supports the InnoDB or BDB storage engines, see "Checking Which Storage Engines Are Available" earlier in the chapter.
By default, MySQL runs in autocommit mode, which means that changes made by individual statements are committed to the database immediately to make them permanent. In effect, each statement is its own transaction implicitly. To perform transactions explicitly, disable autocommit mode and then tell MySQL when to commit or roll back changes.
One way to perform a transaction is to issue a START TRANSACTION statement to suspend autocommit mode, execute the statements that make up the transaction, and end the transaction with a COMMIT statement to make the changes permanent. If an error occurs during the transaction, cancel it by issuing a ROLLBACK statement instead to undo the changes. START TRANSACTION suspends the current autocommit mode, so after the transaction has been committed or rolled back, the mode reverts to its state prior to the START TRANSACTION. (If autocommit was enabled beforehand, ending the transaction puts you back in autocommit mode. If it was disabled, ending the current transaction causes you to begin the next one.)
The following example illustrates this approach. First, create a table to use:
mysql> CREATE TABLE t (name CHAR(20), UNIQUE (name)) ENGINE = InnoDB;
The statement creates an InnoDB table, but you can use BDB if you like. Next, initiate a transaction with START TRANSACTION, add a couple of rows to the table, commit the transaction, and then see what the table looks like:
mysql> START TRANSACTION; mysql> INSERT INTO t SET name = 'William'; mysql> INSERT INTO t SET name = 'Wallace'; mysql> COMMIT; mysql> SELECT * FROM t; +---------+ | name | +---------+ | Wallace | | William | +---------+
You can see that the rows have been recorded in the table. If you had started up a second instance of mysql and selected the contents of t after the inserts but before the commit, the rows would not show up. They would not become visible to the second mysql process until the COMMIT statement had been issued by the first one.
mysql> START TRANSACTION; mysql> INSERT INTO t SET name = 'Gromit'; mysql> INSERT INTO t SET name = 'Wallace'; ERROR 1062 (23000): Duplicate entry 'Wallace' for key 1 mysql> ROLLBACK; mysql> SELECT * FROM t; +---------+ | name | +---------+ | Wallace | | William | +---------+
The second INSERT attempts to place a row into the table that duplicates an existing name value. The statement fails because name has a UNIQUE index. After issuing the ROLLBACK, the table has only the two rows that it contained prior to the failed transaction. In particular, the INSERT that was performed just prior to the point of the error has been undone and its effect is not recorded in the table.
Note: For older versions of MySQL that do not recognize START TRANSACTION, use BEGIN instead.
Another way to perform transactions is to manipulate the autocommit mode directly using SET statements:
SET autocommit = 0; SET autocommit = 1;
Setting the autocommit variable to zero disables autocommit mode. The effect of any statements that follow becomes part of the current transaction, which you end by issuing a COMMIT or ROLLBACK statement to commit or cancel it. With this method, autocommit mode remains off until you turn it back on, so ending one transaction also begins the next one. You can also commit a transaction by re-enabling autocommit mode.
To see how this approach works, begin with the same table as for the previous examples:
mysql> DROP TABLE t; mysql> CREATE TABLE t (name CHAR(20), UNIQUE (name)) ENGINE = InnoDB;
Then disable autocommit mode, insert some records, and commit the transaction:
mysql> SET autocommit = 0; mysql> INSERT INTO t SET name = 'William'; mysql> INSERT INTO t SET name = 'Wallace'; mysql> COMMIT; mysql> SELECT * FROM t; +---------+ | name | +---------+ | Wallace | | William | +---------+
At this point, the two records have been committed to the table, but autocommit mode remains disabled. If you issue further statements, they become part of a new transaction, which may be committed or rolled back independently of the first transaction. To verify that autocommit is still off and that ROLLBACK will cancel uncommitted statements, issue the following statements:
mysql> INSERT INTO t SET name = 'Gromit'; mysql> INSERT INTO t SET name = 'Wallace'; ERROR 1062 (23000): Duplicate entry 'Wallace' for key 1 mysql> ROLLBACK; mysql> SELECT * FROM t; +---------+ | name | +---------+ | Wallace | | William | +---------+
mysql> SET autocommit = 1;
As just described, a transaction ends when you issue a COMMIT or ROLLBACK statement, or when you re-enable autocommit while it is disabled. Transactions also end under the following circumstances:
Transactions are useful in all kinds of situations. Suppose that you're working with the score table that is part of the grade-keeping project and you discover that the grades for two students have gotten mixed up and need to be switched. The incorrectly entered grades are as follows:
mysql> SELECT * FROM score WHERE event_id = 5 AND student_id IN (8,9); +------------+----------+-------+ | student_id | event_id | score | +------------+----------+-------+ | 8 | 5 | 18 | | 9 | 5 | 13 | +------------+----------+-------+
To fix this, student 8 should be given a score of 13 and student 9 a score of 18. That can be done easily with two statements:
UPDATE score SET score = 13 WHERE event_id = 5 AND student_id = 8; UPDATE score SET score = 18 WHERE event_id = 5 AND student_id = 9;
However, it's necessary to ensure that both statements succeed as a unit. This is a problem to which transactional methods may be applied. To use START TRANSACTION, do this:
mysql> START TRANSACTION; mysql> UPDATE score SET score = 13 WHERE event_id = 5 AND student_id = 8; mysql> UPDATE score SET score = 18 WHERE event_id = 5 AND student_id = 9; mysql> COMMIT;
mysql> SET autocommit = 0; mysql> UPDATE score SET score = 13 WHERE event_id = 5 AND student_id = 8; mysql> UPDATE score SET score = 18 WHERE event_id = 5 AND student_id = 9; mysql> COMMIT; mysql> SET autocommit = 1;
Either way, the result is that the scores are swapped properly:
mysql> SELECT * FROM score WHERE event_id = 5 AND student_id IN (8,9); +------------+----------+-------+ | student_id | event_id | score | +------------+----------+-------+ | 8 | 5 | 13 | | 9 | 5 | 18 | +------------+----------+-------+
Using Transaction Savepoints
As of MySQL 4.1.1, it is possible to perform a partial rollback of a transaction. To do this, issue a SAVEPOINT statement to set a marker in the transaction. To roll back to just that point in the transaction later, use a ROLLBACK statement that names the savepoint. The following statements illustrate how this works:
mysql> CREATE TABLE t (i INT) ENGINE = INNODB; mysql> START TRANSACTION; mysql> INSERT INTO t VALUES(1); mysql> SAVEPOINT my_savepoint; mysql> INSERT INTO t VALUES(2); mysql> ROLLBACK TO SAVEPOINT my_savepoint; mysql> INSERT INTO t VALUES(3); mysql> COMMIT; mysql> SELECT * FROM t; +------+ | i | +------+ | 1 | | 3 | +------+
Because MySQL is a multiple-user database system, different clients can attempt to use any given table at the same time. Storage engines such as MyISAM use table locking to keep clients from modifying a table at the same time, but this does not provide good concurrency performance when there are many updates. The InnoDB storage takes a different approach. It uses row-level locking for finer-grained control over table access by clients. One client can modify a row at the same time that another client reads or modifies a different row in the same table. If both clients want to modify a row at the same time, whichever of them acquires a lock on the row gets to modify it first. This provides better concurrency than table locking. However, there is the question about whether one client's transaction should be able to see the changes made by another client's transaction.
InnoDB implements transaction isolation levels to give clients control over what kind of changes made by other transactions they want to see. Different isolation levels allow or prevent the various problems that can occur when different transactions run simultaneously:
To deal with these problems, InnoDB provides four transaction isolation levels. These levels determine which modifications made by one transaction can be seen by other transactions that execute at the same time:
Table 2.3 shows for each isolation level whether they allow dirty reads, nonrepeatable reads, or phantom rows. The table is InnoDB-specific in that REPEATABLE READ does not allow phantom rows to occur. Some database systems do allow phantoms at the REPEATABLE READ isolation level.s
The default InnoDB isolation level is REPEATABLE READ. This can be changed at server startup with the --transaction-isolation option, or at runtime with the SET trANSACTION statement. The statement has three forms:
SET GLOBAL TRANSACTION ISOLATION LEVEL level SET SESSION TRANSACTION ISOLATION LEVEL level SET TRANSACTION ISOLATION LEVEL level
A client that has the SUPER privilege can use SET TRANSACTION to change the global isolation level, which then applies to any clients that connect thereafter. In addition, any client can change its own transaction isolation level, either for all subsequent transactions within its session with the server or for just its next transaction. No special privileges are required for the client-specific levels.
Non-Transactional Approaches to Transactional Problems
In a non-transactional environment, some transactional issues can be dealt with and some cannot. The following discussion covers what can and cannot be achieved without using transactions. You can use this information to determine whether an application can employ the techniques here and avoid the overhead of transaction-safe tables.
First, let's consider how concurrency problems can occur when multiple clients attempt to make changes to a database using operations that each require several statements. Suppose that you're in the garment sales business and your cash register software automatically updates your inventory levels whenever one of your salesmen processes a sale. The sequence of events shown here outlines the operations that take place when multiple sales occur. For the example, assume that the initial shirt inventory level is 47.
At the end of this sequence of events, you've sold five shirts. That's good. However, the inventory level says 45. That's bad, because it should be 42. The problem is that if you look up the inventory level in one statement and update the value in another statement, you have a multiple-statement operation. The action taken in the second statement is dependent on the value retrieved in the first. If separate multiple-statement operations occur during overlapping time frames, the statements from each operation intertwine and interfere with each other. To solve this problem, it's necessary that the statements for a given operation execute without interference from other operations. A transactional system ensures this by executing each salesman's statements as a unit and isolating them from each other. As a result, Salesman B's statements won't execute until those for Salesman A have completed.
To deal with the concurrency issues inherent in the situation just described, you can take a couple of approaches:
If any of these issues are significant for your applications, you should use transaction-safe tables instead, because transactional capabilities help you deal with each issue. A transaction handler executes a set of statements as a unit and manages concurrency issues by preventing clients from getting in the way of each other. It also allows rollback in the case of failure to keep half-executed operations from damaging your database, and it determines which locks are necessary and acquires them automatically.