Creating, Dropping, Indexing, and Altering Tables
MySQL allows you to create tables, drop (remove) them, and change their structure with the CREATE TABLE, DROP TABLE, and ALTER TABLE statements. The CREATE INDEX and DROP INDEX statements allow you to add or remove indexes on existing tables. The following sections provide the details for these statements. But first it's necessary to discuss the storage engines that MySQL supports for managing different types of tables.
Storage Engine Characteristics
MySQL supports multiple storage engines, or "table handlers" as they used to be known. Each storage engine implements tables that have a specific set of properties or characteristics. The following table lists the current engines and the versions in which they first became available.
Some of the engine names have synonyms. MRG_MyISAM, BerkeleyDB, and NDBCLUSTER are synonyms for MERGE, BDB, and NDB, respectively. The MEMORY and InnoDB storage engines originally were known as HEAP and Innobase. The latter names still are recognized, but are deprecated.
Checking Which Storage Engines Are Available
It's quite possible that a given MySQL server will not support all available storage engines. The engines actually available to you depend on your version of MySQL, how the server was configured at build time, and the options with which it was started. For details on selecting storage engines, see "Storage Engine Configuration," in Chapter 11.
mysql> SHOW ENGINES; +------------+---------+------------------------------------------------ ... | Engine | Support | Comment ... +------------+---------+------------------------------------------------ ... | MyISAM | DEFAULT | Default engine as of MySQL 3.23 with great perf ... | HEAP | YES | Alias for MEMORY ... | MEMORY | YES | Hash based, stored in memory, useful for tempor ... | MERGE | YES | Collection of identical MyISAM tables ... | MRG_MYISAM | YES | Alias for MERGE ... | ISAM | NO | Obsolete storage engine, now replaced by MyISAM ... | MRG_ISAM | NO | Obsolete storage engine, now replaced by MERGE ... | InnoDB | YES | Supports transactions, row-level locking, and f ... | INNOBASE | YES | Alias for INNODB ... | BDB | YES | Supports transactions and page-level locking ... | BERKELEYDB | YES | Alias for BDB ... | NDBCLUSTER | NO | Clustered, fault-tolerant, memory-based tables ... | NDB | NO | Alias for NDBCLUSTER ... | EXAMPLE | NO | Example storage engine ... | ARCHIVE | NO | Archive storage engine ... | CSV | YES | CSV storage engine ... | FEDERATED | YES | Federated MySQL storage engine ... +------------+---------+------------------------------------------------ ...
The value in the Support column is YES or NO to indicate that the engine is or is not available, DISABLED if the engine is present but turned off, or DEFAULT for the storage engine that the server uses by default. The engine designated as DEFAULT should be considered available.
Availability of storage engines also can be determined to some extent by referring to the server's version number or system variables. ISAM is the only storage engine available before MySQL 3.23. From 3.23 on, MyISAM, MERGE, and MEMORY are always available, and availability of the other types can be assessed by means of an appropriate SHOW VARIABLES statement:
SHOW VARIABLES LIKE 'have_isam'; SHOW VARIABLES LIKE 'have_bdb'; SHOW VARIABLES LIKE 'have_innodb';
If the output from SHOW VARIABLES shows that the variable has a value of YES, the corresponding storage engine is enabled. If the value is something else or there is no output, the engine is unavailable.
Table Representation on Disk
Each time you create a table, MySQL creates a disk file that contains the table's format (that is, its definition). This file is stored in the database directory for the database that the table belongs to. This is true no matter which storage engine manages the table. The format file has a basename that is the same as the table name and an .frm extension. That is, for a table named t, the format file is named t.frm. The association of each table with an frm file has a table-naming consequence: Table names must be legal for use in filenames.
Most storage engines also create other files that are unique to the table, to be used for storing the table's content. For any given table, the files specific to it are located in the directory that represents the database that contains the table. Table 2.2 shows the filename extensions for the table-specific files that each storage engine creates.
The following sections describe specific characteristics of MySQL's storage engines.
The ISAM Storage Engine
The ISAM storage engine manages tables that use the indexed sequential access method. ISAM was the original storage engine in MySQL, and was the only one available prior to MySQL 3.23. ISAM has since been superseded by the MyISAM storage engine. MyISAM tables are the preferred general replacement because they have fewer limitations. The ISAM engine currently is still available but considered obsolete and support for it will fade over time. For example, ISAM support has been omitted from the embedded server, and probably will disappear entirely sometime in MySQL 5.0.
Due to the decline in both the level of support for the ISAM engine and its use in the field, ISAM is not covered very much elsewhere in this book. In most cases, it is mentioned only to point out how it embodied a restriction that has been lifted in the MyISAM storage engine.
The MyISAM Storage Engine
The MERGE Storage Engine
MERGE tables are a means for grouping a set of MyISAM tables into a single logical unit. By querying a MERGE table, you in effect query all the constituent tables. One advantage of this is that you can in effect exceed the maximum table size allowed by the filesystem for individual MyISAM tables.
The tables that make up a MERGE table must all have the same structure. This means the columns in each table must be defined with the same names and types in the same order, and the indexes must be defined in the same way and in the same order. It is allowable to mix compressed and uncompressed tables. (Compressed tables are produced with myisampack; see Appendix F, "MySQL Program Reference.")
The MEMORY Storage Engine
The MEMORY storage engine uses tables that are stored in memory and that have fixed-length rows, two characteristics that make them very fast. The MEMORY storage engine originally was called the HEAP engine; you will need to use HEAP if your server is older and does not recognize the MEMORY keyword.
MEMORY tables are temporary in the sense that their contents disappear when the server terminates. That is, MEMORY tables still exist when the server restarts, but will be empty. However, in contrast to temporary tables created with CREATE TEMPORARY TABLE, MEMORY tables are visible to other clients. Several constraints apply to MEMORY tables that allow them to be handled more simply, and thus more quickly:
MEMORY tables can use different character sets for different columns.
The InnoDB Storage Engine
The BDB Storage Engine
The Berkeley DB storage engine was developed by Sleepycat Software. You may have encountered it in other applications. For example, the Subversion revision control system uses BDB to provide backing store for repositories. In MySQL, the BDB storage engine offers these features:
The FEDERATED Storage Engine
The FEDERATED storage engine provides access to tables that are located at other MySQL servers. In other words, the contents of a FEDERATED table really are located remotely. When you create a FEDERATED table, you specify the host where the other server is running and provide the username and password of an account on that server. When you access the FEDERATED table, the local server connects to the remote server using this account. For an example, see "Using FEDERATED Tables."
The NDB Storage Engine
NDB is MySQL's cluster storage engine. It was developed by Ericsson Business Innovation and later acquired by MySQL AB. For this storage engine, the MySQL server actually acts as a client to a cluster of other processes that provide access to the NDB tables. Cluster node processes communicate with each other to manage tables in memory. The tables are replicated among cluster processes for redundancy. Memory storage provides high performance, and the cluster provides high availability because it survives failure of any given node.
NDB is relatively new in MySQL and configuration management is still being worked out. For this reason, NDB is not covered further here. See the MySQL Reference Manual for current details.
Other Storage Engines
There are three other MySQL storage engines that I will group here under the "miscellaneous" category:
Storage Engine Portability Characteristics
Any table managed by a given MySQL server is portable to another server in the sense that you can dump it into a text file with mysqldump, move the file to the machine where the other server runs, and load the file to re-create the table. Another kind of portability is "binary portability," which means that you can directly copy the disk files that represent the table to another machine, install them into a database directory, and expect the MySQL server there to be able to use the table.
A general condition for binary portability is that the destination server must support the storage engine that manages the table. If the server does not have the appropriate engine, it cannot access tables created by that engine.
Some storage engines create tables that are binary portable and some do not. The following list characterizes binary portability for individual engines:
Regardless of a storage engine's general portability characteristics, normally you should not attempt to copy table or tablespace files to another machine unless the server has been shut down cleanly. If you perform a copy after an abnormal shutdown, you cannot assume the integrity of your tables. The tables may be in need of repair or there may be transaction information still stored in a storage engine's log files that needs to be applied or rolled back to bring tables up to date.
It is sometimes possible to tell a running server to leave tables alone while you copy them. However, if the server is running and actively updating the tables, the table contents on disk will be in flux and the associated files will not yield usable table copies. See Chapter 13, "Database Backups, Maintenance, and Repair," for discussion of the conditions under which you can leave the server running while copying tables.
To create a table, use a CREATE TABLE statement. The full syntax for this statement is complex because there are so many optional clauses, but in practice, it's usually fairly simple to use. For example, most of the CREATE TABLE statements that we used in Chapter 1 are reasonably uncomplicated. If you start with the more basic forms and work up, you shouldn't have much trouble.
The CREATE TABLE specifies, at a minimum, the table name and a list of the columns in it. For example:
CREATE TABLE mytbl ( name CHAR(20), age INT NOT NULL, weight INT, sex ENUM('F','M') );
In addition to the column definitions, you can specify how the table should be indexed when you create it. Another option is to leave the table unindexed when you create it and add the indexes later. For MyISAM tables, that's a good strategy if you plan to populate the table with a lot of data before you begin using it for queries. Updating indexes as you insert each row is much slower for those table types than loading the data into an unindexed table and creating the indexes afterward.
We have already covered the basic syntax for the CREATE TABLE statement in Chapter 1. Details on how to write column definitions are given in Chapter 3. Here, we deal more generally with some important extensions to the CREATE TABLE statement that give you a lot of flexibility in how you construct tables:
To modify a table's storage characteristics, you can add table options following the closing parenthesis in the CREATE TABLE statement. For example, you can add an ENGINE = engine_name option to specify which storage engine to use for the table. The engine name is not case sensitive. To create a MEMORY or InnoDB table, write the statement like this:
CREATE TABLE mytbl ( ... ) ENGINE = MEMORY; CREATE TABLE mytbl ( ... ) ENGINE = InnoDB;
TYPE can be used as a synonym for the ENGINE keyword, but you will get a warning in MySQL 4.1 and up. (For older servers that do not understand ENGINE, you must use TYPE.)
With no ENGINE specifier, the server creates the table using the default storage engine. The built-in default is MyISAM, but you can configure the server to use a different default by starting it with the --default-storage-engine option. At runtime, you can change the default storage engine by setting the storage_engine system variable.
If a CREATE TABLE statement names a storage engine that is legal but unavailable, MySQL creates the table using the default engine and generates a warning. For example, if BDB is not available, you would see something like this if you try to create a BDB table:
mysql> CREATE TABLE t (i INT) ENGINE = BDB; Query OK, 0 rows affected, 1 warning (0.01 sec) mysql> SHOW WARNINGS; +---------+------+-------------------------------------------+ | Level | Code | Message | +---------+------+-------------------------------------------+ | Warning | 1266 | Using storage engine MyISAM for table 't' | +---------+------+-------------------------------------------+ 1 row in set (0.00 sec)
If you name an unknown storage engine, an error occurs.
When you want to make sure that a table uses a particular storage engine, be sure to include the ENGINE table option. Because the default engine is configurable, you might not get the type of table that you want if you omit ENGINE.
mysql> SHOW CREATE TABLE t\G *************************** 1. row *************************** Table: t Create Table: CREATE TABLE `t` ( `i` int(11) default NULL ) ENGINE=MyISAM DEFAULT CHARSET=latin1
Multiple table options can be given in the CREATE TABLE statement. Many of them apply only to particular storage engines. For example, a MIN_ROWS = n option can be useful for MEMORY tables to allow the MEMORY storage engine to optimize memory usage:
CREATE TABLE mytbl ( ... ) ENGINE = MEMORY MIN_ROWS = 10000;
The MAX_ROWS and AVG_ROW_LENGTH options can help you size a MyISAM table. By default, MyISAM creates tables with an internal row pointer size that allows table files to grow up to 4GB. Specifying the MAX_ROWS and AVG_ROW_LENGTH options gives MyISAM information that it should use a pointer size for a table that can hold at least MAX_ROWS rows.
A complete list of table options is given in the description for CREATE TABLE in Appendix E.
ALTER TABLE mytbl ENGINE = InnoDB;
See "Altering Table Structure" for more information about changing storage engines.
Provisional Table Creation
To create a table only if it doesn't already exist, use CREATE TABLE IF NOT EXISTS. You can use this statement for an application that makes no assumptions about whether a table that it needs has been set up in advance. The application can go ahead and attempt to create the table as a matter of course. The IF NOT EXISTS modifier is particularly useful for scripts that you run as batch jobs with mysql. In this context, a regular CREATE TABLE statement doesn't work very well. The first time the job runs, it creates the table, but the second time an error occurs because the table already exists. If you use IF NOT EXISTS, there is no problem. The first time the job runs, it creates the table, as before. For second and subsequent times, table creation attempts are silently ignored without error. This allows the job to continue processing as if the attempt had succeeded.
If you use IF NOT EXISTS, be aware that MySQL does not compare the table structure in the CREATE TABLE statement with that of the existing table. If a table exists with the given name but has a different structure, the statement does not fail. If that is a risk you do not want to take, it might be better instead to use DROP TABLE IF NOT EXISTS followed by CREATE TABLE without IF EXISTS.
You can use CREATE TEMPORARY TABLE to create temporary tables that disappear automatically when your connection to the server terminates. This is handy because you don't have to bother issuing a DROP TABLE statement to get rid of the table, and the table doesn't hang around if your connection terminates abnormally. For example, if you have a canned query in a batch file that you run with mysql and you decide not to wait for it to finish, you can kill the script in the middle with impunity and the server will remove any TEMPORARY tables that the script creates.
A TEMPORARY table is visible only to the client that creates the table. Different clients each can create a TEMPORARY table with the same name. The tables do not conflict because each client sees only the table that it created.
The name of a TEMPORARY table can be the same as that of an existing permanent table. This is not an error, nor does the existing permanent table get clobbered. Instead, the permanent table becomes hidden (inaccessible) to the client that creates the TEMPORARY table while the TEMPORARY table exists. Suppose that you create a TEMPORARY table named member in the sampdb database. The original member table becomes hidden, and references to member refer to the TEMPORARY table. If you issue a DROP TABLE member statement, the TEMPORARY table is removed and the original member table "reappears." If you simply disconnect from the server without dropping the TEMPORARY table, the server automatically drops it for you. The next time you connect, the original member table is visible again. (The original table also reappears if you rename a TEMPORARY table that hides it to have a different name.)
The server drops a TEMPORARY table automatically when your client session ends, but you drop it explicitly as soon as you're done with it to allow the server to free any resources associated with it. This is a good idea if your session with the server will not end for a while, particularly for temporary MEMORY tables.
Creating Tables from Other Tables or Query Results
It's sometimes useful to create a copy of a table. For example, you might have a data file that you want to load into a table using LOAD DATA, but you're not quite sure about the options for specifying the data format. You can end up with malformed records in the original table if you don't get the options right the first time. Using an empty copy of the original table allows you to experiment with the LOAD DATA options for specifying column and line delimiters until you're satisfied your input records are being interpreted properly. Then you can load the file into the original table by rerunning the LOAD DATA statement with the original table name.
It's also sometimes desirable to save the result of a query into a table rather than watching it scroll off the top of your screen. By saving the result, you can refer to it later without rerunning the original queryperhaps to perform further analysis on it.
MySQL provides two statements for creating new tables from other tables or from query results. These statements have differing advantages and disadvantages:
CREATE TABLE new_tbl_name LIKE tbl_name;
To create an empty copy of a table and then populate it from the original table, use CREATE TABLE … LIKE followed by INSERT INTO … SELECT:
CREATE TABLE new_tbl_name LIKE tbl_name; INSERT INTO new_tbl_name SELECT * FROM tbl_name;
CREATE TEMPORARY TABLE new_tbl_name LIKE tbl_name; INSERT INTO new_tbl_name SELECT * FROM tbl_name;
Using a TEMPORARY table with the same name as the original can be useful when you want to try some statements that modify the contents of the table, but you don't want to change the original table. To use prewritten scripts that use the original table name, you don't need to edit them to refer to a different table; just add the CREATE TEMPORARY TABLE and INSERT statements to the beginning of the script. The script will create a temporary copy and operate on the copy, which the server deletes when the script finishes. (However, bear in mind the auto-reconnect caveat noted earlier in "Temporary Tables.")
To insert into the new table only some of the rows from the original table, add a WHERE clause that identifies which rows to select. The following statements create a new table named student_f that contains only the records for female students in the student table:
CREATE TABLE student_f LIKE student; INSERT INTO student_f SELECT * FROM student WHERE sex = 'f';
If you don't care about retaining the exact column definitions from the original table, CREATE TABLE … SELECT sometimes is easier to use than CREATE TABLE … LIKE because it can create and populate the new table in a single statement:
CREATE TABLE student_f SELECT * FROM student WHERE sex = 'f';
CREATE TABLE … SELECT also can create new tables that don't contain exactly the same set of columns in an existing table. You can use it to cause a new table to spring into existence on-the-fly to hold the result of an arbitrary SELECT query. This makes it exceptionally easy to create a table fully populated with the data in which you're interested, ready to be used in further statements. However, the new table can contain strange column names if you're not careful. To avoid this, you should use aliases as necessary to provide reasonable column names. When you create a table by selecting data into it, the column names are taken from the columns that you are selecting. If a column is calculated as the result of an expression, the name of the column is the text of the expression, which creates a table with an unusual column name:
mysql> CREATE TABLE mytbl SELECT PI() * 2; mysql> SELECT * FROM mytbl; +----------+ | PI() * 2 | +----------+ | 6.283185 | +----------+
That's unfortunate, because the column name can be referred to directly only as a quoted identifier:
mysql> SELECT `PI() * 2` FROM mytbl; +----------+ | PI() * 2 | +----------+ | 6.283185 | +----------+
To provide a column name that is easier to work with, use an alias:
mysql> DROP TABLE mytbl; mysql> CREATE TABLE mytbl SELECT PI() * 2 AS mycol; mysql> SELECT mycol FROM mytbl; +----------+ | mycol | +----------+ | 6.283185 | +----------+
A related snag occurs if you select from different tables columns that have the same name. Suppose that tables t1 and t2 both have a column c and you want to create a table from all combinations of rows in both tables. The following statement fails because it attempts to create a table with two columns named c:
mysql> CREATE TABLE t3 SELECT * FROM t1, t2; ERROR 1060 (42S21): Duplicate column name 'c'
To solve this problem, provide aliases that give each column a unique name in the new table:
mysql> CREATE TABLE t3 SELECT t1.c AS c1, t2.c AS c2 FROM t1, t2;
As mentioned previously, a shortcoming of CREATE TABLE … SELECT is that not all characteristics of the original data are incorporated into the structure of the new table. For example, creating a table by selecting data into it does not copy indexes from the original table, and it can lose column attributes such as the default value. In some cases, you can force specific attributes to be used in the new table by invoking the CAST() function in the SELECT part of the statement. The following CREATE TABLE … SELECT statement forces the columns produced by the SELECT to be treated as INT UNSIGNED, DATE, and BINARY. You can verify that with DESCRIBE:
mysql> CREATE TABLE mytbl SELECT -> CAST(1 AS UNSIGNED) AS i, -> CAST(CURDATE() AS DATE) AS d, -> CAST('Hello, world' AS BINARY) AS c; mysql> DESCRIBE mytbl; +-------+-----------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-----------------+------+-----+---------+-------+ | i | int(1) unsigned | | | 0 | | | d | date | YES | | NULL | | | c | binary(12) | | | | | +-------+-----------------+------+-----+---------+-------+
The allowable cast types are BINARY (binary string), DATE, DATETIME, TIME, SIGNED, SIGNED INTEGER, UNSIGNED, and UNSIGNED INTEGER.
It is also possible to provide explicit column definitions in the CREATE TABLE part, to be used for the columns retrieved by the SELECT part. Columns in the two parts are matched by name, so provide aliases in the SELECT part as necessary to cause them to match up properly:
mysql> CREATE TABLE mytbl (i INT UNSIGNED, d DATE, c BINARY(20)) -> SELECT -> 1 AS i, -> CURDATE() AS d, -> 'Hello, world' AS c; mysql> DESCRIBE mytbl; +-------+------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+------------------+------+-----+---------+-------+ | i | int(10) unsigned | YES | | NULL | | | d | date | YES | | NULL | | | c | binary(20) | YES | | NULL | | +-------+------------------+------+-----+---------+-------+
The technique of providing explicit definitions allows you to create character columns that have a different width than that of the longest value in the result set. Also note that the Null and Default attributes of some of the columns are different for this example than for the previous one. You can provide explicit definitions for those attributes as well if necessary.
Using MERGE Tables
The MERGE storage engine provides a way to perform queries on a set of MyISAM tables simultaneously by treating them all as a single logical unit. As described earlier in "Storage Engine Characteristics," MERGE can be applied to a collection of MyISAM tables that all have identical structure. The columns in each table must be defined with the same names and types in the same order, and the indexes must be defined in the same way and in the same order.
Suppose that you have a set of individual log tables that contain log entries on a year-by-year basis and that each are defined like this, where CC and YY represent the century and year:
CREATE TABLE log_CCYY ( dt DATETIME NOT NULL, info VARCHAR(100) NOT NULL, INDEX (dt) ) ENGINE = MyISAM;
If the current set of log tables includes log_2001, log_2002, log_2003, log_2004, and log_2005, you can set up a MERGE table that maps onto them like this:
CREATE TABLE log_all ( dt DATETIME NOT NULL, info VARCHAR(100) NOT NULL, INDEX (dt) ) ENGINE = MERGE UNION = (log_2001, log_2002, log_2003, log_2004, log_2005);
The ENGINE value must be MERGE, and the UNION option lists the tables to be included in the MERGE table. After the table has been set up, you query it just like any other table, but the queries will refer to all the constituent tables at once. This query determines the total number of rows in all the log tables:
SELECT COUNT(*) FROM log_all;
This query determines how many log entries there are per year:
SELECT YEAR(dt) AS y, COUNT(*) AS entries FROM log_all GROUP BY y;
MERGE tables also support DELETE and UPDATE operations. INSERT is trickier because MySQL needs to know which table to insert new records into. MERGE table definitions can include an INSERT_METHOD option with a value of NO, FIRST, or LAST to indicate that INSERT is forbidden or that records should be inserted into the first or last table named in the UNION option. For example, the following definition would cause an INSERT into log_all to be treated as an INSERT into log_2005, the last table named in the UNION option:
CREATE TABLE log_all ( dt DATETIME NOT NULL, info VARCHAR(100) NOT NULL, INDEX (dt) ) ENGINE = MERGE UNION = (log_2001, log_2002, log_2003, log_2004, log_2005) INSERT_METHOD = LAST;
Using FEDERATED Tables
The FEDERATED storage engine is available as of MySQL 5.0.3. It enables you to access tables from one MySQL server that actually are managed by another server. This section briefly summarizes how to use this storage engine.
Suppose that there is no sampdb database on your local server, but there is one available from the MySQL server running on the host corn.snake.net and that you have an account for accessing that server. This account also can be used by the local server through the FEDERATED storage engine to make the sampdb tables available on the local server. For each table that you want to access this way, create a FEDERATED table that has the same columns as the remote table, but include a connection string that indicates to the local server how to connect to the remote server. This is done with the COMMENT table option.
For example, the student table on the remote server has this definition:
CREATE TABLE student ( name VARCHAR(20) NOT NULL, sex ENUM('F','M') NOT NULL, student_id INT UNSIGNED NOT NULL AUTO_INCREMENT, PRIMARY KEY (student_id) ) ENGINE = InnoDB;
To create a FEDERATED table, use the same definition except that the ENGINE option should be FEDERATED and a COMMENT option should be given that provides connection information. The following definition creates a table named federated_student that accesses the student table on corn.snake.net:
CREATE TABLE federated_student ( name VARCHAR(20) NOT NULL, sex ENUM('F','M') NOT NULL, student_id INT UNSIGNED NOT NULL AUTO_INCREMENT, PRIMARY KEY (student_id) ) ENGINE = FEDERATED COMMENT = 'mysql://sampadm:firstname.lastname@example.org/sampdb/student';
The connection string in the COMMENT value indicates that the username and password of the MySQL account on the remote server are sampadm and secret. The general connection string syntax is as follows, where square brackets indicate optional information:
After you create the federated_student table, you can select from it to access the remote student table. You can also use INSERT, UPDATE, and DELETE with federated_student to modify the contents of the student table.
The FEDERATED engine is quite new, so some details are likely to change. For example, an alternative to the COMMENT option for storing the connection string might be implemented to prevent the name and password from being visible to anyone who can use SHOW CREATE TABLE for a FEDERATED table.
DROP TABLE tbl_name;
Indexes are the primary means of speeding up access to the contents of your tables, particularly for queries that involve joins on multiple tables. This is an important enough topic that Chapter 4, "Query Optimization," discusses why you use indexes, how they work, and how best to take advantage of them to optimize your queries. This section covers the characteristics of indexes for the various table types and the syntax for creating and dropping indexes.
Storage Engine Indexing Characteristics
MySQL provides quite a bit of flexibility in the way you can construct indexes:
Not all storage engines offer all indexing features. The following table summarizes the indexing properties of the various table types. The table does not include the MERGE storage engine, because MERGE tables are created from MyISAM tables and have similar indexing characteristics. Nor does it include the EXAMPLE, ARCHIVE, or CSV engines, which do not support indexing.
The table illustrates some of the reasons why the MyISAM storage engine generally is to be preferred over the ISAM engine that it succeeds. MyISAM relaxes several of the indexing constraints that apply to ISAM tables. For example, with MyISAM tables, you can index columns that contain NULL values, you can index BLOB and TEXT columns, and you can have a larger number of indexes per table.
One implication of the differences in indexing characteristics for the various storage engines is that if you require an index to have certain properties, you may not be able to use certain types of tables. For example, if you want to use a FULLTEXT index, you must use a MyISAM table. If you want to use foreign keys, you must use an InnoDB table.
If you have an existing table of one type but would like to convert it to another type that has more suitable indexing characteristics, use ALTER TABLE to change the type. Suppose that you were using ISAM tables in an older version of MySQL but now have upgraded to a newer version. To take advantage of MyISAM's superior indexing features, you can easily convert each table to use the MyISAM storage engine with ALTER TABLE:
ALTER TABLE tbl_name ENGINE = MyISAM;
ALTER TABLE tbl_name ENGINE = InnoDB; ALTER TABLE tbl_name ENGINE = BDB;
MySQL can create several types of indexes:
You can create indexes for a new table when you use CREATE TABLE. Examples of this are shown in Chapter 1. To add indexes to existing tables, use CREATE INDEX or ALTER TABLE. MySQL maps CREATE INDEX statements onto ALTER TABLE operations internally.
ALTER TABLE tbl_name ADD INDEX index_name (index_columns); ALTER TABLE tbl_name ADD UNIQUE index_name (index_columns); ALTER TABLE tbl_name ADD PRIMARY KEY (index_columns); ALTER TABLE tbl_name ADD FULLTEXT index_name (index_columns); ALTER TABLE tbl_name ADD SPATIAL index_name (index_columns);
tbl_name is the name of the table to which the index should be added, and index_columns indicates which column or columns to index. If the index consists of more than one column, separate the names by commas. The index name index_name is optional. If you leave it out, MySQL picks a name based on the name of the first indexed column.
Indexed columns can be NULL unless the index is a PRIMARY KEY or SPATIAL index.
A single ALTER TABLE statement can include multiple table alterations if you separate them by commas. This means that you can create several indexes at the same time, which is faster than adding them one at a time with individual ALTER TABLE statements.
CREATE INDEX index_name ON tbl_name (index_columns); CREATE UNIQUE INDEX index_name ON tbl_name (index_columns); CREATE FULLTEXT INDEX index_name ON tbl_name (index_columns); CREATE SPATIAL INDEX index_name ON tbl_name (index_columns);
tbl_name, index_name, and index_columns have the same meaning as for ALTER TABLE. Unlike ALTER TABLE, the index name is not optional with CREATE INDEX, and you cannot create multiple indexes with a single statement.
To create indexes for a new table when you issue a CREATE TABLE statement, the syntax is similar to that used for ALTER TABLE, but you specify the index-creation clauses in addition to the column definitions:
CREATE TABLE tbl_name ( ... column definitions ... INDEX index_name (index_columns), UNIQUE index_name (index_columns), PRIMARY KEY (index_columns), FULLTEXT index_name (index_columns), SPATIAL index_name (index_columns), ... );
As with ALTER TABLE, index_name is optional. MySQL picks an index name if you leave it out.
As a special case, you can create a single-column PRIMARY KEY or UNIQUE index by adding a PRIMARY KEY or UNIQUE clause to the end of a column definition. For example, the following CREATE TABLE statements are equivalent:
CREATE TABLE mytbl ( i INT NOT NULL PRIMARY KEY, j CHAR(10) NOT NULL UNIQUE ); CREATE TABLE mytbl ( i INT NOT NULL, j CHAR(10) NOT NULL, PRIMARY KEY (i), UNIQUE (j) );
The default index type for a MEMORY table is HASH. A hashed index is very fast for exact-value lookups, which is the typical way MEMORY tables are used. However, if you plan to use a MEMORY table for comparisons that can match a range of values (for example, id < 100), hashed indexes do not work well. In this case, you'll be better off creating a BTREE index instead. Do this by adding a USING clause to the index definition:
CREATE TABLE namelist ( id INT NOT NULL, name CHAR(100), INDEX USING BTREE (id) ) ENGINE = MEMORY;
To index a prefix of a string column, the syntax for naming the column in the index definition is col_name(n) rather than simply col_name. The prefix value, n, can be from 1 to 255. (1 to 1000 for MyISAM and InnoDB as of MySQL 4.1.2.) This indicates that the index should include the first n bytes of column values for binary string types, or the first n characters for non-binary string types. For example, the following statement creates a table with a CHAR column and a BINARY column. It indexes the first 10 characters of the CHAR column and the first 15 bytes of the BINARY column:
CREATE TABLE mytbl ( name CHAR(30) NOT NULL, address BINARY(60) NOT NULL, INDEX (name(10)), INDEX (address(15)) );
When you index a prefix of a string column, the prefix length, just like the column length, is specified in the same units as the column data typethat is, bytes for binary strings and characters for non-binary strings. However, the maximum size of index entries are measured internally in bytes. The two measures are the same for single-byte character sets, but not for multi-byte character sets. For non-binary strings that have multi-byte character sets, MySQL stores into index values as many complete characters that fit within the allowed maximum byte length.
In some circumstances, you may find it not only desirable but necessary to index a column prefix rather than the entire column:
To drop an index, use either a DROP INDEX or an ALTER TABLE statement. To use DROP INDEX, you must name the index to be dropped. (To drop a PRIMARY KEY with DROP INDEX, use the quoted identifier `PRIMARY`.) The syntax for DROP INDEX looks like this:
DROP INDEX index_name ON tbl_name; DROP INDEX `PRIMARY` ON tbl_name;
Like the CREATE INDEX statement, DROP INDEX is handled internally as an ALTER TABLE statement. The syntax for ALTER TABLE statements that correspond to the preceding DROP INDEX statements is as follows:
ALTER TABLE tbl_name DROP INDEX index_name; ALTER TABLE tbl_name DROP PRIMARY KEY;
If you don't know the names of a table's indexes, use SHOW CREATE TABLE or SHOW INDEX to find out.
When you drop columns from a table, indexes can be affected implicitly. If you drop a column that is a part of an index, the column is removed from the index as well. If all columns that make up an index are dropped, the entire index is dropped.
Altering Table Structure
ALTER TABLE is a versatile statement in MySQL, and you can use it for many purposes. We've already seen some of its capabilities in this chapter (for changing storage engines and for creating and dropping indexes). You can also use ALTER TABLE to rename tables, add or drop columns, change column data types, and more. In this section, we'll cover some of its features. The complete syntax for ALTER TABLE is described in Appendix E.
ALTER TABLE is useful when you find that the structure of a table no longer reflects what you want to do with it. You might want to use the table to record additional information, or perhaps it contains information that has become superfluous. Maybe existing columns are too small. Perhaps it turns out that you've defined them larger than you need and you'd like to make them smaller to save space and improve query performance. Here are some situations in which ALTER TABLE is valuable:
The syntax for ALTER TABLE looks like this:
ALTER TABLE tbl_name action [, action] ... ;
Each action specifies a modification that you want to make to the table. Some database systems allow only a single action in an ALTER TABLE statement. MySQL allows multiple actions, separated by commas.
Tip: If you need to remind yourself about a table's current definition before using ALTER TABLE, issue a SHOW CREATE TABLE statement. This statement also can be useful after ALTER TABLE to see how the alteration affected the table definition.
The following examples show some of the capabilities of ALTER TABLE.
Changing a column's data type. To change a data type, you can use either a CHANGE or MODIFY clause. Suppose that the column in a table mytbl is SMALLINT UNSIGNED and you want to change it to MEDIUMINT UNSIGNED. Do so using either of the following commands:
ALTER TABLE mytbl MODIFY i MEDIUMINT UNSIGNED; ALTER TABLE mytbl CHANGE i i MEDIUMINT UNSIGNED;
Why is the column named twice in the command that uses CHANGE? Because one thing that CHANGE can do that MODIFY cannot is to rename the column in addition to changing the type. If you had wanted to rename i to j at the same time you changed the type, you'd do so like this:
ALTER TABLE mytbl CHANGE i j MEDIUMINT UNSIGNED;
The important thing with CHANGE is that you name the column you want to change and then specify a complete column definition, which includes the column name. That is, you must include the name in the definition, even if it's the same as the old name.
To rename a column, use CHANGE old_name new_name followed by the column's current definition.
You can assign character sets to individual columns, so it's possible to use the CHARACTER SET attribute in a column's definition to change its character set:
ALTER TABLE t MODIFY c CHAR(20) CHARACTER SET ucs2;
An important reason for changing data types is to improve query efficiency for joins that compare columns from two tables. Indexes often can be used for comparisons in joins between similar column types, but comparisons are quicker when both columns are exactly the same type. Suppose that you're running a query like this:
SELECT ... FROM t1, t2 WHERE t1.name = t2.name;
If t1.name is CHAR(10) and t2.name is CHAR(15), the query won't run as quickly as if they were both CHAR(15). You can make them the same by changing t1.name using either of these commands:
ALTER TABLE t1 MODIFY name CHAR(15); ALTER TABLE t1 CHANGE name name CHAR(15);
ALTER TABLE tbl_name ENGINE = engine_name;
engine_name is a name such as MyISAM, MEMORY, BDB, or InnoDB. Lettercase of the name does not matter.
Changing storage engines can be useful when you upgrade your MySQL installation to a newer version that provides additional table-handling features. For example, if you inherit an old pre-3.23 database, its tables will be in ISAM format. To change them to MyISAM tables, use this statement for each one:
ALTER TABLE tbl_name ENGINE = MyISAM;
Doing this allows you to take advantages of the capabilities that MyISAM offers that ISAM does not. For example, MyISAM tables are binary portable, so you can move them to other machines by copying table files directly, even if the machines have different hardware architectures. In addition, MyISAM tables have better indexing characteristics than ISAM.
Another reason to change a storage engine is to make it transaction-safe. Suppose that you have a MyISAM table and discover that an application that uses it needs to perform transactional operations, including rollback in case failures occur. MyISAM tables do not support transactions, but you can make the table transaction-safe by converting it to an InnoDB or BDB table:
ALTER TABLE tbl_name ENGINE = InnoDB; ALTER TABLE tbl_name ENGINE = BDB;
When you convert a table to use a different engine, the allowable or sensible conversions may depend on the feature compatibility of the old and new types:
ALTER TABLE tbl_name RENAME TO new_tbl_name;
RENAME TABLE old_name TO new_name;
RENAME TABLE t1 TO tmp, t2 TO t1, tmp TO t2;
If you qualify a table name with a database name, you can move a table from one database to another by renaming it. Either of the following statements move the table t from the sampdb database to the test database:
ALTER TABLE sampdb.t RENAME TO test.t; RENAME TABLE sampdb.t TO test.t;
You cannot rename a table to a name that already exists.
If you rename a MyISAM table that is part of a MERGE table, you must redefine the MERGE table accordingly.