Storage Engine Configuration

The MySQL server supports multiple storage engines and is highly configurable in terms of which engines to make available. General storage engine characteristics are discussed in Chapter 2, "MySQL SQL Syntax and Use." This section discusses how to configure which engines your server uses. It also provides specific configuration information for the MyISAM and InnoDB storage engines.

Selecting Storage Engines

The MySQL server provides flexible control over which storage engines to make available:

If you build MySQL from source, you can determine which of the optional storage engines to compile in.
For any server, whether or not you compile it yourself, you can selectively enable or disable at startup time those optional storage engines that are compiled in. You might do this for engines that you don't need so as to reduce server memory requirements. (Keep in mind that if you disable an engine, you cannot access any tables that might already have been created by it.)
You can find out at runtime which engines are compiled into the server, which of those actually are enabled, and which one is the default engine.

The following list describes how to include or exclude storage engines when building MySQL from source. It also indicates how to disable each engine at startup time if it is built in but you do not want to use it. (The terms "included" and "excluded" refer here to configuration choices, whereas "enabled" and "disabled" refer to runtime choices. "Available" means "can be used when the server is running.")

The MyISAM storage engine is always available. It can be neither excluded at configuration time nor disabled at server startup time. (The grant tables are MyISAM tables, so it is necessary that the MyISAM engine be available to read them.)
Like MyISAM, the MERGE and MEMORY engines always are available. Other storage engines are optional.
The InnoDB storage engine is included by default, but can be excluded with the --without-innodb option to configure. If included, InnoDB can be disabled at server startup time with the --skip-innodb option.
The BDB storage engine can be included with the --with-berkeley-db option to configure. If included, it can be disabled at server startup time with the --skip-bdb option.
The ISAM engine is excluded by default but can be included with the --with-isam option to configure. If compiled in, the ISAM engine can be disabled at startup with the --skip-isam option.
For the embedded server, the ISAM engine is excluded, even if you run configure with the --with-isam option. To include ISAM in the embedded server, you must edit the mysql_embed.h file in the source distribution and rebuild the server.
In general, it's better to convert ISAM tables to MyISAM tables and avoid using the ISAM storage engine. ISAM support will be phased out completely in the future, probably sometime in MySQL 5.0.
The EXAMPLE, ARCHIVE, CSV, and FEDERATED storage engines are excluded by default, but can be included with the --with-example-storage-engine, --with-archive-storage-engine, --with-csv-storage-engine, and --with-federated-storage-engine options to configure. If included, these engines are always enabled; there are no runtime options for disabling them.
The NDB storage engine is not covered here because configuration management is still in flux and being worked out. See the MySQL Reference Manual for current details.

As the server runs, it designates one storage engine as the default, which it uses for tables that are created without an explicit ENGINE = engine_name option. MyISAM is compiled in as the default engine, but you can select a different default at startup time or runtime:

To specify a default engine at startup, use a --default-storage-engine=engine_name option. For example, to make InnoDB the default, put the following lines in your server option file:
```
[mysqld]
default-storage-engine = innodb
```
To change the default storage engine at runtime, use one of the following statements:
```
SET GLOBAL storage_engine = engine_name;
SET SESSION storage_engine = engine_name;
```
The first statement requires the SUPER privilege and sets the default engine for all clients that connect thereafter. The second requires no special privilege and affects only the current client session. It can be used by any client to change its own default engine.
To check which storage engine is the global or session default, use this statement:
```
SELECT @@GLOBAL.storage_engine, @@SESSION.storage_engine;
```

To determine which storage engines are available, use the SHOW ENGINES statement. Appendix E describes the output produced by that statement.

Configuring the MyISAM Storage Engine

MyISAM tables have separate data and index files, which are handled differently:

For caching data rows read from or written to a MyISAM table, the server relies on the operating system to use its own filesystem caching mechanism.
For index processing, MyISAM manages its own key cache, which is the most important configurable resource for the MyISAM storage engine. The key cache is used for index-based retrievals and sorts, and for index creation and modification operations.

This section describes the general operation of the key cache and the system variables that you use to configure it.

The MyISAM key cache operates as follows:

Initially, the key cache is empty.
When the server needs to examine index values from a table during statement execution, it checks whether they have already been read into the key cache. If so, it consults the in-memory values. Otherwise, it reads the index file for the table to read the index values from disk into a block buffer in the cache.
If the cache is full when new index values need to be read in, the server must discard values from one of the block buffers. By default, it determines which values to discard on the basis of a least recently used (LRU) algorithm. That is, the server chooses the cache buffer that has been unused for the longest time. Buffers are maintained as a chain ordered by last access time, so the server simply picks the one at the end of the chain.
If the chosen buffer has not been modified, its contents are overwritten with newly read index values. Otherwise, the buffer first must be flushed to the index file before being overwritten.

Values not found in the cache when needed are "misses" and must be read from disk. Values that are found are "hits." The purpose of the key cache is to reduce disk access (that is, to minimize the ratio of misses to hits). It boosts performance greatly because disk access is much slower than memory access.

Index buffers containing frequently used values tend to stay in the key cache, but a larger cache increases the chance of a hit. This decreases the need to discard buffers to make room for new ones and minimizes the number of disk accesses needed for index processing. If the key cache currently is small and you have memory available, making the cache larger generally is one of the single easiest and best configuration changes you can make.

To configure the key cache size, set the key_buffer_size system variable. Its default value is 8MB, but can be increased up to 4GB if you have the memory available. For example, to set the key cache to 512MB, you can put these lines in an option file:

[mysqld]
key_buffer_size = 512M

Be careful not to set the key cache size so large that you use all available memory. That can cause the key cache itself to get paged out, which defeats the purpose of using a cache to hold information in memory. Remember too that other storage engines use their own buffers for which memory must be allocated, and that generally there are other processes running on the server host that require memory.

The preceding discussion is written as though there is a single key cache. That is true up through MySQL 4.1.0, and remains true by default. However, as of MySQL 4.1.1, it is possible to create multiple key caches. The revised caching mechanism also provides more control over cache operation. Here is a summary of its features:

You can create multiple key caches.
You have control over total cache size, cache block size, and the buffer discard algorithm.
You can assign tables to specific caches.
You can preload table indexes into a cache.

Multiple key caches can serve to reduce cache contention. If you have a table or set of tables that are heavily used, you can assign them to a separate key cache so that index caching for them does not have to compete with all the other tables that are processed through the default cache.

Each key cache is associated with a set of system variables. Because these variables are related, they are grouped as components that form a structured system variable. Structured variables are an extension of simple system variables, so they are accessed using a syntax that combines a cache name and a variable name:

cache_name.var_name

Each key cache structured variable has these components:

key_buffer_size
The total size of the key cache, in bytes.
key_cache_block_size
The size of blocks in the key cache, in bytes. By default, blocks are 1024 bytes.
key_cache_division_limit
This variable influences the cache buffer reuse algorithm. If set to its default value of 100, the key cache uses a least recently used strategy for determining which cache buffers to reuse. If set lower than 100, the key cache uses a midpoint insertion strategy that splits the cache into warm and hot sub-chains. The value of key_cache_division_limit is the percentage of the key cache to use for the warm buffer sub-chain. The value should be from 1 to 100.
With the midpoint insertion strategy that uses warm and hot sub-chains, an attempt is made to keep the most frequently accessed block buffers in the hot sub-chain. Buffers can move between the hot or warm sub-chains as access to them increases or decreases. Buffers to reuse and overwrite always are chosen from the warm sub-chain.
key_cache_age_threshold
How long buffers stay unused in the hot sub-chain of the key cache before being moved to the warm sub-chain. Higher values allow blocks to remain in the hot sub-chain longer. The default is 300. The minimum value is 100.

One key cache is the default and has a name of default. If you refer to a key cache component variable without using a cache name, MySQL uses the default cache. Thus, key_buffer_size and default.key_buffer_size refer to the same variable. Key cache names must be legal identifiers and are not case sensitive. They can be quoted like any other identifier using backtick ('`') characters, or, if the ANSI_QUOTES SQL mode is enabled, with double quote ('"') characters.

To create a new key cache, simply assign a value to one of its components. For example, to create a cache named my_cache with a size of 24MB, at server startup, add these lines to the server option file:

[mysqld]
my_cache.key_buffer_size = 24M

To create the cache at runtime, use this statement:

SET GLOBAL my_cache.key_buffer_size = 24*1024*1024;

The GLOBAL keyword is necessary because key caches are global. No special privileges are required to access component values, but you must have the SUPER privilege to set them.

After you create a key cache, you can assign MyISAM tables to it with CACHE INDEX. This statement names a key cache and one or more tables to be assigned to it. The following statement assigns the member and president tables from the sampdb database to the cache named my_cache:

CACHE INDEX member, president IN my_cache;

You can also preload table indexes into their assigned cache with LOAD INDEX INTO CACHE if you choose:

LOAD INDEX INTO CACHE member, president;

It is not necessary to preload the indexes, but the server reads index blocks sequentially if you do. This is more efficient than waiting for them to be fetched as needed.

The CACHE INDEX and LOAD INDEX INTO CACHE statements require that you have the INDEX privilege for the tables that are to be assigned to or preloaded into a cache.

To destroy a key cache, set its size to zero. Any tables assigned to the cache are reassigned to the default cache. If you set the size of the default key cache to zero, indexes of tables assigned to it are processed using filesystem caching the same way as for MyISAM data files.

Key cache assignments last only until the server shuts down. To make assignments each time the server starts, place the appropriate CACHE INDEX and LOAD INDEX INTO CACHE statements in a file and start the server with an --init-file option that names the file.

Configuring the InnoDB Storage Engine

The InnoDB storage engine manages a shared tablespace for storing table contents and its data dictionary. You have the option of configuring InnoDB to use one tablespace per table. InnoDB also has its own log files and memory buffers.

Configuring the InnoDB Tablespace

By default, the InnoDB storage engine does not use separate files for each table the way that other storage engines such as MyISAM and BDB do. Instead, it manages all InnoDB tables within a single shared tablespace, which is a logically unified block of storage that the engine treats as a giant data structure. (In a sense, the tablespace is something like a virtual filesystem.) For an InnoDB table stored in the shared tablespace, the only file uniquely associated with the table is the .frm format file that is stored in the database directory of the database that the table belongs to. The shared tablespace also contains the InnoDB data dictionary that records information about table structure.

It is also possible to configure InnoDB to represent each table using its own tablespace file. In other words, tables are created using individual tablespaces. To use individual tablespaces, start the server with the ---innodb-file-per-table option. The shared tablespace still is needed even in this case because it contains the InnoDB data dictionary, although it need not be as large.

Shared Tablespace Configuration Parameters

The InnoDB shared tablespace, although logically a single storage area, comprises one or more files on disk. Each component can be a regular file or a raw partition. This section describes the configuration options that you use to set up and manage the shared tablespace. It's possible to specify these options on the server command line, but in practice this is rarely done. Instead, you should configure the tablespace using an appropriate server group in an option file (for example, the [mysqld] or [server] group), so that the server uses the same configuration consistently each time it starts. Two options are the most important:

innodb_data_home_dir specifies the parent directory of all the component files that make up the tablespace. If you don't specify this option, its default value is the data directory.
innodb_data_file_path indicates the specifications for the component files of the tablespace under the InnoDB home directory. The value of this option is a list of one or more file specifications, separated by semicolons. Each specification consists of a filename, a size, and possibly other options, separated by colons. The combined size of the tablespace components must be at least 10MB.

If you provide no values for either option, the InnoDB storage engine creates a default tablespace consisting of a single 10MB auto-extending file named ibdata1 in the server's data directory. By using these options, you can explicitly control the number, size, and placement of the files in the shared tablespace.

As a simple example, suppose that you want to create a tablespace consisting of two 10MB files named innodata1 and innodata2 in the data directory. Configure the files as follows:

[mysqld]
innodb_data_file_path = innodata1:10M;innodata2:10M

No innodb_data_home_dir setting is required in this case because its default value is the server's data directory, the desired location for the files.

The following rules describe how the InnoDB storage engine combines the values of innodb_data_home_dir and innodb_data_file_path to determine the pathnames of the tablespace files:

If innodb_data_home_dir is empty, InnoDB treats all file specifications in innodb_data_file_path as absolute pathnames. "Empty" means that you specify the option with no value after the equal sign; it does not mean the option is unspecified.
If innodb_data_home_dir is not empty, it should name the directory under which all the file specifications in innodb_data_file_path are located. In this case, InnoDB interprets those filenames relative to the innodb_data_home_dir value.
If innodb_data_home_dir is not specified, its default value is the pathname to the MySQL data directory, and InnoDB interprets the filenames in innodb_data_file_path relative to the data directory.

Based on the preceding rules, the following three configurations all specify the same set of tablespace files, assuming that the data directory is /var/mysql/data:

[mysqld]
innodb_data_home_dir=
innodb_data_file_path=/var/mysql/data/ibdata1:10M;/var/mysql/data/ibdata2:10M

[mysqld]
innodb_data_home_dir=/var/mysql/data
innodb_data_file_path=ibdata1:10M;ibdata2:10M

[mysqld]
innodb_data_file_path=ibdata1:10M;ibdata2:10M

The innodb_data_file_path value consists of file specifications that are separated by semicolons. The parts of each specification are separated by colons. The simplest file specification syntax consists of a filename and a size, but other syntaxes are legal:

path:size
path:size:autoextend
path:size:autoextend:max:maxsize

The first format specifies a file with a fixed size of size. A size value should be a positive integer followed by M or G to indicate units of megabytes or gigabytes, respectively. The second format specifies an auto-extending file; if the file fills up, InnoDB extends it incrementally. The third format is similar, but includes a value indicating the maximum size to which the auto-extending file is allowed to grow. Only the final component of the tablespace may be listed as auto-extending.

The default auto-extend increment is 8MB. To specify a different increment, set the innodb_autoextend_increment system variable, available as of MySQL 4.1.5.

Initial Configuration of the Shared Tablespace

In the usual case, the shared tablespace consists only of regular files and does not include any raw partitions (device files). To perform the initial setup for a shared tablespace that contains only regular files, use this procedure:

1.	Add the appropriate lines to the option file.
2.	Make sure that the directories exist in which the tablespace component files are to be created. InnoDB creates files, but it will not create directories.
3.	Make sure that none of the component files already exist.
4.	Start the server. InnoDB will notice that the files do not exist, and will create and initialize them.

If you've already started the server without configuring InnoDB explicitly, InnoDB will have created a shared tablespace using the default configuration. To configure the tablespace explicitly, first stop the server and remove the InnoDB-related files (tablespace and log files). Then specify the configuration options you want to use and restart the server. (You should do this before creating any InnoDB tables. Otherwise, dump them with mysqldump before reconfiguring and reload them afterward.)

It is a little more complex to use raw partitions as components of the InnoDB shared tablespace, but there are several reasons to consider doing so:

You can easily create very large tablespaces. A partition component can span the entire extent of the partition, whereas regular file components are limited in size to the maximum file size allowed by your operating system.
Raw partition files are guaranteed to be composed of entirely contiguous space on disk, whereas regular files are subject to filesystem fragmentation. When it initializes the tablespace, InnoDB tries to minimize fragmentation of regular files by writing enough zeros to the files to force space for them to be allocated all at once rather than incrementally. But this can only reduce fragmentation; it cannot guarantee that it will not occur.
Raw partitions reduce overhead by eliminating the filesystem management layer. On some systems, this overhead may not be significant, but on others the difference might be enough to justify using partitions.

A factor that counts against using raw partitions in the InnoDB tablespace is that your system backup software might be oriented toward use with filesystems rather than partitions. In this case, using partitions would make it more difficult to perform system backups.

Including a raw partition in the tablespace is a two-step procedure. Suppose that you want to use a 20GB partition on a Unix system that has a pathname of /dev/rdsk8. In this case, it's necessary to specify a value for innodb_data_home_dir because the partition doesn't lie under the data directory. For example, if you set innodb_data_home_dir to an empty value, you can list the full pathname of the device file in innodb_data_file_path to configure the partition as follows:

Configure the partition initially with a size value that has a newraw suffix. This suffix tells InnoDB that the file is a raw partition that needs to be initialized:

[mysqld]
innodb_data_home_dir =
innodb_data_file_path = /dev/rdsk8:20Gnewraw

After listing these lines in your [mysqld] option group, start the server. InnoDB sees the newraw suffix and initializes the partition. It also treats the tablespace as read-only, because it knows that you have not completed the second step. After the partition has been initialized, stop the server.

Modify the configuration information to change the suffix from newraw to raw:

[mysqld]
innodb_data_home_dir =
innodb_data_file_path = /dev/rdsk8:20Graw

Then start the server again. InnoDB sees that the suffix is raw rather than newraw and assumes that the partition has been initialized and that it can use the tablespace in read/write fashion.

If you specify a raw partition as part of the InnoDB tablespace, make sure its permissions are set so that the server has read/write access to it. Also, make sure the partition is being used for no other purpose. Otherwise you will have competing processes each thinking that they own the partition and can use it as they please, with the result that they'll stomp all over each other's data. For example, if you mistakenly specify a swap partition for use by InnoDB, your system will behave quite erratically!

When configuring the InnoDB shared tablespace on Windows systems, backslashes in pathnames can be specified using either single forward slashes ('/') or doubled backslashes ('\\'). Also, you should still separate the parts of each file specification with colons, even though colons may also appear in filenames (full Windows pathnames begin with a drive letter and a colon). When it encounters a colon, InnoDB resolves this ambiguity by looking at the following character. If it is a digit, the next part of the specification is taken to be a size. Otherwise, it's taken as part of a pathname. For example, the following configuration sets up a tablespace consisting of files on the C and D drives with sizes of 50MB and 60MB:

[mysqld]
innodb_data_home_dir =
innodb_data_file_path = C:/ibdata1:50M;D:/ibdata2:60M

When you're setting up the initial tablespace, if startup fails because InnoDB cannot create some necessary file, check the error log to see what the problem was. Then remove all the files that InnoDB created (excluding any raw partitions you may be using), correct the configuration error, and start the server again.

Reconfiguring the Shared Tablespace

After the shared tablespace has been initialized and you have begun to use it, you cannot change the size of its component files. However, you can add another file at the end of the list of existing files, which may be helpful if the tablespace fills up. One symptom of a full tablespace is that InnoDB transactions consistently fail and roll back when they should succeed. You can determine the amount of free space with the following statement, where tbl_name is the name of any InnoDB table that is located in the shared tablespace:

mysql> SHOW TABLE STATUS LIKE 'tbl_name';

To make the shared tablespace larger by adding another component, use this procedure:

1.	Stop the server if it is running.
2.	If the final component of the tablespace is an auto-extending file, you must change its specification to that of a fixed-size file before adding another file after it. To do this, determine the current actual size of the file. Then round the size down to the nearest multiple of 1 megabyte (measured as 1,048,576 bytes rather than as 1,000,000 bytes) and use that size in the file's specification. Suppose that you have a file currently listed like this: [mysqld] innodb_data_file_path = ibdata1:100M:autoextend If the file's actual size now is 121,634,816 bytes, that is 121,634,816 / 1,048,576 = 116 megabytes. Change the specification as follows: [mysqld] innodb_data_file_path = ibdata1:116M
3.	Add the specification for the new component to the end of the current file list. If the new component is a regular file, make sure that it does not already exist. If the component is a raw partition, add it using the two-step procedure described earlier for specifying a partition as part of the tablespace. (That is, first with `newraw`, and then with `raw` after starting and stopping the server.)
4.	Restart the server.

If you want to reconfigure the shared tablespace in some way other than adding a new file to the end, you should dump it and then reconstruct it using the new configuration:

1.	Use `mysqldump` to dump all your InnoDB tables that are stored in the shared tablespace.
2.	Stop the server and delete your existing InnoDB shared tablespace files (other than raw partitions), the InnoDB log files, and the `.frm` files that correspond to the dumped InnoDB tables. (An alternative to deleting `.frm` files is to use `DROP TABLE` for every InnoDB table while the server is running.)
3.	Reinitialize the tablespace according to the new configuration you want to use.
4.	Reload the dump file into the server to re-create the InnoDB tables.

Using Individual Tablespaces

To use one tablespace per InnoDB table, start the server with the --innodb-file-per-table option. In this case, each InnoDB table has an .frm format file and an .ibd data file, both stored in the database directory for the database that contains the table.

The choice of whether to use individual tablespaces affects only how InnoDB creates new tables. InnoDB can always access tables already created in the shared tablespace or with individual tablespaces, regardless of whether the --innodb-file-per-table option is used.

InnoDB Storage Engine Variables

The preceding section discusses how to configure InnoDB's tablespace. InnoDB also has its own log files and memory buffers, and several other configuration parameters. The following list describes a few parameters that commonly are used to affect the operation of the InnoDB storage engine.

innodb_buffer_pool_size
If you have the memory available, making the InnoDB buffer pool larger can reduce disk usage for accessing table data and indexes.
innodb_log_buffer_size
InnoDB tries to buffer information about each transaction in memory and flush it to disk in a single operation when the transaction finishes. If a transaction is large and exceeds the size of the buffer, more disk activity is required to flush the buffer multiple times before the transaction finishes. Increasing the size of the buffer allows larger transactions to be buffered in memory without early flushing. The default value is 1MB. The maximum useful value is 8MB.
innodb_log_group_home_dir
InnoDB has its own log files, which it creates during server startup if they do not exist. By default, these logs are created in the data directory and have names that begin with ib_. innodb_log_group_home_dir can be used to set the pathname to the directory where InnoDB should write its log files. Note that InnoDB will create only files, not directories, so the log file directory must be created prior to starting the server. If you set innodb_log_group_home_dir, you should also set innodb_log_arch_dir to the same value.
innodb_log_file_size, innodb_log_files_in_group
When its logs fill up, InnoDB checkpoints the buffer pool by flushing it to disk. Using larger InnoDB log files reduces the frequency with which the logs fill up, and thus reduces the number of times this flushing occurs. (The tradeoff is that with larger logs, the time for recovery after a crash increases.) You can modify innodb_log_file_size to change the size of the log files or innodb_log_files_in_group to change the number of files. The important characteristic is the total size of the logs, which is the product of the two values. The total size of the logs must not exceed 4GB.