MySQL Data Types
Each table in a database contains one or more columns. When you create a table using a CREATE TABLE statement, you specify a data type for each column. A data type is more specific than a general category such as "number" or "string." For a column, the data type is the means by which you precisely characterize the kind of values the column may contain, such as SMALLINT or VARCHAR(32). This in turn determines how MySQL treats those values. For example, if you have numeric values, you could store them using either a numeric or string column, but MySQL will treat the values somewhat differently depending on what type you use. Each data type has several characteristics:
The following discussion surveys MySQL's data types briefly, and then describes in more detail the syntax for defining them and the properties that characterize each type, such as their range and storage requirements. The type specifications are shown as you use them in CREATE TABLE statements. Optional information is indicated by square brackets (). For example, the syntax MEDIUMINT[(M)] indicates that the maximum display width, specified as (M), is optional. On the other hand, for VARCHAR(M), the lack of brackets indicates that (M) is required.
Overview of Data Types
MySQL provides data types for values from all the general data value categories except the NULL value. NULL spans all types in the sense that the property of whether a column may contain NULL values is treated as a type attribute.
MySQL has numeric data types for integer, floating-point, and fixed-point values, as shown in Table 3.2. Numeric columns can be signed or unsigned. A special attribute allows sequential integer column values to be generated automatically, which is useful in applications that require a series of unique identification numbers.
Table 3.3 shows the MySQL string data types. Strings can hold anything, even arbitrary binary data such as images or sounds. Strings can be compared according to whether or not they are case sensitive. In addition, you can perform pattern matching on strings. (Actually, in MySQL, you can even perform pattern matching on numeric types, but it's more commonly done with string types.)
Table 3.4 shows the MySQL date and types, where CC, YY, MM, DD hh, mm, and ss represent century, year, month, day, hour, minute, and second, respectively. For temporal values, MySQL provides types for dates and times (either combined or separate) and timestamps (a special type that allows you to track when changes were last made to a record). There is also a type for efficiently representing year values when you don't need an entire date.
Table 3.5 shows the MySQL spatial data types. These represent various kinds of geometrical or geographical values.
Defining Column Types in Table Definitions
CREATE TABLE mytbl ( f FLOAT(10,4), c CHAR(15) NOT NULL DEFAULT 'none', i TINYINT UNSIGNED NULL );
Each column has a name and a type. Various attributes may be associated with the type. The syntax for defining a column is as follows:
col_name col_type [col_attributes] [general_attributes]
The name of the column, col_name, is always first in the definition and must be a legal identifier. The precise rules for identifier syntax are given in "MySQL Naming Rules," in Chapter 2. Briefly summarized, column identifiers may be up to 64 characters long, and may consist of alphanumeric characters from the system character set, as well as the underscore and dollar sign characters ('_' and '$'). Keywords such as SELECT, DELETE, and CREATE normally are reserved and cannot be used. However, you can include other characters within an identifier or use a reserved word as an identifier if you are willing to put up with the bother of quoting it whenever you refer to it. To quote an identifier, enclose it within backtick ('`') characters. If the ANSI_QUOTES SQL mode is enabled, you also can quote identifiers within double quote ('"') characters.
col_type indicates the column data type; that is, the specific kind of values the column can hold. Some type specifiers indicate the maximum length of the values you store in the column. For others, the length is implied by the type name. For example, CHAR(10) specifies an explicit length of 10 characters, whereas TINYTEXT values have an implicit maximum length of 255 characters. Some of the type specifiers allow you to indicate a maximum display width (how many characters to use for displaying values). For floating-point and fixed-point types, you can specify the number of significant digits and number of decimal places.
Following the column's data type, you may specify optional type-specific attributes as well as more general attributes. These attributes function as type modifiers. They cause MySQL to change the way it treats column values in some way:
If multiple column attributes are present, there are some constraints on the order in which they may appear. In general, you should be safe if you specify data type-specific attributes such as UNSIGNED or ZEROFILL before general attributes such as NULL or NOT NULL.
Specifying Column Default Values
For all but BLOB and TEXT types, spatial types, or columns with the AUTO_INCREMENT attribute, you can specify a DEFAULT def_value clause to indicate that a column should be assigned the value def_value when a new row is created that does not explicitly specify the column's value. With some limited exceptions for TIMESTAMP columns, def_value must be a constant. It cannot be an expression or refer to other columns.
Numeric Data Types
MySQL's numeric data types group into three general classifications:
Floating-point values may be assigned to integer columns, but will be rounded to the nearest integer. Conversely, integer values may be assigned to floating-point or fixed-point columns. They are treated as having a fractional part of zero.
When you specify a number, you should not include commas as a separator. For example, 12345678.90 is legal, but 12,345,678.90 is not.
Integer Data Types
MySQL provides five integer types: TINYINT, SMALLINT, MEDIUMINT, INT, and BIGINT. INTEGER is a synonym for INT. These types vary in the range of values they can represent and in the amount of storage space they require. (The larger the range, the more storage is required.) Integer columns can be defined as UNSIGNED to disallow negative values; this shifts the range for the column upward to begin at 0.
When you define an integer column, you can specify an optional display size M. If given, M should be an integer from 1 to 255. It represents the number of characters used to display values for the column. For example, MEDIUMINT(4) specifies a MEDIUMINT column with a display width of 4. If you define an integer column without an explicit width, a default width is assigned. The defaults are the lengths of the "longest" values for each type. Note that displayed values are not chopped to fit within M characters. If the printable representation of a particular value requires more than M characters, MySQL displays the full value.
The display size M for an integer column relates only to the number of characters used to display column values. It has nothing to do with the number of bytes of storage space required. For example, BIGINT values require 8 bytes of storage regardless of the display width. It is not possible to magically cut the required storage space for a BIGINT column in half by defining it as BIGINT(4). Nor does M have anything to do with the range of values allowed. If you define a column as INT(3), that doesn't restrict it to a maximum value of 999.
The following statement creates a table to illustrate the default values of M and D for integer data types:
CREATE TABLE mytbl ( itiny TINYINT, itiny_u TINYINT UNSIGNED, ismall SMALLINT, ismall_u SMALLINT UNSIGNED, imedium MEDIUMINT, imedium_u MEDIUMINT UNSIGNED, ireg INT, ireg_u INT UNSIGNED, ibig BIGINT, ibig_u BIGINT UNSIGNED );
mysql> DESCRIBE mytbl; +-----------+-----------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-----------+-----------------------+------+-----+---------+-------+ | itiny | tinyint(4) | YES | | NULL | | | itiny_u | tinyint(3) unsigned | YES | | NULL | | | ismall | smallint(6) | YES | | NULL | | | ismall_u | smallint(5) unsigned | YES | | NULL | | | imedium | mediumint(9) | YES | | NULL | | | imedium_u | mediumint(8) unsigned | YES | | NULL | | | ireg | int(11) | YES | | NULL | | | ireg_u | int(10) unsigned | YES | | NULL | | | ibig | bigint(20) | YES | | NULL | | | ibig_u | bigint(20) unsigned | YES | | NULL | | +-----------+-----------------------+------+-----+---------+-------+
Floating-Point and Fixed-Point Data Types
MySQL provides two floating-point types (FLOAT, DOUBLE), and one fixed-point type (DECIMAL). Synonymous types are DOUBLE PRECISION for DOUBLE, and NUMERIC and FIXED for DECIMAL. The REAL type is a synonym for DOUBLE by default. If the REAL_AS_DEFAULT SQL mode is enabled, REAL type is a synonym for FLOAT.
Ranges for these types differ from ranges for integer types in the sense that there is not only a maximum value that a floating-point type can represent, but also a minimum non-zero value. The minimum values provide a measure of how precise the type is, which is often important for recording scientific data. (There are, of course, corresponding negative maximum and minimum values.)
Floating-point and fixed-point types can be defined as UNSIGNED. Unlike the integer types, defining a floating-point or fixed-point type as UNSIGNED doesn't shift the type's range upward, it merely eliminates the negative end.
For each floating-point or fixed-point type, you may specify a maximum number of significant digits M and the number of decimal places D. The value of M should be from 1 to 255. The value of D may be from 0 to 30. If M is not greater than D, it is adjusted up to a value of D+1. M and D correspond to the concepts of "precision" and "scale" with which you may be familiar.
For DECIMAL, M and D are optional. If D is omitted, it defaults to 0. If M is omitted as well, it defaults to 10. In other words, the following equivalences hold:
DECIMAL = DECIMAL(10) = DECIMAL(10,0) DECIMAL(n) = DECIMAL(n,0)
FLOAT(p) syntax also is allowed. However, whereas p stands for the required number of bits of precision in standard SQL, it is treated differently in MySQL. p may range from 0 to 53 and is used only to determine whether the column stores single-precision or double-precision values. For p values from 0 to 24, the column is treated as single precision. For values from 25 to 53, the column is treated as double precision. That is, the column is treated as a FLOAT or DOUBLE with no M or D values.
The DECIMAL type is a fixed-point type. It differs from FLOAT and DOUBLE in that DECIMAL values actually are stored as strings and have a fixed number of decimals. The significance of this fact is that DECIMAL values are not subject to roundoff error the way that FLOAT and DOUBLE columns area property that makes DECIMAL especially applicable for storing currency values. The corresponding tradeoff is that DECIMAL values are not as efficient as floating-point values stored in native format that the processor can operate on directly. Also, be aware that the fixed-point properties of DECIMAL apply only to storage and retrieval. Calculations on DECIMAL values might be done using floating-point operations.
MySQL handles DECIMAL values according to the standard SQL specification, with one extension. Standard SQL requires that a type of DECIMAL(M,D) must be able to represent any value with M digits and D decimal places. For example, DECIMAL(4,2) must be able to represent values from 99.99 to 99.99. Because the sign character and decimal point must still be stored, this requires an extra two bytes, so DECIMAL(M,D) values use M+2 bytes. For DECIMAL(4,2), six bytes are needed for the "widest" value (99.99).
The MySQL extension to standard SQL occurs at the positive end of the range. The sign byte is not needed to hold a sign character, so MySQL uses it to extend the range beyond that required by the SQL standard. In other words, for DECIMAL(4,2), the maximum value that can be stored in the six bytes available is 999.99.
The maximum possible range for DECIMAL is the same as for DOUBLE, but the effective range is determined by the values of M and D. If you vary M and hold D fixed, the range becomes larger as M becomes larger. If you hold M fixed and vary D, the range becomes smaller as D becomes larger, although the precision increases. These properties are shown by Table 3.8 and Table 3.9.
The BIT Data Type
The BIT data type was introduced in MySQL 5.0.3 as a type for holding bit-field values. When you define a BIT column, you can specify an optional maximum width M that indicates the "width" of the column in bits. M should be an integer from 1 to 64. If omitted, M defaults to 1.
Values retrieved from a BIT column are not displayed in printable form by default. To display a printable representation of bit-field values, add zero. The BIN() function also can be useful for display bit-field values or the result of computations on them.
mysql> CREATE TABLE t (b BIT(3)); # holds values from 0 to 7 mysql> INSERT INTO t (b) VALUES(0),(b'11'),(b'101'),(b'111'); mysql> SELECT BIN(b+0), BIN(b & b'101'), BIN(b | b'101') FROM t; +----------+-----------------+-----------------+ | BIN(b+0) | BIN(b & b'101') | BIN(b | b'101') | +----------+-----------------+-----------------+ | 0 | 0 | 101 | | 11 | 1 | 111 | | 101 | 101 | 101 | | 111 | 101 | 111 | +----------+-----------------+-----------------+
Numeric Data Type Attributes
The UNSIGNED attribute disallows negative values. It can be used with all numeric types except BIT, but is most often used with integer types. Making an integer column UNSIGNED doesn't change the "size" of the underlying data type's range; it just shifts the range upward. Consider this table definition:
CREATE TABLE mytbl ( itiny TINYINT, itiny_u TINYINT UNSIGNED );
itiny and itiny_u both are TINYINT columns with a range of 256 values, but differ in the set of allowable values. The range of itiny is 128 to 127, whereas the range of itiny_u is shifted up, resulting in a range of 0 to 255.
UNSIGNED is useful for columns into which you plan to store information that doesn't take on negative values, such as population counts or attendance figures. Were you to use a signed column for such values, you would use only half of the data type's range. By making the column UNSIGNED, you effectively double your usable range. For example, if you use the column for sequence numbers, it will take twice as long to run out of values if you make it UNSIGNED.
You also can specify UNSIGNED for floating-point and fixed-point columns, although the effect is slightly different than for integer columns. The range does not shift upward; instead, the upper end remains unchanged and the lower end becomes zero.
The SIGNED attribute is allowed for all numeric types that allow UNSIGNED. However, it has no effect because such types are signed by default. SIGNED serves simply to indicate explicitly in a column definition that the column allows negative values.
The ZEROFILL attribute may be specified for all numeric types except BIT. It causes displayed values for the column to be padded with leading zeros to the display width. You can use ZEROFILL when you want to make sure column values always display using a given number of digits. Actually, it's more accurate to say "a given minimum number of digits" because values wider than the display width are displayed in full without being chopped. You can see this by issuing the following statements:
mysql> DROP TABLE IF EXISTS mytbl; mysql> CREATE TABLE mytbl (my_zerofill INT(5) ZEROFILL); mysql> INSERT INTO mytbl VALUES(1),(100),(10000),(1000000); mysql> SELECT my_zerofill FROM mytbl; +-------------+ | my_zerofill | +-------------+ | 00001 | | 00100 | | 10000 | | 1000000 | +-------------+
Note that the final value, which is wider than the column's display width, is displayed in full.
If you specify the ZEROFILL attribute for a column, it automatically becomes UNSIGNED as well.
One other attribute, AUTO_INCREMENT, is intended only for use with integer data types. Specify the AUTO_INCREMENT attribute when you want to generate a series of unique identifier values. When you insert NULL into an AUTO_INCREMENT column, MySQL generates the next sequence value and stores it in the column. Normally, unless you take steps to cause otherwise, AUTO_INCREMENT values begin at 1 and increase by 1 for each new row. The sequence may be affected if you delete rows from the table. That is, sequence values might be reused; it is storage engine-dependent whether this occurs.
You can have at most one AUTO_INCREMENT column in a table. The column should have the NOT NULL constraint, and it must be indexed. Generally, an AUTO_INCREMENT column is indexed as a PRIMARY KEY or UNIQUE index. Also, because sequence values always are positive, you normally define the column UNSIGNED as well. For example, you can define an AUTO_INCREMENT column in any of the following ways:
CREATE TABLE ai (i INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY); CREATE TABLE ai (i INT UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE); CREATE TABLE ai (i INT UNSIGNED NOT NULL AUTO_INCREMENT, PRIMARY KEY (i)); CREATE TABLE ai (i INT UNSIGNED NOT NULL AUTO_INCREMENT, UNIQUE (i));
The first two forms specify the index information as part of the column definition. The second two specify the index as a separate clause of the CREATE TABLE statement. Using a separate clause is optional if the index includes only the AUTO_INCREMENT column. If you want to create a multiple-column index that includes the AUTO_INCREMENT column, you must use a separate clause. (For an example of this, see "AUTO_INCREMENT for MyISAM Tables for MyISAM Tables.")
It is always allowable to define an AUTO_INCREMENT column explicitly as NOT NULL, but if you omit NOT NULL, MySQL adds it automatically.
"Working with Sequences" discusses the behavior of AUTO_INCREMENT columns further.
Following the attributes just described, which are specific to numeric columns, you may specify NULL or NOT NULL. If you do not specify NULL or NOT NULL for a numeric column, it allows NULL by default.
CREATE TABLE t ( i1 INT DEFAULT -1, i2 INT DEFAULT 1, i3 INT DEFAULT NULL );
The rules that MySQL uses for assigning a default value if you specify no DEFAULT clause are given in "Specifying Column Default Values."
Choosing Numeric Data Types
When you choose a type for a numeric column, consider the range of values that you need to represent and choose the smallest type that will cover the range. Choosing a larger type wastes space, leading to tables that are unnecessarily large and that cannot be processed as efficiently as if you had chosen a smaller type. For integer values, TINYINT is the best if the range of values in your data is small, such as a person's age or number of siblings. MEDIUMINT can represent millions of values and can be used for many more types of values, at some additional cost in storage space. BIGINT has the largest range of all but requires twice as much storage as the next smallest integer type (INT) and should be used only when really necessary. For floating-point values, DOUBLE takes twice as much space as FLOAT. Unless you need exceptionally high precision or an extremely large range of values, you can probably represent your data at half the storage cost by using FLOAT instead of DOUBLE.
Every numeric column's range of values is determined by its type. If you attempt to insert a value that lies outside the column's range, the result depends on whether strict mode is enabled. If it is, an out of range value results in an error. If strict mode is not enabled, truncation occurs: MySQL clips the value to the appropriate endpoint of the range and uses the result.
Value truncation occurs according to the range of the data type, not the display width. For example, a SMALLINT(3) column has a display width of 3 and a range from 32768 to 32767. The value 12345 is wider than the display width but within the range of the column, so it is inserted without clipping and retrieved as 12345. The value 99999 is outside the range, so it is clipped to 32767 when inserted. Subsequent retrievals return the value 32767.
In general, values assigned to a floating-point or fixed-point column are rounded to the number of decimals indicated by the column specification. If you store 1.23456 in a FLOAT(8,1) column, the result is 1.2. If you store the same value in a FLOAT(8,4) column, the result is 1.2346. This means you should define floating-point columns with a sufficient number of decimals to store values as precise as you require. If you need accuracy to thousandths, don't define a type with only two decimal places.
String Data Types
MySQL provides several data types for storing string values. Strings are often used for text values like these:
'N. Bertram, et al.' 'Pencils (no. 2 lead)' '123 Elm St.' 'Monograph Series IX'
But strings are actually "generic" types in a sense because you can use them to represent any value. For example, you can use binary string types to hold binary data, such as images, sounds, or compressed output from gzip.
Table 3.10 shows all the types provided by MySQL for defining string-valued columns, and the maximum size and storage requirements of each type. The BLOB and TEXT types each have several variants that are distinguished by the maximum size of values they can hold.
Some types hold binary strings (byte strings) and others hold non-binary strings (character strings). Thus, size and storage requirements are given in number of bytes per value for binary string types and number of characters for non-binary string types. For example, BINARY(20) holds 20 characters, whereas CHAR(20) holds 20 bytes. The differences between byte and character semantics for binary and non-binary strings are characterized in "String Values." Each of the binary string types for byte strings has a corresponding non-binary type for character strings, as shown in Table 3.11.
Each of the non-binary string types, as well as ENUM and SET, can be assigned a character set and collation. The MyISAM, MEMORY, and InnoDB storage engines include support for using multiple character sets within a single table. Character set assignment is discussed in "String Data Type Attributes."
Some string types are fixed-length. For a given column, each value requires the same amount of storage. Other string types are variable-length. The amount of storage taken by a value varies from row to row and depends on the length of the values actually stored in the column. This length is represented by L in the table for variable-length types. The extra bytes required in addition to L are the number of bytes needed to store the length of the value. MySQL handles variable-length values by storing both the content of the value and its length. These extra bytes are treated as an unsigned integer. There is a correspondence between a variable-length type's maximum length, the number of extra bytes required for that type, and the range of the unsigned integer type that uses the same number of bytes. For example, a MEDIUMBLOB value may be up to 2241 bytes long and requires 3 bytes to record the length. The 3-byte integer type MEDIUMINT has a maximum unsigned value of 2241. That's not a coincidence.
The number of extra bytes for VARBINARY and VARCHAR is always one prior to MySQL 5.0.3 because the maximum column length is 255. As of MySQL 5.0.3, the column length can be up to 65,535, and two extra bytes are required for lengths greater than 255.
Values for all string types except ENUM and SET are stored as a sequence of bytes and interpreted either as bytes or characters depending on whether the type holds binary or non-binary strings. Values that are too long are chopped to fit. (In strict mode, an error occurs instead unless the chopped characters are spaces.) But string types range from very small to very large, with the largest type able to hold nearly 4GB of data, so you should be able to find something long enough to avoid truncation of your information. (The effective maximum column size actually is imposed by the maximum packet size of the client/server communication protocol, which is 1GB.)
For ENUM and SET, the column definition includes a list of legal string values, but ENUM and SET values are stored internally as numbers, as detailed later in "The ENUM and SET Data Types." Attempting to store a value other than those in the list causes the value to be converted to '' (the empty string) unless strict mode is enabled. In strict mode, an error occurs instead.
The CHAR and VARCHAR Data Types
CHAR and VARCHAR are two of the most commonly used string types. They both hold non-binary strings, and thus are associated with a character set and collation.
The primary differences between CHAR and VARCHAR lie in whether they have a fixed or variable length, and in how trailing spaces are treated:
CHAR columns can be defined with a maximum length M from 0 to 255. M is optional for CHAR and defaults to 1 if missing. Note that CHAR(0) is legal. A CHAR(0) column can be used to represent on/off values if you allow it to be NULL. Values in such a column can have one of two values: NULL or the empty string. A CHAR(0) column takes very little storage space in the tableonly a single bit. It can be useful as a placeholder when you want to define a column but don't want to allocate space for it if you're not sure yet how wide to make it. You can use ALTER TABLE to widen the column later.
VARCHAR columns can be defined with a maximum length M from 0 to 255 before MySQL 5.0.3. As of MySQL 5.0.3, the length can be from 0 to 65,535. However, the actual maximum length of a VARCHAR column in practice may be less than 65,535, depending on storage engine internal row-size limits, the column character set, and the number of other columns in the table.
Keep in mind two general principles when choosing between CHAR and VARCHAR data types:
Before MySQL 5.0.3, you cannot mix CHAR and VARCHAR within the same table, with a few limited exceptions. Depending on the circumstances, MySQL will even convert columns from one type to another when you create a table, something that other databases do not do. The principles that govern these conversions are as follows:
CREATE TABLE mytbl ( c1 CHAR(10), c2 VARCHAR(10) );
If you issue a DESCRIBE statement, the output is as follows:
mysql> DESCRIBE mytbl; +-------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-------------+------+-----+---------+-------+ | c1 | varchar(10) | YES | | NULL | | | c2 | varchar(10) | YES | | NULL | | +-------+-------------+------+-----+---------+-------+
Notice that the presence of the VARCHAR column causes MySQL to convert c1 to VARCHAR as well. If you try using ALTER TABLE to convert c1 to CHAR, it won't work. The only way to convert a VARCHAR column to CHAR is to convert all VARCHAR columns in the table at the same time:
mysql> ALTER TABLE mytbl MODIFY c1 CHAR(10), MODIFY c2 CHAR(10); mysql> DESCRIBE mytbl; +-------+----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+----------+------+-----+---------+-------+ | c1 | char(10) | YES | | NULL | | | c2 | char(10) | YES | | NULL | | +-------+----------+------+-----+---------+-------+
The BLOB and TEXT data types are variable-length like VARCHAR, but they have no fixed-length equivalent, so you cannot use CHAR columns in the same table as BLOB or TEXT columns. CHAR columns will be converted to VARCHAR. Even using ALTER TABLE to convert all VARCHAR columns to CHAR at the same time will fail. The presence of a BLOB or TEXT column requires that the rows be variable length, so MySQL will not convert the VARCHAR columns.
The exception to the prohibition on mixing fixed-length and variable-length columns is that CHAR columns shorter than four characters are not converted to VARCHAR. For example, MySQL will not change the CHAR column in the following table to VARCHAR:
CREATE TABLE mytbl ( c1 CHAR(2), c2 VARCHAR(10) );
You can see this from the output of DESCRIBE:
mysql> DESCRIBE mytbl; +-------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-------------+------+-----+---------+-------+ | c1 | char(2) | YES | | NULL | | | c2 | varchar(10) | YES | | NULL | | +-------+-------------+------+-----+---------+-------+
There is a reason for not converting columns that are shorter than four characters: On average, any savings you might gain by not storing trailing spaces are offset by the extra byte needed in a VARCHAR column to record the length of each value. In fact, if all your columns are short, MySQL will convert any that you define as VARCHAR to CHAR. MySQL does this because the conversion decreases storage requirements on average and, for MyISAM tables, improves performance by making table rows fixed-length. Suppose that you create a table with the following specification:
CREATE TABLE mytbl ( c0 VARCHAR(0), c1 VARCHAR(1), c2 VARCHAR(2), c3 VARCHAR(3), c4 VARCHAR(4) );
DESCRIBE reveals that MySQL silently changes all the VARCHAR columns shorter than four characters to CHAR:
mysql> DESCRIBE mytbl; +-------+------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+------------+------+-----+---------+-------+ | c0 | char(0) | YES | | NULL | | | c1 | char(1) | YES | | NULL | | | c2 | char(2) | YES | | NULL | | | c3 | char(3) | YES | | NULL | | | c4 | varchar(4) | YES | | NULL | | +-------+------------+------+-----+---------+-------+
The BINARY and VARBINARY Data Types
Trailing space removal is the same for BINARY as for CHAR, and the same for VARBINARY as for VARCHAR. Also, prior to MySQL 5.0.3, BINARY columns may be converted to VARBINARY or vice versa when you create a table. The rules for when this occurs are the same as for conversion between CHAR and VARCHAR, as described in the previous section.
The BLOB and TEXT Data Types
A "BLOB" is a binary large objectbasically, a container that can hold anything you want to toss into it, and that you can make about as big as you want. In MySQL, the BLOB type is really a family of types (TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB). These types are identical except in the maximum amount of information they can hold (see Table 3.10). BLOB columns store binary strings. They are useful for storing data that may grow very large or that may vary widely in size from row to row. Some examples are compressed data, encrypted data, images, and sounds.
MySQL also has a family of TEXT types (TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT). These are similar to the corresponding BLOB types, except that TEXT types store non-binary strings rather than binary strings. That is, they store characters rather than bytes, and are associated with a character set and collation. This results in the general differences between binary and non-binary strings that were described earlier in "String Values." For example, in comparison operations, BLOB values are compared in byte units and TEXT values are compared in character units using the column collation.
BLOB or TEXT columns sometimes can be indexed, depending on the storage engine you're using:
BLOB or TEXT columns may require special care:
The ENUM and SET Data Types
ENUM and SET are special string data types for which values must be chosen from a fixed (predefined) list of allowable strings. The primary difference between them is that ENUM column values must consist of exactly one member of the list of values, whereas SET column values may contain any or all members of the list. In other words, ENUM is used for values that are mutually exclusive, whereas SET allows multiple choices from the list.
The ENUM data type defines an enumeration. ENUM columns may be assigned values consisting of exactly one member chosen from a list of values specified at table-creation time. You can define an enumeration to have up to 65,535 members. Enumerations are commonly used to represent category values. For example, values in a column defined as ENUM('N','Y') can be either 'N' or 'Y'. Or you can use ENUM for such things as available sizes or colors for a product or for answers to multiple-choice questions in a survey or questionnaire where a single response must be selected:
employees ENUM('less than 100','100-500','501-1500','more than 1500') color ENUM('red','green','blue','black') size ENUM('S','M','L','XL','XXL') vote ENUM('Yes','No','Undecided')
If you are processing selections from a Web page that includes mutually exclusive radio buttons, you can use an ENUM to represent the options from which a visitor to your site chooses. For example, if you run an online pizza ordering service, ENUM columns can be used to represent the type of crust and size of pizza a customer orders:
crust ENUM('thin','regular','pan style','deep dish') size ENUM('small','medium','large')
If enumeration categories represent counts, it's important to choose your categories properly when you create the enumeration. For example, when recording white blood cell counts from a laboratory test, you might group the counts into categories like this:
If any given test result is provided as an exact count, you can record the value in the wbc column using the category into which the count falls. But you cannot recover the original count if you decide you want to convert the column from a category-based ENUM to an integer column based on exact count. If you really need the exact count, use an integer column instead. You can group integer values into categories when you retrieve them using the CASE construct. For example, if wbc is defined as an integer column, you can select it as a category like this:
SELECT CASE WHEN wbc <= 100 THEN '0-100' WHEN wbc <= 300 THEN '101-300' ELSE '>300' END AS 'wbc category' FROM ...
The SET type is similar to ENUM in the sense that when you create a SET column, you specify a list of legal set members. But unlike ENUM, each column value may consist of any number of members from the set. The set may have up to 64 members. You can use a SET when you have a fixed set of values that are not mutually exclusive as they are in an ENUM column. For example, you might use a SET to represent options available for an automobile:
SET('luggage rack','cruise control','air conditioning','sun roof')
Then particular SET values would represent those options actually ordered by customers:
'cruise control,sun roof' 'luggage rack,air conditioning' 'luggage rack,cruise control,air conditioning' 'air conditioning' ''
The final value shown (the empty string) means that the customer ordered no options. This is a legal value for any SET column.
SET column definitions are written as a list of individual strings separated by commas to indicate what the set members are. A SET column value, on the other hand, is written as a single string. If the value consists of multiple set members, the members are separated within the string by commas. This means you shouldn't use a string containing a comma as a SET member.
Other uses for SET columns might be for representing information such as patient diagnoses or results from selections on Web pages. For a diagnosis, there may be a standard list of symptoms to ask a patient about, and the patient might exhibit any or all of them:
SET('dizziness','shortness of breath','cough')
For an online pizza service, the Web page for ordering could have a set of check boxes for ingredients that a customer wants as toppings on a pizza, several of which might be chosen:
The way you define the legal value list for an ENUM or SET column is significant in several ways:
ENUM and SET are classified as string types because enumeration and set members are specified as strings when you create columns of these types. However, the ENUM and SET types actually have a split personality: The members are stored internally as numbers and you can work with them as such. This means that ENUM and SET types are more efficient than other string types because they often can be handled using numeric operations rather than string operations. It also means that ENUM and SET values can be used in either string or numeric contexts. Finally, ENUM and SET columns can cause confusion if you use them in string context but expect them to behave as numbers, or vice versa.
MySQL sequentially numbers ENUM members in the column definition beginning with 1. (The value 0 is reserved for the error member, which is represented in string form by the empty string.) The number of enumeration values determines the storage size of an ENUM column. One byte can represent 256 values and two bytes can represent 65,536 values. (Compare this to the ranges of the one-byte and two-byte integer types TINYINT UNSIGNED and SMALLINT UNSIGNED.) Thus, counting the error member, the maximum number of enumeration members is 65,536 and the storage size depends on whether there are more than 256 members. You can specify a maximum of 65,535 (not 65,536) members in the ENUM definition because MySQL reserves a spot for the error member as an implicit member of every enumeration. When you assign an illegal value to an ENUM column, MySQL assigns the error member. (In strict mode, an error occurs instead.)
Here's an example you can try using the mysql client. It demonstrates that you can retrieve ENUM values in either string or numeric form (which shows the numeric ordering of enumeration members and also that the NULL value has no number in the ordering):
mysql> CREATE TABLE e_table (e ENUM('jane','fred','will','marcia')); mysql> INSERT INTO e_table -> VALUES('jane'),('fred'),('will'),('marcia'),(''),(NULL); mysql> SELECT e, e+0, e+1, e*3 FROM e_table; +--------+------+------+------+ | e | e+0 | e+1 | e*3 | +--------+------+------+------+ | jane | 1 | 2 | 3 | | fred | 2 | 3 | 6 | | will | 3 | 4 | 9 | | marcia | 4 | 5 | 12 | | | 0 | 1 | 0 | | NULL | NULL | NULL | NULL | +--------+------+------+------+
You can compare ENUM members either by name or number:
mysql> SELECT e FROM e_table WHERE e='will'; +------+ | e | +------+ | will | +------+ mysql> SELECT e FROM e_table WHERE e=3; +------+ | e | +------+ | will | +------+
It is possible to define the empty string as a legal enumeration member, but this will only cause confusion. The string is assigned a non-zero numeric value, just as any other member listed in the definition. However, an empty string also is used for the error member that has a numeric value of 0, so it would correspond to two internal numeric element values. In the following example, assigning the illegal enumeration value 'x' to the ENUM column causes the error member to be assigned. This is distinguishable from the empty string member listed in the column definition only when retrieved in numeric form:
mysql> CREATE TABLE t (e ENUM('a','','b')); mysql> INSERT INTO t VALUES('a'),(''),('b'),('x'); mysql> SELECT e, e+0 FROM t; +------+------+ | e | e+0 | +------+------+ | a | 1 | | | 2 | | b | 3 | | | 0 | +------+------+
The numeric representation of SET columns is a little different than for ENUM columns. Set members are not numbered sequentially. Instead, members correspond to successive individual bits in the SET value. The first set member corresponds to bit 0, the second member corresponds to bit 1, and so on. In other words, the numeric values of SET members all are powers of two. The empty string corresponds to a numeric SET value of 0.
SET values are stored as bit values. Eight set members per byte can be stored this way, so the storage size for a SET column is determined by the number of set members, up to a maximum of 64 members. SET values take 1, 2, 3, 4, or 8 bytes for set sizes of 1 to 8, 9 to 16, 17 to 24, 25 to 32, and 33 to 64 members.
The representation of a SET as a set of bits is what allows a SET value to consist of multiple set members. Any combination of bits can be turned on in the value, so the value may consist of any combination of the strings in the SET definition that correspond to those bits.
The following example shows the relationship between the string and numeric forms of a SET column. The numeric value is displayed in both decimal and binary form:
mysql> CREATE TABLE s_table (s SET('table','lamp','chair','stool')); mysql> INSERT INTO s_table -> VALUES('table'),('lamp'),('chair'),('stool'),(''),(NULL); mysql> SELECT s, s+0, BIN(s+0) FROM s_table; +-------+------+----------+ | s | s+0 | BIN(s+0) | +-------+------+----------+ | table | 1 | 1 | | lamp | 2 | 10 | | chair | 4 | 100 | | stool | 8 | 1000 | | | 0 | 0 | | NULL | NULL | NULL | +-------+------+----------+
When you assign values to SET columns, the substrings don't need to be listed in the same order that you used when you defined the column. However, when you retrieve the value later, members are displayed within the value in definition order. Also, if you assign to a SET column a value containing substrings that are not listed as set members, those strings drop out and the column is assigned a value consisting of the remaining substrings. When you retrieve the value later, the illegal substrings will not be present.
If you assign a value of 'chair,couch,table' to the column s in s_table, two things happen:
In strict mode, use of an illegal SET member causes an error instead and the value is not stored. In the preceding example, assigning a value containing 'couch' would cause an error and the assignment would fail.
The fact that MySQL reorders members in a SET value means that if you search for values using a string, you must list members in the proper order. If you insert 'chair,table' and then search for 'chair,table' you won't find the record; you must look for it as 'table,chair'.
Sorting and indexing of ENUM and SET columns is done according to the internal (numeric) values of column values. The following example might appear to be incorrect because the values are not displayed in alphanumeric order:
mysql> SELECT e FROM e_table ORDER BY e; +--------+ | e | +--------+ | NULL | | | | jane | | fred | | will | | marcia | +--------+
You can better see what's going on by retrieving both the string and numeric forms of the ENUM values:
mysql> SELECT e, e+0 FROM e_table ORDER BY e; +--------+------+ | e | e+0 | +--------+------+ | NULL | NULL | | | 0 | | jane | 1 | | fred | 2 | | will | 3 | | marcia | 4 | +--------+------+
If you have a fixed set of values and you want them to sort in a particular order, you can exploit the ENUM sorting properties: Represent the values as an ENUM column in a table and list the enumeration values in the column definition in the order that you want them to be sorted. Suppose that you have a table representing personnel for a sports organization, such as a football team, and that you want to sort output by personnel position so that it comes out in a particular order, such as coaches, assistant coaches, quarterbacks, running backs, receivers, linemen, and so on. Define the column as an ENUM and list the enumeration elements in the order that you want to see them. Then column values automatically will come out in that order for sort operations.
mysql> SELECT CAST(e AS CHAR) AS e_str FROM e_table ORDER BY e_str; +--------+ | e_str | +--------+ | NULL | | | | fred | | jane | | marcia | | will | +--------+
CAST() doesn't change the displayed values, but has the effect in this statement of performing an ENUM-to-string conversion that alters their sorting properties so they sort as strings.
String Data Type Attributes
The attributes unique to the string data types are CHARACTER SET and COLLATE for designating a character set and collating order. You can specify these as options for the table itself to set its defaults, or for individual columns to override the table defaults. (Actually, each database also has a default character set and collation, as does the server itself. These defaults sometimes come into play during table creation, as we'll see later.)
The CHARACTER SET and COLLATION attributes apply to the CHAR, VARCHAR, TEXT, ENUM, and SET data types. They do not apply to the binary string data types (BINARY, VARBINARY, and BLOB), because those types contain byte strings, not character strings.
To see how these rules apply, consider the following statement. It creates a table that uses several character sets:
CREATE TABLE mytbl ( c1 CHAR(10), c2 CHAR(40) CHARACTER SET latin2, c3 CHAR(10) COLLATE latin1_german1_ci, c4 BINARY(40) ) CHARACTER SET utf8;
The resulting table has utf8 as its default character set. No COLLATE table option is given, so the default table collation is the default utf8 collation (which happens to be utf8_general_ci). The definition for the c1 column contains no CHARACTER SET or COLLATE attributes of its own, so the table defaults are used for it. On the other hand, the table-level character set and collation are not used for c2, c3, and c4: c2 and c3 have their own character set information, and c4 has a binary string type, so the character set attributes do not apply. For c2, the collation is latin2_general_ci, the default collation for latin2. For c3, the character set is latin1, as can be determined from the collation name latin1_german1_ci.
To see character set information for an existing table, use SHOW CREATE TABLE:
mysql> SHOW CREATE TABLE mytbl\G *************************** 1. row *************************** Table: mytbl Create Table: CREATE TABLE `mytbl` ( `c1` char(10) default NULL, `c2` char(40) character set latin2 default NULL, `c3` char(10) character set latin1 collate latin1_german1_ci default NULL, `c4` binary(40) default NULL ) ENGINE=MyISAM DEFAULT CHARSET=utf8
If SHOW CREATE TABLE does not display a column character set, it is the same as the table default character set. If it does not display a column collation, it is the default collation for the character set.
mysql> SHOW FULL COLUMNS FROM mytbl; +-------+------------+-------------------+------+-----+---------+... | Field | Type | Collation | Null | Key | Default |... +-------+------------+-------------------+------+-----+---------+... | c1 | char(10) | utf8_general_ci | YES | | NULL |... | c2 | char(40) | latin2_general_ci | YES | | NULL |... | c3 | char(10) | latin1_german1_ci | YES | | NULL |... | c4 | binary(40) | NULL | YES | | NULL |... +-------+------------+-------------------+------+-----+---------+...
The preceding discussion mentions column and table character set assignments, but character sets actually can be designated at the column, table, database, or server level. When MySQL processes a character column definition, it determines which character set to use for it by trying the following rules in order:
In other words, MySQL searches up through the levels at which character sets may be specified until it finds a character set defined, and then uses that for the column's set. The database always has a default character set, so the search process is guaranteed to terminate at the database level even if no character set is specified explicitly at any of the lower levels.
The character set name binary is special. If you assign the binary character set to a non-binary string column, the result is a type conversion that forces the column to the corresponding binary string type. The following pairs of column definitions each show two equivalent definitions:
c1 CHAR(10) CHARACTER SET binary c1 BINARY(10) c2 VARCHAR(10) CHARACTER SET binary c2 VARBINARY(10) c3 TEXT CHARACTER SET binary c3 BLOB
If you assign the binary character set as a table option, it applies to each string column that does not have any character set information specified in its own definition.
MySQL provides some shortcut attributes for defining character columns:
The general attributes NULL or NOT NULL can be specified for any of the string types. If you don't specify either of them, NULL is the default. However, defining a string column as NOT NULL does not prevent you from storing an empty string (that is, '') in the column. In MySQL, an empty value is different from a missing value, so don't make the mistake of thinking that you can force a string column to contain non-empty values by defining it NOT NULL. If you require string values to be non-empty, that is a constraint you must enforce from within your own applications.
You also can specify a default value using the DEFAULT attribute for all string data types except the BLOB and TEXT types. The rules that MySQL uses for assigning a default value if you specify no DEFAULT clause are given in "Specifying Column Default Values."
Choosing String Data Types
Date and Time Data Types
MySQL provides several data types for storing temporal values: DATE, TIME, DATETIME, TIMESTAMP, and YEAR. Table 3.12 shows these types and the range of legal values for each type. The storage requirements for each type are shown in Table 3.13.
Each date and time type has a "zero" value that is stored when you insert a value that is illegal for the type, as shown in Table 3.14. The "zero" value also is the default value for date and time columns that are defined with the NOT NULL constraint.
MySQL always represents dates with the year first, in accordance with the standard SQL and ISO 8601 specifications. For example, December 3, 2004 is represented as '2004-12-03'. However, MySQL does allow some leeway in how you can specify input dates. For example, it will convert two-digit year values to four digits, and you need not supply a leading zero digit for month and day values that are less than 10. However, you must specify the year first and the day last. Formats that you may be more used to, such as '12/3/99' or '3/12/99', will not be interpreted as you might intend. The date interpretation rules MySQL uses are discussed further in "Working with Date and Time Values."
For combined date and time values, it is also allowable to specify a 'T' character rather than a space between the date and time (for example, '2004-12-31T12:00:00').
Time or combined date and time values can include a microseconds part following the time, consisting of a decimal point and up to six digits. (For example, '12:30:15.5' or '2005-06-15 10:30:12.000045'.) However, current support for microsecond values is only partial. Some temporal functions use them, but you cannot store a temporal value that includes a microseconds part in a table; the microseconds part is discarded.
For retrieval, you can display date and time values in a variety of formats by using the DATE_FORMAT() and TIME_FORMAT() functions.
The DATE, TIME, and DATETIME Data Types
The DATE and TIME types hold date and time values. The DATETIME type holds combined date and time values. The formats for the three types of values are 'CCYY-MM-DD', 'hh:mm:ss', and 'CCYY-MM-DD hh:mm:ss', where CC, YY, MM, DD hh, mm, and ss represent century, year, month, day, hour, minute, and second, respectively.
For the DATETIME type, the date and time parts are both required; if you assign a DATE value to a DATETIME column, MySQL automatically adds a time part of '00:00:00'. Conversions work in the other direction as well. If you assign a DATETIME value to a DATE or TIME column, MySQL discards the part that is irrelevant:
mysql> CREATE TABLE t (dt DATETIME, d DATE, t TIME); mysql> INSERT INTO t (dt,d,t) VALUES(NOW(), NOW(), NOW()); mysql> SELECT * FROM t; +---------------------+------------+----------+ | dt | d | t | +---------------------+------------+----------+ | 2004-07-17 16:30:44 | 2004-07-17 | 16:30:44 | +---------------------+------------+----------+
MySQL treats the time in DATETIME and TIME values slightly differently. For DATETIME, the time part represents a time of day and must be in the range from '00:00:00' to '23:59:59'. A TIME value, on the other hand, represents elapsed timethat's why the range shown in Table 3.12 for TIME columns is so great and why negative values are allowed.
One thing to watch out when inserting TIME values into a table is that if you use a "short" (not fully qualified) value, it may not be interpreted as you expect. For example, you'll probably find that if you insert '30' and '12:30', into a TIME column, one value will be interpreted from right to left and the other from left to right, resulting in stored values of '00:00:30' and '12:30:00'. If you consider '12:30' to represent a value of "12 minutes, 30 seconds," you should specify it in fully qualified form as '00:12:30'.
The TIMESTAMP Data Type
TIMESTAMP is a temporal data type that stores combined date and time values. (The word "timestamp" might appear to connote time only, but that is not the case.) The description of the TIMESTAMP data type in this section is current as of MySQL 4.1.6. Certain aspects of TIMESTAMP properties were in flux during earlier 4.1 releases, so avoid them and use a current release. The end of the section summarizes how the TIMESTAMP properties differ in MySQL 4.0 from the current properties.
The TIMESTAMP data type has several special properties:
Only one TIMESTAMP column in a table can be designated to have automatic properties. You cannot have automatic initialization for one TIMESTAMP column and automatic update for another. Nor can you have automatic initialization for multiple columns, or automatic update for multiple columns.
The syntax for specifying a TIMESTAMP column is as follows, assuming a column name of ts:
ts TIMESTAMP [DEFAULT value] [ON UPDATE CURRENT_TIMESTAMP]
The DEFAULT and ON UPDATE attributes can be given in any order, if both are given. The default value can be CURRENT_TIMESTAMP or a constant value such as 0 or a value in 'CCYY-MM-DD hh:mm:ss' format. Synonyms for CURRENT_TIMESTAMP are CURRENT_TIMESTAMP() and NOW(); they're all interchangeable in a TIMESTAMP definition.
To use automatic initialization or update for a different TIMESTAMP column than the first one, you must explicitly define the first one with a DEFAULT constant_value attribute and no ON UPDATE CURRENT_TIMESTAMP attribute. Then you can use DEFAULT CURRENT_TIMESTAMP or ON UPDATE CURRENT_TIMESTAMP (or both) with any other single TIMESTAMP column.
If you want to defeat automatic initialization or update for a TIMESTAMP column, set it explicitly to the desired value for insert or update operations. For example, you can prevent an update from changing the column by setting the column to its current value.
TIMESTAMP column definitions also can include NULL or NOT NULL. The default is NOT NULL. Its effect is that when you explicitly set the column to NULL, MySQL sets it to the current timestamp. (This is true both for inserts and updates.) If you specify NULL, setting the column to NULL stores NULL rather than the current timestamp.
If you want a table to contain columns for both a time-created value and a last-modified value, use two TIMESTAMP columns:
CREATE TABLE t ( t_created TIMESTAMP DEFAULT 0, t_modified TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP ... other columns ... );
When inserting a new record, set both TIMESTAMP columns to NULL to set them to the insertion timestamp. When updating an existing record, leave both columns alone; t_modified will be updated automatically to the modification timestamp.
During table creation, TIMESTAMP columns are subject to the setting of the SQL mode. If the MAXDB mode is enabled, any TIMESTAMP column is created as a DATETIME column instead. This is for compatibility with the MaxDB DBMS.
If you have been using MySQL 4.0, you'll notice that its handling of the TIMESTAMP data type differs in several ways from 4.1. Here's a brief characterization of the important differences for MySQL 4.0:
The YEAR Data Type
YEAR is a one-byte data type intended for efficient representation of year values. A YEAR column definition may include a specification for a display width M, which should be either 4 or 2. If you omit M from a YEAR definition, the default is 4. YEAR(4) has a range of 1901 to 2155. YEAR(2) has a range of 1970 to 2069, but only the last two digits are displayed. You can use the YEAR type when you want to store date information but only need the year part of the date, such as year of birth, year of election to office, and so forth. When you do not require a full date value, YEAR is much more space-efficient than other date types.
TINYINT has the same storage size as YEAR (one byte), but not the same range. To cover the same range of years as YEAR by using an integer type, you would need a SMALLINT, which takes twice as much space. If the range of years you need to represent coincides with the range of the YEAR type, YEAR is more space-efficient than SMALLINT. Another advantage of YEAR over an integer column is that MySQL converts two-digit values into four-digit values for you using MySQL's usual year-guessing rules. For example, 97 and 14 become 1997 and 2014. However, be aware that inserting the numeric value 00 into a four-digit YEAR column results in the value 0000 being stored, not 2000. If you want a value of 00 to convert to 2000, you should specify it in string form as '00'.
Date and Time Data Type Attributes
TIMESTAMP columns are special; the default for the first such column in a table is the current date and time, and the "zero" value for any others. However, the full set of rules governing default values is more complex. See "The TIMESTAMP Data Type" for details.
Working with Date and Time Values
MySQL tries to interpret input values for date and time columns in a variety of formats, including both string and numeric forms. Table 3.16 shows the allowable formats for each of the date and time types.
MySQL interprets formats that have no century part (CC) using the rules described in "Interpretation of Ambiguous Year Values." For string formats that include delimiter characters, you don't have to use '-' for dates and ':' for times. Any punctuation character may be used as the delimiter. Interpretation of values depends on context, not on the delimiter. For example, although times are typically specified using a delimiter of ':', MySQL won't interpret a value containing ':' as a time in a context where a date is expected. In addition, for the string formats that include delimiters, you need not specify two digits for month, day, hour, minute, or second values that are less than 10. The following are all equivalent:
'2012-02-03 05:04:09' '2012-2-03 05:04:09' '2012-2-3 05:04:09' '2012-2-3 5:04:09' '2012-2-3 5:4:09' '2012-2-3 5:4:9'
Note that MySQL may interpret values with leading zeros in different ways depending on whether they are specified as strings or numbers. The string '001231' will be seen as a six-digit value and interpreted as '2000-12-31' for a DATE, and as '2000-12-31 00:00:00' for a DATETIME. On the other hand, the number 001231 will be seen as 1231 after the parser gets done with it and then the interpretation becomes problematic. This is a case where it's best to supply a string value '001231', or else use a fully qualified value if you are using numbers (that is, 20001231 for DATE and 200012310000 for DATETIME).
MySQL provides many functions for working with date and time values. See Appendix C for more information.
Interpretation of Ambiguous Year Values
For all date and time types that include a year part (DATE, DATETIME TIMESTAMP, YEAR), MySQL handles values that contain two-digit years by converting them to four-digit years:
mysql> CREATE TABLE y_table (y YEAR); mysql> INSERT INTO y_table VALUES(68),(69),(99),(00),('00'); mysql> SELECT * FROM y_table; +------+ | y | +------+ | 2068 | | 2069 | | 1999 | | 0000 | | 2000 | +------+
Notice that 00 is converted to 0000, not to 2000. That's because, as a number, 00 is the same as 0, and is a perfectly legal value for the YEAR type. If you insert a numeric zero, that's what you get. To get 2000 using a value that does not contain the century, insert the string '0' or '00'. You can make sure that MySQL sees a string and not a number by inserting YEAR values using CAST(value AS CHAR) to produce a string result uniformly regardless of whether value is a string or a number.
In any case, keep in mind that the rules for converting two-digit to four-digit year values provide only a reasonable guess. There is no way for MySQL to be certain about the meaning of a two-digit year when the century is unspecified. If MySQL's conversion rules don't produce the values that you want, the solution is to provide unambiguous data with four-digit years.
Spatial Data Types
MySQL 4.1 and up supports spatial values. This capability allows representation of values such as points, lines, and polygons. These data types are implemented per the OpenGIS specification, which is available at the Open Geospatial Consortium Web site:
The spatial data types allowed in MySQL are listed in Table 3.17.
Currently, MySQL supports spatial types only for MyISAM tables, and indexed spatial columns do not allow NULL values. These restrictions may or may not matter to you. Probably the most significant implications to consider are these:
MySQL works with spatial values in three formats. Well-Known Text (WKT) and Well-Known Binary (WKB) formats represent spatial values as text strings or in a standard binary format. The syntax for text strings and the binary representation are defined in the OpenGIS specification. For example, the WKT format for a POINT value with coordinates of x and y is written as a string:
Note the absence of a comma between the coordinate values. More complex values have a more complex string representation. The following strings represent a LINESTRING consisting of several points and a POLYGON that has a rectangular outer boundary and a triangular inner boundary:
'LINESTRING(10 20, 0 0, 10 20, 0 0)' 'POLYGON((0 0, 100 0, 100 100, 0 100, 0 0),(30 30, 30 60, 45 60, 30 30))'
The third format is the internal format that MySQL uses for storing spatial values in tables.
Because spatial values can be complex, most operations on them are done by invoking functions. The set of spatial functions is extensive and includes functions for converting from one format to another. For the complete list, see Appendix C.
The following example shows how to use several aspects of spatial support:
mysql> CREATE TABLE pt_tbl (p POINT); mysql> INSERT INTO pt_tbl (p) VALUES -> (POINTFROMTEXT('POINT(0 0)')), -> (POINTFROMTEXT('POINT(0 50)')), -> (POINTFROMTEXT('POINT(100 100)')); mysql> CREATE FUNCTION dist (p1 POINT, p2 POINT) -> RETURNS FLOAT -> RETURN SQRT(POW(X(p2)-X(p1),2) + POW(Y(p2)-Y(p1),2)); mysql> SET @ref_pt = POINTFROMTEXT('POINT(0 0)'); mysql> SELECT ASTEXT(p), dist (p, @ref_pt) AS dist FROM pt_tbl; +----------------+-----------------+ | ASTEXT(p) | dist | +----------------+-----------------+ | POINT(0 0) | 0 | | POINT(0 50) | 50 | | POINT(100 100) | 141.42135620117 | +----------------+-----------------+