Team LiB
Previous Section Next Section

String Types

The MySQL string types are general purpose types and commonly are used to store text or binary data. Types are available to hold values of varying maximum lengths and can be chosen according to whether you want values to be treated as binary or non-binary strings.

BINARY, VARBINARY, and the BLOB types are binary string types. A binary string is a sequences of bytes, and its length is measured in bytes. Binary strings have no character set and values are compared based on their numeric byte values.

CHAR, VARCHAR, and the TEXT types are non-binary string types. A non-binary string is a sequence of characters. It has a character set and collation. The character set defines the allowable characters for the data type and the collation defines the character sort order. Lengths in non-binary string column definitions are specified in number of characters, and values normally are measured in characters. However, the length of a non-binary string can also be measured in bytes. A non-binary string that is n characters long is also n bytes long if it contains single-byte characters, but more than n bytes long if it contains multi-byte characters. This affects the storage requirements for non-binary string columns. For fixed-length columns such as CHAR, CHAR(M) requires enough space to store M instances of the widest character in the character set. For example, characters in the utf8 character set vary from 1 to 3 bytes each, so CHAR(M) requires 3xM bytes. For variable-length columns such as VARCHAR, VARCHAR(M) requires only enough space to store the actual characters in a given value. To obtain the length of a non-binary string in characters or bytes, use the CHAR_LENGTH() or LENGTH() function, respectively.

You can specify a character set and collation for the non-binary string types (CHAR, VARCHAR, TEXT), as well as for the ENUM and SET types:

  • The syntax for specifying a character set is CHARACTER SET charset, where charset is a character set name such as latin1, greek, or utf8.

  • The syntax for specifying a collation is COLLATE collation, where collation is one of the allowable collations for the character set.

  • If no character set or collation are given, they are determined from the table defaults. If a character set is given without a collation, the collation is the default collation for the character set. If a collation is given without a character set, the character set is implied by the collation name. If a character set and collation both are given, the collation must be compatible with the character set. For example, latin1_bin is compatible with latin1 but not with utf8.

  • The binary character set and the BINARY column attribute are treated specially:

    • If you specify CHARACTER SET binary for a non-binary string type, it causes conversion to the corresponding binary string type. That is, CHAR becomes BINARY, VARCHAR becomes VARBINARY, and the TEXT types become BLOB types. ENUM and SET have no corresponding binary types, so CHARACTER SET binary simply becomes a column attribute as is.

    • The BINARY attribute is equivalent to specifying the binary collation for the character set. This is the collation name that ends with _bin. For example, if the default character set is utf8, a column defined as CHAR(10) BINARY becomes CHAR(10) CHARACTER SET utf8 COLLATE utf8_bin.

  • For non-binary string types, the ASCII and UNICODE attributes are shorthand for CHARACTER SET latin1 and CHARACTER SET ucs2, respectively. ASCII was introduced in MySQL 4.1.0 and UNICODE in 4.1.1.

The allowable character sets and collations supported by the server can be determined by issuing the SHOW CHARACTER SET and SHOW COLLATION statements. These statements also show which collation is the default for each character set.

Handling of trailing spaces varies for different string types:

  • For BINARY and CHAR, values are padded if necessary to the column length when stored, but trailing spaces are removed when values are retrieved.

  • For VARBINARY and VARCHAR, trailing spaces are removed when values are stored, but only prior to MySQL 5.0.3; as of MySQL 5.0.3, trailing spaces are retained. Trailing spaces are not removed when values are retrieved.

  • For the BLOB and TEXT types, trailing spaces are retrained; they are not removed for storage or retrieval.

  • For ENUM and SET, any trailing spaces in member values listed in the column definition are ignored. A consequence of this is that any trailing spaces are stripped from values stored in the column. This happens because MySQL converts any value stored in the column to the corresponding internal numeric value of the column member.

Binary String Types

  • BINARY[(M)]

    Meaning. A fixed-length binary string 0 to M bytes long. M should be an integer from 0 to 255. M is optional as of MySQL 4.1.8; if omitted, it defaults to 1. Strings longer than M bytes are chopped to a length of M when stored. Strings shorter than M bytes are right-padded with spaces when stored. Trailing spaces are removed when values are retrieved.

    Allowable attributes. None, other than the global attributes

    Allowable length. 0 to M bytes

    Default value. NULL if the column can be NULL, '' (empty string) if NOT NULL

    Storage required. M bytes

    Comparisons. Binary

  • VARBINARY(M)

    Meaning. A variable-length binary string 0 to M bytes long. M should be an integer from 0 to 255 (0 to 65535 as of MySQL 5.0.3). Strings longer than M bytes are chopped to a length of M when stored. Prior to MySQL 5.0.3, trailing spaces are removed from values when they are stored. As of 5.0.3, trailing spaces are not removed from stored values.

    Allowable attributes. None, other than the global attributes

    Allowable length. 0 to M bytes

    Default value. NULL if the column can be NULL, '' (empty string) if NOT NULL

    Storage required. Length of value (in bytes), plus 1 byte to record the length (1 or 2 bytes as of MySQL 5.0.3)

    Comparisons. Binary

    Note. In MySQL 5.0.3 and up, the actual maximum length of a VARBINARY column in practice may be less than 65535 bytes, depending on storage engine internal row-size limits and the storage required by other columns in the table.

  • TINYBLOB

    Meaning. A small BLOB (binary string) value

    Allowable attributes. None, other than the global attributes

    Allowable length. 0 to 255 (0 to 281) bytes

    Default value. NULL if the column can be NULL, '' (empty string) if NOT NULL

    Storage required. Length of value (in bytes), plus 1 byte to record the length

    Comparisons. Binary

  • BLOB[(M)]

    Meaning. A normal-sized BLOB (binary string) value

    Allowable attributes. None, other than the global attributes

    Allowable length. 0 to 65535 (0 to 2161) bytes. If a length M is given, it is used to choose the appropriate data type and then discarded. For lengths of 1 to 65535, the data type becomes BLOB. For lengths of 65536 or greater, the data types becomes whichever of MEDIUMBLOB or LONGBLOB is required to accommodate values of the given length.

    Default value. NULL if the column can be NULL, '' (empty string) if NOT NULL

    Storage required. Length of value (in bytes), plus 2 bytes to record the length

    Comparisons. Binary

  • MEDIUMBLOB

    Meaning. A medium-sized BLOB (binary string) value

    Allowable attributes. None, other than the global attributes

    Allowable length. 0 to 16777215 (0 to 2241) bytes

    Default value. NULL if the column can be NULL, '' (empty string) if NOT NULL

    Storage required. Length of value (in bytes), plus 3 bytes to record the length

    Comparisons. Binary

    Synonyms. LONG VARBINARY

  • LONGBLOB

    Meaning. A large BLOB (binary string) value

    Allowable attributes. None, other than the global attributes

    Allowable length. 0 to 4294967295 (0 to 2321) bytes

    Default value. NULL if the column can be NULL, '' (empty string) if NOT NULL

    Storage required. Length of value (in bytes), plus 4 bytes to record the length

    Comparisons. Binary

Non-Binary String Types

  • CHAR[(M)]

    Meaning. A fixed-length non-binary string 0 to M characters long. M should be an integer from 0 to 255. If omitted, M defaults to 1. Strings longer than M characters are chopped to a length of M when stored. Strings shorter than M characters are right-padded with spaces when stored. Trailing spaces are removed when values are retrieved.

    Allowable attributes. BINARY, CHARACTER SET

    Allowable length. 0 to M characters

    Default value. NULL if the column can be NULL, '' (empty string) if NOT NULL

    Storage required. M characters

    Comparisons. Non-binary, using the data type collation

    Synonyms. NCHAR(M) and NATIONAL CHAR(M) are synonyms for CHAR(M) CHARACTER SET utf8. BINARY(M) is a synonym for CHAR(M) BINARY before MySQL 4.1.2; from 4.1.2 on, BINARY is a distinct data type.

  • VARCHAR(M)

    Meaning. A variable-length non-binary string 0 to M characters long. M should be an integer from 0 to 255 (0 to 65535 as of MySQL 5.0.3). Strings longer than M characters are chopped to a length of M when stored. Prior to MySQL 5.0.3, trailing spaces are removed from values when they are stored. As of 5.0.3, trailing spaces are not removed from stored values.

    Allowable attributes. BINARY, CHARACTER SET

    Allowable length. 0 to M characters

    Default value. NULL if the column can be NULL, '' (empty string) if NOT NULL

    Storage required. Length of value (in characters), plus 1 byte to record the length (1 or 2 bytes as of MySQL 5.0.3)

    Comparisons. Non-binary, using the data type collation

    Synonyms. CHAR VARYING(M). NCHAR VARYING(M) and NATIONAL CHAR VARYING(M) are synonyms for VARCHAR(M) CHARACTER SET utf8. VARBINARY(M) is a synonym for VARCHAR(M) BINARY before MySQL 4.1.2; from 4.1.2 on, VARBINARY is a distinct data type.

    Note. In MySQL 5.0.3 and up, the actual maximum length of a VARCHAR may be less than 65535 characters, depending on storage engine internal row-size limits, whether the column character set is single-byte or multi-byte, and the storage required by other columns in the table.

  • TINYTEXT

    Meaning. A small TEXT (non-binary string) value

    Allowable attributes. BINARY, CHARACTER SET

    Allowable length. 0 to 255 (0 to 281) characters

    Default value. NULL if the column can be NULL, '' (empty string) if NOT NULL

    Storage required. Length of value (in characters), plus 1 byte to record the length

    Comparisons. Non-binary, using the data type collation

  • TEXT[(M)]

    Meaning. A normal-sized TEXT (non-binary string) value

    Allowable attributes. BINARY, CHARACTER SET

    Allowable length. 0 to 65535 (0 to 2161) characters. If a length M is given, it is used to choose the appropriate data type and then discarded. For lengths of 1 to 65535, the data type becomes TEXT. For lengths of 65536 or greater, the data types becomes whichever of MEDIUMTEXT or LONGTEXT is required to accommodate values of the given length.

    Default value. NULL if the column can be NULL, '' (empty string) if NOT NULL

    Storage required. Length of value (in characters), plus 2 bytes to record the length

    Comparisons. Non-binary, using the data type collation

  • MEDIUMTEXT

    Meaning. A medium-sized TEXT (non-binary string) value

    Allowable attributes. BINARY, CHARACTER SET

    Allowable length. 0 to 16777215 (0 to 2241) characters

    Default value. NULL if the column can be NULL, '' (empty string) if NOT NULL

    Storage required. Length of value (in characters), plus 3 bytes to record the length

    Comparisons. Non-binary, using the data type collation

    Synonyms. LONG VARCHAR

  • LONGTEXT

    Meaning. A large TEXT (non-binary string) value

    Allowable attributes. BINARY, CHARACTER SET

    Allowable length. 0 to 4294967295 (0 to 2321) characters

    Default value. NULL if the column can be NULL, '' (empty string) if NOT NULL

    Storage required. Length of value (in characters), plus 4 bytes to record the length

    Comparisons. Non-binary, using the data type collation

ENUM and SET Types

  • ENUM('value1','value2',...)

    Meaning. An enumeration; column values can be assigned exactly one member of the value list

    Allowable attributes. CHARACTER SET

    Default value. NULL if the column can be NULL, first enumeration value if NOT NULL

    Storage required. 1 byte for enumerations with 1 to 255 members, 2 bytes for enumerations with 256 to 65535 members

    Comparisons. Based on the numeric value of column values

    Note. In the data type definition, any trailing spaces present in member values are ignored.

  • SET('value1','value2',...)

    Meaning. A set; column values can be assigned zero or more members of the value list

    Allowable attributes. CHARACTER SET

    Default value. NULL if the column can be NULL, '' (empty set) if NOT NULL

    Storage required. 1 byte (for sets with 1 to 8 members), 2 bytes (9 to 16 members), 3 bytes (17 to 24 members), 4 bytes (25 to 32 members), or 8 bytes (33 to 64 members)

    Comparisons. Based on the numeric value of column values

    Note. In the data type definition, any trailing spaces present in member values are ignored.

    Team LiB
    Previous Section Next Section