Expression Evaluation and Type Conversion
Expressions contain terms and operators and are evaluated to produce values. Expressions can include constants, function calls, and references to table columns. These values may be combined using different kinds of operators, such as arithmetic or comparison operators, and terms of an expression may be grouped with parentheses. Expressions occur most commonly in the output column list and WHERE clause of SELECT statements. For example, here is a query that is similar to one used for age calculations in Chapter 1:
SELECT CONCAT(last_name, ', ', first_name), (YEAR(death) - YEAR(birth)) - IF(RIGHT(death,5) < RIGHT(birth,5),1,0) FROM president WHERE birth > '1900-1-1' AND DEATH IS NOT NULL;
Each selected value represents an expression, as does the content of the WHERE clause. Expressions also occur in the WHERE clause of DELETE and UPDATE statements, the VALUES() clause of INSERT statements, and so forth.
When MySQL encounters an expression, it evaluates the expression to produce a result. For example, (4*3)/(4-2) evaluates to the value 6. Expression evaluation may involve type conversion, such as when MySQL converts the number 960821 into a date '1996-08-21' if the number is used in a context requiring a DATE value.
This section discusses how you can write expressions in MySQL and the rules that govern the various kinds of type conversions that MySQL performs during the process of expression evaluation. Each of MySQL's operators is listed here, but MySQL has so many functions that only a few are touched on. For more information, see Appendix C.
Expressions can use function calls. Some functions take arguments (values inside the parentheses), and some do not. Multiple arguments should be separated by commas. When you invoke a function, there can be spaces around arguments, but there must be no space between the function name and the opening parenthesis:
If there is a space after the function name, the MySQL parser may interpret the function name as a column name. The usual result is a syntax error. This happens because function names are not reserved words and you can use them for column names if you want. You can tell MySQL to allow spaces after function names by enabling the IGNORE_SPACE SQL mode. However, that also causes function names to be treated as reserved words.
Expressions can include references to table columns. In the simplest case, when the table to which a column belongs is clear from context, a column reference may be given simply as the column name. Only one table is named in each of the following SELECT statements, so the column references are unambiguous, even though the same column names are used in each statement:
SELECT last_name, first_name FROM president; SELECT last_name, first_name FROM member;
If it's not clear which table should be used, a column name can be qualified by preceding it with the proper table name. If it's not even clear which database should be used, the table name can be preceded by the database name. You also can use these more-specific qualified forms in unambiguous contexts if you simply want to be more explicit:
SELECT president.last_name, president.first_name, member.last_name, member.first_name FROM president, member WHERE president.last_name = member.last_name; SELECT sampdb.student.name FROM sampdb.student;
Finally, you can combine all these kinds of values (constants, function calls, and column references) to form more complex expressions.
Terms of expressions can be combined using several kinds of operators. Arithmetic operators, listed in Table 3.18, include the usual addition, subtraction, multiplication, and division operators, as well as the modulo operator. Arithmetic is performed using BIGINT (64-bit) integer values for +, -, and * when both operands are integers, as well as for /, DIV, and % when the operation is performed in a context where the result is expected to be an integer. Otherwise, DOUBLE is used. Be aware that if an integer operation involves large values such that the result exceeds 64-bit range, you will get unpredictable results. (Actually, you should try to avoid exceeding 63-bit values; one bit is needed to represent the sign.)
Logical operators, shown in Table 3.19, evaluate expressions to determine whether they are true (non-zero) or false (zero). It is also possible for a logical expression to evaluate to NULL if its value cannot be ascertained. For example, 1 AND NULL is of indeterminate value.
As alternative forms of AND, OR, and NOT, MySQL allows the &&, ||, and ! operators, respectively, as used in the C programming language. Note in particular the || operator. Standard SQL specifies || as the string concatenation operator, but in MySQL it signifies a logical OR operation. If you use the following expression, expecting it to perform string concatenation, you may be surprised to discover that it returns the number 0:
'abc' || 'def' 0
This happens because 'abc' and 'def' are converted to integers for the operation, and both turn into 0. In MySQL, you must use CONCAT('abc','def') or proximity to perform string concatenation:
CONCAT('abc','def') 'abcdef' 'abc' 'def' 'abcdef'
If you want the standard SQL behavior for ||, enable the PIPES_AS_CONCAT SQL mode.
Bit operators, shown in Table 3.20, perform bitwise intersection, union, and exclusive-OR, where each bit of the result is evaluated as the logical AND, OR, or exclusive-OR of the corresponding bits of the operands. You also can perform bit shifts left or right. Bit operations are performed using BIGINT (64-bit) integer values.
Comparison operators, shown in Table 3.21, include operators for testing relative magnitude or lexical ordering of numbers and strings, as well as operators for performing pattern matching and for testing NULL values. The <=> operator is MySQL-specific.
For a discussion of the comparison properties of string values, see "String Values."
Pattern matching allows you to look for values without having to specify an exact literal value. MySQL provides SQL pattern matching using the LIKE operator and the wildcard characters '%' (match any sequence of characters) and '_' (match any single character). MySQL also provides pattern matching based on the REGEXP operator and regular expressions that are similar to those used in Unix programs such as grep, sed, and vi. You must use one of these pattern-matching operators to perform a pattern match; you cannot use the = operator. To reverse the sense of a pattern match, use NOT LIKE or NOT REGEXP.
The two types of pattern matching differ in important respects besides the use of different operators and pattern characters:
'Franklin' LIKE 'Frank%' 1 'Frankfurter' LIKE 'Frank%' 1
The wildcard character '%' matches any sequence of characters, including the empty sequence, so 'Frank%' matches 'Frank':
'Frank' LIKE 'Frank%' 1
This also means the pattern '%' matches any string, including the empty string. However, '%' will not match NULL. In fact, any pattern match with a NULL operand fails:
'Frank' LIKE NULL NULL NULL LIKE '%' NULL
'Frankly' LIKE 'Frank%' 1 'frankly' LIKE 'Frank%' 1 BINARY 'Frankly' LIKE 'Frank%' 1 BINARY 'frankly' LIKE 'Frank%' 0 'Frankly' COLLATE latin1_general_cs LIKE 'Frank%' 1 'frankly' COLLATE latin1_general_cs LIKE 'Frank%' 0 'Frankly' COLLATE latin1_bin LIKE 'Frank%' 1 'frankly' COLLATE latin1_bin LIKE 'Frank%' 0
This behavior differs from that of the standard SQL LIKE operator, which is case sensitive.
The other wildcard character allowed with LIKE is '_', which matches any single character. The pattern '___' matches any string of exactly three characters. 'c_t' matches 'cat', 'cot', 'cut', and even 'c_t' (because '_' matches itself).
Wildcard characters may be specified anywhere in a pattern. '%bert' matches 'Englebert', 'Bert', and 'Albert'. '%bert%' matches all of those strings, and also strings like 'Berthold', 'Bertram', and 'Alberta'. 'b%t' matches 'Bert', 'bent', and 'burnt'.
'abc' LIKE 'a%c' 1 'abc' LIKE 'a\%c' 0 'a%c' LIKE 'a\%c' 1 'abc' LIKE 'a_c' 1 'abc' LIKE 'a\_c' 0 'a_c' LIKE 'a\_c' 1
The '.' character is a wildcard that matches any single character:
'abc' REGEXP 'a.c' 1
The [...] construction matches any character listed between the square brackets.
'e' REGEXP '[aeiou]' 1 'f' REGEXP '[aeiou]' 0
You can specify a range of characters by listing the endpoints of the range separated by a dash ('-'), or negate the sense of the class (to match any character not listed) by specifying '^' as the first character of the class:
'abc' REGEXP '[a-z]' 1 'abc' REGEXP '[^a-z]' 0
'*' means "match any number of the previous thing," so that, for example, the pattern 'x*' matches any number of 'x' characters:
'abcdef' REGEXP 'a.*f' 1 'abc' REGEXP '[0-9]*abc' 1 'abc' REGEXP '[0-9][0-9]*' 0
'abc' REGEXP 'cd*' 1 'abc' REGEXP 'cd+' 0 'abcd' REGEXP 'cd+' 1
'^pattern' and 'pattern$' anchor a pattern match so that the pattern pattern matches only when it occurs at the beginning or end of a string, and '^pattern$' matches only if pattern matches the entire string:
'abc' REGEXP 'b' 1 'abc' REGEXP '^b' 0 'abc' REGEXP 'b$' 0 'abc' REGEXP '^abc$' 1 'abcd' REGEXP '^abc$' 0
MySQL's regular expression matching has other special pattern elements as well. See Appendix C for more information.
A LIKE or REGEXP pattern can be taken from a table column, although this will be slower than a constant pattern if the column contains several different values. The pattern must be examined and converted to internal form each time the column value changes.
When MySQL evaluates an expression, it looks at the operators to determine the order in which it should group the terms of the expression. Some operators have higher precedence; that is, they are "stronger" than others in the sense that they are evaluated earlier than others. For example, multiplication and division have higher precedence than addition and subtraction. The following two expressions are equivalent because * and / are evaluated before + and -:
1 + 2 * 3 - 4 / 5 6.2 1 + 6 - .8 6.2
Operator precedence is shown in the following list, from highest precedence to lowest. Operators listed on the same line have the same precedence. Operators at a higher precedence level are evaluated before operators at a lower precedence level. Operators at the same precedence level are evaluated left to right.
BINARY COLLATE ! - (unary minus) ~ (unary bit negation) ^ * / DIV % MOD + - << >> & | < <= = <=> <> != >= > IN IS LIKE REGEXP RLIKE BETWEEN CASE WHEN THEN ELSE NOT AND && OR || XOR :=
Some operators have a different precedence depending on the SQL mode or MySQL version. See Appendix C for details.
1 + 2 * 3 - 4 / 5 6.2 (1 + 2) * (3 - 4) / 5 -0.6
NULL Values in Expressions
If you supply NULL as an operand to any arithmetic or bit operator, the result is NULL:
1 + NULL NULL 1 | NULL NULL
With logical operators, the result is NULL unless the result can be determined with certainty:
1 AND NULL NULL 1 OR NULL 1 0 AND NULL 0 0 OR NULL NULL
NULL as an operand to any comparison or pattern-matching operator produces a NULL result, except for the <=>, IS NULL, and IS NOT NULL operators, which are intended specifically for dealing with NULL values:
1 = NULL NULL NULL = NULL NULL 1 <=> NULL 0 NULL LIKE '%' NULL NULL REGEXP '.*' NULL NULL <=> NULL 1 1 IS NULL 0 NULL IS NULL 1
Functions generally return NULL if given NULL arguments, except for those functions designed to deal with NULL arguments. For example, IFNULL() is able to handle NULL arguments and returns true or false appropriately. On the other hand, STRCMP() expects non-NULL arguments; if you pass it a NULL argument, it returns NULL rather than true or false.
In sorting operations, NULL values group together. They appear first in ascending sorts and last in descending sorts.
Whenever a value of one type is used in a context that requires a value of another type, MySQL performs extensive type conversion automatically according to the kind of operation you're performing. Type conversion may occur for any of the following reasons:
You also can perform explicit type conversion using a cast operator or function.
The following expression involves implicit type conversion. It consists of the addition operator + and two operands, 1 and '2':
1 + '2'
The operands are of different types (number and string), so MySQL converts one of them to make them the same type. But which one should it change? In this case, + is a numeric operator, so MySQL wants the operands to be numbers thus and converts the string '2' to the number 2. Then it evaluates the expression to produce the result 3.
1 + '2' 3
Here's another example. The CONCAT() function concatenates strings to produce a longer string as a result. To do this, it interprets its arguments as strings, no matter what type they are. If you pass it a bunch of numbers, CONCAT() converts them to strings, and then returns their concatenation:
If the call to CONCAT() is part of a larger expression, further type conversion may take place. Consider the following expression and its result:
CONCAT(1,2,3) produces the string '123'. The expression '123'/10 is converted to 123/10 because division is an arithmetic operator. The result of this expression would be 12.3 in floating-point context, but REPEAT() expects an integer repeat count, so an integer division is performed to produce 12. Then REPEAT('X',12) produces a string result of 12 'X' characters.
A general principle to keep in mind is that, by default, MySQL attempts to convert values to the type required by an expression rather than generating an error. Depending on the context, it converts values of each of the three general categories (numbers, strings, or dates and times) to values in any of the other categories. However, values can't always be converted from one type to another. If a value to be converted to a given type doesn't look like a legal value for that type, the conversion fails. Conversion to numbers of things like 'abc' that don't look like numbers results in a value of 0. Conversion to date or time types of things that don't look like a date or time result in the "zero" value for the type. For example, converting the string 'abc' to a date results in the "zero" date '0000-00-00'. On the other hand, any value can be treated as a string, so generally it's not a problem to convert a value to a string.
If you want to prevent conversion of illegal values to the closest legal values during data input operations, you can enable strict mode to cause an error to occur instead. See "How MySQL Handles Invalid Data Values."
MySQL also performs more minor type conversions. If you use a floating-point value in an integer context, the value is converted (with rounding). Conversion in the other direction works as well; an integer can be used without problem as a floating-point number.
Hexadecimal constants are treated as binary strings unless the context clearly indicates a number. In string contexts, each pair of hexadecimal digits is converted to a character and the result is used as a string. The following examples illustrate how this works:
0x61 'a' 0x61 + 0 97 X'61' 'a' X'61' + 0 97 CONCAT(0x61) 'a' CONCAT(0x61 + 0) '97' CONCAT(X'61') 'a' CONCAT(X'61' + 0) '97'
For comparisons, context determines whether to treat a hexadecimal constant as a binary string or a number:
0x61 'a' 0x61 = 'A' 0 _latin1 0x61 = 'A' 1 CONVERT(0x61 USING latin1) = 'A' 1
Some operators force conversion of the operands to the type expected by the operator, no matter what the type of the operands is. Arithmetic operators are an example of this. They expect numbers, and the operands are converted accordingly:
3 + 4 7 '3' + 4 7 '3' + '4' 7
In a string-to-number conversion, it's not enough for a string simply to contain a number somewhere. MySQL doesn't look through the entire string hoping to find a number, it looks only at the beginning; if the string has no leading numeric part, the conversion result is 0.
'1973-2-4' + 0 1973 '12:14:01' + 0 12 '23-skidoo' + 0 23 '-23-skidoo' + 0 -23 'carbon-14' + 0 0
MySQL's string-to-number conversion rule converts numeric-looking strings to floating-point values:
'-428.9' + 0 -428.9 '3E-4' + 0 0.0003
This does not work for hexadecimal constants, though:
The logical and bit operators are even stricter than the arithmetic operators. They want the operators to be not just numeric, but integers, and type conversion is performed accordingly. This means that a floating-point number such as 0.3 is not considered true, even though it's non-zero; that's because when it's converted to an integer, the result is 0. In the following expressions, the operands are not considered true until they have a value of at least 1.
0.3 OR .04 0 1.3 OR .04 1 0.3 AND .04 0 1.3 AND .04 0 1.3 AND 1.04 1
This type of conversion also occurs with the IF() function, which expects the first argument to be an integer. This means that values that round to zero will be considered false:
IF(1.3,'non-zero','zero') 'non-zero' IF(0.3,'non-zero','zero') 'zero' IF(-0.3,'non-zero','zero') 'zero' IF(-1.3,'non-zero','zero') 'non-zero'
To test floating-point values properly, it's best to use an explicit comparison:
Pattern matching operators expect to operate on strings. This means you can use MySQL's pattern matching operators on numbers because it will convert them to strings in the attempt to find a match!
12345 LIKE '1%' 1 12345 REGEXP '1.*5' 1
The magnitude comparison operators (<, <=, =, and so on) are context sensitive; that is, they are evaluated according to the types of their operands. The following expression compares the operands numerically because they both are numbers:
2 < 11 1
This expression involves string operands and thus results in a lexical comparison:
'2' < '11' 0
In the following comparisons, the types are mixed, so MySQL compares them as numbers. As a result, both expressions are true:
'2' < 11 1 2 < '11' 1
Date and Time Interpretation Rules
MySQL freely converts strings and numbers to date and time values as demanded by context in an expression, and vice versa. Date and time values are converted to numbers in numeric context; numbers are converted to dates or times in date or time contexts. This conversion to a date or time value happens when you assign a value to a date or time column or when a function requires a date or time value. In comparisons, the general rule is that date and time values are compared as strings.
If the table mytbl contains a DATE column date_col, the following statements are equivalent:
INSERT INTO mytbl SET date_col = '2004-04-13'; INSERT INTO mytbl SET date_col = '20040413'; INSERT INTO mytbl SET date_col = 20040413;
TO_DAYS('2004-04-10') 732046 TO_DAYS('20040410') 732046 TO_DAYS(20040410) 732046
Testing and Forcing Type Conversion
To see how type conversion will be handled in an expression, issue a SELECT query that evaluates the expression so that you can examine the result:
mysql> SELECT 0x41, 0x41 + 0; +------+----------+ | 0x41 | 0x41 + 0 | +------+----------+ | A | 65 | +------+----------+
As you might imagine, I did quite a lot of that sort of testing with different versions of MySQL while writing this chapter.
Testing expression evaluation is especially important for statements such as DELETE or UPDATE that modify records, because you want to be sure you're affecting only the intended rows. One way to check an expression is to run a preliminary SELECT statement with the same WHERE clause that you're going to use with the DELETE or UPDATE statement to verify that the clause selects the proper rows. Suppose that the table mytbl has a CHAR column char_col containing these values:
'abc' 'def' '00' 'ghi' 'jkl' '00' 'mno'
Given these values, what is the effect of the following statement?
DELETE FROM mytbl WHERE char_col = 00;
The intended effect is probably to delete the two rows containing the value '00'. The actual effect would be to delete all the rowsan unpleasant surprise. This happens as a consequence of MySQL's comparison rules. char_col is a string column, but 00 in the statement is not quoted, so it is treated as a number. By MySQL's comparison rules, a comparison involving a string and a number evaluates as a comparison of two numbers. As MySQL executes the DELETE statement, it converts each value of char_col to a number and compares it to 0. Unfortunately, although '00' converts to 0, so do all the strings that don't look like numbers. As a result, the WHERE clause is true for every row, and the DELETE statement empties the table. This is a case where it would have been prudent to test the WHERE clause with a SELECT statement prior to executing the DELETE, because that would have shown you that too many rows are selected by the expression:
mysql> SELECT char_col FROM mytbl WHERE char_col = 00; +----------+ | char_col | +----------+ | abc | | def | | 00 | | ghi | | jkl | | 00 | | mno | +----------+
When you're uncertain about the way a value will be used, you may want to exploit MySQL's type conversion to force an expression to a value of a particular type, or to call a function that performs the desired conversion. The following list demonstrates several useful conversion techniques: