Previous Page
Next Page

Understanding Full-Text Searching


Not All Engines Support Full-Text Searching As will be explained in Chapter 21, "Creating and Manipulating Tables," MySQL supports the use of several underlying database engines. Not all engines support full-text searching as is described in this chapter. The two most commonly used engines are MyISAM and InnoDB; the former supports full-text searching and the latter does not. This is why, although most of the sample tables used in this book were created to use InnoDB, one (the productnotes table) was created to use MyISAM. If you need full-text searching functionality in your applications, keep this in mind.

In Chapter 8, "Using Wildcard Filtering," you were introduced to the LIKE keyword that is used to match text (and partial text) using wildcard operators. Using LIKE it is possible to locate rows that contain specific values or parts of values, regardless of the location of those values within row columns.

In Chapter 9, "Searching Using Regular Expressions," text-based searching was taken one step further with the introduction to using regular expressions to match column values. Using regular expressions, it is possible to write very sophisticated matching patterns to locate the desired rows.

But as useful as these search mechanisms are, they have several very important limitations:

  • Performance Wildcard and regular expression matching usually requires that MySQL try and match each and every row in a table (and table indexes are rarely of use in these searches). As such, these searches can be very time-consuming as the number of rows to be searched grows.

  • Explicit control Using wildcard and regular expression matching, it is very difficult (and not always possible) to explicitly control what is and what is not matched. An example of this is a search specifying a word that must be matched, a word that must not be matched, and a word that may or may not be matched but only if the first word is indeed matched.

  • Intelligent results Although wildcard- and regular expressionbased searching provide for very flexible searching, neither provide an intelligent way to select results. For example, searching for a specific word would return all rows that contained that word, and not distinguish between rows that contain a single match and those that contained multiple matches (ranking them as potentially better matches). Similarly, searches for a specific word would not find rows that did not contain that word but did contain other related words.

All of these limitations and more are addressed by full-text searching. When full-text searching is used, MySQL does not need to look at each row individually, analyzing and processing each word individually. Rather, an index of the words (in specified columns) is created by MySQL, and searches can be made against those words. MySQL can thus quickly and efficiently determine which words match (which rows contain them), which don't, how often they match, and so on.

Previous Page
Next Page