Full-Text Search is a database technique which will retrieve records even if they don’t exactly match your search criteria. This enables rich natural language searching that feels more familiar.

Searching for “database engine” using a full-text search will return results containing the phrases “database,” “engine” or “database engine”. This resolves possible ambiguities in your search criteria, so rows like “I have a database and an engine” still show up.

Simple searches in MySQL can be made using the LIKE operator. This is inefficient and limited in functionality. MySQL usually needs to perform a full-table scan to identify records matching your query.

Full-text queries use a specially created index to improve performance. This also enables MySQL to keep track of the words within your dataset, facilitating the natural language search.

You can only use full-text search with columns that have a full-text index. Use the FULLTEXT clause in CREATE TABLE statements to set this up when you seed your database schema. You can also use it with ALTER TABLE to add a search index to existing columns.

With the index in-place, you’re ready to start querying your database.

Using Full-Text Searches

Full-text searches start with a WHERE clause. You use MATCH AGAINST instead of LIKE. You need to indicate the indexed columns to match, as well as the query to search for.

This will perform a full-text search of the articles using the query database engine. By specifying IN NATURAL LANGUAGE MODE, MySQL is instructed to interpret the query literally, without processing it in any way. This is the default search mode if none is specified.

Ordering By Result Relevance

One of the advantages of full-text search is that it lets you order the returned records by relevance. This is not possible with a regular LIKE query. You can use a MATCH … AGAINST clause as part of a SQL SELECT. The returned virtual column will contain a relevance score, from 0 to 1, indicating how closely the record matched the search query.

This query would return the most relevant results first. This helps your application meet user expectations of how natural language search systems should perform.

MySQL computes search relevant scores by assessing several different factors. These include the number of records which match the query, as well as the number of times the query occurs within each record. A result with several exact matches for the query will rank higher than one which only contains part of the query.

When using MATCH … AGAINST in a SELECT statement, you don’t need to repeat it in the WHERE clause. You could manually filter the results to include only records with a non-zero relevance score.

Query Expansion Mode

Natural language isn’t the only supported full-text search mode. Query expansion mode is an alternative which helps widen the range of the search results. It automatically weights the search query against the most relevant terms.

An expanded search begins by finding the records that contain a match for the query. Those records are then checked to identify the most relevant words. The database then runs another search, this time basing it on the relevant words instead of the original query. This usually results in more records being returned while maintaining an acceptable degree of relevance.

Here’s how you enable query expansion mode:

Boolean Mode

The final full-text search mode is boolean mode. This lets you include boolean modifiers in your query. You can use this mode when you need advanced control over the matching logic.

You can require a word to be present in each result by prefixing it with +. Use - to exclude results which contain the word. Other operators can be used to match parts of words, create sub-expressions and make words reduce the relevance score. The latter can be helpful when masking “noise” terms. If you don’t specify an operator, it’s implied the word will be referenced in the query with OR.

This query would surface articles with at least one word starting with data where sqlite is not mentioned. Each result may or may not include the word engine.

Boolean mode lets you construct powerful searches using your own logic. One caveat is it doesn’t support search relevance scoring. This is the tradeoff in giving users the ability to make boolean searches. The result ranking may not align with human expectations.

MySQL comes with several full-text configuration options which let you finetune how searches are conducted. Here’s a few of the most important.

innodb_ft_min_token_size – Sets the minimum word length for indexed terms. Words with fewer characters than this value won’t be added to the index so you won’t be able to search them. The default value is 3, which excludes extremely common words such as a, an and I. This setting applies to InnoDB tables; use ft_min_word_len for MyISAM. innodb_ft_max_token_size – Similarly to innodb_ft_min_token_size, this sets the maximum length of indexed words. Longer words won’t be searchable. Use ft_max_word_len for MyISAM tables. innodb_ft_enable_stopword – This setting, on by default, lets you control whether MySQL should filter out “stopwords”. Stopwords are very commonly used words which could unduly influence search results. The default stopword list contains 36 frequent phrases. innodb_ft_user_stopword_table – You can set this value to the name of a database table which MySQL should source the stopword list from. This table must have a single VARCHAR column called value. Add your stopwords to exclude as records in this table.

These settings are usually set in your MySQL server configuration file. The location varies by platform; it can often be found at /etc/mysql/my.cnf. You’ll need to restart the MySQL service after changing the settings.

Once the server is back up, rebuild the full-text indexes for your table. You must do this so your data is reindexed using the current configuration. Otherwise, the previously indexed data will continue to be used.

To reindex an InnoDB table, run OPTIMIZE TABLE my_table. For MyISAM tables, use REPAIR TABLE my_table QUICK. The search indexes will then be rebuilt so your configuration changes take effect.

Summary

MySQL Full-Text Search is enabled by adding a FULLTEXT index to your searchable fields. You then use MATCH … AGAINST with one of three searching modes to get your results. Natural language queries return a search relevance score which you can use to rank your results.

Full-text search offers more powerful search behaviour than a LIKE statement. It’s also much more performant, particularly on large datasets, as all the text is indexed in advance.