Model Data to Support Keyword Search
Keyword search is not the same as text search or full text search, and does not provide stemming or other text-processing features. See the Limitations of Keyword Indexes section for more information.
In 2.4, MongoDB provides a text search feature. See Text Indexes for more information.
If your application needs to perform queries on the content of a field that holds text you can perform exact matches on the text or use
$regex to use regular expression pattern matches. However, for many operations on text, these methods do not satisfy application requirements.
This pattern describes one method for supporting keyword search using MongoDB to support application search functionality, that uses keywords stored in an array in the same document as the text field. Combined with a multi-key index, this pattern can support application’s keyword search operations.
To add structures to your document to support keyword-based queries, create an array field in your documents and add the keywords as strings in the array. You can then create a multi-key index on the array and create queries that select values from the array.
Given a collection of library volumes that you want to provide topic-based search. For each volume, you add the array
topics, and you add as many keywords as needed for a given volume.
Moby-Dick volume you might have the following document:
You then create a multi-key index on the
The multi-key index creates separate index entries for each keyword in the
topics array. For example the index contains one entry for
whaling and another for
You then query based on the keywords. For example:
An array with a large number of elements, such as one with several hundreds or thousands of keywords will incur greater indexing costs on insertion.
MongoDB can support keyword searches using specific data models and multi-key indexes; however, these keyword indexes are not sufficient or comparable to full-text products in the following respects:
- Stemming. Keyword queries in MongoDB can not parse keywords for root or related words.
- Synonyms. Keyword-based search features must provide support for synonym or related queries in the application layer.
- Ranking. The keyword look ups described in this document do not provide a way to weight results.
- Asynchronous Indexing. MongoDB builds indexes synchronously, which means that the indexes used for keyword indexes are always current and can operate in real-time. However, asynchronous bulk indexes may be more efficient for some kinds of content and workloads.