Google Cloud Search has several default expansions, interpretations, and optimizations that affect search results. If ever you are seeing unexpected results from search queries, refer to this guide before contacting Cloud Search support.
Default expansions
Suppose a user is searching using a string, such as [Joe’s PDFs], but some returned results contain highlighted words, such as “documents," instead of "PDFs." Why do the results contain the highlighted words that weren't in the search query?
By default, Google Cloud Search, just like Google Web Search, doesn’t only search for the exact words in a query. Instead Cloud Search expands the query to include synonyms and word stems (even if you haven't implemented your own synonyms). This expansion is done to retrieve documents that broadly match the idea and intent of the query. After this broad set of documents is selected, the ranking algorithms work to ensure that the best matches are placed at the top of the result set.
When the user searched for [Joe’s PDFs], Cloud Search supplied the following as additional acceptable words:
- For [Joe’s], Cloud Search might also match "joe" (a stem expansion) and "joes" (a synonym based on punctuation).
- For [PDFs], Cloud Search might also match "documents" (a synonym expansion) and "pdf" (a stem expansion).
By default, synonyms are not necessarily bi-directional. For example, if a user searches for the term “phishing,” Cloud Search might match “phish” as a synonym expansion. However if the user searches for the term “phish,” Google might not match “phishing” as an expansion.
Expansions for hyphenated versus non-hyphenated words
When the user searches for hyphenated words versus their non-hyphenated equivalents, such as [walk-in closet] and [walk in closet], Cloud Search treats these queries differently.
Additionally, different optimizations are used for hyphenated and underscored words, such as [walk-in] and [walk_in].
Compensate for default expansions
There is no guarantee of any expansion by default. If you want to ensure bidirectionality of synonyms or domain-specific synonym expansions, create your own set of domain-specific synonyms. For further information on implementing synonyms, refer to Define synonyms.
Default interpretations
Cloud Search also provides natural-language interpretation which interprets the objects, properties, and field values used in a query according to the schema uploaded for a particular data source. For further information about this natural-language interpretation, refer to Structure your schema for optimal query interpretation.
Disable natural-language interpretations
To disable natural-language interpretations for a specific query, set
QueryInterpretationOptions.disableNlInterpretation
to true
in the search request.
Default optimizations
Cloud Search provides these default optimizations as well:
Blending in results provided by spelling correction. For example, if the query string was [corpoate benefits], Cloud Search would match "corpoate" and the correct spelling of “corporate.”
For queries that would yield zero or few results, Cloud Search uses a more permissive set of related terms, broader than direct synonyms, when matching results. For further information, refer to Handle supplemental results.
Normalizing documents and queries
Normalizing refers to standardizing on certain words or phrases either prior to or after a query has been made. To ensure more consistent responses to your queries, consider normalizing your documents (prior to or during indexing) and queries (after the user has made the query) in the following ways:
To normalize documents:
- Pick a canonical spelling for critical words used in documents within your repositories.
- Correct the spelling in source repository documents, or when indexing content, to match canonical spelling.
To normalize queries:
- Intercept user queries before sending them to Cloud Search.
- Rewrite words in user queries to match the most-common spelling in the indexed data source.
- Send the query to Cloud Search.
Disable expansions, interpretations, and optimizations for all queries
To disable expansions, interpretations and optimizations for a specific query,
set
QueryInterpretationOptions.enableVerbatim Mode
to true
in the search request.