VOOZH about

URL: https://deepwiki.com/netgen/query-translator/5.3-elasticsearch-querystring-generator

⇱ Elasticsearch QueryString Generator | netgen/query-translator | DeepWiki


Loading...
Menu

Elasticsearch QueryString Generator

Purpose and Scope

This document covers the QueryString generator, which produces query strings in Elasticsearch's Query String Query format from Galach syntax trees. The QueryString generator is one of three generator implementations in the Query Translator library, alongside the Native generator (see 5.1) and the Solr ExtendedDisMax generator (see 5.2).

For information about shared Lucene visitor components used by both QueryString and ExtendedDisMax generators, see 5.5. For the overall visitor pattern implementation, see 5.4.


Overview

The QueryString generator translates parsed Galach syntax trees into Elasticsearch Query String Query syntax. It is implemented in the QueryString class and shares substantial code with the ExtendedDisMax generator through Lucene common visitors. The primary difference between the two generators lies in character escaping rules: QueryString requires additional escaping for comparison operators (=, >, <).

Generator Structure

The QueryString generator follows the same architectural pattern as ExtendedDisMax:


Sources: lib/Languages/Galach/Generators/QueryString.php1-37 tests/Galach/Generators/QueryStringTest.php1-72


Generator Class

The QueryString class serves as the entry point for generation. It accepts a Visitor implementation (typically an Aggregate visitor) and delegates the traversal to that visitor.

Class Structure

ElementTypeDescription
$visitorVisitorAggregate visitor containing all node-specific visitors
__construct()ConstructorAccepts a Visitor implementation
generate()MethodTraverses syntax tree and returns Elasticsearch query string

Generation Process

The generate() method accepts a SyntaxTree and optional options parameter, then invokes the visitor on the root node:


This delegates all node-specific processing to the composed visitor collection.

Sources: lib/Languages/Galach/Generators/QueryString.php8-37


Visitor Configuration

The QueryString generator uses the same visitor composition as ExtendedDisMax, with one critical difference: it uses Lucene\QueryString\Word instead of Lucene\ExtendedDisMax\Word.

Required Visitors

The following visitors must be configured to create a functional QueryString generator:

Visitor ClassPurposeLocation
Lucene\Common\ProhibitedHandles -term syntaxShared with ExtendedDisMax
Lucene\Common\GroupHandles parentheses grouping and domain prefixesShared with ExtendedDisMax
Lucene\Common\MandatoryHandles +term syntaxShared with ExtendedDisMax
Lucene\Common\LogicalAndHandles AND operatorsShared with ExtendedDisMax
Lucene\Common\LogicalNotHandles NOT operatorsShared with ExtendedDisMax
Lucene\Common\LogicalOrHandles OR operatorsShared with ExtendedDisMax
Lucene\Common\PhraseHandles quoted phrases with field mappingShared with ExtendedDisMax
Lucene\Common\QueryHandles root query nodeShared with ExtendedDisMax
Lucene\Common\TagTranslates #tag to field:valueShared with ExtendedDisMax
Lucene\Common\UserTranslates @user to field:valueShared with ExtendedDisMax
Lucene\QueryString\WordHandles word terms with QueryString escapingQueryString-specific

Instantiation Example


Sources: tests/Galach/Generators/QueryStringTest.php38-71


Character Escaping

The primary difference between QueryString and ExtendedDisMax generators lies in character escaping. QueryString requires escaping for additional characters that represent comparison operators in Elasticsearch.

Escaping Comparison


QueryString Word Escaping

The Lucene\QueryString\Word visitor escapes the following characters:

CharacterReasonExample InputExample Output
+Mandatory operator\+\+
-Prohibited operator\-\-
=Comparison operator\=\\\=
>Comparison operator\>\\\>
<Comparison operator\<\\\<
&&AND operator\&&\\\&&
``OR operator
!NOT operator\!\!
(, )Grouping\(\(
{, }Range queries\{\\\{
[, ]Range queries\[\\\[
^Boost operator\^\\\^
"Phrase delimiter\"\"
~Fuzzy/proximity\~\\\~
*Wildcard\*\\\*
?Wildcard\?\\\?
:Field delimiter\:\:
/Regex delimiter\/\\\/
\Escape character\\\\
Whitespace\ \

The implementation uses a regular expression to identify and escape these characters:


Note the inclusion of \\=, \\>, and \\< in the pattern, which are not present in the ExtendedDisMax escaping pattern.

Sources: lib/Languages/Galach/Generators/Lucene/QueryString/Word.php1-27 lib/Languages/Galach/Generators/Lucene/ExtendedDisMax/Word.php1-27 tests/Galach/Generators/QueryStringTest.php13-32


Field Mapping

QueryString uses the same field mapping mechanism as ExtendedDisMax. Field mapping translates Galach domain prefixes to Elasticsearch field names.

Field Mapping Configuration

Visitors that support field mapping (Group, Phrase, and Word) accept two constructor parameters:

ParameterTypeDescription
$fieldMaparrayMaps domain names to Elasticsearch field names
$defaultFieldstringField name used for unrecognized or missing domains

Field Mapping Example


Translation Examples

Galach InputConfigurationElasticsearch Output
domain:onedomain → special_text_tspecial_text_t:one
unexpected:oneNo mapping, default: default_text_tdefault_text_t:one
domain:"phrase"domain → special_text_tspecial_text_t:"phrase"
domain:(group)domain → special_text_tspecial_text_t:(group)
#tagTag field: tags_mstags_ms:tag
@userUser field: user_suser_s:user

Sources: tests/Galach/Generators/ExtendedDisMaxTest.php16-20 tests/Galach/Generators/ExtendedDisMaxTest.php46-116


Translation Examples

The following table demonstrates complete query translations from Galach to Elasticsearch Query String format:

Basic Terms and Operators

Galach InputElasticsearch OutputDescription
oneoneSimple word
'one''one'Single-quoted word
"one""one"Phrase
one twoone twoMultiple words
(one two)(one two)Grouped terms
one AND twoone AND twoLogical AND
one && twoone AND twoAlternative AND syntax
one OR twoone OR twoLogical OR
one || twoone OR twoAlternative OR syntax
NOT oneNOT oneLogical NOT
!oneNOT oneAlternative NOT syntax
+one+oneMandatory term
-one-oneProhibited term

Special Syntax

Galach InputElasticsearch OutputDescription
@useruser_s:userUser mention (with user_s configured)
#tagtags_ms:tagTag reference (with tags_ms configured)
domain:onespecial_text_t:oneDomain-scoped term (with mapping)
unexpected:onedefault_text_t:oneUnmapped domain uses default field

Escape Sequences

Galach InputElasticsearch OutputDescription
\\\\Escaped backslash
\\+\\+Escaped mandatory operator
\\-\\-Escaped prohibited operator
\\=\\\\\\\=Escaped equals (QueryString-specific)
\\>\\\\\\\>Escaped greater-than (QueryString-specific)
\\<\\\\\\\<Escaped less-than (QueryString-specific)
\\*\\\\\\\*Escaped wildcard
\\?\\\\\\\?Escaped wildcard
\\!\\!Escaped NOT operator
\\ \\ Escaped space

Sources: tests/Galach/Generators/ExtendedDisMaxTest.php22-201 tests/Galach/Generators/QueryStringTest.php13-32


Usage Example

The following example demonstrates complete QueryString generator usage:

Step-by-Step Generation


Complete Code Flow


Sources: tests/Galach/Generators/QueryStringTest.php38-71 tests/Galach/Generators/ExtendedDisMaxTest.php210-222


Comparison with ExtendedDisMax

The QueryString and ExtendedDisMax generators share nearly identical implementations, differing only in the Word visitor escaping logic:

Similarities

  • Same visitor collection (except Word visitor)
  • Identical field mapping mechanism
  • Same handling of logical operators, grouping, phrases
  • Same translation of special syntax (#tag, @user)

Key Difference

AspectExtendedDisMaxQueryString
Word VisitorLucene\ExtendedDisMax\WordLucene\QueryString\Word
Escape Pattern`/(\+|-|&&|\\
Extra Escaped CharsNone=, >, <
Documentation LinkLucene 5.0 QueryParserLucene 6.5 QueryParser

This design allows maximum code reuse while accommodating backend-specific escaping requirements.

Sources: lib/Languages/Galach/Generators/Lucene/ExtendedDisMax/Word.php19-26 lib/Languages/Galach/Generators/Lucene/QueryString/Word.php19-26