VOOZH about

URL: https://deepwiki.com/netgen/query-translator/5-query-generation

⇱ Query Generation | netgen/query-translator | DeepWiki


Loading...
Menu

Query Generation

Query Generation is the final stage in the QueryTranslator processing pipeline. After a query string has been tokenized and parsed into an Abstract Syntax Tree (AST), the generator component converts this AST into query formats that can be understood by different search backends like Solr and Elasticsearch. This page documents the Query Generation subsystem architecture, components, and functionality.

For information about the tokenization process, see Tokenization, and for details on parsing, see The Galach Language.

Generator Architecture

The Query Generation subsystem follows a clean design using the Visitor pattern to traverse through the nodes of the syntax tree and produce the appropriate query string for the target backend.


Sources: lib/Languages/Galach/Generators/ExtendedDisMax.php1-37 lib/Languages/Galach/Generators/QueryString.php1-37

Generator Types

The QueryTranslator provides three main types of generators:

  1. Native Generator - Produces output in the original Galach query language format
  2. Solr ExtendedDisMax Generator - Generates queries compatible with Solr's ExtendedDisMax query parser
  3. Elasticsearch QueryString Generator - Produces queries in Elasticsearch's Query String format

Each generator uses the same fundamental approach but produces different output formats suited to the target search backend.


Sources: tests/Galach/Generators/ExtendedDisMaxTest.php1-262 tests/Galach/Generators/QueryStringTest.php1-73

Visitor Pattern Implementation

The Query Generation subsystem uses the Visitor pattern to traverse the syntax tree and generate appropriate query strings. This approach allows for separation of algorithm from the object structure it operates on, making the system highly extensible.


Sources: tests/Galach/Generators/ExtendedDisMaxTest.php227-261 tests/Galach/Generators/QueryStringTest.php37-70

Solr ExtendedDisMax Generator

The ExtendedDisMax Generator produces queries compatible with Solr's Extended DisMax query parser. It handles field mappings, logical operators, and special characters according to Solr's syntax requirements.

Key Components

The ExtendedDisMax generator contains:

  1. A visitor object (typically an Aggregate visitor)
  2. A generate() method that accepts a syntax tree and returns a formatted query string

Visitor Configuration for Solr

The ExtendedDisMax generator typically uses the following visitors:

VisitorPurpose
ProhibitedHandles -term syntax
GroupProcesses grouped expressions (terms)
MandatoryHandles +term syntax
LogicalAndProcesses AND and && operators
LogicalNotProcesses NOT and ! operators
LogicalOrProcesses OR and `
PhraseHandles quoted phrases
QueryProcesses the root query node
TagHandles #tag syntax
UserHandles @user syntax
WordProcesses simple word terms

Sources: lib/Languages/Galach/Generators/ExtendedDisMax.php1-37 tests/Galach/Generators/ExtendedDisMaxTest.php227-261

Elasticsearch QueryString Generator

The QueryString Generator produces queries compatible with Elasticsearch's Query String Query format. It extends the functionality of the ExtendedDisMax generator with additional handling for Elasticsearch-specific syntax.

Key Components

Like the ExtendedDisMax generator, the QueryString generator:

  1. Contains a visitor object
  2. Implements a generate() method

Elasticsearch-Specific Handling

The QueryString generator handles additional special characters specific to Elasticsearch:

  • = (equals sign)
  • > (greater than)
  • < (less than)

These characters must be properly escaped in Elasticsearch queries but not necessarily in Solr queries.

Sources: lib/Languages/Galach/Generators/QueryString.php1-37 tests/Galach/Generators/QueryStringTest.php1-73

Query Generation Process

The query generation process flows through several stages, converting a syntax tree into a backend-specific query string.


Sources: lib/Languages/Galach/Generators/ExtendedDisMax.php33-36 lib/Languages/Galach/Generators/QueryString.php33-36

Field Handling and Mapping

One important function of the generators is mapping domain fields in the query to actual field names in the search backend. This is done through field mapping configuration provided to the appropriate visitors.


Sources: tests/Galach/Generators/ExtendedDisMaxTest.php227-261 tests/Galach/Generators/QueryStringTest.php37-70

Special Character Handling

Both ExtendedDisMax and QueryString generators handle special characters by properly escaping them according to the requirements of their respective search backends.


For example, the backslash character (\) in a query is escaped as \\ in Solr's ExtendedDisMax but may require different escaping in Elasticsearch's Query String format.

Sources: tests/Galach/Generators/ExtendedDisMaxTest.php117-200 tests/Galach/Generators/QueryStringTest.php17-30

Integration with Query Processing Pipeline

The query generation is the final step in the QueryTranslator processing pipeline:


Sources: lib/Languages/Galach/Generators/ExtendedDisMax.php1-37 lib/Languages/Galach/Generators/QueryString.php1-37

Summary

The Query Generation subsystem in QueryTranslator provides a flexible, extensible framework for converting abstract syntax trees into backend-specific query formats. By leveraging the Visitor pattern and a modular architecture, it allows for customization and extension to support different search backends and query formats.

For more detailed information about specific generators, see: