VOOZH about

URL: https://deepwiki.com/netgen/query-translator/5.5-lucene-generators-common-components

⇱ Lucene Generators Common Components | netgen/query-translator | DeepWiki


Loading...
Menu

Lucene Generators Common Components

Purpose and Scope

This document describes the common components shared between Lucene-based query generators in the QueryTranslator library. These components provide foundational functionality used by both the Solr ExtendedDisMax generator (covered in Solr ExtendedDisMax Generator) and the Elasticsearch QueryString generator (covered in Elasticsearch QueryString Generator). The common components primarily consist of visitor implementations that traverse the syntax tree and generate appropriate Lucene-compatible query strings.

Sources: lib/Languages/Galach/Generators/Lucene/Common/Group.php lib/Languages/Galach/Generators/Lucene/Common/WordBase.php lib/Languages/Galach/Generators/Lucene/Common/Phrase.php

Common Components Architecture

The Lucene generators utilize a visitor pattern to traverse the abstract syntax tree (AST) and generate backend-specific query formats. The common components reside in the Lucene/Common namespace and provide implementations that can be reused across different Lucene-based generators.

Common Components Hierarchy


Sources: tests/Galach/Generators/LuceneVisitorDispatchTest.php5-20 lib/Languages/Galach/Generators/Lucene/Common/Group.php lib/Languages/Galach/Generators/Lucene/Common/WordBase.php lib/Languages/Galach/Generators/Lucene/Common/Phrase.php

Visitor Pattern Implementation

The Lucene generators use the visitor pattern to traverse the AST and transform each node into a Lucene query format. Each visitor is responsible for handling a specific type of node in the syntax tree.

Visitor Query Generation Flow


Sources: tests/Galach/Generators/AggregateVisitorDispatchTest.php tests/Galach/Generators/LuceneVisitorDispatchTest.php

The Aggregate Visitor

The Aggregate visitor serves as a coordinator, dispatching nodes to the appropriate visitor implementations. It accepts all node types and delegates the visit operation to the first visitor that accepts the node.

Key characteristics:

  • The Aggregate visitor always returns true for accept() method
  • It maintains a collection of visitors that can handle specific node types
  • If no visitor is found for a node, it throws a RuntimeException

Sources: tests/Galach/Generators/AggregateVisitorDispatchTest.php15-32

Common Term Visitors

Term visitors handle the transformation of term nodes (words, phrases, tags, and users) in the AST to their Lucene query equivalents.

WordBase

WordBase is an abstract base class for word visitors. It provides common functionality for handling domain mapping and field prefixing but leaves the word escaping logic to be implemented by concrete subclasses.

Key features:

  • Domain to field mapping
  • Default field name handling
  • Abstract escapeWord() method to be implemented by subclasses

Sources: lib/Languages/Galach/Generators/Lucene/Common/WordBase.php14-98

Phrase Visitor

The Phrase visitor handles phrase tokens, generating queries with quoted strings.

Key features:

  • Handles domain to field mapping
  • Escapes special characters in phrases
  • Adds quotation marks around phrases

Sources: lib/Languages/Galach/Generators/Lucene/Common/Phrase.php14-89

Tag and User Visitors

The Tag and User visitors handle special term types (#tag and @user), following similar patterns to the Word and Phrase visitors.

Sources: tests/Galach/Generators/LuceneVisitorDispatchTest.php82-90

Common Structural Visitors

Structural visitors handle the organization of query components.

Group Visitor

The Group visitor handles group nodes, which represent parenthesized expressions in the query.

Key features:

  • Processes all child nodes using a sub-visitor
  • Adds parentheses around the generated query string
  • Handles domain-to-field mapping for prefixing groups

Sources: lib/Languages/Galach/Generators/Lucene/Common/Group.php14-91

Query Visitor

The Query visitor is the entry point for processing the entire query node.

Key features:

  • Processes all child nodes
  • Concatenates results with spaces
  • Serves as the root visitor for a query

Sources: tests/Galach/Generators/LuceneVisitorDispatchTest.php77-80

Common Logical Operator Visitors

Logical operator visitors transform logical relationships in the AST into Lucene query syntax.

LogicalAnd, LogicalOr, and LogicalNot Visitors

These visitors handle the corresponding logical operations:

  • LogicalAnd: Generates AND or && operator syntax
  • LogicalOr: Generates OR or || operator syntax
  • LogicalNot: Generates NOT or ! operator syntax

Sources: tests/Galach/Generators/LuceneVisitorDispatchTest.php46-55

Mandatory and Prohibited Visitors

These visitors handle special unary operators:

  • Mandatory: Generates + prefix for required terms
  • Prohibited: Generates - prefix for excluded terms

Sources: tests/Galach/Generators/LuceneVisitorDispatchTest.php62-75

Domain Field Mapping

Many of the common components support domain field mapping, which is a mechanism to map domain names in the input query to field names in the backend search engine.

How Domain Field Mapping Works


Domain field mapping allows for:

  1. Customization of which fields are searched
  2. Mapping between user-friendly domain names and internal field names
  3. Providing default field names when no mapping exists

Usage example:


Sources: lib/Languages/Galach/Generators/Lucene/Common/Group.php21-41 lib/Languages/Galach/Generators/Lucene/Common/WordBase.php21-41 lib/Languages/Galach/Generators/Lucene/Common/Phrase.php21-41

Visitor Dispatch Logic

The visitors implement strict type checking to ensure they only process nodes and tokens they are designed for. When a visitor receives an incompatible node, it throws a LogicException with a descriptive message.

Example exceptions:

  • "Implementation accepts instance of Group Node"
  • "Implementation accepts instance of Term Node"
  • "Implementation accepts instance of Word Token"
  • "Implementation requires sub-visitor"

This ensures that type errors are caught early and facilitates debugging when visitors are incorrectly used.

Sources: tests/Galach/Generators/LuceneVisitorDispatchTest.php36-223