VOOZH about

URL: https://deepwiki.com/netgen/query-translator/3.1-syntax-and-features

⇱ Syntax and Features | netgen/query-translator | DeepWiki


Loading...
Menu

Syntax and Features

This page documents the complete syntax specification of the Galach query language, including all supported term types, operators, grouping constructs, domain prefixes, operator precedence rules, and special character handling. This is a reference for understanding what queries can be written in Galach format.

For information about how this syntax is tokenized into Token objects, see Tokenization. For details on how tokens are parsed into a SyntaxTree, see Parser. For the overall language introduction, see The Galach Language.

Syntax Element to Token Type Mapping

The following diagram maps each syntactic element in Galach to its corresponding token type constant defined in the Tokenizer class:


Sources: lib/Languages/Galach/README.md260-387 lib/Languages/Galach/SYNTAX.md1-289

Terms

Terms represent the fundamental search elements in Galach. All term types are recognized as Tokenizer::TOKEN_TERM but are instantiated as different Token subclasses based on their syntax.

Term Types

Term TypeToken ClassSyntax PatternExampleDomain Support
WordValues/Token/WordUnquoted sequence without whitespaceword, another\ wordYes
PhraseValues/Token/PhraseText enclosed in double quotes "...""reality exists"Yes
TagValues/Token/Tag# followed by alphanumeric/underscore/hyphen/dot#php, #PHP-7.1, #query_parserNo (implicit domain)
UserValues/Token/User@ followed by alphanumeric/underscore/hyphen/dot@joe.watt, @_alice83, @The-RonaldNo (implicit domain)

Sources: lib/Languages/Galach/SYNTAX.md3-66 lib/Languages/Galach/README.md263-287

Word Terms

Word terms are defined as strings not containing whitespace. Whitespace can be included if explicitly escaped with a backslash \.

word
another\ word

Sources: lib/Languages/Galach/SYNTAX.md5-12

Phrase Terms

Phrase terms are formed by enclosing text within double quotation marks ". Phrases can contain multiple words and spaces without escaping.

"reality exists"
"what's not real doesn't exist"

Sources: lib/Languages/Galach/SYNTAX.md14-21

User Terms

User terms are defined by the leading @ character, followed by at least one alphanumeric or underscore character, then an arbitrary sequence of alphanumeric characters, hyphens, underscores, and dots.

Regular expression pattern:

@[a-zA-Z0-9_][a-zA-Z0-9_\-.]*

Examples:

@joe.watt
@_alice83
@The-Ronald

Sources: lib/Languages/Galach/SYNTAX.md23-43

Tag Terms

Tag terms are defined by the leading # character, followed by at least one alphanumeric or underscore character, then an arbitrary sequence of alphanumeric characters, hyphens, underscores, and dots.

Regular expression pattern:

\#[a-zA-Z0-9_][a-zA-Z0-9_\-.]*

Examples:

#php
#PHP-7.1
#query_parser

Sources: lib/Languages/Galach/SYNTAX.md45-65

Operators

Operators combine or modify terms to express more complex search logic. Galach supports both binary operators (requiring left and right operands) and unary operators (requiring a single right operand).

Operator Token Types

OperatorFormsToken ConstantArityWhitespace RequiredExample
Logical ANDAND, &&TOKEN_LOGICAL_ANDBinaryYescoffee AND milk
Logical OROR, ||TOKEN_LOGICAL_ORBinaryYespotato OR tomato
Logical NOTNOTTOKEN_LOGICAL_NOTUnaryYesNOT important
Shorthand NOT!TOKEN_LOGICAL_NOT_2UnaryNo (adjacent)!important
Mandatory+TOKEN_MANDATORYUnaryNo (adjacent)+coffee
Prohibited-TOKEN_PROHIBITEDUnaryNo (adjacent)-cake

Sources: lib/Languages/Galach/SYNTAX.md67-128 lib/Languages/Galach/README.md298-354

Logical AND Operator

The Logical and binary operator combines left and right operands so that both must match. It is represented as either AND or && and must be separated from its operands by whitespace.

coffee AND milk
tea && lemon

Sources: lib/Languages/Galach/SYNTAX.md71-83

Logical OR Operator

The Logical or binary operator combines left and right operands so that at least one must match. It is represented as either OR or || and must be separated from its operands by whitespace.

potato OR tomato
true || false

Sources: lib/Languages/Galach/SYNTAX.md85-97

Logical NOT Operator

The Logical not unary operator modifies its operand so that it must not match. It comes in two forms:

  1. Long form NOT: Must be separated from its operand by whitespace

    NOT important
    
  2. Shorthand form !: Must be adjacent to its operand (no whitespace)

    !important
    

The shorthand form is recognized as Tokenizer::TOKEN_LOGICAL_NOT_2, while the long form is Tokenizer::TOKEN_LOGICAL_NOT.

Sources: lib/Languages/Galach/SYNTAX.md99-113 lib/Languages/Galach/README.md316-336

Mandatory Operator

The Mandatory unary operator modifies its operand so that it must match. It is represented by a plus sign + and must be placed adjacent to its operand (recognized as Tokenizer::TOKEN_MANDATORY).

+coffee

Sources: lib/Languages/Galach/SYNTAX.md115-120 lib/Languages/Galach/README.md338-345

Prohibited Operator

The Prohibited unary operator modifies its operand so that it must not match. It is represented by a minus sign - and must be placed adjacent to its operand (recognized as Tokenizer::TOKEN_PROHIBITED).

-cake

Sources: lib/Languages/Galach/SYNTAX.md122-127 lib/Languages/Galach/README.md347-354

Operator Precedence

Unary operators are applied first, followed by binary operators. Among binary operators, Logical and has higher precedence than Logical or.

Precedence order (highest to lowest):

  1. Unary operators: Logical not, Mandatory, Prohibited
  2. Binary operator: Logical and
  3. Binary operator: Logical or

Example equivalencies due to precedence:

one OR NOT two AND three

is equivalent to:

one OR ((NOT two) AND three)

Sources: lib/Languages/Galach/SYNTAX.md129-137

Grouping

Terms and expressions can be grouped using round brackets ( and ). Groups are processed as a whole, allowing explicit control over evaluation order that would otherwise follow operator precedence. The left delimiter is recognized as Tokenizer::TOKEN_GROUP_BEGIN and the right as Tokenizer::TOKEN_GROUP_END.

Default Precedence vs Explicit Grouping

These two queries are equivalent (following default precedence):

one OR NOT two AND three
one OR ((NOT two) AND three)

But grouping can change the meaning:

(one OR NOT two) AND three
one OR NOT (two AND three)

Sources: lib/Languages/Galach/SYNTAX.md138-157 lib/Languages/Galach/README.md356-378

Domains

Domains represent abstract categories or fields on which terms or groups apply. A domain is specified by prefixing a term or group with a domain string followed by a colon :.

Domain String Pattern

Regular expression:

[a-zA-Z_][a-zA-Z0-9_\-.]*

Domain strings must start with at least one alphanumeric or underscore character, followed by an arbitrary sequence of alphanumeric characters, hyphens -, underscores _, and dots ..

Domain Restrictions

  • Word terms: Domain prefix supported
  • Phrase terms: Domain prefix supported
  • Tag terms: Domain prefix not supported (Tag defines implicit domain)
  • User terms: Domain prefix not supported (User defines implicit domain)
  • Groups: Domain prefix supported

Domain Examples

type:aeroplane
title:"Language processor"
description:(wings AND propeller)

Invalid domain usage (treated as escaped):

domain:#tag
domain:@user

Sources: lib/Languages/Galach/SYNTAX.md159-185 lib/Languages/Galach/README.md264-266

Token and Node Type Mappings

The following diagram shows how token types are parsed into corresponding Node types in the SyntaxTree:


Sources: lib/Languages/Galach/README.md260-387

Special Characters and Escaping

Characters that are part of the Galach language syntax must be escaped with a backslash \ to be treated as literal characters rather than syntax elements.

Special Characters Requiring Escaping

CharacterNamePurpose in Syntax
(Left parenthesisGroup begin delimiter
)Right parenthesisGroup end delimiter
+Plus signMandatory operator
-Minus signProhibited operator
!Exclamation markShorthand NOT operator
"Double quotePhrase delimiter
#HashTag marker
@At signUser marker
:ColonDomain separator
\BackslashEscape character
SpaceWhitespace separator

Sources: lib/Languages/Galach/SYNTAX.md187-217

Escaping Examples

joined\ word
"escaped \"double quote\""
escaped \+operator domain\:word \@user \#tag \(and so on\)
double backslash \\ is a backslash escaped

Sources: lib/Languages/Galach/SYNTAX.md205-217

Automatic Escaping Rules

The tokenizer automatically treats certain special characters as literal (escaped) in specific contexts, eliminating the need for explicit escaping:

ContextRuleExampleEquivalent
Trailing colonColon at end of Wordword:word\:
Double colonColon after domain colondomain:domain:domaindomain:domain\:domain
Domain on Tag/UserDomain prefix on Tag/Userdomain:#tagdomain:\#tag
Operator after domain+, -, ! after domain colondomain:+worddomain:\+word
Mid-word operator+, -, ! in middle of Wordone+twoone\+two
Trailing operator+, -, ! at end of Wordone+one\+

Sources: lib/Languages/Galach/SYNTAX.md230-289

Phrase Escaping

Inside phrase delimiters "...", escaping is only required for the quotation marks themselves. All other special characters are treated as literal:

"+one -two"
"\+one \-two"

Both queries above are interpreted identically because operators inside phrases are always literal.

Sources: lib/Languages/Galach/SYNTAX.md219-228

Complete Syntax Reference

Syntax Cheat Sheet

The following represents all supported syntax elements in Galach:

word "phrase" (group) +mandatory -prohibited AND && OR || NOT ! #tag @user domain:term

Complete Query Example

cheese AND (bacon OR eggs) +type:breakfast

This query searches for documents containing "cheese" AND either "bacon" OR "eggs", with mandatory requirement that the type field contains "breakfast".

Sources: lib/Languages/Galach/README.md22-30

TokenExtractor Implementations

The Galach language provides two standard implementations of the TokenExtractor abstract class, each supporting different subsets of the complete syntax:

ImplementationPathDomain SupportTag/User SupportUse Case
FullTokenExtractor/Full.phpYesYesComplete Galach syntax
TextTokenExtractor/Text.phpNoNoSimplified text search

Both implementations define the syntax through regular expressions returned by the getExpressionTypeMap() method and create term tokens in the createTermToken() method.

Sources: lib/Languages/Galach/README.md422-426

Query Processing Example

The following diagram shows how a complete query string flows through tokenization to produce specific token types:


Sources: lib/Languages/Galach/README.md28-30 lib/Languages/Galach/README.md53-78