VOOZH about

URL: https://deepwiki.com/hypervel/translation/7-message-selection-and-pluralization

⇱ Message Selection and Pluralization | hypervel/translation | DeepWiki


Loading...
Menu

Message Selection and Pluralization

Purpose and Scope

This document covers the MessageSelector class, which implements pluralization logic for over 150 locales and handles message selection based on numeric values. The system provides two complementary selection mechanisms: explicit inline conditions embedded in translation strings, and automatic locale-specific pluralization rules derived from Unicode CLDR data.

For information about how translations are loaded and cached, see Translation Data Loading. For details on the Translator class that invokes the message selector, see Core Translation System.

System Overview

The message selection system determines which variant of a translation string to use based on a numeric value and locale. Translation strings can contain multiple variants separated by pipe characters (|), and the system selects the appropriate variant using either inline conditions or linguistic pluralization rules.


Sources: src/MessageSelector.php14-31

The MessageSelector Class

The MessageSelector class is a standalone component with no dependencies beyond the Hypervel Support Collection class. It exposes a single public method choose() and implements all selection logic internally.

MethodVisibilityPurpose
choose()publicMain entry point - selects translation variant based on number and locale
extract()privateAttempts to match inline conditions across all segments
extractFromString()privateParses and evaluates a single inline condition
stripConditions()privateRemoves inline condition syntax from segments
getPluralIndex()publicReturns the plural form index for a given locale and number

Sources: src/MessageSelector.php9-398

Inline Condition Syntax

Inline conditions allow explicit control over message selection, overriding automatic pluralization rules. Conditions are enclosed in curly braces {} for exact matches or square brackets [] for ranges.

Condition Types

SyntaxExampleMatches
{n}{0} no itemsExactly n
[n,m][2,5] a few itemsBetween n and m inclusive
[n,*][10,*] many itemsn or greater
[*,m][*,5] up to fivem or less

Parsing Logic

The inline condition parser uses a regular expression to extract conditions and their associated text from each segment:


Sources: src/MessageSelector.php50-77

Range Evaluation

Range conditions support three types of comparisons:

  1. Closed Range [n,m]: Number must be between n and m (inclusive)
  2. Open-Ended Upper [n,*]: Number must be n or greater
  3. Open-Ended Lower [*,m]: Number must be m or less

The evaluation logic checks for the presence of the wildcard character * and applies the appropriate comparison:

if (contains comma) {
 [from, to] = split by comma
 
 if (to == '*' && number >= from)
 return value
 
 if (from == '*' && number <= to)
 return value
 
 if (number >= from && number <= to)
 return value
}
else {
 if (condition == number)
 return value
}

Sources: src/MessageSelector.php62-76

Locale-Specific Pluralization Rules

When no inline condition matches, the system falls back to locale-specific pluralization rules implemented in getPluralIndex(). These rules determine which segment to select based on linguistic plurality rules for 150+ locales.

Linguistic Categories

The system implements six major categories of pluralization rules:

Category 1: No Plural Distinction

Languages with no grammatical plural forms always return index 0, regardless of the number.

Locales: az, bo, dz, id, ja, jv, ka, km, kn, ko, ms, th, tr, vi, zh (and variants)

return 0;

Sources: src/MessageSelector.php98-133

Category 2: Simple Singular/Plural

Most Germanic and Romance languages distinguish only between singular (n=1) and plural (n≠1).

Locales: en, de, es, pt, it, nl, sv, da, no, af, bg, ca, el, et, eu, fi, fo, fur, fy, gl, gu, ha, he, hu, is, ku, lb, ml, mn, mr, nah, nb, ne, nn, om, or, pa, pap, ps, so, sq, sw, ta, te, tk, ur, zu (and variants)

return (number == 1) ? 0 : 1;

Sources: src/MessageSelector.php134-296

Category 3: Zero/One vs Others

Languages like French treat zero and one similarly, distinct from other numbers.

Locales: am, bh, fil, fr, gun, hi, hy, ln, mg, nso, ti, wa, xbr (and variants)

return ((number == 0) || (number == 1)) ? 0 : 1;

Sources: src/MessageSelector.php297-325

Category 4: Slavic Complex Rules

Slavic languages employ complex modulo-based rules with 3 or more forms.

Example - Russian/Ukrainian/Belarusian:

  • Form 0: Numbers ending in 1 (but not 11): 1, 21, 31, ...
  • Form 1: Numbers ending in 2-4 (but not 12-14): 2, 3, 4, 22, 23, 24, ...
  • Form 2: Everything else: 0, 5-20, 25-30, ...
return ((number % 10 == 1) && (number % 100 != 11)) 
 ? 0 
 : (((number % 10 >= 2) && (number % 10 <= 4) 
 && ((number % 100 < 10) || (number % 100 >= 20))) 
 ? 1 
 : 2);

Sources: src/MessageSelector.php326-340

Other Slavic Variants:

LocaleFormsRule
cs, sk3n=1, n=2-4, else
pl3n=1, n%10=2-4 (excluding 12-14), else
sl4n%100=1, n%100=2, n%100=3-4, else
lt3n%10=1 (not 11), n%10=2-9 (not 12-19), else

Sources: src/MessageSelector.php341-366

Category 5: Special Cases

Several languages have unique pluralization rules:

Gaelic (ga): Three forms based on exact matches (n=1, n=2, else)

Maltese (mt): Four forms based on complex range rules:

return (number == 1) 
 ? 0 
 : (((number == 0) || ((number % 100 > 1) && (number % 100 < 11))) 
 ? 1 
 : (((number % 100 > 10) && (number % 100 < 20)) 
 ? 2 
 : 3));

Welsh (cy): Four forms (n=1, n=2, n=8 or n=11, else)

Sources: src/MessageSelector.php346-372

Category 6: Arabic

Arabic has six distinct plural forms based on numeric ranges:

FormConditionExample
0n = 0"zero items"
1n = 1"one item"
2n = 2"two items"
3n % 100 ∈ [3,10]"three items"
4n % 100 ∈ [11,99]"eleven items"
5else"hundreds of items"
return (number == 0) 
 ? 0 
 : ((number == 1) 
 ? 1 
 : ((number == 2) 
 ? 2 
 : (((number % 100 >= 3) && (number % 100 <= 10)) 
 ? 3 
 : (((number % 100 >= 11) && (number % 100 <= 99)) 
 ? 4 
 : 5))));

Locales: ar and 21 regional variants (ar_AE, ar_SA, ar_EG, etc.)

Sources: src/MessageSelector.php373-393

Default Behavior

For any locale not explicitly listed, the system returns index 0, effectively treating the language as having no plural distinction.

Sources: src/MessageSelector.php394-396

Selection Process Flow

The complete message selection process follows a prioritized cascade:


Sources: src/MessageSelector.php14-31

Integration with Translator

The Translator class invokes MessageSelector::choose() through its choice() method. Before calling the selector, the translator extracts the numeric count from various input types:


The translator also automatically injects a :count placeholder into the replacement array, allowing translations to reference the numeric value without manual injection.

Sources: src/Translator.php128-157 (referenced from context)

Example Translation Strings

Inline Conditions

Basic Exact Matches:

{0} No items|{1} One item|[2,*] :count items

Complex Range Conditions:

[0,0] Empty cart|{1} Single item|[2,5] A few items|[6,19] Several items|[20,*] Many items

Mixed Inline and Linguistic:

{0} Special case for zero|One item|:count items

When no inline condition matches, falls back to locale rules for indices 1 and 2.

Locale-Specific Examples

English (2 forms):

One apple|:count apples

French (2 forms, but 0 and 1 use same form):

Aucune pomme ou une pomme|:count pommes

Russian (3 forms):

:count яблоко|:count яблока|:count яблок
  • Form 0: 1, 21, 31, 41, ... (ends in 1, not 11)
  • Form 1: 2-4, 22-24, 32-34, ... (ends in 2-4, not 12-14)
  • Form 2: 0, 5-20, 25-30, ... (everything else)

Arabic (6 forms):

لا تفاح|تفاحة واحدة|تفاحتان|:count تفاحات|:count تفاحة|:count تفاح
  • Form 0: 0 apples
  • Form 1: 1 apple
  • Form 2: 2 apples
  • Form 3: 3-10 apples
  • Form 4: 11-99 apples
  • Form 5: 100+ apples

Attribution and License

The pluralization rules in getPluralIndex() are derived from the Zend Framework (2010-09-25) and are subject to the New BSD License. Copyright (c) 2005-2010 - Zend Technologies USA Inc.

Sources: src/MessageSelector.php91-95