Every "math evaluator in N lines" article either uses shunting-yard (Dijkstra's stack-based RPN converter) or hand-waves over the hard cases (
-2^2, right associativity, error positions). This one uses recursive descent because the grammar and the code map 1-to-1, which makes the whole thing easier to read and write about. ~250 lines of JS, 36 unit tests, supports functions, variables, constants, parentheses, and reports errors with a column number and a caret.
π expression-eval UI: math input
π Demo: https://sen.ltd/portfolio/expression-eval/
π¦ GitHub: https://github.com/sen-ltd/expression-eval
Why recursive descent (not shunting-yard)
Both work. Shunting-yard is more compact (one for loop, two stacks). But for explaining what's happening, recursive descent wins:
- Grammar β code is 1-to-1. You can read the BNF and the implementation in parallel and they look the same.
-
Stack traces show the AST shape. When you're debugging "why is
-2^2returning 4 instead of -4", the call stack literally is the parse tree. -
Error positions are natural. The "current token" is always explicit, so reporting "unexpected
)at column 6" falls out for free.
The grammar:
expression := additive
additive := multiplicative (('+' | '-') multiplicative)*
multiplicative := unary (('*' | '/' | '%') unary)*
unary := ('-' | '+')? power
power := atom ('^' unary)?
atom := number | identifier | identifier '(' args? ')' | '(' expression ')'
args := expression (',' expression)*
Operator precedence is encoded in the nesting depth. That's the whole magic. The deeper a rule is, the tighter it binds.
Trap 1: making ^ right-associative
2^3^2 should be 2^(3^2) = 2^9 = 512, not (2^3)^2 = 64.
For left-associative operators (+ - * / %) the parser builds a left-leaning tree with a while loop:
parseAdditive() {
let left = this.parseMultiplicative();
while (current is '+' or '-') {
const right = this.parseMultiplicative();
left = { type: "binop", op, left, right }; // accumulate on the LEFT
}
return left;
}
For the right-associative ^, switch from loop to recursion:
parsePower() {
const base = this.parseAtom();
if (current is '^') {
consume();
const exponent = this.parseUnary(); // RECURSE
return { type: "binop", op: "^", left: base, right: exponent };
}
return base;
}
Recursion (not iteration) means the AST for 2^3^2 looks like:
binop ^
number 2
binop ^
number 3
number 2
DFS evaluation walks the inner 3^2 = 9 first, then 2^9 = 512.
Pin both sides:
test("^ is right-associative", () => {
assert.equal(evaluateExpression("2^3^2"), 512); // not 64
});
test("parse produces right-associative AST for '2^3^2'", () => {
const ast = parse("2^3^2");
assert.equal(ast.op, "^");
assert.equal(ast.left.value, 2);
assert.equal(ast.right.op, "^");
assert.equal(ast.right.left.value, 3);
assert.equal(ast.right.right.value, 2);
});
Trap 2: -2^2 = -4 (unary minus binds LOOSER than ^)
This is where calculator implementations disagree:
| Tool | -2^2 |
Convention |
|---|---|---|
| WolframAlpha | -4 |
^ > unary -
|
Python (**) |
-4 |
** > unary -
|
| Mathematica | -4 |
^ > unary -
|
| Excel | 4 |
unary - > ^ β |
| Google Sheets | 4 |
unary - > ^ β |
| Mathematical convention | -4 |
^ > unary -
|
The math convention is -2^2 = -(2^2) = -4. Excel is the outlier (probably for backward compatibility with VisiCalc-era spreadsheets).
To get this right, ^ must bind tighter than unary -. That's achieved by the grammar nesting:
unary := ('-' | '+')? power β unary sees '-' first, then drops down to power
power := atom ('^' unary)? β but power's right-hand side can recurse to unary
Walking -2^2 through the parser:
-
parseUnary()sees-, consumes it, callsparsePower() -
parsePower()callsparseAtom(), reads2 - Next token is
^, consume, recurse toparseUnary()for the exponent - The recursion reads
2 -
parsePower()returns{ ^, 2, 2 } -
parseUnary()wraps:{ unary -, { ^, 2, 2 } }β evaluates to-(2^2) = -4β
The reason the right-hand side of ^ recurses to unary (instead of parsePower() or parseAtom()) is to also allow negative exponents: 2^-3 = 1/8. Both rules β -2^2 = -4 and 2^-3 = 0.125 β fall out of the same single grammar choice.
test("^ binds tighter than * but unary minus binds looser than ^", () => {
assert.equal(evaluateExpression("-2^2"), -4); // -(2^2)
assert.equal(evaluateExpression("(-2)^2"), 4); // explicit parens win
assert.equal(evaluateExpression("2 * 3^2"), 18); // 2 * (3^2)
});
test("^ accepts unary minus on the exponent side", () => {
assert.equal(evaluateExpression("2^-3"), 0.125); // 2^(-3)
});
If a user wants 4, they write (-2)^2. The parens are non-optional and that's the right design.
Trap 3: numeric literal tokenisation order matters
Numbers seem easy, but the edge cases bite:
-
42β integer -
3.14β float -
.5β leading-dot float -
1e3β scientific (= 1000) -
1.5e-2β negative exponent (= 0.015) -
1eβ error: exponent has no digits -
1e+β error: sign with no digits
The naΓ―ve "scan while character matches [0-9.eE+-]" approach breaks on 1+2 because the + between numbers gets pulled into the first literal.
The fix is to scan state by state in fixed order:
// 1. integer part (optional if dot follows)
while (i < length && isDigit(source[i])) i++;
// 2. fractional part (optional)
if (source[i] === ".") {
i++;
while (i < length && isDigit(source[i])) i++;
}
// 3. exponent (optional)
if (source[i] === "e" || source[i] === "E") {
i++;
if (source[i] === "+" || source[i] === "-") i++;
const expStart = i;
while (i < length && isDigit(source[i])) i++;
if (i === expStart) throw new ExpressionError("exponent has no digits", start);
}
The exponent can have a sign, but after the sign you need at least one digit β 1e and 1e+ are invalid literals, not "1 followed by an identifier e". Once you've committed to scanning a number, partial exponents are syntax errors:
test("tokenize rejects exponent without digits", () => {
assert.throws(() => tokenize("1e"), ExpressionError);
assert.throws(() => tokenize("1e+"), ExpressionError);
});
Trap 4: error positions with a caret
"Syntax error" is hostile. Give the column and a visual pointer:
export class ExpressionError extends Error {
constructor(message, position) {
super(message);
this.name = "ExpressionError";
this.position = position; // 0-indexed column
}
}
Inside parseAtom(), throw with the token's position:
throw new ExpressionError(`unexpected "${t.value}"`, t.position);
In the UI, render a caret:
const caret = "".repeat(Math.max(0, position)) + "^";
els.error.textContent = `${msg}\n${source}\n${caret}`;
The user sees:
parse: unexpected ")"
1 + + )
^
Pin the position in tests:
test("error positions are reported", () => {
try {
parse("1 + + )");
} catch (e) {
assert.ok(e instanceof ExpressionError);
assert.equal(e.position, 6); // the ")" column
return;
}
assert.fail("expected ExpressionError");
});
Function calls as part of atom
When the parser sees an identifier, it peeks at the next token. If it's (, it's a function call:
parseAtom() {
// ...
if (t.type === "ident") {
this.next();
if (peek is "(") {
this.next();
const args = [];
if (peek is not ")") {
args.push(this.parseAdditive());
while (peek is ",") {
this.next();
args.push(this.parseAdditive());
}
}
consume(")");
return { type: "call", name: t.value, args };
}
return { type: "ident", name: t.value };
}
// ...
}
Variadic functions (min, max, hypot) fall out for free β the args loop already accepts any count. The evaluator just spreads:
const FUNCTIONS = {
min: (...xs) => Math.min(...xs),
max: (...xs) => Math.max(...xs),
hypot: (...xs) => Math.hypot(...xs),
// ...
};
Variables override constants
CONSTANTS = { pi, e, tau, ... } is hard-coded, but user-supplied variables win:
case "ident": {
if (Object.prototype.hasOwnProperty.call(variables, node.name)) {
return Number(variables[node.name]); // variables first
}
if (Object.prototype.hasOwnProperty.call(CONSTANTS, node.name)) {
return CONSTANTS[node.name];
}
throw new ExpressionError(`undefined identifier "${node.name}"`, 0);
}
Two design choices worth calling out:
-
hasOwnProperty.call(obj, key)instead ofkey in objto avoid prototype pollution β withouthasOwnProperty, a user variable named__proto__would be intercepted by the prototype chain. -
User variables override constants because the user's intent is more specific.
e = 10makes sense as an override (e.g., "the elasticity coefficient is named e in my problem").
test("variables override constants if same name", () => {
assert.equal(evaluateExpression("e + 1", { e: 10 }), 11);
});
Edge cases: division by zero, sqrt of negative
I let JavaScript's defaults through:
-
1 / 0βInfinity(no throw) -
sqrt(-1)βNaN -
min(3, 1, 2)β1
The display layer formats them:
export function formatResult(value) {
if (Number.isNaN(value)) return "NaN";
if (!Number.isFinite(value)) return value > 0 ? "β" : "ββ";
if (Number.isInteger(value) && Math.abs(value) < 1e15) return String(value);
return String(Number(value.toPrecision(12)));
}
Number(value.toPrecision(12)) is the trick to make 0.1 + 0.2 = 0.3 display cleanly. toPrecision(12) rounds floats to 12 significant digits then Number() strips trailing zeros β so 0.30000000000000004 shows as 0.3.
TL;DR
- Recursive descent beats shunting-yard for readability. Grammar maps 1-to-1 to code.
-
Right-associative
^is recursion (not loop) on the operator's RHS. -
-2^2 = -4is the math convention; it falls out naturally if^(power) sits a level below unary minus in the grammar. - Numeric literals need state-machine scanning (integer β fractional β exponent), not a regex.
- Errors carry a column and the UI renders a caret. Pin positions in tests.
- Function calls are atom-level; variadic functions work for free.
-
Variables override constants with
hasOwnProperty.callto avoid proto pollution.
Source: https://github.com/sen-ltd/expression-eval β MIT, ~250 lines of JS + 36 unit tests, zero dependencies, no build step.
π Built by SEN LLC as part of an ongoing series of small, focused developer tools. Browse the full portfolio for more.
For further actions, you may consider blocking this person and/or reporting abuse
