Architecture

The compilation pipeline — .sno source to native, WASM, or interpreted execution

Synoema turns a .sno source file into one of three runtime forms: a tree-walking interpretation (semantically authoritative), a native binary via Cranelift (3× CPython median), or a WASM v3 module (browser, IoT, capability container). The same six-stage front-end feeds all three back-ends. Async, TLS, and the capability container ride on top of the JIT runtime via FFI — they are orthogonal to the pipeline below.

The pipeline

Source (.sno)
    │
    ▼
┌───────────────────────────────────────────────────┐
│  Stage 1: Lexer  (synoema-lexer)                  │
│  scanner.rs + layout.rs                           │
│  Source text → token stream + INDENT/DEDENT       │
└────────────────────────┬──────────────────────────┘
                         ▼
┌───────────────────────────────────────────────────┐
│  Stage 2: Parser  (synoema-parser)                │
│  parser.rs — Pratt parsing, 25 ExprKind variants  │
│  Token stream → typed AST                         │
└────────────────────────┬──────────────────────────┘
                         ▼
┌───────────────────────────────────────────────────┐
│  Stage 3: Type Checker  (synoema-types)           │
│  infer.rs — Algorithm W (Hindley-Milner)          │
│  AST → type-annotated AST                         │
└────────────────────────┬──────────────────────────┘
                         ▼
┌───────────────────────────────────────────────────┐
│  Stage 4: Desugaring → Core IR  (synoema-core)    │
│  desugar.rs + core_ir.rs                          │
│  Typed AST → System F-like Core IR                │
└────────────────────────┬──────────────────────────┘
                         ▼
┌───────────────────────────────────────────────────┐
│  Stage 5: Optimizer  (synoema-core, 4 passes)     │
│  constant fold + DCE → e-graph saturation →       │
│  region annotation → Perceus RC insertion         │
└────────────────┬─────────────────┬────────────────┘
                 ▼                 ▼
┌────────────────────┐  ┌─────────────────────────────────────────┐
│  synoema-eval      │  │  synoema-codegen                         │
│  Tree-walking      │  │  Cranelift JIT → native (default)        │
│  interpreter       │  │  AOT native (--native, x86_64 + arm64)   │
│  (reference impl)  │  │  WASM v3 (sno wasm — IoT, browser)       │
└────────────────────┘  └─────────────────────────────────────────┘

         Async event loop (Phase G)        TLS / HTTPS (Phase 27)
         mio reactor + timer wheel +        rustls 0.23 ring backend +
         bounded file-IO pool +             tls_* builtins + http_*
         async TCP server (H1/H2)           https:// auto-upgrade

         Capability Container (Phase 32)
         sno:cc-base Docker image + cc-mcp pure-Synoema MCP server +
         apply_change 8-step pipeline (typecheck → diff → MAJOR gate
         → write → commit → audit → restart → health → rollback)

Stage walkthrough — factorial.sno

A two-line program shows what each stage does:

fac 0 = 1
fac n = n * fac (n - 1)
main = fac 10

1. Lexing

The lexer (synoema-lexer) does two jobs: character-level scanning and offside-rule indentation processing.

Scanner (scanner.rs) converts raw bytes to tokens. Every operator in Synoema is a single BPE token in cl100k_base — a design invariant verified by lang/tools/bpe-verify/verify_bpe.py. The 33 operators (including ->, |>, ++, **, ?, :) are all 1-token by construction.
Layout (layout.rs) inserts synthetic INDENT/DEDENT tokens whenever indentation increases or decreases, following Python's offside rule. This makes the grammar context-free while keeping the surface whitespace-sensitive.

2. Parsing

The parser uses Pratt parsing (top-down operator precedence) with 13 precedence levels. Pratt parsing is ideal for expression-heavy functional languages because it handles operator precedence and associativity through a binding-power table rather than 13 levels of grammar productions.

Function definitions with multiple equations (fac 0 = 1, fac n = ...) are parsed as separate clauses and later merged into a single FuncDef with a list of equations. Pattern matching on the LHS of = is parsed during this stage.

The AST has 25 ExprKind variants covering literals, application, binary ops, conditionals, lists, list comprehensions, ranges, records, field access, lambdas, pipes, compose, sequencing, where-blocks, case, constructors, string interpolation, record updates, bytes, naturals, rationals, plus the async surface (Await, Async) and post-Phase 27 additions.

3. Type checking

The type checker (synoema-types) runs Algorithm W — Hindley-Milner type inference. Annotations are optional; the checker infers types for every expression without them.

For factorial, inference proceeds: fac 0 = 1 infers fac : Int → Int (first equation pins arg and result type) → fac n = n * fac (n - 1) unifies n : Int, confirms fac : Int → Int → main = fac 10 infers main : Int.

The checker also handles row polymorphism for records (a function f r = r.x accepts any record with an x field) and linear types (LinearArrow) for resources used exactly once. Type errors carry source spans and LLM-friendly hints via synoema-diagnostic. See Type System for the deep-dive.

4. Desugaring → Core IR

The desugarer translates the typed AST into a small Core IR based on System F. Core IR has fewer constructs than the surface language, making optimization and code generation simpler. Key transformations:

Multi-equation function definitions → a single Case expression (decision tree)
? cond -> then : else → Case(cond, [Alt(true→then), Alt(false→else)])
|> pipe operator → nested App
>> compose operator → Lam wrapper
List comprehensions → concatMap + filter calls
where blocks → nested Let bindings

5. Optimization (4 passes)

Four passes in sequence. See Optimizer for full details.

Constant folding + DCE. 2 + 3 → 5; dead branches of ? false -> x : y eliminated.
E-graph equality saturation. Algebraic rewrites: x + 0 → x, x * 1 → x, map f (map g xs) → map (f >> g) xs. Up to 10 saturation iterations.
Region annotation. Marks which sub-expressions allocate on the heap, feeding Perceus.
Perceus RC insertion. Inserts inc/dec reference-count operations at ownership-transfer points, enabling memory reclamation without a GC.

6a. Interpreter

The tree-walking interpreter (synoema-eval) evaluates Core IR directly using big-step operational semantics. It supports all language features — including those not yet in the JIT (Nat, Rational, Char, Bytes, TCP networking, fd_popen). The interpreter is the reference implementation for language semantics.

6b. JIT compiler

The JIT (synoema-codegen/compiler.rs) uses Cranelift to generate native machine code at runtime. Cranelift was chosen over LLVM for fast builds, simple distribution, and JIT-first design (JITModule compiles and links in-process). For factorial, the JIT generates a native function that loops via TCO and multiplies, running ~4.2× faster than CPython on the same benchmark. See JIT & ABI.

Why these design choices?

Why Cranelift and not LLVM?

LLVM produces excellent code but adds 200–400 MB of build dependencies and minutes of compile time. For a research-stage language, iteration speed matters more than the last 10% of performance. Cranelift gives ~80% of LLVM's codegen quality at a fraction of the dependency cost.

Why Hindley-Milner and not bidirectional typing?

HM gives full type inference with no annotations required — critical for LLM-generated code: the model doesn't need to annotate types, reducing token cost and error surface. Bidirectional typing requires more annotations for polymorphism.

Why BPE-aligned operators?

Every operator tokenizes to exactly 1 BPE token in cl100k_base (used by GPT-4, Claude, most modern LLMs). Multi-token operators waste model capacity on syntactic overhead. Result: Synoema programs are 15% shorter than equivalent Python on average, up to 52% on algorithmic tasks (sorting, recursion, tree traversal).

Key benchmarks

Token efficiency (cl100k_base, 16 tasks)

Language	Avg tokens	vs Python
Synoema	baseline	−15% avg
Python	+15%	reference

Algorithmic tasks (quicksort, fibonacci, tree traversal): up to −52%.

JIT runtime performance (vs CPython 3.12, median of 5 runs)

Benchmark	JIT speedup
fibonacci	28.2×
factorial	4.2×
gcd	3.5×
collatz	3.1×
quicksort	2.7×
matrix_mult	2.1×
Median	3.0×

Deep-dives

Type System — Algorithm W, ADTs, traits, contracts, row polymorphism, linear types
JIT & ABI — Cranelift IR, calling convention, async state-machine compilation
Optimizer — constant folding, e-graph saturation, region annotation, Perceus RC

Cross-references

Language Reference — surface syntax, types, stdlib
CLI Reference — sno run, sno jit, sno wasm
LLM Integration — how the architecture supports LLM workflows
Canonical overview on GitHub