About Synoema
The language of shared understanding
What is Synoema?
Synoema [sy-NO-e-ma] — from Greek synoema (συνόημα), "shared understanding". A BPE-aligned programming language purpose-built for LLM code generation.
The core insight: LLMs generate code token by token. Every token costs money, time, and context window. If the programming language itself is designed to align with how LLMs tokenize text, you get cheaper generation, faster inference, and more correct output — by construction, not by accident.
Synoema is not a general-purpose language competing with Python or Rust. It occupies a new niche: the language where AI writes code. It's optimized for the machine that generates it, not the human who reads it (though it's readable too).
Design Philosophy
Token Economy First
Every syntax decision is measured in BPE tokens. ? cond -> then : else costs 3 tokens; if cond then x else y costs 6. We chose 3. Every operator is verified against cl100k_base, Llama 3, and Mistral tokenizers.
Correctness by Construction
Hindley-Milner type inference catches errors without annotations. GBNF grammar constrains LLM output to be 100% syntactically valid. Verification contracts (requires/ensures) guard runtime behavior.
Minimal Dependencies
The entire compiler is 2 dependencies: Cranelift (JIT) and pretty_assertions (tests). No tokio, no serde, no async. ~33K LOC Rust, 9 crates. You can read the entire codebase in a day.
Immutable & Strict
All bindings are immutable. Evaluation is eager, left-to-right. No lazy evaluation surprises. Predictable for both humans and LLMs.
Equations, Not Statements
No def, no return, no semicolons between statements. Functions are equations: f x = body. The last expression is the result. Pattern matching via multiple equations.
Dual Backend
Interpreter for development (full I/O, networking, concurrency). Cranelift JIT for production speed (3x over Python). Same language, same semantics, choose your tradeoff.
Core Strengths
1. Token Efficiency — 15% Fewer Tokens Than Python
Measured with tiktoken (cl100k_base) on 16 benchmark tasks. Synoema consistently uses fewer tokens for the same algorithm:
| Task | Synoema | Python | Saving |
|---|---|---|---|
| json_build | 32 | 67 | 52% |
| pattern_match | 136 | 225 | 40% |
| quicksort | 77 | 124 | 38% |
| mergesort | 117 | 179 | 35% |
| gcd | 26 | 35 | 26% |
| fibonacci | 38 | 49 | 22% |
| factorial | 25 | 32 | 22% |
| fizzbuzz | 59 | 63 | 6% |
| Average (16 tasks) | 15% | ||
Where Synoema wins big: recursive algorithms, pattern matching, data structure operations, JSON building.
Where Python wins: string operations, imperative-style code, matrix math.
2. Native Speed — 3x Median Over Python
JIT-compiled via Cranelift to native x86-64. Benchmarked against CPython 3.12, median of 5 runs:
| Benchmark | Python | Synoema JIT | Speedup |
|---|---|---|---|
| fibonacci | 144ms | 5.1ms | 28.2x |
| factorial | 24ms | 5.7ms | 4.2x |
| gcd | 17ms | 4.7ms | 3.5x |
| collatz | 18ms | 5.6ms | 3.1x |
| quicksort | 17ms | 6.2ms | 2.7x |
| matrix_mult | 16ms | 7.7ms | 2.1x |
| Median (12 tasks) | 3.0x | ||
Fibonacci shows 28x thanks to tail-call optimization. Typical sustainable speedup: 2–4x.
3. Guaranteed Syntactic Correctness
The GBNF grammar (162 lines, 48 rules) enables constrained decoding: the LLM can only generate tokens that form valid Synoema syntax. Works with llama.cpp, vLLM, SGLang, TensorRT-LLM via XGrammar (100x speedup over naive approaches).
Result: 100% of LLM-generated programs parse successfully. Compare with unconstrained generation where 24% of GitHub Copilot suggestions contain compilation errors (Nguyen & Nadi, 2022).
4. Type-Guided Generation
Hindley-Milner type inference acts as a semantic constraint on LLM output. Research shows type-constrained decoding reduces compilation errors by 74.8% vs only 9.0% for syntax-only constraints (Mundler et al., PLDI 2025). Synoema's type system is not just for correctness — it's a generation quality multiplier.
Scientific Foundation
Synoema's design is grounded in 23 peer-reviewed publications. Key findings that shaped the language:
| Finding | Source | Impact on Design |
|---|---|---|
| LLM inference consumes >90% of total AI energy | TokenPowerBench, 2024 | Token efficiency = direct cost/energy reduction |
| Attention cost is O(n²) — halving tokens = 4x less compute | Vaswani et al., 2017 | 15% fewer tokens → ~28% less attention compute |
| Token efficiency varies 2.6x across languages | Alderson, 2026 | Language design can significantly impact token count |
| Type errors = 33.6% of LLM code failures | Tambon et al., 2025 | Hindley-Milner inference eliminates the dominant error class |
| Type constraints reduce errors by 74.8% | Mundler et al., PLDI 2025 | Type system as generation constraint, not just verification |
| Bridge tokens distort LLM distributions | Domino, ICML 2024 | All operators = 1 BPE token → no bridge token distortion |
| XGrammar: 100x speedup for grammar-constrained decoding | Dong et al., 2024 | GBNF grammar designed for efficient constrained decoding |
| LLM quality degrades with sequence length | Multiple sources | Fewer tokens = less context rot = better output quality |
Synoema vs Other Languages
| Feature | Synoema | Python | Haskell | Rust | TypeScript |
|---|---|---|---|---|---|
| Token efficiency | Best (-15%) | Baseline | Similar | Verbose | Verbose |
| Type inference | Full (HM) | None | Full (HM) | Partial | Partial |
| Pattern matching | Full ADTs | Limited | Full | Full | None |
| Constrained decoding | GBNF | No | No | No | No |
| JIT compilation | Cranelift | No (CPython) | GHC | LLVM | V8 |
| LLM toolchain | MCP + GBNF | None | None | None | None |
| Learning curve | Medium | Easy | Hard | Hard | Easy |
| Ecosystem | Small | Huge | Medium | Large | Huge |
| Evaluation | Strict | Strict | Lazy | Strict | Strict |
| Immutability | Default | No | Default | Default | No |
Key insight: Synoema doesn't try to replace Python or Rust for human-written code. It's designed for the specific scenario where an LLM generates code — and in that scenario, token efficiency, type safety, and constrained decoding matter more than ecosystem size.
Maximum Impact Areas
Synoema gives nonlinear advantage where machines generate code and correctness is critical. The key insight: Synoema is not competing with Python for human developers — it's the language where AI writes code that other machines verify and execute.
Impact Matrix
| Correctness: nice-to-have | Correctness: critical | |
|---|---|---|
| Machine generates code | Edge AI microtools, on-device code gen, IoT rules | Verified microservices, financial logic, executable specs, agent orchestration |
| Human writes code | Python/JS are better choices | Executable specifications, formal contracts |
Maximum effect = machine generates, correctness critical. This is where all three verification layers (GBNF syntax + HM types + contracts) work simultaneously.
Six High-Impact Domains
1. LLM-Generated Microservices
User describes business logic in natural language. LLM generates a Synoema service. GBNF guarantees syntax. Types guarantee correctness. Contracts guard business invariants. The service runs immediately via TCP/HTTP builtins. Human never reads the code — code is an artifact between two machines.
2. Formally Verified AI Code
Today: LLM generates Python, human reviews, hopes it works. With Synoema: GBNF ensures syntax, HM types catch type errors (33.6% of all LLM failures), contracts enforce requires/ensures. Three layers of verification, zero human review needed for correctness. Critical for financial calculations, medical algorithms, regulatory compliance.
3. Edge AI / Small Models
Device: Raspberry Pi, phone, IoT. Model: 4B-7B parameters. Context: 2K-4K tokens. Synoema's compact reference (900 tokens) fits the context. GBNF eliminates syntax errors (40% on small models). -15% tokens = critical when context is limited. JIT = 3x faster on constrained hardware.
4. LLM Self-Improvement Loops
LLM generates code → type checker finds errors → structured JSON error with llm_hint → LLM fixes using hint → repeat. Each iteration: -15% tokens vs Python pipeline. Synoema's --errors json with fixability and did_you_mean is designed for machine consumption, not humans. No other language has error messages optimized for LLMs.
5. Executable Specifications
"Discount 10% for orders over 1000, max 500" becomes: discount : Int -> Int with requires total > 0, ensures result <= 500. The specification IS the code. Contracts are checked at runtime. synoema doc --contracts generates the spec table. Specification never diverges from implementation because they're the same thing.
6. AI Agent Orchestration
Multiple AI agents exchange Synoema programs instead of JSON or natural language. Programs are formally typed, verified by contracts, and executable. MCP server enables eval/typecheck/run in real-time. Synoema as lingua franca between AI agents — a shared language both machines understand with formal guarantees.
When NOT to Use Synoema
Critical thinking requires honesty. Synoema loses everywhere a human writes code or ecosystem matters more than correctness:
| Domain | Why not Synoema | Better choice |
|---|---|---|
| Web frontends | No DOM, no browser API | TypeScript/JavaScript |
| Data science | No numpy/pandas ecosystem | Python |
| Systems programming | No ownership, no unsafe | Rust |
| Enterprise backend | Small ecosystem, no ORM | Java/Go |
| Mobile apps | No SDK | Swift/Kotlin |
| String-heavy tasks | Python is 87% more token-efficient | Python |
| Existing codebase | Migration cost not justified | Whatever's there |
| Human writes code manually | Python is simpler and more familiar | Python |
The pattern: Synoema wins where machines generate code and correctness is critical. Synoema loses where humans write code or ecosystem size matters.
Architecture
~33,000 lines of Rust across 9 workspace crates:
| Crate | Purpose | LOC |
|---|---|---|
synoema-lexer | Tokenization, offside rule (indentation → INDENT/DEDENT) | ~1,050 |
synoema-parser | Pratt parser, 15 expression kinds, AST | ~3,260 |
synoema-types | Hindley-Milner inference, row polymorphism, contracts | ~3,450 |
synoema-core | Core IR (System F), constant folding, dead code elimination | ~2,320 |
synoema-eval | Tree-walking interpreter, all builtins, I/O | ~5,300 |
synoema-codegen | Cranelift JIT compiler, tagged pointer ABI, arena memory | ~5,870 |
synoema-diagnostic | Structured errors with LLM hints, fixability scores | ~840 |
synoema-lsp | LSP server (hover, go-to-def, diagnostics, completion) | ~620 |
synoema-repl | CLI: run, jit, eval, test, doc, watch, init, install | ~2,670 |
Key architectural decisions:
- Tagged pointer ABI — all values fit in i64 with bit-level type tags (bit 0 = list, bit 1 = string). Zero boxing overhead for small values.
- Arena memory — 8MB bump allocator with region stack. Auto-reset in tail-recursive loops. No GC pauses.
- 2 dependencies only — Cranelift for JIT, pretty_assertions for tests. No runtime dependencies beyond std.
Ecosystem
MCP Server
npx synoema-mcp — instant integration with Claude Desktop, Cursor, Zed. Tools: eval, typecheck, run, constrain (token masking).
VS Code Extension
Syntax highlighting, run/JIT keybindings (Cmd+Shift+R/J), eval selection. LSP server for hover types, go-to-def, diagnostics.
GBNF Grammar
162 lines, 48 rules. Works with llama.cpp, vLLM, SGLang, TensorRT-LLM. Ensures 100% syntactically valid LLM output.
Fine-Tuning Corpus
5,037 verified examples (99.9% pass rate) in JSONL format. Covers algorithms, data structures, pattern matching, error handling, I/O, and more.
Benchmark Suite
30 tasks across 5 languages. Automated token counting (tiktoken) and runtime measurement. Reproducible via Python scripts.
1,217 Tests
Unit tests, stress tests (fib(35), 100K tokens, deep nesting), corpus validation, adversarial edge cases. 0 failures, 0 warnings.
Roadmap
| Phase | Status | Description |
|---|---|---|
| Working Language | Done | Lexer, parser, type system, interpreter, REPL |
| Working Compiler | Done | Core IR, Cranelift JIT, tagged pointer ABI, arena memory |
| LLM-Native | Done | GBNF grammar, MCP server, constrained decoding, LLM error feedback |
| Production | 90% | Region inference, contract docs, benchmark suite, small model templates |
| Community | Next | Package manager, WASM playground, expanded corpus, documentation |
Current version: 0.1.0-alpha.3 (alpha — syntax and APIs may change)
Get Involved
Synoema is an open research project. Contributions welcome:
- Try it — playground, getting started guide
- Read the source — ~33K LOC, 2 dependencies, readable in a day
- Report issues — GitHub
- Contribute examples — expand the corpus, write benchmarks
- Research — token efficiency, constrained decoding, type-guided generation