Insights

Research, design decisions, and lessons from building a language for machines

Synoema exists at an unusual intersection: programming language theory, LLM research, and compiler engineering. These articles document what we've learned — including things that didn't work the way we expected. No marketing, no hype. Just the data and the reasoning behind the decisions.

All Articles

Explainer

Why Build a New Programming Language in the Age of AI?

The paradox at the heart of LLM code generation: AI can write Python fluently, yet most of the code it produces doesn't run. The case for a language designed for machines, not humans.

April 13, 2026 · ~9 min read

Explainer

Why AI Writes Broken Code — and How Type Systems Can Fix It

Type errors account for 33.6% of LLM code failures. Hindley-Milner type inference reduces compilation errors by 74.8%. Here's how type-guided generation works in practice.

April 13, 2026 · ~11 min read

Explainer

Introducing Synoema: A Language Machines Can Verify

A tour of the language: pattern matching, algebraic data types, verification contracts, Hindley-Milner inference, and the Cranelift JIT backend. With code examples throughout.

April 13, 2026 · ~12 min read

Results

What We Learned Teaching AI a New Language

We ran 10+ LLMs on 9 standard tasks and a 50-task corpus. H1 was disproved. H2 was confirmed with ρ=1.00 Spearman correlation. Here's what the data actually showed.

April 13, 2026 · ~14 min read

Explainer

How a Compiler Catches AI Mistakes Before They Run

Three verification layers: GBNF grammar for syntax, Hindley-Milner types for semantics, contracts for runtime behavior. And error messages designed for machine consumption, not humans.

April 13, 2026 · ~9 min read

Results

From Zero to 41%: Building an AI That Writes Working Code

The full journey: language design, corpus generation (5,037 validated programs), QLoRA fine-tuning on AMD hardware, and where 59% of attempts still fail. An honest account.

April 13, 2026 · ~11 min read

Explainer

How We Automated Our Entire Dev Workflow with Claude Code Skills

A directory-based skills system gives us autonomous pipelines, parallel execution, and zero-config automation. Here's how it works and what it enables.

April 13, 2026 · ~10 min read

Research

The Scientific Method Behind Synoema

12 falsifiable hypotheses. Statistical methodology with Bonferroni correction and Cohen's h effect sizes. A corpus validated by the compiler itself. How we try to do this rigorously.

April 13, 2026 · ~14 min read

Results

Intermediate Results: v6 Fine-Tuning

90.5% run rate on the 7B model — up from 41% baseline. And an honest look at the constructs regression (44.6% vs 52.7% in v5) and what we're doing about it. ChatML format change: what it improved and what it broke.

April 14, 2026 · ~7 min read

About This Series

These articles were written for several audiences simultaneously: researchers evaluating the project's scientific rigor, developers curious about the technical decisions, and anyone who has wondered why AI-generated code fails so often. We've tried to make the technical content accessible without sacrificing precision.

All experimental data referenced in these articles is publicly available in the benchmarks/results/ and docs/research/ directories of the repository. If you find an error or want to replicate a result, the scripts are in benchmarks/scripts/.

About the Project Language Reference