Synoema

JIT & ABI

Tagged pointers, heap layouts, Perceus refcounting, and the 198-function runtime FFI

The Synoema JIT (powered by Cranelift) represents every runtime value as a single i64. Type information lives in the lower 3 bits of each word; the upper 61 bits hold either a payload (for unboxed types) or an 8-byte-aligned heap pointer. This page documents the encoding, the heap node layouts that follow from it, the two-layer memory model (bump arena + Perceus RC), and the standard FFI pattern for adding new runtime capabilities.

Why tagged pointers?

Cranelift's native IR works with machine-word integers. Boxing every value into a heap-allocated object is slow; using separate type metadata is complex. Tagged pointers thread the needle: because all heap allocations are 8-byte aligned (guaranteed by the arena allocator), the lower 3 bits of any heap pointer are zero. Synoema steals those bits for tag information. For non-pointer values (integers, booleans, characters), the value is embedded directly in the upper bits.

The tag table

i64 value layout:

 63                              3  2  1  0
 ┌────────────────────────────────┬──┬──┬──┐
 │           payload / pointer    │t2│t1│t0│
 └────────────────────────────────┴──┴──┴──┘
                                   └──────┘
                                   tag = lower 3 bits (v & 7)
TagTypeEncoding
0Int / Bool / ListInt: unboxed value; Bool: 0=false, 1=true; List: untagged arena pointer (distinguished by arena_contains_ptr)
1 (CON_TAG)ADT constructor(v & !7)ConNode*
2 (STR_TAG)String(v & !7)StrNode*
3 (RATIONAL_TAG)Rational(v & !7)RationalNode*
4 (FLOAT_TAG)Float(v & !7)FloatNode*
5 (RECORD_TAG)Record(v & !7)RecordNode*
6 (CHAR_TAG)CharacterUnboxed: (codepoint << 3) | 6
7 (BYTES_TAG)Bytes buffer(v & !7)StrNode* (same layout, different tag)

Tag-0 disambiguation: integers, booleans, and list pointers all share tag 0. They are distinguished by value range: 0 = nil list, 0 = false, 1 = true, and arena pointers are ≥ 0x1000 within the arena bounds (verified by arena_contains_ptr()).

Char encoding

Characters are unboxed — no heap allocation. 'a' (codepoint 97) is stored as (97 << 3) | 6 = 776 | 6 = 782. Extract with (v >> 3) as u32.

Heap node layouts

Every heap node starts with rc: i64 at offset 0 (Perceus reference count, initialized to 1). This is a critical ABI invariant — the Perceus synoema_inc/synoema_dec functions read rc at offset 0 without knowing the node type.

ListNode — cons cell

#[repr(C)]
struct ListNode {
    rc:   i64,  // offset 0:  Perceus refcount
    head: i64,  // offset 8:  the element (any i64-encoded value)
    tail: i64,  // offset 16: next ListNode* or 0 (nil)
}
// size: 24 bytes

A list [1 2 3] in memory:

{rc=1, head=1, tail=→} → {rc=1, head=2, tail=→} → {rc=1, head=3, tail=0}

Nil = 0i64 (null pointer with tag=0).

StrNode — string and bytes

#[repr(C)]
struct StrNode {
    rc:  i64,   // offset 0: Perceus refcount
    len: i64,   // offset 8: byte length
    // UTF-8 bytes follow immediately after (inline, not a pointer)
}
// size: 16 + len bytes (padded to 8-byte alignment)

For Bytes values, the layout is identical but the tag is BYTES_TAG=7 instead of STR_TAG=2.

FloatNode

#[repr(C)]
struct FloatNode {
    rc:   i64,  // offset 0: Perceus refcount
    bits: i64,  // offset 8: f64 bit pattern (via f64::to_bits())
}
// size: 16 bytes

ClosureNode

#[repr(C)]
struct ClosureNode {
    rc:      i64,  // offset 0:  Perceus refcount
    fn_ptr:  i64,  // offset 8:  Cranelift-compiled function pointer
    env_ptr: i64,  // offset 16: pointer to array of captured variables
}
// size: 24 bytes

Environment is a separately allocated array of i64 values (one per captured variable). Lambda lifting determines which variables need to be captured.

ConNode — ADT constructor (variable size)

offset 0:  rc:    i64
offset 8:  tag:   i64    (constructor index within the ADT)
offset 16: field_0: i64
offset 24: field_1: i64
...

ConNode is not a fixed Rust struct — it's built dynamically with arena allocation proportional to the number of constructor fields.

Memory management — two layers

Layer 1: Arena Allocator (bump allocation)
  • 8 MB thread-local buffer
  • All JIT heap allocations go here
  • O(1) allocation: bump an offset pointer
  • Freed in bulk by arena_reset() after each program run

Layer 2: Perceus Reference Counting
  • Each heap node has rc: i64 at offset 0
  • synoema_inc(val) increments rc
  • synoema_dec(val) decrements rc; rc=0 → push to reuse_pool
  • Reuse pool: up to 256 dead nodes recycled for future allocs
  • Does NOT free to OS — arena is the backing store

Arena lifecycle

You don't manage individual deallocations in JIT runtime functions — just allocate from the arena and let arena_reset() clean up at run boundaries. Perceus is an optimization layer that enables reuse within a single run.

synoema_inc and synoema_dec

// Increment: called when a value is shared
pub extern "C" fn synoema_inc(val: i64) -> i64 {
    let raw = heap_ptr_of(val);
    if raw != 0 {
        unsafe { *(raw as *mut i64) += 1; }
    }
    0
}

// Decrement: called when a value is consumed
pub extern "C" fn synoema_dec(val: i64) -> i64 {
    let raw = heap_ptr_of(val);
    if raw != 0 {
        unsafe {
            let rc = *(raw as *mut i64) - 1;
            *(raw as *mut i64) = rc;
            if rc == 0 { reuse_pool_push(raw); }
        }
    }
    0
}

The Perceus pass in synoema-core/perceus.rs inserts these calls automatically based on ownership analysis. See Optimizer for how that pass works.

Runtime FFI pattern

Adding a new runtime capability to the JIT requires three steps. This is the standard pattern used for all 198 JIT FFI functions.

Step 1: Implement in runtime.rs

// lang/crates/synoema-codegen/src/runtime.rs

/// Return the byte length of a string.
#[unsafe(no_mangle)]
pub extern "C" fn synoema_str_len(val: i64) -> i64 {
    if !is_str(val) { return 0; }
    let ptr = (val & !STR_TAG) as *const StrNode;
    unsafe { (*ptr).len }
}

Rules: extern "C" + #[unsafe(no_mangle)]; all parameters and return type are i64; access heap nodes via the untagged pointer.

Step 2: Register the symbol in Compiler::new()

builder.symbol("synoema_str_len", runtime::synoema_str_len as *const u8);

Step 3: Declare the function signature

// sig1 = fn(i64) -> i64 (already declared elsewhere; reuse it)
decl(self, "synoema_str_len", "str_len", &sig1)?;

After these three steps, the function is callable from JIT-compiled code using the short name "str_len". Add a doctest in src/lib.rs, ensure the interpreter has an equivalent in synoema-eval/src/eval.rs, run cargo test -p synoema-codegen.

TLS soft-error channel (Phase H3)

Async error propagation in the JIT cannot rely on panic! because Cranelift-generated frames have no Rust unwinding tables — a panic crossing a Cranelift frame is undefined behavior. Phase H3 (scope_result / try_await JIT fix) replaces panic-through-Cranelift with a thread-local soft-error channel:

thread_local! {
    static SCOPE_RESULT_ERROR: RefCell<Option<String>> = RefCell::new(None);
    static TASK_ERROR:         RefCell<Option<String>> = RefCell::new(None);
}

Three rules govern the channel:

  1. synoema_error never panics. It writes the message into SCOPE_RESULT_ERROR (or TASK_ERROR) and returns a sentinel value. Cranelift frames see only normal returns.
  2. synoema_task_complete and the G1 reactor wake-loop intercept TASK_ERROR. When set, the parent task is marked Panicked instead of Done; await_with_timeout / try_await lower it back to Result a Error / Maybe a at the language level.
  3. compile_and_run checks SCOPE_RESULT_ERROR after the JIT main returns. If set, the top-level result is wrapped in Err.

scope_result : (Unit -> a) -> Result a Error and try_await : Task a -> Result a Error are the two builtins that read these channels. Regression test: lang/crates/synoema-codegen/tests/phase_e_errors.rs (E10).

Async reactor handles (Phase G)

The mio-backed reactor (Phase G) introduces three integer handle namespaces that ride alongside the tagged-pointer ABI as plain i64 (tag=0) values:

  • Task IDsi64 keys into Mutex<HashMap<i64, PendingTask>> waker map
  • TCP socket fds — returned by async_tcp_connect / async_tcp_listen / async_tcp_accept; passed back to async_tcp_read / write / close
  • TLS handles (Phase 27) — opaque Int values; in the interpreter they live in a thread_local! RefCell<HashMap<i64, TlsConn>>; in the JIT, in a OnceLock<Mutex<HashMap<i64, TlsConn>>> global. Returned by tls_connect / tls_listen / tls_accept.

These handles deliberately do not get heap-tag bits — they are integers from the language's perspective. The runtime side uses the value as a key to look up the actual socket/connection in its global table.

Cross-references