guided · lite edition

Lite Onboarding Walkthrough

Three steps that take you from a raw model to a projected leaf/spine deployment — using only the open-source Lite stack. Each step runs locally; no cluster required.

Auto-quantise an ONNX model

Run a calibration sweep across fp16, int8 (per-channel) and int4 (GPTQ-style). The Lite optimiser picks the precision that keeps accuracy drop below your tolerance.

bash — step 1

adaptive-ctloptimise ./resnet50.onnx \

--auto-quant --tolerance=1.5% --calibset=./imagenet-val-512

# sweeping fp16 → int8 → int4 …

expected output

✓ fp16 acc Δ -0.2% throughput 1.8× size 49 MB

✓ int8 acc Δ -0.7% throughput 3.4× size 25 MB

✓ int4 acc Δ -2.9% throughput 6.1× size 13 MB (rejected)

→ selected: int8 · ./build/resnet50.signal

0/3 complete

What's next

→ telemetry

Stream optimiser events live

Watch latency, power, and scheduler decisions in real time and export to Parquet/CSV.

→ enterprise

Promote the projection to a real cluster

Take the leaf/spine plan you just modelled and ship it with distributed sharding + SLA governor.