guided · lite edition
Lite Onboarding Walkthrough
Three steps that take you from a raw model to a projected leaf/spine deployment — using only the open-source Lite stack. Each step runs locally; no cluster required.
Auto-quantise an ONNX model
Run a calibration sweep across fp16, int8 (per-channel) and int4 (GPTQ-style). The Lite optimiser picks the precision that keeps accuracy drop below your tolerance.
bash — step 1
1
adaptive-ctloptimise ./resnet50.onnx \
2
--auto-quant --tolerance=1.5% --calibset=./imagenet-val-512
3
# sweeping fp16 → int8 → int4 …
expected output
✓ fp16 acc Δ -0.2% throughput 1.8× size 49 MB
✓ int8 acc Δ -0.7% throughput 3.4× size 25 MB
✓ int4 acc Δ -2.9% throughput 6.1× size 13 MB (rejected)
→ selected: int8 · ./build/resnet50.signal
0/3 complete