Performance
32k
tokens
569.5
PP tok/s
35.0
TG tok/s
57536
TTFT (ms)
27.5
Peak mem (GB)
Hardware
Chip
M4 Pro (20c)
Memory
64 GB
GPU Cores
20
Software
oMLX
v0.3.8
macOS
macOS 26.3
Context
32,768