← Back to community benchmarks
Qwen2.5-Coder-3B-Instruct
Performance
1k
tokens
3,887
PP tok/s
172.8
TG tok/s
264
TTFT (ms)
2.5
Peak mem (GB)
Hardware
Chip
M5 Max (32c)
Memory
36 GB
GPU Cores
32
Software
oMLX
v0.3.9.dev2
macOS
macOS 26.4.1
Context
1,024
Batching Results
| Batch Size | TG tok/s | Speedup |
|---|---|---|
| 1× | 172.8 | 1.00× |
| 2× | 266.5 | 1.54× |
| 4× | 350.8 | 2.03× |