← Back to community benchmarks
Qwen3.5-122B-A10Bvision-4.7-bit
Performance
32k
tokens
317.5
PP tok/s
20.8
TG tok/s
103201
TTFT (ms)
70.7
Peak mem (GB)
Hardware
Chip
M1 Ultra (64c)
Memory
128 GB
GPU Cores
64
Software
oMLX
v0.2.14
macOS
macOS 26.3
Context
32,768
Performance by Context Length
| Context | PP tok/s | TG tok/s | Peak Mem | |
|---|---|---|---|---|
| 1k | 333.4 | 36.0 | 68.0 GB | view |
| 4k | 358.9 | 34.5 | 68.3 GB | view |
| 8k | 359.3 | 31.6 | 68.6 GB | view |
| 16k | 347.0 | 27.1 | 69.3 GB | view |
| 32k | 317.5 | 20.8 | 70.7 GB | current |
| 64k | 248.0 | 13.9 | 73.5 GB | view |
Batching Results
| Batch Size | TG tok/s | Speedup |
|---|---|---|
| 1× | 36.0 | 1.00× |
| 2× | 54.3 | 1.51× |
| 4× | 74.8 | 2.08× |