← Back to community benchmarks

Qwen3.5-4B

M2 (8c) · 16 GB · 4bit · 2026-03-10

Performance

1k

tokens

160.2

PP tok/s

31.1

TG tok/s

6394

TTFT (ms)

3.9

Peak mem (GB)

Hardware

Chip M2 (8c)

Memory 16 GB

GPU Cores 8

Software

oMLX v0.2.6

macOS macOS 15.6.1

Context 1,024

Performance by Context Length

Context	PP tok/s	TG tok/s	Peak Mem
1k	160.2	31.1	3.9 GB	current
4k	159.4	29.0	4.2 GB	view

Batching Results

Batch Size	TG tok/s	Speedup
1×	31.1	1.00×
2×	36.8	1.18×
4×	38.0	1.22×