← Back to community benchmarks

Qwen3-4B

M4 (8c) · 16 GB · 4bit · 2026-03-09

Performance

1k

tokens

290.2

PP tok/s

36.3

TG tok/s

3528

TTFT (ms)

2.8

Peak mem (GB)

Hardware

Chip M4 (8c)

Memory 16 GB

GPU Cores 8

Software

oMLX v0.2.6

macOS macOS 26.3

Context 1,024

Performance by Context Length

Context	PP tok/s	TG tok/s	Peak Mem
1k	290.2	36.3	2.8 GB	current
4k	265.1	30.4	3.2 GB	view

Batching Results

Batch Size	TG tok/s	Speedup
1×	36.3	1.00×
2×	65.7	1.81×
4×	69.2	1.91×