← Back to community benchmarks

Qwen3-VL-4B-Instruct

M4 (8c) · 16 GB · 4bit · 2026-03-10

Performance

4k

tokens

284.0

PP tok/s

30.0

TG tok/s

14422

TTFT (ms)

4.0

Peak mem (GB)

Hardware

Chip M4 (8c)

Memory 16 GB

GPU Cores 8

Software

oMLX v0.2.6

macOS macOS 26.3.1

Context 4,096

Performance by Context Length

Context	PP tok/s	TG tok/s	Peak Mem
1k	310.8	36.8	3.6 GB	view
4k	284.0	30.0	4.0 GB	current

Batching Results

Batch Size	TG tok/s	Speedup
1×	36.8	1.00×
2×	64.9	1.76×
4×	69.5	1.89×