← Back to community benchmarks

Qwen3-VL-4B-Instruct

M1 (8c) · 16 GB · 4bit · 2026-03-11

Performance

1k

tokens

144.5

PP tok/s

20.7

TG tok/s

7087

TTFT (ms)

3.6

Peak mem (GB)

Hardware

Chip M1 (8c)

Memory 16 GB

GPU Cores 8

Software

oMLX v0.2.7

macOS macOS 26.3.1

Context 1,024

Performance by Context Length

Context	PP tok/s	TG tok/s	Peak Mem
1k	144.5	20.7	3.6 GB	current
4k	133.4	18.1	4.0 GB	view

Batching Results

Batch Size	TG tok/s	Speedup
1×	20.7	1.00×
2×	31.3	1.51×
4×	29.9	1.44×