← Back to community benchmarks

Qwen2.5-3B-Instruct

M1 (8c) · 16 GB · 4bit · 2026-03-10

Performance

1k

tokens

303.1

PP tok/s

30.4

TG tok/s

3378

TTFT (ms)

2.2

Peak mem (GB)

Hardware

Chip M1 (8c)

Memory 16 GB

GPU Cores 8

Software

oMLX v0.2.7

macOS macOS 15.3.1

Context 1,024

Performance by Context Length

Context	PP tok/s	TG tok/s	Peak Mem
1k	303.1	30.4	2.2 GB	current
4k	288.0	26.6	2.4 GB	view

Batching Results

Batch Size	TG tok/s	Speedup
1×	30.4	1.00×
2×	53.8	1.77×
4×	56.3	1.85×