← Back to community benchmarks

Llama-3.2-3B-Instruct

M4 (8c) · 16 GB · 4bit · 2026-03-09

Performance

4k

tokens

382.4

PP tok/s

37.8

TG tok/s

10715

TTFT (ms)

2.6

Peak mem (GB)

Hardware

Chip M4 (8c)

Memory 16 GB

GPU Cores 8

Software

oMLX v0.2.6

macOS macOS 15.7.5

Context 4,096

Performance by Context Length

Context	PP tok/s	TG tok/s	Peak Mem
1k	419.8	47.9	2.3 GB	view
4k	382.4	37.8	2.6 GB	current

Batching Results

Batch Size	TG tok/s	Speedup
1×	47.9	1.00×
2×	84.8	1.77×
4×	88.5	1.85×