← Back to community benchmarks

Llama-3.2-3B-Instruct

M1 (8c) · 16 GB · 4bit · 2026-03-12

Performance

1k

tokens

312.3

PP tok/s

28.6

TG tok/s

3282

TTFT (ms)

2.3

Peak mem (GB)

Hardware

Chip M1 (8c)

Memory 16 GB

GPU Cores 8

Software

oMLX v0.2.9

macOS macOS 26.3.1

Context 1,024

Performance by Context Length

Context	PP tok/s	TG tok/s	Peak Mem
1k	312.3	28.6	2.3 GB	current
4k	292.3	23.8	2.6 GB	view

Batching Results

Batch Size	TG tok/s	Speedup
1×	28.6	1.00×
2×	52.5	1.84×
4×	57.0	1.99×
8×	58.1	2.03×