← Back to community benchmarks

Llama-3.2-1B-Instruct

M1 (8c) · 16 GB · 4bit · 2026-03-14

Performance

1k

tokens

800.4

PP tok/s

67.8

TG tok/s

1281

TTFT (ms)

1.3

Peak mem (GB)

Hardware

Chip M1 (8c)

Memory 16 GB

GPU Cores 8

Software

oMLX v0.2.11

macOS macOS 26.4

Context 1,024

Performance by Context Length

Context	PP tok/s	TG tok/s	Peak Mem
1k	800.4	67.8	1.3 GB	current
4k	796.7	50.7	1.4 GB	view

Batching Results

Batch Size	TG tok/s	Speedup
1×	67.8	1.00×
2×	115.5	1.70×
4×	117.7	1.74×