← Back to community benchmarks

Llama3.3-8B-Instruct-Thinking-Heretic-Uncensored-Claude-4.5-Opus-High-Reasoning

M1 (8c) · 16 GB · 8bit · 2026-05-18

Performance

1k

tokens

78.4

PP tok/s

7.2

TG tok/s

13081

TTFT (ms)

8.5

Peak mem (GB)

Hardware

Chip M1 (8c)

Memory 16 GB

GPU Cores 8

Software

oMLX v0.3.8

macOS macOS 26.4.1

Context 1,024

Performance by Context Length

Context	PP tok/s	TG tok/s	Peak Mem
1k	78.4	7.2	8.5 GB	current
4k	77.3	6.8	9.1 GB	view

Batching Results

Batch Size	TG tok/s	Speedup
1×	7.2	1.00×
2×	12.9	1.79×
4×	21.7	3.01×