← Back to community benchmarks

gpt-oss-20b-MXFP4-Q8

M1 Max (32c) · 64 GB · 4bit · 2026-03-23

Performance

4k

tokens

343.0

PP tok/s

51.0

TG tok/s

11943

TTFT (ms)

11.5

Peak mem (GB)

Hardware

Chip M1 Max (32c)

Memory 64 GB

GPU Cores 32

Software

oMLX v0.2.14

macOS macOS 26.3.1

Context 4,096