Quantized Labs | Hardware Benchmarks

Hardware / Chipset	SmolLM-135MUltra-Tiny • 150MB RAM	Gemma-2-2BIoT / Mobile • 600MB RAM	Llama-3.1-8BReasoning • 2.5GB RAM	Qwen-2.5-14BHeavy-Duty • 4GB RAM
AWS Graviton3 (c7g.2xlarge) Cloud Server / ARM Neoverse-V1	2,800 T/s	620 T/s	145 T/s	72 T/s
Apple M3 Max MacBook Pro / Desktop ARM	2,400 T/s	450 T/s	110 T/s	65 T/s
AMD Ryzen 9 7950X Desktop x86 / AVX-512	2,100 T/s	380 T/s	98 T/s	54 T/s
Intel Core Ultra 7 155H Meteor Lake / Modern AI Laptop	1,950 T/s	360 T/s	92 T/s	50 T/s
Intel Core i7-13700K Desktop x86 / AVX2	1,800 T/s	320 T/s	85 T/s	45 T/s
Snapdragon 8 Gen 3 Galaxy S24 / Flagship Android	950 T/s	160 T/s	42 T/s	24 T/s
MediaTek Dimensity 9300 Flagship Android	880 T/s	150 T/s	38 T/s	20 T/s
Apple A17 Pro iPhone 15 Pro	800 T/s	90 T/s	28 T/s	16 T/s
NVIDIA Jetson Orin Nano Edge Robotics	700 T/s	120 T/s	30 T/s	15 T/s
Google Pixel 8 Pro Tensor G3	650 T/s	115 T/s	28 T/s	14 T/s
Steam Deck AMD Custom APU	600 T/s	110 T/s	25 T/s	12 T/s
Snapdragon 6 Gen 3 Mid-Range Android	350 T/s	45 T/s	12 T/s	OOM
Rockchip RK3588 SBC / Orange Pi 5	220 T/s	35 T/s	8 T/s	OOM
Apple Watch Ultra 2 S9 SiP	120 T/s	OOM	OOM	OOM

Real-Time Inference (>30 T/s)

Usable Inference (10-30 T/s)

Out of Memory (OOM)

Zero Thermal Throttling

Standard edge runtimes (like CoreML and NNAPI) burn through battery and overheat the device within minutes, causing massive frame-rate drops. The Quantized Labs's Symbiotic Runtime prevents thermal runaway entirely.

45°C40°C35°C30°C

44.2°C

Standard CoreML
Throttles at 3 mins

33.8°C

Quantized Labs
60 mins continuous

Battery Consumption (mAh / 10k Tokens)

Thermals are great, but for mobile and wearables, battery life is the ultimate constraint. The Quantized Labs uses raw integer execution units, consuming up to 80% less power than float-based Neural Engine frameworks.

100 mAh75 mAh50 mAh25 mAh

88 mAh

Standard CoreML
iPhone 15 Pro

17 mAh

Quantized Labs
iPhone 15 Pro

Real-World Performance Simulator

Watch how Quantized Labs's Symbiotic Engine radically outperforms standard float-based pipelines on identical hardware (simulated iPhone 15 Pro).

CoreML (Float16)

Explain Quantum Entanglement

Thermal Throttling Detected12 T/s

Quantized Labs (Int2)

Explain Quantum Entanglement

Nominal Temp (33°C)38 T/s

Memory Bandwidth Bottlenecks

Edge AI is bound by memory bandwidth, not just compute. The following matrix shows the theoretical minimum bandwidth required to achieve 10 Tokens/Sec for each compressed architecture.

Architecture	Min Bandwidth for 10 T/s	Recommended RAM Type	Compatible Hardware
SmolLM-135M	1.5 GB/s	DDR4 / LPDDR4	IoT / Raspberry Pi 4
Gemma-2-2B	6.0 GB/s	LPDDR4x	Mid-Range Phones
Llama-3.1-8B	25.0 GB/s	LPDDR5 (3200MHz+)	Flagship Phones / Laptops
Qwen-2.5-14B	40.0 GB/s	LPDDR5x	Apple M-Series / High-End Laptops

Sustained Load Profiling

Need to know exactly how Quantized Labs performs on your proprietary hardware? Submit your silicon architecture details and our engineering team will provide a comprehensive Time-to-Throttling analysis.

Hardware Compatibility Matrix

Zero Thermal Throttling

Battery Consumption (mAh / 10k Tokens)

Real-World Performance Simulator

Memory Bandwidth Bottlenecks

Sustained Load Profiling