The Quantized Labs Model Repository

Discover ultra-compressed foundation models that defy hardware limitations. Powered by the Quantized Labs, these models run entirely 100% on CPU without requiring a GPU or NPU. With a massive reduction in RAM footprint, you can now run an 8-Billion parameter model entirely in under 2GB of RAM on any modern smartphone.

Pro Vault

Mid-Market Access

$499/mo

Unlimited downloads for models up to 14B parameters. Perfect for indie developers and startups prototyping edge deployment.

Titan Vault

Enterprise Access

$1,999/mo

Unlimited downloads for the entire marketplace, including 70B+ enterprise reasoning models. Includes prioritized hardware heuristics and premium SLAs.

Ultra-Tiny Background Agents (< 0.5B)

Load in milliseconds. Perfect for predictive typing, notification filtering, and background data sorting. Less than 100MB RAM.

SmolLM-360M

HuggingFace
Apache 2.0 (Commercial OK)
BackgroundClassification
Format1.5-bit Asym
Context8k
Original RAM720 MB
Quantized Labs RAM85 MB
99% MMLU Match
Hardware Heuristics
iPhone 15 Pro450 T/s
Capabilities
Tool UseJSON ModeWeb Search
Free

Mamba-370M

State Space
Apache 2.0 (Commercial OK)
Fixed RAMBackground
Format1.5-bit Asym
Context64k
Original RAM740 MB
Quantized Labs RAM88 MB
99% MMLU Match
Hardware Heuristics
iPhone 15 Pro480 T/s
Capabilities
Tool UseJSON ModeWeb Search
Free

Qwen-2.5-0.5B

Alibaba
Qwen License (Commercial OK)
MultilingualFast
Format2-bit Asym
Context32k
Original RAM1.0 GB
Quantized Labs RAM120 MB
99% MMLU Match
Hardware Heuristics
iPhone 15 Pro400 T/s
Capabilities
Tool UseJSON ModeWeb Search
$99.00

Standard Application Assistants (1B - 3B)

The 'Daily Driver' for consumer apps. 500MB to 900MB RAM footprint. Perfect for customer service bots, localized AI, and offline drafting.

Llama-3.2-3B

Meta
Llama 3 License (Commercial OK)
ConversationalAssistant
Format2-bit Asym
Context128k
Original RAM6.0 GB
Quantized Labs RAM750 MB
99% HumanEval Match
Hardware Heuristics
iPhone 15 Pro120 T/s
Capabilities
Tool UseJSON ModeWeb Search
$199.00

Gemma-2-2B

Google
Gemma License (Commercial OK)
CodingLogic
Format2-bit Asym
Context8k
Original RAM4.0 GB
Quantized Labs RAM500 MB
100% GSM8K Match
Hardware Heuristics
iPhone 15 Pro150 T/s
Capabilities
Tool UseJSON ModeWeb Search
$149.00

Phi-3-Mini

Microsoft
MIT (Commercial OK)
FactualEducation
Format2-bit Asym
Context4k
Original RAM7.6 GB
Quantized Labs RAM820 MB
100% MMLU Match
Hardware Heuristics
iPhone 15 Pro95 T/s
Capabilities
Tool UseJSON ModeWeb Search
$249.00

Heavy-Duty Reasoning Engines (8B - 14B)

Desktop-class AI for flagship smartphones. 2GB - 4GB RAM footprint. Perfect for on-device code generation, deep document analysis, and complex agentic workflows.

Llama-3.1-8B

Meta
Llama 3 License (Commercial OK)
InstructionChat
Format2-bit Asym
Context128k
Original RAM16 GB
Quantized Labs RAM1.92 GB
98.5% MMLU Match
Hardware Heuristics
iPhone 15 Pro19 T/s
Mac M3 Max152 T/s
Capabilities
Tool UseJSON ModeWeb Search
$499.00

Mistral-Nemo-12B

Mistral AI
Apache 2.0 (Commercial OK)
Long-ContextDocuments
Format2-bit Asym
Context128k
Original RAM24 GB
Quantized Labs RAM2.8 GB
98% HumanEval Match
Hardware Heuristics
iPhone 15 Pro12 T/s
Capabilities
Tool UseJSON ModeWeb Search
$599.00

Qwen-2.5-14B

Alibaba
Qwen License (Commercial OK)
PremiumReasoning
Format2-bit Asym
Context32k
Original RAM28 GB
Quantized Labs RAM3.3 GB
98% MMLU Match
Hardware Heuristics
Mac M3 Max85 T/s
Capabilities
Tool UseJSON ModeWeb Search
$699.00

Vision-Language Models (VLMs)

Real-time visual processing for mobile AR, translation, and accessibility. Text and vision cores compressed to 2-bit.

Qwen2-VL-2B

Alibaba
Qwen License (Commercial OK)
VisionReal-time
Format2-bit Asym
Context32k
Original RAM4.0 GB
Quantized Labs RAM600 MB
97% VQA Match
Hardware Heuristics
iPhone 15 Pro80 T/s
Capabilities
Tool UseJSON ModeWeb Search
$299.00

Llama-3.2-11B-Vision

Meta
Llama 3 License (Commercial OK)
VisionHigh-Fidelity
Format2-bit Asym
Context128k
Original RAM22 GB
Quantized Labs RAM2.6 GB
97.5% VQA Match
Hardware Heuristics
Mac M3 Max90 T/s
Capabilities
Tool UseJSON ModeWeb Search
$699.00