VRAM Calculator
Can my GPU run this model?
Pick a model and quantization. See exactly how much VRAM it needs. Check GPU compatibility instantly.
Step 1
Pick a model
Search by name, or scroll the list. Custom param count at the bottom of the panel.
Step 2
QuantizationQuantization compresses model weights to use less memory. Lower quant = less VRAM but slightly lower quality. Q4_K_M is the most popular choice for models 14B+.
Recommended for models 14B and above. Best balance of size and quality.
Step 3
Context lengthHow many tokens the model can see at once. Larger context = more VRAM for KV cache. 32K is the sweet spot for most coding tasks. 8K is too short for repo-level work.
32Ktokensmax 1M
Pick a model or enter a parameter count to see the VRAM estimate.
Every model on this page runs in Bodega One.
Pick a model, connect a provider, start coding. No config files. One-time purchase.
Join the WaitlistEstimates based on published model architectures and quantization specifications. Actual usage may vary by 1-2 GB depending on runtime, driver version, and system configuration. KV cache calculated at FP16. VRAM figures do not include memory used by other applications. Last verified March 2026.