Updated 2025-02-23
Ollama
Run LLMs locally. Free, open-source. No API bills. Cost is hardware and power.
Entry:Free (hardware cost)
Data:Full
Best for:Data sovereignty, high volume
Use cases
Drafting and Q&A
Code assistance
Document analysis
Private/internal use cases
Offline or air-gapped
Cost-predictable high volume
Pricing model
Free software. No per-token costs. You pay for: GPU/hardware, electricity, and maintenance.
Software
Free
Open source, MIT license
Hardware
One-off
GPU required; 8GB+ VRAM for smaller models, 24GB+ for 14B, 40GB+ for 70B
Power
Ongoing
See our Self-hosted GPU Comparisons for £/1M token estimates
Business fit
Pros
- No data leaves your premises
- Predictable cost (hardware + power)
- No per-token bills
- Wide model support (Llama, Mistral, Qwen, etc.)
- Simple local setup
Cons
- Upfront hardware cost
- You manage updates and security
- Model quality depends on hardware
- No managed SLA
When it makes sense: Good for data-sensitive workloads, high volume where cloud costs would exceed hardware, or air-gapped environments. Compare with our GPU cost guide.
Data handling
All data stays on your hardware. No data sent to third parties. Full control.
GDPR / compliance: Data never leaves your infrastructure.
Data sovereignty: Complete data sovereignty.