Matt Shore - Evangelist of AI Factories for Scalable Intelligence

Updated 2025-02-23

Ollama

Run LLMs locally. Free, open-source. No API bills. Cost is hardware and power.

Entry:Free (hardware cost) Data:Full Best for:Data sovereignty, high volume

Use cases

Drafting and Q&A Code assistance Document analysis Private/internal use cases Offline or air-gapped Cost-predictable high volume

Pricing model

Free software. No per-token costs. You pay for: GPU/hardware, electricity, and maintenance.

Software

Free

Open source, MIT license

Hardware

One-off

GPU required; 8GB+ VRAM for smaller models, 24GB+ for 14B, 40GB+ for 70B

Power

Ongoing

See our Self-hosted GPU Comparisons for £/1M token estimates

Source: Ollama pricing

Business fit

Pros

No data leaves your premises
Predictable cost (hardware + power)
No per-token bills
Wide model support (Llama, Mistral, Qwen, etc.)
Simple local setup

Cons

Upfront hardware cost
You manage updates and security
Model quality depends on hardware
No managed SLA

When it makes sense: Good for data-sensitive workloads, high volume where cloud costs would exceed hardware, or air-gapped environments. Compare with our GPU cost guide.

Data handling

All data stays on your hardware. No data sent to third parties. Full control.

GDPR / compliance: Data never leaves your infrastructure.

Data sovereignty: Complete data sovereignty.

Ollama data policy

← All products Compare products Estimate AI costs