A comprehensive framework for understanding AI implementation maturity. Learn about 5 progressive stages: Discovery, Local Inferencing, Plug-and-Play RAG, Custom RAG & Orchestration, and Complete Local AI Infrastructure.
Outside of work, I’m a proper car enthusiast. It’s addictive. You start with a platform doing something like 320bhp. Next thing you know, you’re chasing close to a thousand.
Each stage teaches you something. You build on what came before. Sometimes you rip out a mod that looked clever at the time and think, what was I thinking? You take a step back to take two steps forward.
AI feels the same to me. Back in 2001, when I started in SEO, everything was new and exciting. Raw potential everywhere. AI right now gives me that same buzz. Fresh, powerful, room to tinker and improve.
So here’s our Bravr journey through five stages of AI adoption, from stock cloud tinkering to a proper self-hosted setup. Before you dive in, you might want to run an AI readiness assessment to see where your business actually stands. Read on. Have a think: where are you at?
Stage 0 – Discovery / Cloud APIs
You sign up for ChatGPT, Claude, Gemini, Perplexity. First time it drafts an email properly or brainstorms decent ideas, you think: bloody hell, this actually helps.
Your workspace? Twenty browser tabs.
At Bravr we jumped in early. ChatGPT at launch, then Grok, Perplexity, Anthropic, Gemini (Google Workspace forced that one). We switch depending on the job:
- Claude: Great for coding
- Grok: Best for creativity
- ChatGPT: Was good, but March ’26 models are lazy and rely on training data
- Gemini: Our least favourite. Too arrogant and makes mistakes frequently
- Opus: Currently king for us, especially hard reasoning tasks
What you can do: draft faster, brainstorm in minutes, ask anything.
But it adds up. Heavy use means bills. You send data to someone else’s servers. No real automation yet.
Still a fun starting point.
Stage 1 – Local Inferencing / Your Machine
Install LM Studio, Ollama, or AnythingLLM. Run a model locally. That “my laptop can do this?” moment hits hard.
Data never leaves your machine. After the hardware cost, inference is free. Test models without API fees.
We used Python scripts to rephrase 250,000 product descriptions locally. Cloud APIs would have cost a fortune. We would have gone bust.
A mate with a Mac was chuffed he could use AI on a flight. No internet needed.
Hardware reality: RTX 4080 handles 13B to 30B quantised comfortably. M-series Macs do similar with unified memory. Older cards struggle above 7B.
GPU shortage back then meant not rewarding scalpers. Still worth it for privacy and control.
That 'my laptop can run an AI model' moment hits hard.
Stage 2 – Plug-and-Play RAG / Chat Over Your Docs
Upload your docs to AnythingLLM or OpenWebUI with built-in RAG. It chunks, embeds, and lets you chat over them. No deep config. It just works.
Ask natural questions over PDFs, notes, wikis, product catalogues. Get answers with citations. Hallucinations drop.
We started here for quick internal wins. Upload support docs or meeting notes. Someone asks, “Summarise the latest pricing changes.” Answer comes back fast and accurate. Team picked it up because setup took minutes.
First proper privacy plus speed win over cloud.
Downsides? Retrieval feels a bit black-box. Why did it pick that chunk? Messy tables or complex layouts trip it up sometimes. Less control over how chunks are made or searched.
Stage 3 – Custom RAG & Orchestration / Real Control
Now you run your own Qdrant instance. You learn embeddings, collections, points, similarity search. You experiment with chunking: recursive, semantic, respecting headings and tables, overlap tweaks.
Tools like RAGFlow step in. Deep document parsing, hybrid vector plus keyword search, reranking.
Add orchestration: OpenRouter to route prompts smartly, ComfyUI for local image and video workflows, LoRAs trained on rented GPUs, basic agent flows in OpenWebUI.
What you get: much better answers on tricky queries. “Compare warranty terms EU vs US” pulls exact clauses, not guesses.
At Bravr this is where we spend most time now. Self-hosted Qdrant plus semantic chunking experiments gave big quality jumps on our wikis and client docs. RAGFlow handles messy PDFs with tables properly. Precise extracts, saves hours.
ComfyUI plus local prompts means generating branded images and videos without API costs or sending data anywhere. Route complex bits via OpenRouter if needed. Real, measurable value here.
Trade-off: debugging why a query misses takes trial and error. Steeper learning curve. You maintain the DB and update embeddings.
Stage 4 – Local AI Stack / Complete Infrastructure
Ubuntu server. Multiple GPUs. Big and small models running together. Local embeddings, reranking.
Tailscale for secure remote access. Docker running Ragflow, OpenWebUI, n8n, Postgres, Qdrant, and the rest. Custom dashboard to switch modes (team use, coding focus, etc.). Quantised models, KV cache tweaks. Multi-agent setups.
What you can do: complete self-hosted AI that scales for a small team. Integrate with CRM or project tools. Keep data sovereign. Cut electricity costs versus constant cloud API calls.
This is where Bravr sits today. We deploy client-facing tools on this stack when privacy really matters. The savings and control make it worthwhile. We optimise it daily.
Downsides? You need DevOps chops, or the patience to learn. Hardware gets serious for multi-user. Maintenance never fully goes away (updates, monitoring, backups). Not quite “set and forget”.
Still, it’s our garage now.
Most people bounce between stages. Needs change. Nothing wrong with that.
Where Are You At?
If you’re unsure how to move between these stages, a structured AI implementation roadmap can help you identify the gaps and prioritize the right hardware and software.
The point isn’t racing to stage 4. It’s having the right tools for the job you’re actually doing.
We’re comfortable at stage 4 right now, scaling steadily. That’s honest progress for us.
What stage are you on? Drop us a line at hello@bravr.ai or message via our contact page. We read them all. Love hearing how people are building with AI in the real world.
About the Author
Shah founded Bravr in 2009 after 8 years at Cheapflights. He’s spent over 15 years in search technology, currently running self-hosted LLM stacks across multiple GPU clusters. Personal interest: rebuilding BMW E46s and tuning ECUs on the dyno.
Where's your business on the AI adoption journey?
We read every message. Tell us which stage you're in.
Get in touch