I Self-Hosted an AI Assistant: Lessons from 48 Hours of Debugging
I wanted a local AI assistant. Expected: 2 hours. Reality: 2 days of edge cases, broken dependencies, and discovering that "local" doesn't mean "free."
The Stack
- OpenClaw (open-source AI assistant framework)
- VPS with limited console access (had to file tickets to enable)
- OpenRouter for model access
- Local Qwen as fallback
What Broke
1. Dependency Hell
Pre-installed OpenClaw came with an outdated library. Updated manually. Then updated again. OpenRouter integration only worked after the second update.
2. Certificate Issues
Self-hosted means self-managed certificates. Let's Encrypt, reverse proxy, CORS headers. Each layer adds a new failure mode.
3. "Free" API Credits Aren't
OpenRouter's "free" models have limits. Hit them within hours. The API key died silently — no error message, just empty responses.
4. Local Model Reality Check
Qwen promised tool-use support. Reality:
- Absolute paths broke tool calling (relative only)
- Model experienced "amnesia" — couldn't open .md files it created
- Larger models need more RAM but run slower
- 200K context window sounds great until you hit memory limits
5. The Debugging Cascade
Fix one thing → break another. Add skills for email and search. DuckDuckGo API rate-limits kill the search skill. Switch to alternative. New limits.
What Worked
Despite everything, the assistant is now running. Key insight:
Boxed solutions (Kimi, GLM native APIs) are more reliable. But self-hosting teaches you how the pieces actually connect — tool calling, memory management, model routing, context windows.
The Real Cost
| Item | Expected | Actual |
|---|---|---|
| Setup time | 2 hours | 2 days |
| API costs | $0 | $20+ before limits |
| Compute | Minimal | 16GB+ RAM for usable local models |
| Maintenance | Zero | Ongoing dependency updates |
Should You Self-Host?
Yes if:
- You want to understand LLM infrastructure deeply
- Data privacy is non-negotiable
- You enjoy debugging more than using
No if:
- You need reliability today
- Your time has a cost
- You're not ready to file support tickets for console access
What's Next
I'm keeping the local setup as a learning environment but routing production tasks to managed APIs. The hybrid approach: local for experimentation, cloud for reliability.
More self-hosting experiments and production AI infrastructure notes — follow my Telegram channel:
https://t.me/ai_tablet (Russian, technical)
More AI engineering notes, RAG benchmarks, and production insights from inside a bank — follow my Telegram channel:
🚀 https://t.me/ai_tablet (Russian, technical)













