I Self-Hosted an AI Assistant: Lessons from 48 Hours of Debugging

I wanted a local AI assistant. Expected: 2 hours. Reality: 2 days of edge cases, broken dependencies, and discovering that "local" doesn't mean "free."

The Stack

OpenClaw (open-source AI assistant framework)
VPS with limited console access (had to file tickets to enable)
OpenRouter for model access
Local Qwen as fallback

What Broke

1. Dependency Hell

Pre-installed OpenClaw came with an outdated library. Updated manually. Then updated again. OpenRouter integration only worked after the second update.

2. Certificate Issues

Self-hosted means self-managed certificates. Let's Encrypt, reverse proxy, CORS headers. Each layer adds a new failure mode.

3. "Free" API Credits Aren't

OpenRouter's "free" models have limits. Hit them within hours. The API key died silently — no error message, just empty responses.

4. Local Model Reality Check

Qwen promised tool-use support. Reality:

Absolute paths broke tool calling (relative only)
Model experienced "amnesia" — couldn't open .md files it created
Larger models need more RAM but run slower
200K context window sounds great until you hit memory limits

5. The Debugging Cascade

Fix one thing → break another. Add skills for email and search. DuckDuckGo API rate-limits kill the search skill. Switch to alternative. New limits.

What Worked

Despite everything, the assistant is now running. Key insight:

Boxed solutions (Kimi, GLM native APIs) are more reliable. But self-hosting teaches you how the pieces actually connect — tool calling, memory management, model routing, context windows.

The Real Cost

Item	Expected	Actual
Setup time	2 hours	2 days
API costs	$0	$20+ before limits
Compute	Minimal	16GB+ RAM for usable local models
Maintenance	Zero	Ongoing dependency updates

Should You Self-Host?

Yes if:

You want to understand LLM infrastructure deeply
Data privacy is non-negotiable
You enjoy debugging more than using

No if:

You need reliability today
Your time has a cost
You're not ready to file support tickets for console access

What's Next

I'm keeping the local setup as a learning environment but routing production tasks to managed APIs. The hybrid approach: local for experimentation, cloud for reliability.

More self-hosting experiments and production AI infrastructure notes — follow my Telegram channel:

https://t.me/ai_tablet (Russian, technical)

More AI engineering notes, RAG benchmarks, and production insights from inside a bank — follow my Telegram channel:

🚀 https://t.me/ai_tablet (Russian, technical)