Originally published at kunalganglani.com — read it there for inline code, hero image, and live links.
Homelab AI Coding Server: Run OpenCode Across All Devices [2026]
A homelab AI coding server is a self-hosted, persistent AI development environment running on your own hardware — accessible from any device, network-isolated from production services, and gated behind human PR review before any code reaches deployment. It's the setup that lets you use AI agents for daily coding without handing control to a vendor, and it's becoming the default for developers who are sick of paying escalating API costs for cloud-hosted tools.
On June 14, 2026, a post titled "My Homelab AI Dev Platform" by rsgm hit 326 points and 54 comments on Hacker News in under 21 hours. Same day, a parallel Ask HN thread — "Has anyone replaced Claude/GPT with a local model for daily coding?" — hit 1,098 points and 474 comments. Two simultaneous front-page signals pointing at the same hunger. And this all landed the same week Reuters reported that SpaceX was acquiring Anysphere (the company behind Cursor) for $60 billion, cementing the commercial consolidation of AI coding tools.
Developers want persistent, vendor-agnostic AI coding environments they actually own. Here's how to build one.
Why Building a Homelab AI Dev Platform Matters Now
The timing here isn't a coincidence. A few things are happening at once that make self-hosted AI coding infrastructure not just nice-to-have but strategically necessary.
Cost pressure is real and getting worse. rsgm switched from Claude Code specifically because "AI providers have been really squeezing the value out of customers recently through token limits." This isn't an isolated complaint. GaltRanch, an independent developer, documented spending roughly $400 per month across personal and work API usage before building a fully self-hosted alternative. When OpenAI deprecated three model versions in a single quarter — causing prompts that had been stable for nine months to suddenly produce different outputs — that was the final straw.
Vendor consolidation is accelerating. The $60 billion Anysphere acquisition means Cursor, one of the most popular AI coding tools, is now owned by a company whose primary business isn't developer tools. I've watched this play out enough times in my career to know exactly what comes next: pricing changes, feature gates, integration priorities that serve the parent company's roadmap instead of developers'. It happens every single time.
Open-source alternatives have gotten genuinely good. OpenCode, built by Anomaly, has crossed 175,000 GitHub stars, 900+ contributors, 13,000+ commits, and over 7.5 million monthly active developers. It supports 75+ LLM providers, ships with a built-in web server and Web UI, handles git worktrees for parallel coding sessions, and auto-configures LSPs. This isn't a weekend project someone threw on GitHub. It's a production-grade tool that happens to be open source.
I've built and maintained internal developer platforms at scale, and I learned this the hard way: the moment your AI coding tool becomes a work dependency, you need to treat it like infrastructure, not a subscription.
What Is OpenCode and Why Use It for a Persistent Server?
OpenCode is an open-source AI coding agent that runs as a terminal TUI, desktop app, IDE extension, or — and this is what matters for our setup — a web server. It's vendor-agnostic by design, connecting to any of 75+ LLM providers through Models.dev, including local models via Ollama or llama.cpp.
What makes OpenCode a natural fit for a persistent homelab server:
- Built-in Web UI with terminal, file browser, and git diffs — no bolting on a separate frontend
- Git worktree support for running multiple parallel coding sessions at once
- LSP auto-loader that configures language servers for whatever project the agent touches
- MCP server integration for extending the agent with external tools
- Session share links — handy for switching devices or sharing context with teammates
- GitHub Copilot and ChatGPT Plus/Pro login support — use existing subscriptions as providers
- Privacy-first architecture — OpenCode doesn't store your code or context data
The move that makes rsgm's setup click is running OpenCode's web server as a systemd unit on a dedicated VM. Always on. Always accessible from any device on your network. Isolated from everything else. You open a browser on your phone, laptop, or tablet, and your coding sessions are right there. Persistent across devices.
I've shipped enough internal tooling to know that the gap between a tool people actually use and one they abandon is almost always availability. If using the AI coding agent means opening a terminal, SSHing into a box, and starting a process manually, you'll stop using it within a week. Guaranteed. But if it's just a URL you bookmark? It becomes part of your daily workflow without you even noticing.
How to Set Up the VM and Install OpenCode
The original setup uses a TrueNAS host with a simple VM, but this architecture works on anything that can run a Linux VM — Proxmox, ESXi, Hyper-V, even a spare machine running bare-metal Ubuntu. Slimane Bouhadi documented a similar approach using Multipass + k3s on Windows 11 with just 32GB RAM and 8 CPU cores.
Here's the architecture in prose, because the setup is simpler than it sounds:
VM provisioning. Spin up a lightweight Linux VM (Ubuntu Server or Debian are both fine). Give it internet access for pulling packages and talking to LLM APIs, plus access to your Git server. That's it for networking. The VM should NOT have access to your actual services, databases, or production infrastructure. This is the blast radius principle: if the AI does something unexpected, the worst case is a messed-up VM that you can snapshot-restore in seconds.
Install basic dev tooling. Git, Node.js (for OpenCode's npm install path), your language runtimes, and whatever build tools your projects need. The VM is the agent's sandbox. It should have everything the agent needs to build and test, but nothing it doesn't.
Install OpenCode. Fastest path is the curl installer: curl -fsSL https://opencode.ai/install | bash. Or install via npm (npm install -g opencode-ai), Homebrew (brew install anomalyco/tap/opencode), or Docker (ghcr.io/anomalyco/opencode). On Windows, WSL is recommended for full feature compatibility.
Configure your LLM provider. OpenCode connects to any provider via API key. If you want zero marginal cost, point it at a local LLM running on your network. If you want frontier-model quality, configure Claude, GPT, or Gemini API keys. The whole point of vendor-agnostic tooling is you can swap providers without changing how you work.
Set up the systemd unit. Create a service file that launches OpenCode's web server on boot, binds to a specific port, and runs under a dedicated system user. You get a persistent process that survives reboots and can be managed with standard systemctl commands. The web UI becomes accessible at http://your-vm-ip:port from any device on your network.
For those running Kubernetes on their homelab, you could containerize this entire setup. But honestly, for a single-developer homelab, a VM with a systemd unit is the right level of complexity. This is one of those things where the boring answer is actually the right one.
Security: Keeping AI Behind PR Review
This is where the architecture gets interesting, and where most tutorials completely drop the ball. The security model of a homelab AI coding server isn't about preventing the AI from writing bad code — that's what code review is for. It's about limiting the blast radius when (not if) the AI does something you didn't expect.
rsgm's approach is elegant because it's simple:
- Dedicated Git user. OpenCode gets its own user on the Git server with dedicated SSH keys. It can clone projects and push branches. It cannot push directly to the deploy branch.
- PR-gated deployment. All AI-generated changes go through human PR review. OpenCode writes the change, you merge it yourself. As rsgm puts it: "I think it's cute, but more importantly, it keeps unreviewed code from getting deployed."
- Network isolation. The VM has internet access and Git server access, but cannot reach actual services. No database connections, no production APIs, no secrets beyond its own Git credentials.
- Comfortable root access. Because the blast radius is small, you can give OpenCode root on the VM for installing build tools or test dependencies without anxiety. If it breaks the VM, restore from snapshot.
This is the right architecture for AI agents in 2026. I've seen the same principle validated in production AI contexts. The pattern: give the agent maximum capability within a minimized blast radius. Don't try to make the AI safe by restricting what it can do — that defeats the purpose. Make the environment safe by restricting what the AI can affect.
Jay Grider raises something important for anyone running local AI models: SHA256 checksums alone aren't enough to verify model artifacts. A maliciously constructed model file can pass a checksum if the checksum was generated against the malicious file. Proper verification means metadata parsing — checking architecture headers and parameter counts against the source release notes — on top of hash checks. This is an AI security concern that most homelab guides skip entirely.
Don't restrict the AI's capabilities. Restrict its blast radius. Give it root on a VM it can't escape, push access to branches it can't deploy, and network access to services it can't reach.
Multi-Device Access: The Part Nobody Talks About
Here's the thing that separates a homelab AI coding server from just running OpenCode on your laptop: persistent sessions accessible from anywhere on your network.
OpenCode's built-in web UI means your coding sessions survive device switches. Start a complex refactoring task on your desktop, check the progress on your phone while making coffee, then merge the PR from your laptop on the couch. The sessions live on the server, not in your browser. Your device is just a viewport.
I didn't expect this to matter as much as it does. After working with AI coding tools for the past year, I've found that my most productive pattern isn't sitting at a desk for hours grinding through tasks. It's kicking off an agent task, checking results asynchronously, and iterating in short bursts throughout the day. A persistent server makes that workflow seamless. Without it, you're constantly re-establishing context every time you switch machines.
For remote access beyond your local network, run a reverse proxy (Caddy or nginx) with TLS and basic auth in front of the OpenCode web UI. If you're already running a VPN like WireGuard or Tailscale on your homelab, even simpler — just access the VM's IP directly over the tunnel. The point is: this should be as frictionless as opening a browser tab.
OpenCode also supports session share links, which opens up a collaboration angle I hadn't considered initially. Working with another developer? Share a link to a specific coding session for debugging or pair review. It's not Google Docs-level real-time collaboration, but it's a massive step up from screen sharing a terminal.
Can You Run This Entirely on Local Models?
Short answer: yes, with tradeoffs. The HN thread from the same day made this pretty clear.
Greenpants, a developer running Qwen3.6 35B (a mixture-of-experts model with only 3B active parameters) on a Mac Studio with 128GB RAM, reported a roughly 5x developer productivity speedup at zero marginal cost. For comparison, Claude Opus delivers approximately 15x. That's a compelling ratio when you factor in the $0 ongoing cost of local LLM inference.
But Greenpants was honest about the limitations: local models require more precise prompting, tend to loop more often, and will take the laziest implementation path unless you steer them. "Comparing agentic Qwen3.6 35B to Claude Opus is like a junior with knowledge across the board, that you really need to guide, versus a senior that thinks with you on architecture."
Another commenter, lambda, runs Pi (a coding harness) in a container talking to llama.cpp in another container on a Strix Halo 128GB unified memory laptop — completely airgapped, no credential access, only the working directory mounted.
The practical recommendation: use OpenCode's provider-agnostic architecture to your advantage. Point it at a local model for routine tasks (container updates, healthcheck additions, boilerplate generation) and switch to a frontier model for complex architectural work. OpenCode supports multiple providers simultaneously, so you can configure this per-session or even per-task.
If you're looking at the hardware side, I've covered the specifics in my local LLM hardware guide and the Apple Silicon vs NVIDIA comparison. The short version: a Mac Studio with 128GB unified memory or a PC with an RTX 4090 (24GB VRAM) are the sweet spots for a homelab AI coding server in 2026.
The Community Is Already Building This — Independently
What convinced me this is a real pattern and not one person's weekend project is that multiple developers built nearly identical setups without talking to each other.
From the HN comments:
-
david-giesberg built a workflow that runs OpenCode inside Forgejo action runners. You invoke it with
/ocinside a Forgejo issue, and it comes back with a PR for review. - t0mas88 built smithy-ai — a separate app that runs Claude Code inside Docker containers integrated with Forgejo.
- chamoda built agent-foundry for GitHub Actions with two components: "daydream" (runs daily, reads a VISION.md file, creates issues for new features and maintenance tasks) and "nightwatch" (runs nightly, picks up the oldest issue, generates a PR). A fully automated idea-to-PR pipeline with human merge gates.
chamoda shared a telling operational detail: they quickly burned through free GitHub Actions hours and had to self-host runners. But self-hosted runners have a security implication that multiple commenters flagged — unlike GitHub-hosted runners, the VM isn't destroyed after each job. If a malicious action runs, it can persist artifacts or credentials on the runner. This is exactly why the VM-isolation-plus-dedicated-Git-user pattern from rsgm's setup matters.
As MisterPea noted in the thread: "Sometimes I feel like a lot of people in tech independently go through the same things right around the same time with few people writing/sharing about it." That's the signal. When multiple experienced developers converge on the same architecture without coordinating, you're looking at an emerging best practice, not a fad.
Git Integration and the PR-Gated Workflow
The git integration deserves its own section because it's the part most people get wrong when they first set up agentic AI coding.
The temptation is to give the AI agent direct commit access to your main branch. You're sitting right there reviewing the output in the web UI, right? Why add the overhead of PRs?
This is a mistake. I've watched teams make it in production AI deployments, and it always ends the same way: one distracted merge, one overlooked change, and suddenly you're debugging AI-generated code in production at 2am on a Tuesday.
The correct workflow, as demonstrated by rsgm:
- Plan out a feature or improvement (rsgm uses a thinking model like o3 or Gemini Flash Thinking for this)
- Open an OpenCode session and describe the task
- OpenCode writes the code, creates a branch, and pushes it
- You review the diff in a PR on your Git server (Gitea, Forgejo, GitHub, GitLab — doesn't matter)
- You merge when satisfied, and GitOps handles deployment
Git worktree support in OpenCode makes this particularly powerful. You can have multiple coding sessions running in parallel, each on its own branch, each working on a different task. One session is updating container versions, another is adding healthchecks, a third is refactoring a config file. All of them push branches independently, all go through PR review.
This maps directly to the GitOps pattern that's already standard for infrastructure management. rsgm uses Arcane for GitOps deployment, but any tool in the space works — ArgoCD, Flux, even a simple webhook that triggers docker compose pull && docker compose up -d on merge to main.
Here's the key architectural decision: the AI agent is a code contributor, not a deployer. It has the same access level as a junior developer on your team — push branches, open PRs, but no direct path to production. Prompt engineering can only do so much to prevent mistakes. Access control handles the rest.
Practical Use Cases: What This Setup Actually Does Well
After studying rsgm's setup and the various community implementations, here's where a homelab AI coding server genuinely shines and where it's overkill:
Where it excels:
- Container maintenance. Reading release notes, checking for breaking changes, updating versions, adding healthchecks across a dozen docker-compose stacks. rsgm reported going from "a few hours" to "a few minutes" for this task.
- Configuration management. Refactoring docker-compose files, updating environment variables, adding monitoring configs. Tedious-but-important work that benefits most from AI.
- Boilerplate generation. New service configs, CI/CD pipeline definitions, Terraform modules. The agent has full project context via the file browser and can generate consistent, project-aware code.
- Old project revival. chamoda's daydream/nightwatch pattern is brilliant for this — the agent reads your VISION.md, generates ideas for improvements, and creates PRs. It keeps momentum on projects you'd otherwise let rot.
Where it's overkill:
- Greenfield application development. For new projects, you want the tight feedback loop of a local IDE with inline AI coding assistance. The web UI is solid but not as fast as Cursor or VS Code with an extension.
- Complex debugging. The agent is great at writing new code but struggles with the detective work of tracking down subtle bugs across multiple services. That still needs a human at the keyboard with full context.
For my own homelab, the container maintenance use case alone justified the setup. I manage about 15 docker-compose stacks, and keeping them updated used to be the task I'd procrastinate on for weeks. Now it's a vibe coding session that runs while I'm doing something else.
FAQ
Can I run a homelab AI coding server on a Raspberry Pi?
Not practically. The OpenCode web server itself is lightweight, but you need enough RAM and CPU to handle the dev tooling, build processes, and any local model inference. A minimum of 16GB RAM is recommended for the VM. A Raspberry Pi 5 with 8GB could theoretically run the server connected to a cloud API, but the experience will be sluggish.
Do I need a GPU for a homelab AI coding server?
Only if you want to run local models for inference. If you're connecting OpenCode to cloud APIs like Claude, GPT, or Gemini, the server itself needs minimal hardware — just enough to run the web UI and dev tools. The GPU requirement is entirely on the inference side.
Is OpenCode a full replacement for Claude Code or Cursor?
For terminal-based agentic coding, OpenCode is competitive with Claude Code and has the advantage of being vendor-agnostic. It's not a direct replacement for Cursor's IDE experience — they serve different workflows. Many developers use OpenCode for server-side persistent sessions and a traditional IDE for local interactive development.
How do I secure the OpenCode web UI from unauthorized access?
At minimum, bind the web server to your local network only and don't expose it to the internet without authentication. For remote access, use a VPN like WireGuard or Tailscale. If you must expose it publicly, put it behind a reverse proxy with TLS and strong authentication (OAuth, basic auth with a strong password, or client certificates).
Can multiple developers share one homelab AI coding server?
Yes. OpenCode's multi-session and session share link features support this. Each developer can have their own coding sessions running in parallel, each on separate git worktrees. The dedicated Git user model would need to be extended — one Git user per developer, each with their own SSH keys and branch permissions.
What's the advantage over just running OpenCode locally on my laptop?
Persistence and multi-device access. Local installations die when you close the terminal or switch machines. A server-based setup keeps sessions alive indefinitely, accessible from any device. It also centralizes your LLM API keys and Git credentials on one secured machine rather than spreading them across every device you own.
What Comes Next
Today it's a VM with a systemd unit. Within a year, it'll be ephemeral containers spun up on demand — each with preinstalled tooling, access guardrails, and audit logs. rsgm sees it too: "I could see building this into a production developer platform."
The SpaceX/Anysphere acquisition will accelerate this. As commercial AI coding tools consolidate under companies with their own strategic agendas, the open-source, self-hosted path stops being a philosophical preference and starts being a practical hedge. OpenCode's 7.5 million monthly developers aren't all idealists. A lot of them are pragmatists who've been through enough vendor lock-in cycles to know that owning your toolchain is the only strategy that holds up over time.
My prediction: within 12 months, every serious homelab will have a persistent AI coding server the same way they have a Kubernetes cluster or a NAS. The pattern is too useful, the tools are too mature, and the alternative — paying escalating subscription fees for tools you don't control — is too obviously a bad deal.
If you're managing any kind of homelab infrastructure, set this up this weekend. A VM, a curl command, a systemd unit, and a Git user. Four steps to a coding agent that's always on, always yours, and always behind PR review. The boring architecture is the right architecture.
Originally published on kunalganglani.com
![Homelab AI Coding Server: Run OpenCode Across All Devices [2026]](https://media2.dev.to/dynamic/image/width=1200,height=627,fit=cover,gravity=auto,format=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe54iu3ldf4bmnsuiy6wo.png)






