I'm building a read-only context engine for Kubernetes and AI agents

Kubernetes gives us an incredibly powerful API.

It also gives us a familiar debugging ritual:

kubectl get pods
kubectl describe pod ...
kubectl get svc ...
kubectl get endpointslices ...
kubectl get deployment ...
kubectl get events ...
kubectl get application.argoproj.io ...

Then we mentally stitch the result together.

Which workload owns this Pod? Which Service routes to it? Are there ready endpoints? Is the namespace unhealthy because of one bad Deployment, a missing backend, warning Events, or something else?

Which facts should I paste into an incident, attach to a CI failure, or give to an AI assistant before asking it to reason?

I started building kctx because I wanted a small tool for that missing middle layer: not raw YAML, not a dashboard, not an auto-remediation system.

Just structured Kubernetes context.

The short version

kctx is a read-only Kubernetes context engine for humans, scripts, and AI agents.

It turns live Kubernetes API state into a compact model of:

entities: Pods, Services, workloads, Nodes, PVCs, ConfigMaps, Secrets, and supported CRDs
relations: ownership, selection, scheduling, service routing, and dependencies
signals: factual observations such as unhealthy Pods, missing endpoints, warning Events, failed readiness, or degraded workloads
graphs: dependency and ownership views around supported resources
dumps: deterministic namespace snapshots for automation, incident review, and agent grounding

The design goal is intentionally conservative:

read cluster state, normalize facts, avoid speculative root-cause claims.

That boundary matters.

I do not want kctx to be a tool that confidently invents explanations. I want it to provide the evidence layer that humans, automation, and AI agents can use before reasoning.

Why another Kubernetes tool?

Most Kubernetes tools are optimized for one of a few jobs:

kubectl exposes the raw API very well
dashboards make current state visible
monitoring systems track metrics and alerts over time
logging systems answer "what happened in the process?"
GitOps tools understand desired state and sync status

Those are all useful. I use them too.

But during debugging, there is still a recurring gap between "I can query every object" and "I have a compact operational picture of what is connected to what."

For example, when looking at a Service, I often care less about the complete YAML and more about questions like:

Which EndpointSlices back this Service?
Which endpoints are ready?
Which Pods do those endpoints point to?
Who owns those Pods?
Which Nodes are involved?
Are there obvious factual signals, such as missing endpoints or no ready Pods?

That is the kind of context kctx tries to assemble.

kctx trace service payments-api --namespace payments

For a namespace-level view:

kctx health namespace payments

For a focused resource view:

kctx explain pod api-xyz --namespace payments

And when you need a deterministic JSON snapshot for automation or incident review:

kctx dump namespace payments > payments-dump.json

The important constraint: read-only

kctx does not mutate Kubernetes resources.

It does not:

restart workloads
apply manifests
patch resources
delete anything
perform remediation
claim to know the root cause

That is not because remediation is uninteresting. It is because I think the context layer should be boring, auditable, and safe before anything else is built on top of it.

This becomes even more important when AI agents enter the picture.

If an agent needs Kubernetes context, I would rather give it a narrow read-only tool that returns structured facts than hand it broad cluster access and hope the prompt is enough of a safety boundary.

Kubernetes context for AI agents

One of the areas I am experimenting with is exposing kctx through the Model Context Protocol.

Current serve modes include:

kctx serve --mode mcp
kctx serve --mode mcp-sse

The MCP tools currently cover the same core context operations:

get_namespace_health
explain_resource
trace_service
get_pod_graph
dump_namespace

The idea is simple: let an AI assistant ask for Kubernetes context without giving it mutation capabilities.

No remediation.
No raw YAML firehose.
No root-cause guessing dressed up as certainty.

Just compact operational facts that can be used as evidence.

There is also an MCP/SSE release test guide for anyone who wants to try this with a local kind cluster, the released Helm chart, Online Boutique, ngrok, Codex, Claude Code, or ChatGPT Developer Mode:

https://github.com/lucasepe/kctx/tree/main/docs/kctx-mcp-sse-release-test-guide.pdf

Important note: the MCP/SSE endpoint is read-only, but built-in AuthN/AuthZ is not production-ready yet. Treat it as local-lab or trusted-network tooling for now, or put it behind an external access-control layer.

CRDs need semantics, not wishful thinking

Another design choice: kctx does not pretend that every custom resource can be understood generically.

Kubernetes discovery can tell you that a CRD exists. It cannot tell you what that CRD means operationally.

So kctx uses explicit adapters for ecosystem-specific resources. An adapter can translate a CRD into the same core model used by the rest of the project: resource identity, compact status, related entities, relations, signals, and graph nodes or edges.

The current adapter set includes:

Argo CD Application
Argo CD AppProject
cert-manager Certificate

That approach is slower than saying "we support every CRD", but I think it is more honest. If a tool is going to describe operational context, it should know what it is describing.

JSON first, because scripts and agents need contracts

The CLI and HTTP API emit versioned JSON by default.

Responses include a schema version and kind, for example:

{
  "schemaVersion": "kctx.io/v1alpha1",
  "kind": "NamespaceHealth"
}

The repository includes machine-readable JSON schemas under:

schemas/kctx.io/v1alpha1

That part may sound less exciting than graphs or agents, but it is one of the pieces I care about most.

If humans are the only users, text output can be enough. If scripts, CI systems, incident tooling, and AI agents are also users, the output needs a contract.

Data safety

kctx` is designed to provide operational context, not sensitive data.

It avoids returning raw manifests, Secret data, ConfigMap data, raw environment variables, logs, and workload metrics.

Supported outputs also pass metadata and Kubernetes messages through a small redaction policy for common secret-bearing keys and text patterns.

This is not a magic privacy shield, but it is an intentional boundary in the design.

A tiny example workflow

Install:

bash curl -fsSL https://raw.githubusercontent.com/lucasepe/kctx/main/install.sh | bash

Run against your current Kubernetes context:

bash kctx health namespace default kctx explain pod <pod-name> --namespace default kctx trace service <service-name> --namespace default kctx graph pod <pod-name> --namespace default --render mermaid kctx dump namespace default

Or run the read-only HTTP server:

bash kctx serve curl http://localhost:8080/health/namespace/default

Or install the in-cluster server with Helm:

bash VERSION=0.3.0 helm upgrade --install kctx \ "https://github.com/lucasepe/kctx/releases/download/v${VERSION}/kctx-${VERSION}.tgz" \ --namespace kctx-system \ --create-namespace

Then:

bash kubectl -n kctx-system port-forward svc/kctx 8080:8080 curl http://localhost:8080/health/namespace/default

What I am looking for

The project is still under active development. It is useful today, but I am still hardening packaging, production deployment guidance, auth boundaries for server mode, and client compatibility around MCP/SSE.

I would love feedback from SREs, platform engineers, Kubernetes operators, and people experimenting with AI-assisted operations.

In particular:

Does the output feel like useful operational context?
Are the signals too noisy, too sparse, or missing obvious facts?
Which CRDs would be most valuable to support next?
Does the JSON contract work for your scripts or internal tools?
Does the MCP interface fit how you want AI agents to inspect infrastructure?
Where does the install, Helm chart, or local test flow feel confusing?
What would make you trust a tool like this in a production debugging workflow?

If you try the MCP/SSE path, I am especially interested in:

OS and Kubernetes/kind version
AI client and version
transport used: localhost, port-forward, ngrok, or internal URL
whether the standalone smoke test passed
whether the AI client discovered and called the tools
any rough edges in the responses

The repo

GitHub:

https://github.com/lucasepe/kctx

If this idea resonates with you, a star would help the project reach more Kubernetes and platform people.

But even more useful: open an issue, tell me where the model feels wrong, or share a debugging scenario where structured context would have saved time.

That is the kind of feedback that can make kctx sharper.