Why I'm building Hyphae: provenance over prediction (and the 3-line baseline that tied it)

A few months ago I set out to build a cognitive substrate without a large language model in the answering path. I had a thesis I liked, a Rust workspace, and a lot of conviction.

Then I wrote a three-line baseline that tied it on every metric I cared about.

This is the story of why that was the best thing that happened to the project — and why I'm still building it, just pointed at a sharper target.

The problem I actually care about

When a language model answers a grounded question, it paraphrases its sources. That paraphrase is fluent, often correct, and — this is the part that bothers me — impossible to bind back to its source byte-for-byte. You can cite a document. You cannot prove, after the fact, that the words in the answer are the words that were stored, unaltered, at a known position.

For a chatbot that doesn't matter. For anything that has to be audited — a compliance trail, a medical or legal memory, an agent acting on your behalf over months — it matters a lot. "Trust me, I read the docs" is not a property you can verify.

So I started building Hyphae: a substrate that answers by emitting byte-identical quotations of stored memory fragments, over a SHA-256 hash-chained journal, with no LLM in the cognition path. Rust, CPU-only, a single binary.

The shape of it is simple:

// Every stored fragment is appended verbatim to a hash chain.
let (seq, head) = journal.append("memory_op", fragment.bytes())?;

// An answer span is a byte-identical quotation of a stored fragment.
// Tamper with any historical entry and the recomputed chain breaks
// at the next link — verify() localises exactly where.
journal.verify()?;

Nothing here is cryptographically novel. Hash-chained logs are old and well understood — Haber & Stornetta in 1991, Merkle before that, Certificate Transparency, git. I want to be honest about that up front, because the interesting part isn't the chain.

The day a three-line baseline tied me

I wanted to show Hyphae was better than an LLM+RAG pipeline at grounded answering. So I built the comparison properly: a real retriever, reranking, six models across three retrieval modes, two corpora, twelve metrics.

Then a reviewer asked the obvious question: what does a trivial baseline score?

So I wrote echo — a few lines that just print the retrieved fragment back. It tied Hyphae on every correctness and grounding metric. So did echo + journal.

That stings for about a day. Then it becomes the whole point.

The measured correctness and grounding were never properties of my system. They are properties of verbatim quotation itself. If you emit a stored span unchanged, of course it's "grounded" — it is the source. Hyphae's seventeen subsystems weren't what made the answers auditable. The verbatim-emission-over-a-journal layer was. And that layer is addable to any extractive retrieval system — it isn't Hyphae-specific at all.

So I stopped claiming Hyphae was a better brain and started claiming something narrower and, I think, truer:

Verifiable provenance is a property you can add to grounded retrieval. A paraphrase destroys byte-level bindability to its source; a verbatim quotation preserves it, and a hash chain makes that binding independently auditable.

The contribution isn't the hash chain. It's the observation, and the measurement of it against eighteen LLM configurations and a tamper-detection benchmark.

Closing the gaps, honestly

Once you claim "tamper-evident," people who know what they're doing immediately ask where it breaks. Good. The threat model is the product.

A bare hash chain catches a store-only attacker who edits a record in place. It does not catch a chain-aware attacker who recomputes every hash forward and rewrites the head — because the head lives in the same store. So I anchor the head with an Ed25519 signature held outside the store (the attacker can't re-sign). That closes it.

But a single signature pins a valid head, not the latest one. Every head the journal ever had was, at its time, legitimately signed. An attacker can roll back to an earlier state and replay its genuine-but-stale anchor — and a lone signature check accepts it. So the heads get published to an append-only, hash-chained ledger, and an auditor checks the current head against the ledger's tail:

// A single signature pins *a* valid head.
// An append-only ledger pins *the latest* one.
verify_fresh_head(&current_head, &ledger, &verifying_key); // rollback rejected

That's the pattern from Certificate Transparency and git, applied to memory provenance: the value isn't the chain, it's publishing the head to a monotonic log that third parties can compare.

And I keep a column for what I haven't closed: a store that withholds later ledger entries is only caught once an auditor gets the true tail from an external witness (a timestamp authority, a gossiped tree head). That's deployment work, and I'd rather write it down than pretend it's solved.

Where this is going

The direction got clearer the moment the echo baseline humbled me. I'm not building a better answer engine. I'm building provenance as a first-class, measurable property of grounded AI, in the open. Concretely:

A provenance benchmark. Correctness benchmarks compare RAG systems on answer quality. There's no standard way to compare verifiable-generation systems on whether tampering is detectable and localisable. So I built one: a tampering taxonomy, an adversary-capability matrix, and a scoring protocol any system can plug into. That's the axis I think actually matters for AI you have to trust over time.
Provenance as an addable layer. The realizer-independence is the feature, not a caveat. The goal is for any extractive retriever to be able to adopt the layer.
External witnessing and key rotation to harden the ledger for real deployments.
The bigger picture — this is one piece of Celiums, where the bet is that memory is the foundation for AI agents you can actually audit, not an afterthought bolted onto a model.

It's all open

The substrate, the LLM+RAG comparator, every result envelope, the tamper-detection experiment, the provenance benchmark, and the full preprint are public. Code is Apache-2.0; the docs, corpora, and preprint are CC-BY-4.0.

Code: https://github.com/terrizoaguimor/hyphae-v2
Preprint (Zenodo DOI): https://doi.org/10.5281/zenodo.20436643

I'm a solo, self-taught founder building this in public, which means the dead ends are public too — the echo baseline being the best example. If you work on retrieval, tamper-evident logs, or grounded generation, I'd genuinely like to hear where you think this breaks. The threat model only gets better when someone smarter than me attacks it.