TIL - Graph Thinking Without a Graph Database

This week I read DDIA Chapter 2 related to data models. Most of it felt familiar. Relational vs document, many-to-many with junction tables, schema-on-read vs schema-on-write. These were things I had opinions about already.

But the graph data model section was a blind spot. I assumed graph databases were for social networks, interesting but not relevant to anything I was doing.

What graph data models actually are

One key aspect that the chapter emphasizes is how different data models handle many-to-many relationships. In a relational data model, we usually have several tables and a junction table that connects them. We also can add some additional columns to that junction table.

For a graph data model, we can think of it having 2 different tables: nodes and edges. Nodes are entities. Edges are relationships between them. Both can have properties.

There is not much difference between a relational and graph data model for a single relationship at a fixed depth. For example, a friendships table with user_a_id, user_b_id, since, is_close_friend is basically an edge with properties. Relational handles that fine.

The difference shows up when we start traversing.

Say we want "friends of friends". With a junction table, that's a self-join. "Friends of friends of friends" is another join. "Anyone reachable from me through any number of friendship hops" is a recursive CTE. It works, but the query complexity has nothing to do with how simple the question sounds.

In a graph query language, traversal is the native operation. Here's friends of friends in Cypher (Neo4j's query language):

MATCH (me:User {id: $userId})-[:FOLLOWS]->(friend)-[:FOLLOWS]->(fof)
RETURN fof

We can read it almost like a sentence: match the pattern where I follow a friend, who follows a friend-of-friend. The arrows are edges; [:FOLLOWS] is the edge type to traverse.

And arbitrary depth is just one more character:

MATCH (me:User {id: $userId})-[:FOLLOWS*]->(reachable)
RETURN reachable

The * means "follow any number of these edges." Same query shape whether it's one hop or ten. In SQL, that jump from fixed depth to arbitrary depth means rewriting our query as a recursive CTE.

The second difference is that a graph treats different relationship types uniformly. In relational, follows, blocks, and memberships are usually separate tables, and traversing across them means a different join per table. In a graph, they're all just edges, and we can traverse across types in a single pattern.

So my take is that the real distinction is the traversal. Especially variable-depth traversal across multiple relationship types. It's a first-class operation in a graph model and an awkward bolt-on in SQL.

The shape that fits

The chapter convinced me that graph models suit problems where:

Relationships are recursive or variable-depth (friends of friends, transitive dependencies, reachability)
Multiple paths can exist between the same two entities
The type of relationship matters as much as the entities themselves
Queries are about traversal and reachability, not just lookup

If our data is mostly "fetch a row by ID" or "join two tables on a foreign key", relational is fine. But the moment we start asking "what's reachable from here through any valid path?", that's a graph question, whether we store it in a graph database or not.

Looking at my own work through this lens

I worked on an ads management system. The schema looked like this:

Some queries this service needed to answer:

Find all keywords in a shop.
Find all keywords in a group.
Find all keywords in an ad. (Keywords directly on the ad, plus keywords on the group that contains it.)

Reasonable schema, queries, and I'd worked with this code. But when I tried drawing the data as a graph, this is what I got:

There are multiple paths from a shop to a keyword: through an ad group, through an ad, through both. When I query "all keywords in a shop", I'm doing a graph traversal: "find every Keyword reachable from this Shop through any path of contains edges". I just hadn't been calling it that.

What actually changed for me

1. I started thinking about reachability instead of joins

Before: every question about the data felt like a question about which tables to join. To get keywords in a shop, I join keyword tables with ad/ad_group tables and filter by shop_id. The query was a sequence of join operations.

After: every question about the data feels like a question about which nodes are reachable from which. Find all keyword nodes reachable from this shop. The traversal is the question, and the join is just one implementation of the traversal.

The shift sounds subtle, but it's what made other approaches (denormalization, recursive CTEs, even just rephrasing the SQL) become visible. Once the question is "what's reachable from here?", the answer doesn't have to be "join these tables." It can be anything that gets us the same set of reachable nodes.

2. I noticed the multiple paths problem

In a tree, every node has exactly one parent. The hierarchy I was working with isn't a tree. A keyword can be reached from a shop through an ad group, or through an ad, or through both. An ad can belong to a shop directly, or be inside a group.

I'd been treating this as a quirk of the schema. The nullable columns, the join table, the two separate keyword tables, these were just "how things are." But the graph lens names it clearly: the data has multiple paths between the same kinds of nodes. That's a structural property, not a quirk.

And it explains why my SQL queries kept needing unions. Each UNION branch is one path. The graph is telling me up front that I'm going to need multiple branches; the schema was hiding that until the query made it visible.

3. I saw edges that weren't in my schema

The graph diagram has a contains edge from ad_group to ad. In my schema, that relationship lives in the ad_group_ad junction table.

But the graph also has implicit relationships that my schema doesn't model directly. The "keyword in shop X" relationship is real (and we query for it constantly), but no column or table represents it directly. It's a derived relationship, computed every time we run the traversal.

That's where the option space opens up. Once I can see shop-to-keyword as a meaningful relationship, I can ask whether to materialize it (denormalize shop_id onto every keyword) or keep deriving it (current approach with traversal). Both are valid; the graph view is what made the choice visible.

What it would look like as a graph query

In Cypher, "all keywords in a shop" is:

MATCH (s:Shop {id: $shopId})-[:CONTAINS*]->(k:Keyword)
RETURN k

In our actual schema, the same query takes two branches:

-- Keywords on groups in this shop
SELECT k.* FROM ad_group_keyword k
JOIN ad_group g ON k.ad_group_id = g.id
WHERE g.shop_id = $shopId

UNION

-- Keywords on ads in this shop
SELECT k.* FROM ad_keyword k
JOIN ad a ON k.ad_id = a.id
WHERE a.shop_id = $shopId AND a.status = 'active'

It works. But this query got slow on us in a way that took us a while to understand.

The issue wasn't the JOIN itself, or the UNION. It was the query planner. When a shop has many keywords, the planner sometimes picks an index path that ends up scanning across the ad tables (soft-deleted rows included), even when the request is for a small page of results.

The behavior was hard to predict because it depended on the shop's data distribution. Small shops were fine. Large shops sometimes triggered the bad path. And because the keyword list is a batch API, a single slow query multiplied across the batch and put real pressure on the database.

The team's fix was to stop letting the planner choose. We took the joins out of SQL and resolved them in application code: query each table separately with WHERE shop_id = X AND status = 'active' (which uses clean indexes predictably), then stitch the results in Go.

It works. But it's a workaround for a query that's conceptually one thing: find all keywords reachable from this shop. The graph traversal is happening, just spread across multiple queries and some application code, with the planner taken out of the loop entirely.

Would I actually use a graph database?

Constraints:

The hierarchy is only 3-4 levels deep. Graph databases shine on deep or unbounded traversal. Mine is bounded.
Nobody on the team has run Neo4j in production. PostgreSQL we know cold.
Our source of truth is already in Postgres. Adding a graph database means syncing two stores or migrating the source of truth, both big commitments.
Our queries are predictable. We're not doing pattern matching or shortest-path.

I'm not sure I'd reach for Neo4j here. The elegance of the Cypher query is real, but the operational cost feels high for the shape of problem I have.

What's more interesting is that the graph lens opened up another option.

A different option

What if every entity in the hierarchy carried its ancestor IDs directly?

ad:               id, shop_id, ad_group_id (nullable), status
ad_group_keyword: id, ad_group_id, tag, shop_id
ad_keyword:       id, ad_id, shop_id, ad_group_id (nullable), status

ad_group_ad still records the actual ad-to-group membership, but ad.ad_group_id is a maintained denormalization. Same idea on the keyword tables: each keyword carries shop_id, and ad_keyword also carries ad_group_id and status. The ID columns are nullable where the relationship doesn't apply.

Now the "find" queries get simpler:

-- All keywords in a shop
SELECT * FROM ad_group_keyword WHERE shop_id = $shopId
UNION
SELECT * FROM ad_keyword WHERE shop_id = $shopId AND status = 'active'

-- All keywords in a group
SELECT * FROM ad_group_keyword WHERE ad_group_id = $groupId
UNION
SELECT * FROM ad_keyword WHERE ad_group_id = $groupId AND status = 'active'

-- All keywords in an ad
-- Step 1: get the ad's group (single PK read)
SELECT ad_group_id FROM ad WHERE id = $adId
-- Step 2: pull keywords from both sources
SELECT * FROM ad_keyword WHERE ad_id = $adId AND status = 'active'
UNION
SELECT * FROM ad_group_keyword WHERE ad_group_id = $adGroupId

These are direct lookups on indexed columns. There's no join for the planner to mis-optimize, no intermediate result set whose size depends on data distribution. The query behavior is the same whether a shop has 50 keywords or 50,000.

The third case is two reads, but step 1 is just a primary-key lookup on the ad row we'd usually be fetching anyway.

The cost is that denormalization is now a system, not a single column. Moving an ad updates ad_group_id across its keywords; soft-deleting an ad updates status across its keywords. Two operations, both fan out to the keyword rows, both have to be transactional or the data drifts.

Drift is the part that worries me. The query patterns get dramatically cleaner, but the write paths get more places they could go wrong. Six months later, someone adds a new way to move ads between groups and forgets to update the keywords. The reads quietly return wrong results. So I'm not sure denormalization is the answer either.

What surprised me is that I'd been treating it as a normalization problem ("where should the foreign keys go?") instead of a modeling problem ("what shape does the data actually have?"). The graph perspective is what reframed it for me.

The takeaway

The data was always graph-shaped. The queries were always graph queries. The schema and the application code were doing graph work without the vocabulary to describe it. And once I could see the shape, alternatives I hadn't been considering became visible, even if I haven't decided which one is right.

I'm not sure I'll ever reach for a graph database. But learning about them is already changing how I think about modeling, even though I'm not using one.