Field Note

Six months of a persistent knowledge graph under operational load.

MEMORY 2026·04·24

Background

CCL is an AI agent. She runs locally on the CEO's machine and acts as his partner across the lab's work. Calendar, code review, planning, writing, brain access. CCL runs on top of a persistent knowledge graph that captures everything she has learned about the operator's world. The graph has been running for six months. At the cutover from v1 to v2, it held 432 nodes and 1,117 edges. This note is the operational record of what we kept, what we threw out, and what almost ended the experiment.

Mechanism

The graph is updated by a sequential heartbeat. An UPDATE pass ingests new conversation activity into entity, event, and decision nodes. A CLEANUP pass refines or fades stale ones. Both run once per tick. The substrate (a graph) is not unusual. The discipline around what each pass is allowed to do is.

Every node carries two timestamps. When the fact was valid, and when it became invalid. Stale facts are not deleted. They are dated and pushed out of the active set. This is the bi-temporal model. It is what lets CLEANUP be aggressive without erasing history.

Compounding error

The v1 graph had no concept of structural validity at write time. A bad write would land, get used as input to the next pass, and produce subsequent bad writes that referenced it. By the time the operator noticed, the graph carried a small but persistent shadow of itself drifting alongside the truth.

An audit at the time found 63.8% of edges had their since timestamp silently defaulted to session start, destroying the actual relationship history. 47.2% of nodes had verified_at null. The graph was carrying load it could not name.

The v2 cutover introduced two invariants. The writer rejects structurally invalid writes (schema, edge shape, lifecycle class checked at the point of write). The brain agent rejects semantically invalid ones (a write that contradicts the live source it claims to reflect does not land in the active graph; it lands in a drift log with the disagreement explicit).

Combined, no incorrect node enters the graph. The first month of operation under the new invariants surfaced 23 drift entries the v1 graph would have absorbed silently.

Self-verification

CLEANUP does not trust the graph against itself. Entity-class flags (what a node says about who, what, where) are verified inline against the operating tools. GitHub for repository state. Slack for channel and message state. The calendar CLI for events. The brain's own conversation log for what was said.

The confidence math is explicit. A node resolves inline only if agree / (agree + disagree) ≥ 0.8 and at least one fresh source confirms. Below that threshold, the node surfaces to the operator's input queue rather than being silently rewritten.

Outage behaviour

On 2026·05·02 the upstream API was unreachable for seven hours. The hourly heartbeat ran on the launchd schedule the whole time. Each cycle returned a ConnectionRefused after roughly 175 seconds and exited cleanly. Eight cycles in a row failed. No data was lost; no manual intervention was required.

The first cycle after the API recovered processed the backlog and the graph returned to current. The Telegram alert pipeline did not fire (the host had no internet), which is exactly when the alert pipeline should not fire spurious failure messages.

What this confirmed: the scheduler holds the schedule, the cycle is idempotent at the entry, a failed cycle is a recoverable cycle.

Lifecycle classes

Not every node ages the same way. The graph carries four lifecycle classes.

Operational

Short-lived state. PRs, missions, calls in flight. CLEANUP fades these on completion.

Structural

Companies, projects, agents, roles. CLEANUP almost never touches these without explicit operator direction.

Intelligence

Observations, decisions, hypotheses. CLEANUP refines and re-ranks these.

Temporal-bounded

Facts that are true within a date range. CLEANUP rolls these forward.

Observation

Persistent memory is not the value. The graph is not the value. The value is what those two enable the agent to do across sessions: pick up a thread from three weeks ago, refuse to repeat a question already answered, surface the relevant context without being asked.

The compounding-error invariant is what kept the experiment alive long enough to learn that. The work that productises this loop for someone else's environment is Grove.

Contact

If something on this page is relevant to work you are running, write to us. The form is on the landing page. We come back within two working days.

Book a discovery call →