GitHub reconciliation
The thesis says writes are commits. R2 is not a git server. Bridging that is the job of one scheduled Worker handler and the GitHub Git Data API.
The setup
Each R2Backend.commit call appends a JSON entry to commits/<timestamp>-<oid>.json in the bucket:
{
"oid": "r2-mqtqso4w-21f6b941",
"message": "create Customer 30cf071b-...",
"paths": ["data/acme/Customer/30cf071b-...agi"],
"ts": "2026-06-25T16:54:37.616Z"
}
That's all the Worker can do synchronously on a hot path — write a small JSON blob alongside the actual .agi file. It's NOT a real git commit. It's a promise to become one.
The cron
[triggers]
crons = ["*/15 * * * *"]
Every fifteen minutes, the Worker's scheduled handler fires GitHubSyncer.sync(). That method:
- Lists every pending JSON entry under
commits/. - Collects the unique
.agipaths touched by all of them. - For each path, reads the current content from R2.
- Walks the GitHub Git Data API:
GET /repos/:owner/:repo/git/ref/heads/main → parent SHA
POST /repos/:owner/:repo/git/blobs (× N paths) → blob SHAs
POST /repos/:owner/:repo/git/trees → new tree
(base_tree: parentTree, tree: [{path,mode,sha}])
POST /repos/:owner/:repo/git/commits → commit SHA
(parent: parentSHA, tree: newTreeSHA, message)
PATCH /repos/:owner/:repo/git/refs/heads/main → fast-forward
- On success, moves the pending entries from
commits/<…>.jsontosynced/<…>.jsonand tags each with the new GitHub commit SHA.
That's one real GitHub commit per cron tick, regardless of how many .agi writes happened during the window. A single tool call that creates an invoice with three line items will produce one R2 entry with four paths, which becomes one GitHub commit with four files changed. That matches the agent's mental model and the operator's git log.
The failure mode
The whole operation is idempotent and naturally retrying:
- If GitHub returns 500 mid-sequence, the entry stays under
commits/. Next tick replays. - If the Worker is killed between writing the new tree and moving entries to
synced/, next tick sees the same pending entries, produces the same blobs (same SHAs — git is content-addressed), produces the same tree, possibly produces a duplicate commit. Idempotent enough for our purposes; commit-deduplication would be a small extension. - If two ticks race (shouldn't happen given Cloudflare's cron is single-instance, but in principle): one wins the ref update, the other gets a 422 and retries on the next tick.
What the audit log looks like after a few syncs
$ git log --oneline main
5dfb26c batch: 1 .agi write
8a4f912 batch: 3 .agi writes
c91d7e5 batch: 12 .agi writes
658127f ui: navy + gold dashboard, served from /
The batch: commits are the syncer's. The others are human commits to the codebase itself (this is the same repo as the source). The syncer commits as the accelerando-syncer author so it's distinguishable in git log --author.
What the activity feed shows
The Worker exposes GET /activity which reads both commits/ (pending) and synced/ (reconciled) from R2, filters by tenant, sorts newest-first. The navy + gold UI renders each entry with a green "synced abc1234" pill or amber "pending" pill — operators see in-flight history and reconciled history in one timeline.
2m ago create Customer 30cf071b... [synced 5dfb26c]
data/acme/Customer/30cf071b-....agi
5m ago create Invoice with 3 lines [pending]
data/acme/Invoice/8f3b2e1a-....agi
data/acme/InvoiceLineItem/...
data/acme/InvoiceLineItem/...
data/acme/InvoiceLineItem/...
Why this matters
A traditional ERP's audit table is adjacent to the data. Code changes can forget to write it. Operators have to trust that everything that should be there is.
The GitHub commit log is the only way data lives in production. There is no "regular write" code path that bypasses the audit. Even if the syncer cron is broken, the writes are still pending in R2 and can be reconciled later. Even if R2 is broken, the .agi files don't exist and the operation didn't happen.
You cannot have phantom writes. The audit log is the system.