accelerando.wiki ↗ app ↗ github

GitHub reconciliation

The thesis says writes are commits. R2 is not a git server. Bridging that is the job of one scheduled Worker handler and the GitHub Git Data API.

The setup

Each R2Backend.commit call appends a JSON entry to commits/<timestamp>-<oid>.json in the bucket:

{
  "oid": "r2-mqtqso4w-21f6b941",
  "message": "create Customer 30cf071b-...",
  "paths": ["data/acme/Customer/30cf071b-...agi"],
  "ts": "2026-06-25T16:54:37.616Z"
}

That's all the Worker can do synchronously on a hot path — write a small JSON blob alongside the actual .agi file. It's NOT a real git commit. It's a promise to become one.

The cron

[triggers]
crons = ["*/15 * * * *"]

Every fifteen minutes, the Worker's scheduled handler fires GitHubSyncer.sync(). That method:

  1. Lists every pending JSON entry under commits/.
  2. Collects the unique .agi paths touched by all of them.
  3. For each path, reads the current content from R2.
  4. Walks the GitHub Git Data API:
GET    /repos/:owner/:repo/git/ref/heads/main          → parent SHA
POST   /repos/:owner/:repo/git/blobs       (× N paths) → blob SHAs
POST   /repos/:owner/:repo/git/trees                   → new tree
       (base_tree: parentTree, tree: [{path,mode,sha}])
POST   /repos/:owner/:repo/git/commits                 → commit SHA
       (parent: parentSHA, tree: newTreeSHA, message)
PATCH  /repos/:owner/:repo/git/refs/heads/main         → fast-forward
  1. On success, moves the pending entries from commits/<…>.json to synced/<…>.json and tags each with the new GitHub commit SHA.

That's one real GitHub commit per cron tick, regardless of how many .agi writes happened during the window. A single tool call that creates an invoice with three line items will produce one R2 entry with four paths, which becomes one GitHub commit with four files changed. That matches the agent's mental model and the operator's git log.

The failure mode

The whole operation is idempotent and naturally retrying:

What the audit log looks like after a few syncs

$ git log --oneline main

5dfb26c  batch: 1 .agi write
8a4f912  batch: 3 .agi writes
c91d7e5  batch: 12 .agi writes
658127f  ui: navy + gold dashboard, served from /

The batch: commits are the syncer's. The others are human commits to the codebase itself (this is the same repo as the source). The syncer commits as the accelerando-syncer author so it's distinguishable in git log --author.

What the activity feed shows

The Worker exposes GET /activity which reads both commits/ (pending) and synced/ (reconciled) from R2, filters by tenant, sorts newest-first. The navy + gold UI renders each entry with a green "synced abc1234" pill or amber "pending" pill — operators see in-flight history and reconciled history in one timeline.

2m ago    create Customer 30cf071b...                [synced 5dfb26c]
          data/acme/Customer/30cf071b-....agi
5m ago    create Invoice with 3 lines                [pending]
          data/acme/Invoice/8f3b2e1a-....agi
          data/acme/InvoiceLineItem/...
          data/acme/InvoiceLineItem/...
          data/acme/InvoiceLineItem/...

Why this matters

A traditional ERP's audit table is adjacent to the data. Code changes can forget to write it. Operators have to trust that everything that should be there is.

The GitHub commit log is the only way data lives in production. There is no "regular write" code path that bypasses the audit. Even if the syncer cron is broken, the writes are still pending in R2 and can be reconciled later. Even if R2 is broken, the .agi files don't exist and the operation didn't happen.

You cannot have phantom writes. The audit log is the system.