SYN-42: when the ID has to wait

Every issue tracker gives you a number. Jira gives you PROJ-42 . GitHub gives you #42 . Linear gives you ENG-4 . These are human-readable, memorable and easy to say out loud. "I'm working on SYN-42" is something you can communicate without checking your notes.

I wanted the same thing for cmdock. The task store already has UUIDs ( d4f8a9c2-1e3b-4a7c-b5e1-9c8f2a1d0e3b ), which are canonical and stable. But nobody says "can you look at d4f8a9c2" in a standup. So I set out to add a key system. The format would be PREFIX-N , where the prefix is account-specific and N is a monotonically increasing integer.

At first, that seemed simple enough.

The naive solution

The naive approach is to auto-increment a counter. When a task is created, the server can look up the current maximum key for the account, add one and store the result on the task.

This works fine if you have a single client and no retries. But cmdock is a sync server. Multiple clients can be connected simultaneously: a mobile app creating a task, a desktop client creating another, both within the same second. And if a request fails partway through, the client retries.

So I thought: reserve the number first, then create the task. Allocate SYN-42 upfront, store it in a pending row, proceed with the create. If the create fails, the client retries, finds the pending row and picks up where it left off.

That sounds reasonable, but it breaks down once failure and retry behaviour are included.

What goes wrong

If you allocate the number before the create, you have two things that can fail independently. Say the allocation succeeds but the create fails. The client retries, but the allocation is already done so the retry creates a new one: SYN-43 . Retry again and you get SYN-44 . Each failed attempt burns a number.

In a single-client world that's annoying but manageable. With concurrent clients it's worse. Two clients can simultaneously request task creation for the same account. Without coordination, both get allocated a number and both attempt to create, and you have a race on the counter.

The fix for both problems is an idempotency key. The client generates a unique key per create request and includes it in the request. The server checks for an existing allocation with that key before creating a new one. If found, return the existing result. If not, proceed.

This stops duplicate allocations. But it introduced a subtler constraint I didn't anticipate.

The ordering constraint

Here's the mistake I nearly made.

The version I almost shipped works like this. The client sends a create request with an idempotency key. The server checks for an existing allocation, finds none and allocates SYN-42 as a pending slot, recording the key against it. That is Phase 1. In Phase 2, the server creates the task in TaskChampion, links the UUID to the slot and commits.

Looks fine. The idempotency key gates the allocation so retries find the same row. The number is reserved upfront so you can return it to the client immediately.

But the ordering constraint says you cannot assign the number in Phase 1. The number must be assigned inside Phase 2.

The reason is what happens when Phase 2 fails permanently. The Phase 1 row is in the database. The task doesn't exist in TaskChampion. The original client gave up. Now you have a pending allocation for SYN-42 with no corresponding task. The number is burned. Not catastrophic in itself since the counter keeps incrementing, but in a system where people rely on SYN-42 meaning something real, phantom allocations erode trust over time.

The fix is to assign the number inside Phase 2, not before it. Phase 1 still reserves a slot (idempotency key maps to a slot ID, status: pending, no number yet). Phase 2 creates the task in TaskChampion, then assigns the number to the slot as part of the same atomic operation. If Phase 2 fails before it commits, no number is assigned and the slot stays pending, so the next retry can re-run Phase 2 and try again.

Numbers only get assigned when tasks actually exist, which keeps the key space clean.

The reaper

Crashes still happen. A slot can stay pending indefinitely if the server restarts at exactly the wrong moment: Phase 2 completed in TaskChampion but the allocation row commit failed, or the connection dropped before the client could confirm.

So there's a reaper. A background job that runs every few minutes, scanning pending allocation rows and checking each one against the TaskChampion store.

If the task exists in TC but the slot is still pending, the reaper commits the allocation. This handles the crash-mid-commit case where the task got created but the number never got assigned.

If the slot has been pending past the idempotency window and no matching task exists in TC, the reaper marks it as failed and moves on. No number has been consumed.

Without the reaper you're relying on every client to retry reliably forever. That's not a reasonable assumption.

The reaper also needs to be idempotent. If the server crashes mid-scan, the next run needs to produce the same outcome rather than double-committing an already-resolved allocation or incorrectly reaping a slot that had just been confirmed. This falls out naturally from the ordering constraint: the reaper checks TaskChampion before it writes anything, so repeating the same check produces the same result. The less obvious production concern is how long the idempotency window should be. Too short and the reaper will mark slots as permanently failed while a slow client is still legitimately retrying. Too long and stale pending rows accumulate, adding noise to any monitoring watching the allocation table. The right window sits past your p99 client retry duration with a safety margin. An arbitrary constant is a guess; a number derived from observed retry behaviour is an answer.

UUID or key

Once the key system is working, every endpoint that previously accepted a UUID also needs to accept SYN-42 . The resolver is straightforward: try to parse the input as a UUID first. If that fails, look for the PREFIX-N format. If neither form matches, the resolver returns a clean error.

Case-insensitive prefix matching means syn-42 and SYN-42 both work, and unknown prefixes get a proper 404 rather than a cryptic database error. The resolver runs once at the start of each handler, before any business logic touches the input.

What this tells you

If you've worked with distributed systems before, the two-phase structure here will look familiar. It comes from the same family as classic 2PC in distributed transaction processing, which solves the same underlying problem: two independent systems can't share a transaction boundary, so you need a protocol to ensure both commit or neither does. In 2PC, the coordinator sends a "prepare," waits for everyone to vote YES and then sends "commit." If the coordinator crashes between those two steps, every participant holds locks and waits indefinitely, which is the blocking problem that eventually pushed the field toward Paxos, Raft and other consensus algorithms. cmdock borrows the intuition but doesn't need the full machinery. TaskChampion is local storage, not an independent distributed node, so there's no blocking coordinator, no voting protocol and no cross-network lock contention. The reaper plays the role that 2PC's recovery log plays after a coordinator crash: reconciling state that fell through the cracks, but asynchronously rather than by holding participants hostage until resolution. The old model gives you the vocabulary. The new one drops the machinery you don't need.

The generalised lesson: any time you assign a human-readable identifier to something that hasn't fully committed yet, you need to think about where the assignment happens. The identifier allocation must be atomic with the creation. The idempotency key should gate the reservation, not the number itself. And you need a reconciliation job for anything that falls through the cracks between a failed Phase 2 and a persistent Phase 1.

"Just auto-increment" is the right intuition, but it is not the right implementation.