Likewise
A protocol for decentralized personal knowledge graphs.
A user runs a node on each of their own devices. Nodes share an append-only log of signed operations. The log encodes evidence (photos, calendar events, contacts), the working hypotheses derived from that evidence, the permissions governing who may read or derive from what, and a record of every inference call made against any of it.
This book is the protocol specification.
How to read it
If you are encountering Likewise for the first time, read in this order:
- Motivation — why this protocol exists, and what the status quo gets wrong.
- Overview — the system in five minutes, no code.
- Concepts — the mental model. Evidence, ops, claims, projections, capabilities, mesh.
- Comparison — honest contrast with Solid, AT Protocol, Nostr, Iroh, the local-first manifesto, and UCAN.
If you are implementing a compatible node, the normative specification is organised into three parts:
- Part 1: The substrate — chapters 00 Conventions through 12 State Machines (skipping chapter 09). Sufficient for any conformant node, including organisation peers consuming a scoped slice of a user's graph. If you are building a substrate-only peer, this is everything you need.
- Part 2: The inference pipeline — Mesh Coordination and Inference Audit. Adds the vocabulary by which nodes cooperate on a user's work and the convention by which audited inference calls become recoverable artefacts on the log. Required for nodes participating in distributed work; substrate-only peers MAY ignore.
- Annex: Application conventions — Episodes, Suggested Actions, Salience. Non-normative. The reference implementation's choices for surfacing the substrate to a user; alternative implementations are free to substitute.
After the three parts: Open Issues catalogues known cross-implementation hazards. The high-level chapters above are non-normative; the spec chapters use RFC 2119 keywords.
If you are looking for an existing implementation, see Implementations.
What this protocol is not
It is not a particular application. It is not a particular AI model. It is not a synchronization library or a database engine. It is the wire-level agreement that lets independently-built nodes interoperate over a single user's knowledge graph.
Status
v0.1 — draft for public review. The wire format was developed alongside an in-progress reference implementation (currently private, working codename Cortex — provisional). It is not yet stable across major versions, and there is no public implementation an interested party can run today. See Implementations for status, and Open Issues for known cross-implementation hazards.
License
Creative Commons Attribution 4.0 International (CC-BY-4.0).
Motivation
Where the value is moving
For most of the consumer software era, the question of "who owns your data" had a clean enough answer to ignore: you owned the photos and the calendar entries and the contact list, and the platform owned the account that hosted them. If you wanted your data back, you exported a zip file. The trade was uncomfortable but legible.
That trade is no longer where the action is. The artefact platforms care most about producing now is not the photo, it's the claim about you the photo helped to derive. That you go to the same coffee shop on Tuesdays. That this is your child. That you're probably training for a marathon. That you and Sarah are close. None of those facts are in the original photo. They are inferred — sometimes by hand-written classifiers, increasingly by language models — and they are the part of the data pipeline with leverage. They are what makes a feed personalised, an ad targeted, a recommendation feel uncanny.
Those derived claims are not, in any practical sense, yours. You cannot read them. You cannot correct them. You cannot port them to another product. When the platform deletes your account they vanish with it, even though the work of building them used your evidence and the model that derived them was trained, in part, on people like you. You are the substrate on which the intelligence is built and the only party in the loop without a copy of the result.
This protocol exists because that arrangement is bad and because it is about to get much worse.
The default the internet was built around
This is not new. It is the consumer internet's defining feature.
For twenty-five years, almost every successful piece of consumer software has agreed on the same arrangement: the party providing the service collects the record of the user. The cookie was an implementation detail. The free account was an implementation detail. The "personalised" feed, the loyalty card, the recommended purchase — all implementation details on top of the same underlying contract. The party doing the work also kept the work's record, and the record was not the user's.
The arrangement persisted because, for most of those twenty-five years, the record was structured enough to be useful to the collecting party but raw enough to be inert in anyone else's hands. Click-streams, locations, search histories — they powered ad targeting and recommendation rankings, but they were not, by themselves, a model of who you were. They were ingredients. The party with the most ingredients had the best recipes.
That is what changed. The same logs that were inert raw material a decade ago are now training data and prompt context for systems that can describe you to yourself with uncomfortable accuracy. The economic value of being the party that holds the record has risen by an order of magnitude. So has the asymmetry between you and the party that holds it.
Likewise does not propose that data collection should stop, and it does not deny that there is real value in the systems being built on top of these records. It proposes something more specific: that the question of who holds the canonical record of the user — the user themselves, or whoever is currently making money from them — is now load-bearing for whether any of this is something done for the user instead of to them. And it proposes that the historical default — the party providing the service is also the party that owns the record — is no longer acceptable.
The next wave makes the asymmetry sharper
Personal AI is on its way. Locally-runnable models that can read your calendar and your photos and your messages and reason about them are already plausible on consumer hardware. The pitch — your phone understands your life, surfaces what's relevant, drafts the message, schedules the call — is real and probably correct.
The default architecture for delivering that pitch will be a single vendor's app, talking to a single vendor's cloud, with the model running wherever the vendor finds cheapest and the derived claims sitting in whatever storage the vendor has chosen. The user will get intelligence. They will not get a copy of what the system believes about them. They will not get a way to refute it. They will not get a way to move to a competitor without starting over. They will not be told which of their evidence the model looked at when it drafted that suggestion.
Calling this "AI on your phone" obscures what is actually happening. The model running on the phone is the visible part. The interesting part is the data substrate underneath it, and that substrate — the graph of evidence and claims and permissions — is what determines whether a personal AI is a product the user owns or a product the user is.
What "owning your knowledge graph" should mean
If a system makes claims about you, owning those claims should mean five concrete things:
-
You can read every claim. Not in the form "here are the topics the assistant has noticed about you," but in the form "here are the facts the system is operating on, with the evidence each was derived from." This is the only way to know what you're being judged by.
-
You can refute or revise any claim. Inference is fallible. The system thought your sister was your wife; it inferred a job title you don't have. A claim system that can't be corrected by the person it describes is not serving them.
-
You can audit every inference call. When the model is asked "what should we surface to this user today," the question, the retrieved context, the model identity, and the answer must all be recoverable. "How did it know?" should have a literal, byte-for-byte answer.
-
You can move. Your evidence and your derived claims should travel from one implementation to another without a vendor in the loop. If the implementation you started with stops being maintained, or starts behaving in ways you object to, you should be able to walk away with everything you came in with and everything that has been derived since.
-
You can grant and revoke. A system that runs on your devices inevitably sees more than any third party should. Sharing the bits that need sharing — the family calendar with a partner, the work events with a colleague's scheduling assistant — must be a capability, not a flag in someone else's database.
None of those properties are exotic. They are what anyone would expect from a record they had any power over. The reason they are not the default in personal-data systems is not that they are hard; it is that incumbents have no incentive to provide them.
Why a protocol, not a product
You can build a single product that gives one user a private, auditable, portable knowledge graph and call it done. We tried that first. The trouble is that the moment the user wants their data to flow between two devices, or between two pieces of software, or to a trusted second party — a partner, a coach, a therapist's intake form — you need an agreement about how the bits travel. If that agreement is private to the product, the user is back where they started: locked in, only this time the lock has a friendlier name.
A protocol is different. A protocol is the rule that lets a phone running implementation A and a laptop running implementation B synchronise the same user's data without either of them being a trust anchor. It lets a researcher build a node that ingests a new kind of evidence (medical records, fitness data) and federate into an existing graph. It lets the user, ten years from now, run their graph on software no one in this room has heard of yet, because the specification — not the product — is what their data is denominated in.
This document is that specification.
Concrete scenarios this protocol changes
"I want to switch phones"
Today: your assistant's understanding of you is locked to the vendor. You start over.
Under this protocol: your devices are nodes in a small mesh you own. Adding a new phone means enrolling another node. The append-only log syncs to it. Within minutes the new device has the same understanding of you as the old one, derived from the same evidence under the same rules.
"I want to know why it suggested that"
Today: the vendor surfaces a recommendation; the reasoning is opaque, and at best you get a generic "based on your activity."
Under this protocol: every inference call is itself an artefact on the log. The retrieved context, the model identity, and the prompt are recoverable. "Why did it suggest I message Sarah today" has an answer that consists of specific evidence and specific claims, with their provenance. If the answer is wrong, you can refute the claims behind it and watch the recommendation update.
"I want my partner to see family events but not work events"
Today: you either share an entire account or you don't. The granularity is missing.
Under this protocol: capabilities are first-class. You delegate a read capability scoped to a class of evidence (calendar entries with a particular tag, say) and a class of derived claim, and you can revoke that delegation at any time. The receiving node can only synchronise the slice of the log it has been authorised for. There is no privileged "admin" account; there is only a graph of delegations rooted at the user.
"I want to refute a claim the system made"
Today: there is, often, no surface for this. The system knows what the system knows; you live with it.
Under this protocol: a user assertion is itself an op on the log. Refuting a claim flows through the derivation graph: anything that was derived from the refuted claim becomes invalid. The same mechanism that propagates evidence forward propagates corrections backward. The user is the final authority on facts about themselves, mechanically, not just rhetorically.
"I want a server to do the heavy inference"
Today: you either trust a vendor cloud or you don't have a server.
Under this protocol: a server is just another node in the mesh, enrolled by the user, with capabilities the user defined. It runs the inference work the phone can't. The phone keeps the canonical log; the server's outputs are themselves logged operations the phone receives and can audit. The user can revoke the server's capability at any time, at which point its derivations stop being trusted and the affected claims invalidate.
"I want to share my grocery rhythm with a retailer, without sharing my purchases"
Today: you either accept the loyalty-card terms in full (and the retailer collects a fine-grained record of your transactions, app sessions, and adjacent ad-platform signals) or you opt out (and the retailer falls back to coarser inference from third-party data, which is no better for either side).
Under this protocol: you delegate the retailer's node a capability scoped to a single claim — your grocery-visit rhythm — with caveats that prevent any underlying evidence (receipts, photos, location pings, basket details) from crossing the boundary. The retailer gets a precise, accurate answer to a useful question, and no more. You can revoke the delegation in one operation. Both sides know exactly what was shared because the wire format describes it precisely.
Consensual data partnership
The previous scenario points at the protocol's most interesting consequence — one its designers didn't initially set out to deliver. The same machinery that lets a user share data between their own devices also lets them share data, on their own terms, with anyone else.
Today, when a retailer wants to know that you regularly buy apples, they have to guess. They collect transaction logs, loyalty-card swipes, app session data, and ad-platform signals; they segment the behaviour across millions of users until a confident probability emerges that you are an apple-buyer; and the result is, at best, a guess the retailer holds about you that you will never see and cannot correct. The cost of producing the guess is enormous. The accuracy is uneven. The relationship is adversarial — every additional signal the retailer captures is a small extraction.
Now consider the same scenario differently. The user has ground-truth claims about themselves: that they go to a grocery store roughly four times a month, that the visits cluster on Saturdays, that the basket size has been growing. Those claims already exist on the user's personal mesh, because the user's own evidence — calendar, location, photos of receipts — derived them.
Sharing those claims with a retailer is no longer an act of surveillance acceptance. It is an act of delegation. The user issues a UCAN scoped to the retailer's node with caveats:
- only the predicates they care about (
grocery_visit_rhythm), - none of the underlying evidence (no source-typed photos or calendar entries cross the boundary),
- sanitisation rules that strip descriptive content fields,
- a time-range that auto-expires the delegation in twelve months.
The retailer deploys a Likewise node — same wire protocol, same op log, same authority machinery — and that node synchronises only the slice of the user's log this delegation admits. The node materialises a tiny knowledge graph: possibly nothing more than the rhythm claim and its confidence. The underlying photos, locations, and basket details never leave the user's mesh. If the user revokes the delegation, the retailer's node loses its authorisation, and the slice of state it materialised becomes invalid by the same cascade rule that retires any other revoked authority.
This is not a hypothesis about a future protocol. The capabilities, caveats, sanitisation rules, and revocation semantics are already specified for the single-user mesh case (see Capabilities and UCAN and Caveats). The same machinery generalises directly: a "node" in this protocol does not have to be a personal device. It can be any party — a retailer, a clinic, an employer's scheduling assistant, a research institution, a public-interest data trust — that the user has chosen to invite in. The materialisation that party holds can be as small as one claim or as large as the user authorises.
The economic shape this enables is different from the status quo. The retailer pays nothing for the bulk-collection infrastructure they no longer need. The user shares specific claims they have chosen to share, on terms they have chosen, and can stop at any time. Both sides know exactly what is being shared because the wire protocol describes it precisely. Compliance with the user's "no" is enforced mechanically, not by lawsuit.
The protocol does not specify how the resulting market gets built. It does not specify pricing, payment rails, negotiation formats, or contract templates. What it specifies is the substrate: a wire format in which "I share these claims, with this masking, until I revoke" is something that can be expressed precisely and verified independently by both parties. The market on top of that substrate is for others to design.
There is a version of the future where every commercial relationship that today depends on third-party tracking is reconstituted as a voluntary, scoped, revocable delegation between the user and the counterparty. There is also a version where it isn't, and the incumbents preserve their bulk-collection model because nothing in the law or the market forces a change. The protocol exists, in part, so that the first version becomes possible.
The non-negotiable rules
The protocol is built around six rules that exist to make the properties above survive contact with reality. They are stated formally in Invariants; the short forms are:
- Only operations mutate canonical truth. Everything else is a projection that can be rebuilt from the log.
- Every user-visible claim has transitive provenance to evidence.
- Derivation forms a directed acyclic graph. Refutations cascade.
- Sync converges operations, not projections. Two nodes that have seen the same operations agree on what is true.
- Every operation is signed by its author. Identity is per-device, bound by capability delegations rooted at the user.
- Inference is auditable — by default on the user's own nodes, and on demand when delegated to others, via a caveat the user attaches.
Anything an implementation does that violates one of those rules breaks the user's ability to own what the system says about them. That is why they are non-negotiable.
What this is not trying to be
It is not trying to be a social network. The graph is private to the user and the parties they have explicitly delegated to. There is no global namespace, no public feed, no follow graph.
It is not trying to be a general-purpose database. The data model is shaped for personal context — evidence, entities, claims, episodes, suggested actions — not for arbitrary tabular workloads.
It is not trying to replace cloud AI for everyone. Some users will prefer the convenience of a vendor offering. The protocol is for the users — and the implementers — who would prefer the alternative to exist.
It is not trying to be a finished system. The reference implementation works end-to-end and is the source of truth for what the wire format actually is today, but the spec has known open questions, listed honestly in Open Issues. The point of publishing now is to make those questions public before the de-facto answers are decided by whoever ships first.
What we want from readers
If you are an implementer: read the spec. Build a compatible node. Tell us where the spec is unclear or where two reasonable readings diverge.
If you are a researcher: the protocol is licensed CC-BY-4.0. Cite it, fork it, write a better version. We would rather lose to a better protocol than win with a worse one.
If you are a user: there is no public implementation you can run today. The protocol's first reference implementation (provisional codename Cortex) is in private development and not yet released. The point of publishing the specification before the implementation is to ensure the standard — and the property it gives you, of owning what the system says about you — is not a luxury feature. It is the precondition for any of this being something done for you instead of to you. The implementation will follow.
Overview
This chapter describes the protocol in five minutes, no code. If you want the why first, read Motivation — this chapter assumes you already accept that owning your own knowledge graph is worth specifying. If you want the wire-level rules, jump to Conventions.
The picture
A user runs a small mesh of nodes — typically a phone, a laptop, and maybe a server they own. Each node is a self-contained implementation of the protocol: a local database, a sync engine, and (where the hardware allows) an inference engine. The nodes talk to each other directly over the local network or the public internet, never through a central service.
What the nodes share is a single append-only log of signed operations. The log is the canonical state. Everything else — what the user sees on a card, what the model is given as context, what is highlighted as "important today" — is a projection of the log, regenerable from it.
What's on the log
Operations come in a few categories.
Evidence operations record raw inputs the user has chosen to ingest: a photo (referenced by content hash, not embedded), a calendar event, a contact card, a message thread. Evidence is immutable once written. Removing it requires a tombstone op, which cascades through everything derived from it.
Entity operations record the things the user's life is about: people, places, organisations, events, commitments, concepts. Entities are not pre-defined by the protocol; they are derived. The protocol specifies how an implementation may merge two entities ("Sarah" the contact is the same as "Sarah M." extracted from a photo caption), how it may split one back apart, and how it must record the provenance of those decisions.
Claim operations record the working hypotheses the system is operating on: "Sarah is a close contact." "Tuesday mornings are gym mornings." "The next coffee with Mike is overdue." Claims have a status — hint, claim, fact — that reflects how strongly they are believed and whether the user has confirmed them. Claims have explicit confidence and explicit provenance: every claim links back through the operations that derived it to the evidence at the bottom.
User-assertion operations record what the user themselves has said: "yes, that's right," "no, refute that," "merge those two." User assertions take precedence over derived claims. They are the mechanism by which the user is the final authority on facts about themselves.
Artifact operations record machine-produced byproducts of derivation: embeddings, transcripts, OCR text, and the inference-snapshot artefacts that record model calls. Artefacts ride the same op-log machinery as everything else, with a TTL and eviction lifecycle for storage management.
Job and lease operations record work the mesh has scheduled, claimed, completed, or yielded — for example, "this server should synthesise an episode for last week." This is how a phone offloads inference to a laptop without anyone having to be in charge of the whole mesh.
Capability operations record permissions: who may write what, who may read what, who may schedule what kind of work. They use UCAN delegations, rooted at the user.
Coordinator and routing operations record decisions about who does what in the mesh: which node coordinates derivation, which kinds of jobs route to which node.
Two further op categories — Episode operations, which record narrative clusters of evidence and claims, and Suggested-action operations, which record recommendations the system surfaces to a user — are application-layer conventions used by the reference implementation, not part of the substrate proper. They are documented in Annex: Application Conventions. A node that does not surface records to a user — for example, an organisation's node — has no need to emit them.
The full substrate taxonomy is in Operations. For now the important point is: every state change is one of these operations, every operation is signed by its author, every operation is timestamped with a hybrid logical clock so causal order is total across the mesh.
What you read from
A naive reader of an append-only log would have to fold over the whole thing every time they wanted to know whether Sarah was a close contact. Implementations don't. They maintain a small set of projections — materialized read views — that an op-application function keeps in sync with the log.
The protocol distinguishes three substrate projections by purpose:
- A larger, in-memory, model-prompt-oriented view used to assemble context windows for inference calls.
- A durable, on-disk, lookup-oriented view used by the user interface for "show me everything you know about Sarah."
- A debug-only, full-graph view, used by inspection tooling.
Each projection consumes the log; none of them are canonical. Any of them can be discarded and rebuilt. The protocol specifies what each one must be able to answer; how an implementation builds it is open.
A fourth projection — a small, in-memory, ranking-oriented view used to decide what's salient now — is an application-layer convention used by the reference implementation, not part of the substrate. It is documented in Annex: Application Conventions §A.3. A node that does not surface records to a user has no need for it.
How nodes converge
There is one HTTP endpoint and one cursor. A node asks a peer for "the operations you have that I don't," sending its causal frontier as the cursor. The peer returns the matching slice of its log. Both nodes apply received operations into their local log idempotently. Because operations are timestamped with a hybrid logical clock and the merge rules for any conflicting updates are deterministic, two nodes that have seen the same set of operations agree on the same projected state.
There is no leader. There is no central coordinator. There is no handshake more elaborate than "what's your causal frontier; here is the set difference." Sync is the same operation whether two nodes are catching up after a week apart or staying current minute-by-minute.
Capabilities filter what crosses the wire. A node holding only a read-only delegation for calendar evidence will not be served claims about photos. The filter runs on the source side. Operations that must be sanitised before crossing — strip GPS, redact participants, truncate body text — have their signatures cleared at sanitisation time, which makes the change visible to the recipient as a deliberate intent rather than a corruption.
A node does not have to be one of the user's own devices. The same delegation machinery scopes what an invited third party — an organisation, a clinic, a service the user has chosen to share with — can see. A retailer running a Likewise node receives only the claims the user delegated to them, sanitised per the caveats the user attached. The same wire format that synchronises a phone and a laptop also synchronises a personal mesh and a partner the user has explicitly opted in. See Motivation: Consensual data partnership.
How permissions work
Every node has a key. Every operation is signed by a node key. Every node key is itself the subject of capability delegations issued by the user (or by another node the user has delegated authority to).
A capability is a triple: a resource (operations of a certain class, evidence of a certain class, jobs of a certain kind), an action (read, write, schedule, claim, complete), and a set of caveats that narrow it (only evidence of these source types, only claims with these predicates, only jobs in this time range, only operations that have been sanitised in these specific ways).
Delegations form a graph rooted at the user. Revoking a delegation prunes the subgraph beneath it. The protocol specifies how a node must interpret an incoming op against its capability set, so any two implementations agree on whether a given op was authorised at the moment it was sent.
How inference is audited
When a node operating under audit calls a model — to summarise a
window, to draft a recommendation, to extract entities from a
photo caption — the call itself becomes an operation. The
retrieved context, the prompt, the model identity, the timing,
and the output are all recorded as a likewise.inference.snapshot
artefact on the log.
Audit is in force in two cases:
- The node is one of the user's own. Any node operating under the user's root delegation — the user's phone, their laptop, a server they run at home — emits snapshots by default. This is the case the reference implementation satisfies and is what makes the user's personal mesh auditable end-to-end.
- The user has required it of a delegated party. A user
delegating to an organisation's node MAY attach an
audit_inference: truecaveat, requiring that delegated node to emit snapshots for inference performed against the delegated data. The snapshots become themselves operations on the log the user receives back.
A delegated node operating without an audit caveat is not required to record its inference. What it does internally with the data the user authorised is governed by the delegation's other caveats, not by the audit invariant. This is a deliberate scope choice: the protocol's role is to let the user decide whether audit applies, not to mandate it for every party that ever processes a piece of the user's graph.
When audit is in force, the snapshot is referenced from any record the call produced. Asking "why did the system suggest I message Sarah today" follows the link from the suggested action to the snapshot to the inputs. There is no operation produced under audit that produces a user-visible result without leaving this trail.
Snapshots are themselves bounded — they have a TTL, they can be evicted, they can be tombstoned with the rest of an evidence cascade. But while they exist, they are the audit record.
A day in the life
A user takes a photo. The phone ingests it as an evidence operation (content hash + EXIF + Vision labels), runs the deterministic extraction pass on the labels and any visible text, and emits some candidate claims as hint-status operations. Nothing has been shown to the user yet.
Overnight, the user's laptop — which has more capable hardware — claims the synthesise job for yesterday. It pulls the relevant slice of the log, assembles a model context, makes one inference call, and writes the result back as operations. Because the laptop is operating under the user's root delegation, audit is in force by default; the inference snapshot is also written. The reference implementation materialises the result as an episode and a suggested action — both application-layer conventions, not part of the substrate proper — so a card can be rendered later. A different implementation might materialise the same result a different way; the substrate-level claims and the snapshot are what the protocol guarantees.
The phone receives the new operations on next sync. Its salience projection (also an application-layer convention used by the reference implementation) rebuilds. The next time the user opens the app, a card appears: "Coffee with Mike — last seen at the same shop two weeks ago, your usual rhythm is monthly." The user taps "show why." The app follows the suggested-action's link to its snapshot, which lists the evidence, the claims, the model used, and the literal prompt.
The next day the user refutes one of the claims — the system assumed Mike worked nearby, but he doesn't. That refutation is a user-assertion op. The derivation cascade fires: claims that depended on Mike's location are invalidated. The next salience projection no longer surfaces the suggestion that depended on it.
None of this required a service. None of it required a vendor. None of it could have happened in a way the user couldn't audit or undo.
Where to go next
- Concepts — the mental model in more depth, with diagrams.
- Comparison — how this protocol relates to Solid, AT Protocol, Nostr, Iroh, the local-first manifesto, and UCAN.
- Conventions — the start of the normative specification.
Concepts
This chapter is the mental model in depth. It is non-normative — the specification chapters are where the must-haves live — but a reader who finishes this chapter should be able to predict how a Likewise node would behave in most situations, and should be able to read the spec without surprise.
The shape
┌──────────────┐ ┌──────────────┐
│ Evidence │ immutable inputs; │ Evidence │
│ (immutable) │ hashed, not stored │ (immutable) │
└──────┬───────┘ in-band └──────┬───────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────────┐
│ Operation Log │
│ append-only, signed, hybrid-logical-clock ordered │
│ ─ evidence ops ─ entity ops ─ claim ops ─ episode ─ │
│ ─ action ops ─ user assertions ─ job ops ─ ucan ─ │
└─────────────────────────────────────────────────────────┘
│
│ deterministic apply_op
▼
┌─────────────────────────────────────────────────────────┐
│ Projections │
│ ─ salience ─ inference ─ details ─ debug graph ─ │
│ each rebuildable from the log alone │
└─────────────────────────────────────────────────────────┘
│
│ surfaced through
▼
Surface
(cards, suggested actions, UI)
The diagram is the system. The log is canonical. The projections are caches. The surface is the part the user touches. Everything below that line — every claim, every recommendation, every episode — exists because of an op that produced it, and that op is on the log.
Evidence
Evidence is the raw material the system reasons over. A photo, a
calendar event, a contact card, a message thread, a location ping.
Evidence is immutable: once an evidence op lands on the log, the
content it points at does not change. Removing evidence is its own
op (TombstoneEvidence) which causes a derivation cascade — see
below — but the historical record of what was once known survives,
because removing operations would break the rebuild-from-log
invariant.
Evidence is referenced by content hash (BLAKE3). The hash is on the log; the content itself need not be. An implementation may store the photo bytes in a local blob store, in a peer's blob store, or not at all. The protocol cares that the hash is there and is verifiable; it does not mandate where the bytes live.
Each piece of evidence has a source anchor: a stable identifier from the upstream system (calendar event UID, photo asset id) that lets multiple nodes agree they are talking about the same external fact, even if they extracted it slightly differently.
Operations
An operation is a typed, signed message that mutates the log. It carries:
- A typed payload (one of ~31 variants — evidence, entity, claim, episode, action, user assertion, job, capability, coordinator, routing).
- A timestamp in hybrid logical clock form:
(wall_ms, logical, node). - An author node id.
- A signature by the author's key (RFC 7515 detached JWS, Ed25519, over the canonical encoding of the op with the signature field cleared).
- For sanitised ops: the signature is cleared on transit, signalling that the op has been intentionally redacted by an authorised caveat. Recipients distinguish "altered in transit" (corruption) from "deliberately sanitised by an authorised filter."
Why typed ops instead of a generic CRDT? Because the protocol's content domain is narrow and well-understood. A typed vocabulary — "create entity," "supersede claim," "tombstone evidence and cascade" — gives an implementation the information it needs to maintain projections and derivations without a generic merge engine. The trade is expressivity: the protocol does not try to be a general collaborative-document substrate. It tries to be a precise model of a single user's knowledge graph.
Time: the hybrid logical clock
Two devices that disagree about the wall clock should still agree
about what happened first. The protocol uses a hybrid logical clock
(HLC) for that: every op timestamp is (wall_ms, logical, node),
and ordering is lexicographic over the triple. The wall component
keeps timestamps roughly aligned with human time; the logical
component handles bursts within a millisecond; the node id breaks
ties when two devices emit at the same (wall, logical).
The clock has two disciplines, both normative:
- Tick on emit. Before a node writes a local op, it ticks its HLC, ensuring the new timestamp strictly dominates every prior timestamp the node knows about.
- Recv on receive. When a node receives a remote op, it advances its own HLC past the received timestamp.
If either discipline is violated, two nodes can disagree about the order of operations they have both received. This is the kind of quiet bug that is undetectable in test fixtures and devastating in production.
The causal frontier
A frontier is the per-author maximum-timestamp summary of what a node has seen. When two nodes synchronise, they exchange frontiers and ship each other the operations the other doesn't have. Because the HLC induces a total order per author, "what you don't have" is a clean set difference rather than a merkle-tree dance.
Frontiers are also the cursor for incremental sync. A node tells a peer "send me everything past this frontier"; the peer streams the matching ops and returns the resulting frontier as the cursor for the next exchange.
Entities
An entity is a stable identity for a thing in the user's life: a person, a place, an organisation, a recurring commitment, a project, a concept. Entities are not pre-defined by the protocol; they are derived from evidence by the implementation's resolution pass. The protocol specifies how an implementation merges or splits entities and what provenance it must record when it does.
Entity identity is per-mesh, ULID-based, and survives across nodes — once two nodes have synchronised the operation that created an entity, they refer to it by the same id. Entity labels (the human-readable name) are claims like any other and can change; the id is what holds the cluster of claims together.
Claims
A claim is a working hypothesis: a subject (often an entity), a predicate (drawn from a centralized vocabulary), an object, a confidence vector, and a set of supporting operations. "Sarah is a close contact" is a claim. "Tuesdays are gym mornings" is a claim.
Claims have a status that reflects how strongly the system believes them and whether the user has had a say:
- Hint — the system has noticed something but is not surfacing it yet.
- Claim — the system is operating on this as a working belief.
- Fact — the user has confirmed it; subsequent derivations may not silently override it.
Claims can be superseded (a newer op replaces an older claim about the same subject and predicate) and rejected (user assertion or downstream evidence invalidates them). Both transitions are themselves operations on the log, so the history of what the system used to believe is recoverable.
Confidence is a vector, not a scalar — the protocol carries multiple components (e.g. evidential, derivational, temporal) so an implementation can decide its own composition rule without losing the underlying signal.
The derivation DAG and provenance
Every claim links to the evidence and other claims it was derived from — its supporting operations. Following those links forms a directed acyclic graph rooted at evidence. The protocol enforces that the derivation graph is a DAG (the entity-resolution graph may have cycles; derivation may not), because cycles in derivation would make invalidation undecidable.
When a piece of evidence is tombstoned, or a user rejects a claim, the cascade walks the DAG forward and invalidates everything that transitively depended on the source. This is what makes the "refute" gesture in the surface mean something. The user is not just hiding a card; they are marking a node in the graph dead, and the system has to honour the consequences.
This is also what makes auditability mechanical. Asking "why does the system believe X" is following the DAG backwards from the claim to its supporting ops to the evidence at the leaves. There is no narrative to consult — the trail is the trail.
Episodes and suggested actions (application-layer)
The reference implementation also defines two op types that exist purely to surface the substrate to a user. They are application-layer conventions, not part of the substrate proper, and live in Annex: Application Conventions. A node that does not surface records to a user — for example, an organisation peer — has no need to emit them.
An episode is a cluster of related evidence and claims with temporal bounds: a trip, a project, a relationship, a day worth remembering. Episodes are how the reference implementation presents narrative instead of list.
A suggested action is a recommendation the system makes to the user: send this message, review this calendar, reconsider this goal. Suggested actions have their own lifecycle (proposed, shown, acted, dismissed) and their own provenance link to the inference call that produced them. They exist to make recommendations refutable — a user's "stop suggesting this" is itself an op that the inference pipeline must respect.
Both are documented because the reference implementation emits them and other implementations may want to interoperate with applications that consume them. They are not, however, what makes a node Likewise-conformant; the substrate is.
Projections
Reading the log directly to answer "what does the system know about Sarah" would require a fold over millions of ops. Implementations maintain projections: in-memory or on-disk views that an op-application function keeps in sync with the log on every write.
The protocol distinguishes three substrate projections by purpose, and the distinction is load-bearing:
- An inference projection is shaped for assembling a model context window. It is not a UI store; it is prompt furniture.
- A detail projection is durable, on-disk, and shaped for the user-facing reads ("show me everything you know about Sarah"). It carries titles, labels, claim text, provenance links.
- A debug-graph projection exists for inspection tooling. It contains the full graph of entities and claims and is generally not maintained at production load.
A fourth projection — a salience projection used to rank what is important now — is an application-layer convention, not part of the substrate. The reference implementation provides one; alternatives are free to substitute or omit.
The reason these are separate is that collapsing them produces a single fat object that is too slow for ranking, too lossy for UI, and too memory-hungry for inference contexts. Implementations are free to optimise within each projection; they are not free to fold them into one.
The detail projection rebuilds from the log when missing or corrupted. This is the mechanism that closes the loop on the "projections are disposable" rule: an implementation can lose every cache and recover from the log alone.
Capabilities
A capability is a triple (resource, action, caveats):
- Resource — a class of operation or content (operations of a kind, evidence of a source type, claims with a predicate, jobs of a kind, episodes, artefacts, suggested actions, mesh coordination, registration).
- Action — what may be done (read, write, schedule, claim, complete).
- Caveats — narrowing constraints on the resource and action:
source_types— only evidence from these source types.predicates— only claims with these predicates.kind_prefix— only jobs whose kind starts with this prefix.time_range— only ops with timestamps in this window.sanitize— operations crossing this delegation must be passed through these field-level redactions: strip GPS, redact participants, truncate content bodies, strip custom metadata.
Capabilities are carried by UCAN delegations rooted at the user. A capability is delegated by a parent, may be re-delegated by the recipient if and only if the new delegation is attenuated (its caveats are at least as restrictive as the parent's), and may be revoked at any time. Revocation prunes the subgraph of delegations beneath the revoked one and invalidates any already-applied operations whose authority depended on the revoked capability.
Sanitisation is the most subtle caveat. When an op crosses a delegation that requires sanitisation, the relevant fields are redacted and the signature is cleared. The recipient sees an unsigned op tagged as sanitised, which is treated as a deliberate filter and applied. An op without a signature that is not tagged as sanitised is rejected — the missing signature would otherwise be indistinguishable from corruption.
The capability machinery is symmetric: a delegation chain that admits a personal device (the user's laptop, an inference server the user runs at home) is structurally identical to one that admits an organisation the user has chosen to invite in. A retailer deploying a Likewise node, a clinic running a scheduling assistant against a user-authorised slice of their calendar, an employer's scheduling helper that sees only the predicates the employee has consented to share — all of these are the same kind of peer to the protocol. They differ only in the scope of the delegation the user has signed. This is what makes consensual commercial data partnership a use case the protocol enables out of the box rather than a separate machinery.
Mesh coordination
Multiple nodes can do work for the same user. The protocol provides a vocabulary for who claims what:
ScheduleJob— declare that a unit of work needs to happen (e.g., "synthesise an episode for last week").ClaimWork— a node takes responsibility for a scheduled job, with a hybrid-logical-clock-relative lease.CompleteJob— the work is done; the result is written as follow-on ops the rest of the mesh receives on next sync.YieldWork— the claiming node is releasing the lease voluntarily.ExpireWork— the lease passed without a completion; another node may now claim.
Two further ops shape who does what:
DesignateCoordinator— the user (or a delegate) designates the node responsible for the deterministic derivation pass. This is not an election; it is a declaration. The coordinator's output is what the mesh agrees to derive from a given log prefix.RouteKind— the user routes a class of jobs to a specific node. Once routed, only the target node may claim jobs of that kind. Used to direct heavy inference to a server while keeping the phone in charge of the log.
These ops use the same UCAN-shape capabilities as everything else:
scheduling a job requires Schedule on Job; claiming requires
Claim on Job with a kind_prefix caveat that admits the kind.
Inference auditing
When a node operating under audit calls a model, the call itself
becomes an op. Specifically, a likewise.inference.snapshot
artefact is written to the log recording the retrieved context
(the evidence and claims fed into the prompt), the model
identity, telemetry (latency, token counts, backend), and the
output. Any claim or suggested action the call produced links
back to the snapshot.
Audit is in force in two cases: when the node is operating under
the user's root delegation (a node the user runs themselves),
and when the node is operating under a delegation whose
audit_inference caveat the user has set to true. A delegated
node operating without an audit caveat is not required by the
protocol to record its inference; what it does internally is
governed by the delegation's other caveats. The user retains the
choice; the protocol enforces it when chosen.
Snapshots are first-class artefacts and follow the same lifecycle: they have TTLs, they can be evicted, they can be tombstoned with their underlying evidence. While they exist, they are the authoritative answer to "why did the system say that."
The full mechanism is specified in Inference Audit.
The six rules in plain English
These are the non-negotiable rules from the Invariants chapter, restated without normative language so the intent is clear:
- Only operations change the truth. Caches and projections do not. Anything you can't reproduce by replaying the log is not real.
- Every claim has provenance. No fact about the user appears without a chain back to evidence the user provided.
- Derivation is a DAG, and refutations cascade. Marking something wrong has consequences the system has to honour.
- Sync converges operations, not projections. Two nodes that have seen the same ops agree on truth, regardless of how each chose to materialise it.
- Every op is signed. Identity is per-device, anchored at the user's root delegation. There are no anonymous writes.
- Inference is auditable. On the user's own nodes, every model call is recorded as a referenceable op by default. On nodes the user has delegated to, audit is opt-in via a caveat the user attaches to the delegation.
A system that violates any of these breaks the user's ability to own what it says about them. The rest of the spec exists to make those rules precise enough to implement.
Three layers, one specification
The specification is organised into three explicit layers that match the architecture above:
- Part 1: The substrate. Evidence, claims, entities, sync, signatures, capabilities, the substrate projections. This is what every conformant node implements. It is sufficient on its own to express and synchronise a user-owned knowledge graph across an arbitrary set of authorised peers, including organisational peers.
- Part 2: The inference pipeline. Job scheduling and claiming, routing kinds to specific nodes, and the inference-snapshot artefact convention that gives the system its audit trail. An implementation that wants to participate in distributed audited inference implements Part 2 on top of Part 1; an implementation that doesn't (a substrate-only peer) ignores it entirely.
- Annex: Application conventions. Episodes, suggested actions, and the salience-ranking projection — the reference implementation's choices for surfacing the substrate to a user. These are not normative; alternative implementations are free to substitute their own application layer.
This split is load-bearing for the org-as-peer scenario: a retailer's node implementing Part 1 (and optionally Part 2) is fully conformant without ever touching the application conventions. A user-facing node in the spirit of the reference implementation will typically implement all three layers.
Where to go next
- Comparison — how this protocol relates to other decentralized-data work.
- Conventions — the start of the normative specification.
Comparison with adjacent work
This chapter is an honest contrast between Likewise and other public work in the decentralized-data and personal-AI space. The aim is not to persuade. It is to give a reader who already knows one of these projects the shortest possible path to understanding what Likewise does differently — and, importantly, where another project does something better and Likewise should not be chosen.
The protocol is young. Several of the projects below are not. None of what follows is meant to disparage them; they are the reason Likewise could be designed at all.
Solid
Solid (Tim Berners-Lee's project, ongoing at MIT and Inrupt) returns control of personal data to users by storing it in user-owned Pods that any application can read or write with permission. The goal is to break data silos so multiple apps can interoperate over the same RDF graph the user owns.
A Pod is an HTTP server exposing Linked Data Platform containers and RDF resources (Turtle, JSON-LD). Identity is WebID (an HTTP URI that dereferences to a profile document) authenticated via Solid-OIDC. Authorization is Web Access Control or Access Control Policy — ACL documents attached to resources. Mutation is plain HTTP CRUD. The Solid Notifications Protocol pushes resource updates over WebSocket / WebHook.
Where it overlaps with Likewise. Both treat "your data, your server" as the foundational stance. Both decentralize identity. Both have capability-flavoured access control. Both expect external applications to operate on a graph the user owns.
Where it diverges. Solid is CRUD-on-RDF-resources; Likewise is an append-only signed-op log with deterministic projections. Solid has no concept of evidence-claim-episode lineage, no causal ordering (no HLC or vector clock), no per-op signatures, no inference auditing, and no work routing. Pods assume an always-online HTTP origin; Likewise expects a small mesh of user-owned devices with intermittent connectivity. Solid leans on the open-world semantics of RDF; Likewise's predicate vocabulary is centralised and lint-enforced for the same reasons it has typed ops in the first place.
Sources: Solid Project, Solid Specification, Solid Protocol.
AT Protocol
AT Protocol (Bluesky) decentralizes social networking by giving each user a portable, content-addressed repository that can move between hosting providers (Personal Data Servers, or PDSes). Relays aggregate a public firehose so anyone can build a feed, index, or app over the network without a central gatekeeper.
Each user has a DID resolving to a signing key and service endpoint. Their PDS holds a repo: a Merkle Search Tree of records (DAG-CBOR / IPLD), each commit signed by the account key. Records conform to Lexicons — typed JSON schemas named with NSIDs. Sync is via the firehose (a WebSocket stream of commits) and CAR-file repo export for migration.
Where it overlaps with Likewise. This is the closest cousin in the list. Both: per-user signed log, content-addressed records, account/key portability, schema-typed records (Lexicons play the role Likewise's typed op variants play), single-author repos with cryptographic verification independent of the host. The shape "signed append-only repo plus sync from a frontier" is the same pattern.
Where it diverges. AT Protocol is public-by-default broadcast designed for global indexing — relays slurp everyone's firehose so anyone can build a search engine over the network. Likewise is private-by-default mesh, gated by UCAN delegations with sanitisation caveats, where every op crossing the wire passes through a capability filter. AT has no UCAN-style delegation, no work scheduling or routing, no inference-snapshot artefacts, no multi-projection materialisation, no evidence→claim→episode derivation DAG, and no derived-data invalidation. AT records are user-authored social objects; Likewise ops include machine-derived hypotheses with provenance back to evidence and a mechanism for the user to refute them.
If you want public discoverability and a thriving third-party indexing ecosystem, AT Protocol is the right tool. If you want a private mesh of a single user's own devices, the goals diverge enough that they are not really competitors.
Sources: AT Protocol, AT Protocol Specification, Data Repositories.
Nostr
Nostr ("Notes and Other Stuff Transmitted by Relays") is a censorship-resistant publish-subscribe substrate. Users sign events with a keypair and broadcast them to multiple relays; readers subscribe to relays and verify signatures locally, so no single relay can silence a user.
Identity is a secp256k1 keypair. The wire unit is an event with a
signed (id, pubkey, created_at, kind, tags, content, sig)
envelope. Kinds are integers (0 = profile, 1 = text note, 3 =
follows, 30000+ = addressable). Relays speak a small WebSocket
protocol. There is no causal ordering, no consensus, and no
required durability.
Where it overlaps with Likewise. Per-event signing with a user-owned key. Multi-host distribution. Client-side verification. The "everything is a signed event with a kind" mental model rhymes with Likewise's signed-op log with typed payload variants.
Where it diverges. Nostr has no causal frontier — events are
effectively a flat set ordered by created_at, which is whatever
the user picks. There is no delegation-with-attenuation that has
seen serious adoption (NIP-26 was largely abandoned). There is no
derived state, no projections, no evidence-or-claim model, no work
routing. Nostr is intentionally public broadcast; encrypted DMs
exist but are a thin add-on. Nostr's tag system is freeform and
emergent; Likewise's predicate vocabulary is centralised and
small by design, because predictable derivation requires a closed
vocabulary.
Iroh
Iroh ("dial keys, not IPs") is a modular Rust networking stack that gives any two devices an end-to-end-encrypted QUIC connection identified by public key, traversing NATs via relay servers when direct holepunching fails. Higher-level protocols (blobs, docs, gossip) sit on top of the transport.
NodeId is an Ed25519 public key. iroh-blobs handles BLAKE3 content-addressed transfer with resumable verified streaming. iroh-docs is a multi-writer key-value replica. iroh-gossip does epidemic broadcast. iroh-willow is in development as a next-gen replacement using the Willow data model.
Where it overlaps with Likewise. Both target multi-device sync over hostile networks. Both are Rust-first. Both use Ed25519 keys as the identity primitive. Both content-address payloads (BLAKE3 in Iroh, hash-referenced evidence in Likewise). iroh-docs replicas with a per-author key look superficially like Likewise's signed-op log with a per-node identity.
Where it diverges. Iroh is transport plus sync primitives, not a domain model. It has no claims, episodes, inference snapshots, UCAN delegation graph, projection model, or scheduled-work vocabulary. Iroh's authorization story beyond namespace write-keys is intentionally underspecified.
This is largely a non-overlap. Likewise could plausibly be implemented over Iroh's transport — replacing today's HTTP + reqwest layer with iroh-net QUIC connections — and the result would be additive rather than competitive. Today's reference implementation uses HTTP because it is sufficient for a LAN mesh.
Sources: Iroh, Iroh Docs, iroh-willow.
Local-first software (Ink & Switch)
The local-first manifesto is not a protocol; it is the essay that named seven ideals modern cloud apps fail at: no spinners, multi-device, offline, seamless collaboration, longevity, privacy/security, and user ownership. The essay surveys CRDTs (and Automerge in particular) as candidate plumbing, but the seven ideals are values, not specifications.
Where it overlaps with Likewise. Likewise is squarely a local-first system by these criteria — every ideal is a design goal. The "rebuild projections from the op log" stance directly serves longevity (#5: works in 10 years) and ownership (#7: you own your data). The single-user multi-device mesh addresses multi-device (#2) and offline (#3). The capability-gated sharing model serves privacy (#6). And so on.
Where it diverges. The manifesto leans on CRDT auto-merge of arbitrary structured documents as the canonical answer to multi-device sync. Likewise uses an append-only signed log with deterministic projections and last-write-wins-by-OpId for entity merges, not a generic CRDT. This is a deliberate choice: the data domain is narrow enough that a typed op vocabulary is more precise than a generic mergeable document model, and easier to reason about for derivation. The price is that Likewise is not the right tool for collaborative document editing across multiple users — it is single-user-mesh, not multi-user-collab.
The manifesto is also silent on something Likewise has a strong opinion about: derived intelligence with auditable provenance. Local-first thinking informed Likewise; Likewise commits to a stance the manifesto does not take.
Source: Local-First Software.
UCAN — a building block, not a competitor
UCAN (User-Controlled Authorization Network) is offline-verifiable, decentralized authorization. Instead of an OAuth server issuing tokens, the resource owner signs a delegation directly to a delegate, who can re-delegate (attenuated) further. Verification is purely cryptographic: walk the chain, check signatures and attenuation.
A token is a signed envelope over {iss, aud, sub, cmd, policy, exp, nbf, …}. UCAN v1.0 (DAG-CBOR + Varsig + CIDv1 envelopes) is the
current direction of the working group; v0.10 was the last
JWT-shaped revision.
How Likewise uses UCAN. Every DelegateUcan op carries
a v0.10 token; an implementation's UCAN view materialises the
delegation graph and enforces strict attenuation per hop. Likewise
extends UCAN's policy/caveat slot with a domain-specific
caveat set: source_types, predicates, kind_prefix,
time_range, and a sanitize directive (StripGeo,
RedactParticipants, TruncateContent, StripCustomMetadata). These
plug into a capability policy engine that runs authorization plus
transitive-cascade plus field-level sanitisation on every outbound
op stream. Likewise also extends the Resource and Action
enums with Job and Schedule so work routing rides the same
delegation graph.
Migration cost. Likewise is currently on UCAN v0.10. The v0.10 → v1.0 migration is non-trivial (envelope format and canonicalisation differ) and is tracked as an open issue.
Sources: UCAN Specification, ucan.xyz.
Automerge
Automerge is a CRDT library and sync engine for collaborative document editing. Documents are JSON-shaped CRDTs with full op history; the sync protocol exchanges Bloom-filtered have/need summaries until peers converge. Works over any byte transport.
Where it overlaps with Likewise. Both are append-history-based and target offline-first multi-device. Likewise's "rebuild projections from op log" is structurally similar to Automerge's "materialise document state from op history."
Where it diverges. Automerge is content-agnostic — it merges generic JSON. Likewise's ops are typed and domain-specific. Automerge has no built-in authorization model; Likewise has UCAN end-to-end. Conflict resolution: Automerge uses CRDT merge semantics per field; Likewise uses last-write-wins-by-OpId for entities (with deterministic cycle resolution). Its approach is simpler and less expressive, but it is better suited to derived data, where "the latest user assertion wins" is the right rule.
For collaborative editing across multiple users, Automerge wins. Likewise is not trying to play that game.
Source: Automerge.
Willow Protocol
Willow (2023+) is an authenticated-sync protocol designed for partial replication of large keyed datasets with capability-based access control and confidential sync — peers only learn about data they are authorised to see, including not learning what they are missing.
Data lives in namespaces, subspaces, paths, and entries. An entry
is (namespace_id, subspace_id, path, timestamp, payload_length, payload_digest). Subspaces typically map one-per-author. Prefix
pruning gives "destructive editing": writing at blog/idea with a
newer timestamp deletes all blog/idea/* descendants.
Authorization is Meadowcap, a capability system supporting both
owned (top-down) and communal (bottom-up) namespaces. Confidential
sync uses private-set-intersection-style techniques.
Where it overlaps with Likewise. This is the closest architectural cousin. Capability-based auth (Meadowcap rhymes with UCAN-plus-caveats), per-author signed entries (analogous to Likewise's signed ops), partial sync (Willow's range-based "area of interest" rhymes with Likewise's frontier-plus-filter), timestamp ordering. iroh-willow brings these capabilities into the same Rust ecosystem Likewise's reference implementation inhabits.
Where it diverges. Willow is a storage and sync substrate, not a knowledge model. It has no claims, episodes, inference snapshots, derivation DAG, or work routing. It is the layer underneath what Likewise does. Conversely, Willow's confidential sync is stronger than what Likewise does today — Likewise relies on the sender honestly applying its capability filter server-side, where Willow's design prevents peers from probing for unauthorised data at all. This is a real gap, and one we expect to close some day; it is tracked as an open issue. Willow's destructive editing via prefix-pruning is also more aggressive than Likewise's tombstone-cascade (which preserves the log and only invalidates derivations).
If Willow had existed when Likewise started, this specification might be a knowledge-graph model defined over Willow rather than alongside it. The right relationship may yet turn out to be that one.
Sources: Willow Protocol, Willow Data Model.
Honest synthesis
What Likewise contributes that the projects above don't
- A typed knowledge-graph vocabulary (evidence → claim → entity → episode → action) baked into the op log, not modeled on top of a generic store. Lexicons (AT) and predicates (Solid/RDF) get close, but they are schema systems, not lifecycle models with derivation DAGs and tombstone-cascade semantics.
- Inference auditability as a separable layer. The protocol
defines a
likewise.inference.snapshotartefact type and a conditional invariant that requires snapshots from any node operating under the user's root delegation, or under a delegation whoseaudit_inferencecaveat the user has set. Every audited model call lands as a snapshot recording retrieved context, model identity, telemetry, and output; derived records link back. None of the surveyed protocols treat machine-derived state as a thing that needs provenance back to evidence. The audit pipeline is a separable layer (Part 2 of the specification), so a substrate-only peer — for example, an organisation node receiving a scoped slice of the user's graph — is conformant without participating in audit unless the user required it via caveat. - Domain-extended UCAN caveats including
audit_inference— the v0.1 caveat vocabulary (source_types,predicates,kind_prefix,time_range,sanitize,audit_inference) covers both data scoping and behavioural requirements. UCAN itself is the building block; the caveat vocabulary is Likewise's contribution. - Work routing in the same op log (
ScheduleJob,ClaimWork,RouteKind) so heterogeneous nodes — phone without inference, server with GPU — cooperate via the same delegation graph that gates data access. AT, Solid, Nostr have no equivalent; Iroh has a separate task system at a different layer. - An opinionated read-path projection split (salience, inference, detail, debug-graph) tuned for on-device LLM prompting, UI reads, and ranking from a single log.
- A substrate for consensual commercial data sharing. The capabilities, caveats, and sanitisation rules that secure the user's own mesh generalise directly to delegations to organisations the user invites in. A retailer's node, a clinic's node, an employer's scheduling assistant — each can run a conformant peer with a scope-restricted view of the user's graph, receiving only the claims the user authorised, with sanitisation enforced at the wire boundary. None of the projects above target this user-org-consent shape: they are either personal-only (Iroh, local-first, Automerge) or public-broadcast (AT, Nostr), with Solid the closest in spirit but lacking the caveat + sanitisation vocabulary that makes scoped commercial sharing tractable in practice. See Motivation: Consensual data partnership.
What Likewise doesn't do that one of these does well
- Confidential sync. Willow's design prevents peers from probing for data they are not authorised to see. Likewise relies on the sender honestly applying its capability filter server-side. Closing this gap is an open issue.
- Generic structural merging. For collaborative text or list editing across users, Automerge is better. Likewise's last-write-wins-by-OpId is deliberately coarse, because the domain doesn't need finer.
- Public discoverability and third-party indexing. AT Protocol's firehose model is the right tool for "anyone can build an app over the public stream." Likewise is private-by-default and would have to add new machinery to do this; we have no current plans to.
- Mature ecosystem of clients and apps. Solid has Inrupt and the Community Solid Server. AT has Bluesky and the wider ATmosphere. Nostr has dozens of clients. Likewise has one pre-1.0 reference implementation. The ecosystem cost is real.
- Account portability across hosts. AT Protocol's DID + CAR-export migration is more developed than Likewise's story, which assumes the user owns all participating nodes rather than migrating between hosting providers.
- NAT traversal and transport. Iroh's holepunching plus relay stack is what you would want for cross-network device sync. Likewise's HTTP loopback transport is sufficient for a LAN mesh; an Iroh-backed transport is plausible future work.
- UCAN v1.0. Likewise is on v0.10 (JWT shape); the ecosystem is migrating to v1.0 envelopes (DAG-CBOR plus Varsig). This is technical debt, not a design choice.
The honest one-line summary
If you want public-network social, choose AT Protocol. If you want collaborative editing, choose Automerge. If you want a capability- based confidential-sync substrate, watch Willow closely. If you want a knowledge graph of yourself, owned by you, with auditable inference and a private mesh of your own devices — that's what this protocol is for, and we don't currently know of another public specification that targets the same brief.
Conventions
This chapter defines the conventions used throughout the specification. Subsequent chapters are normative; this one tells you how to read them.
Status of this document
This is Likewise, version 0.1 — draft for public review.
The wire format described in this specification is exercised by an end-to-end reference implementation. It is not yet stable across major versions. Backwards-incompatible changes between v0.1 and v1.0 are expected. Known cross-implementation hazards are catalogued in Open Issues.
Conformance language
The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC 2119 and RFC 8174 when, and only when, they appear in all capitals.
In short:
- MUST / MUST NOT — absolute requirement / prohibition.
- SHOULD / SHOULD NOT — strong recommendation; deviation requires understanding consequences.
- MAY — truly optional.
Normative versus informative material
Each chapter below is divided into normative sections (which use RFC 2119 keywords) and informative sections (which do not). An informative section may explain rationale, give examples, or sketch how an implementation might satisfy the normative requirements. Informative material does not impose requirements. Where the two appear to conflict, the normative material wins.
Examples in code blocks, diagrams, and prose anecdotes are informative.
Versioning
Likewise follows a semantic-versioning shape:
- Major version changes are backwards-incompatible. An implementation MUST NOT silently interoperate across major versions. A change to the wire format, the canonical signing rules, the operation payload encoding, or the meaning of a capability caveat is a major-version change.
- Minor version changes are backwards-compatible additions: new operation variants, new caveats, new sanitisation rules, new reserved fields with safe defaults. An implementation that does not understand a minor-version addition MUST treat it as unknown-but-tolerated where the spec allows, and reject the op otherwise. The specification chapter that introduces an addition states which.
- Patch version changes are editorial only — they do not change observable behaviour.
Two implementations on the same major version SHOULD interoperate
without negotiation. Two implementations on different major
versions MAY refuse to interoperate; the X-Likewise-Mesh-Rules-Hash
sync header is the v0.1 mechanism by which a mismatched pair
detects this and pauses sync rather than corrupting each other
(see Sync).
Defined terms
The following terms are used with precise meanings throughout the specification.
- Node — a process running an implementation of this protocol. A node has a long-lived NodeId and a corresponding signing key. A node is the unit of authorship for operations.
- User — the human (or organisation) at whose authority all delegations in a mesh are rooted. Identified by a DID.
- Mesh — the set of nodes belonging to one user. Mesh membership is governed by capability delegations rooted at the user.
- Operation (or op) — the typed, signed unit of state change. Defined in Operations.
- Op log (or just log) — a node's append-only sequence of operations.
- Projection — a materialised read view derived from the op log. Defined in Projections.
- Evidence — an immutable raw input the user has chosen to ingest, referenced by content hash. Defined in Data Model.
- Claim — a working hypothesis about the user, derived from evidence and other claims. Defined in Data Model.
- Capability — a triple
(Resource, Action, Caveats)authorising a node to perform a class of operation. Defined in Capabilities. - HLC — hybrid logical clock. The timestamp scheme defined in Clocks.
- Causal frontier — the per-author maximum-timestamp summary a node uses as its sync cursor. Defined in Sync.
- Owner — the node holding the user's root delegation. Owner
is a per-mesh role, not a separate identity. Some operations
(notably
RouteKindandDesignateCoordinator) are owner-only. - Coordinator — the node designated to run the deterministic derivation pass for the mesh. There is exactly one coordinator per mesh at a given log prefix; the user designates it explicitly (see Mesh Coordination).
Authoritative sources
When this specification is silent or ambiguous, fall back in this order:
- The relevant RFC for any externally-defined primitive (RFC 7515 for JWS, the UCAN specification for tokens, etc.).
- The maintainers' issue tracker, which is where ambiguities are clarified in subsequent revisions of the specification.
The protocol was developed alongside an in-progress reference implementation (working codename Cortex — provisional; see Implementations). The implementation is not yet publicly available; once it is, its observed behaviour will become the practical fall-back authority for v0.1 ambiguities. Until then, file an issue.
How to cite
When citing this specification, use the form:
Likewise, version 0.1. https://getlikewise.ai/spec/
The protocol is licensed CC-BY-4.0 (see LICENSE at the repository
root). Attribution is required.
Data Model
This chapter defines the structural elements an implementation manipulates: evidence, operations, projections, and the identifier types that link them. It does not define wire encodings (see Wire Format) or the full operation taxonomy (see Operations).
1. Layers
A conforming implementation MUST distinguish three layers:
- Evidence — immutable, content-addressed inputs the user has ingested.
- Operations — typed, signed, totally-ordered records of state change. The operation log is the canonical store of truth.
- Projections — materialised read views derived from the operation log. Projections are disposable and rebuildable.
Higher layers in this list depend only on lower layers. Operations reference evidence by hash. Projections are computed from the operation log. No projection's state may be mutated except by applying operations.
Evidence content (the photo bytes, the calendar payload) MAY be stored separately from the op log in any way an implementation chooses, including not at all on a given node, so long as the content hash recorded in the operation referencing it remains verifiable when the bytes are present.
2. Identifier types
The protocol uses several typed identifier categories. Where this specification refers to an "Id" of a particular kind, the identifier MUST belong to that category. Implementations SHOULD prevent cross-category confusion at the type level where the implementation language permits it.
2.1 NodeId
A NodeId is a long-lived identifier for a node. A NodeId MUST
correspond, one-to-one, to an Ed25519 public key used for op
signing. The mapping is established when a node first authors a
DelegateUcan op announcing its presence and is fixed for the
lifetime of the node.
A NodeId is assigned by the implementation at node initialisation
and MUST be unique within a mesh. The protocol does not specify
the encoding of NodeId beyond requiring that it be a stable byte
string suitable for use as a map key and a JWS kid value.
2.2 ULID-shaped record identifiers
The following identifiers are time-sortable ULID-
shaped values: OpId, EvidenceId, EntityId, ClaimId,
JobId, ArtifactId, EpisodeId, ActionId. Implementations
MUST treat each as opaque outside the operations that produce
them, except that ULID-derived total ordering MAY be relied upon
for tie-breaking where the specification calls for it
(e.g. last-write-wins entity merge in
Mesh Coordination).
2.3 ContentHash
A ContentHash is a 32-byte BLAKE3 hash of canonical bytes. It is
used to reference:
- Evidence payloads (the photo, the calendar event content).
- UCAN tokens (for proof-chain references).
- Mesh-rules documents (for the sync handshake).
ContentHash values MUST be encoded as 32 raw bytes on the wire
and MAY be encoded as 64-character lowercase hex in human-readable
contexts.
2.4 DID
The user is identified by a Decentralized Identifier (DID). The
protocol does not constrain the DID method; did:key and
did:plc are both acceptable. The user's DID is the issuer of
the root delegation in a mesh.
A node's signing key is bound to its NodeId rather than to a DID
directly; the binding from NodeId to a DID is established by
the chain of UCAN delegations rooted at the user.
3. Evidence
3.1 Identity
Each piece of evidence has:
- An
EvidenceId(assigned by the ingesting node). - A
ContentHashof the canonical content. - A
SourceAnchor— a stable identifier from the upstream system the content was ingested from (calendar event UID, photo asset identifier, message identifier). Multiple nodes that ingest the same upstream item MUST agree on theSourceAnchor. - A
source_type— a short string identifying the kind of upstream system ("calendar","photo","contact","location", ...). This is the value matched by thesource_typescapability caveat (see UCAN and Caveats). - Optional custom metadata.
3.2 Immutability
Evidence is immutable. Once an evidence-ingest operation lands on
the log, the content referenced by that op's ContentHash MUST
NOT change. To remove evidence, the user (or an authorised node)
emits a TombstoneEvidence op, which triggers a derivation
cascade.
Tombstoning preserves the evidence-ingest op on the log. An implementation MUST NOT delete the original op; the historical record of what was once known survives, even if the implementation has discarded the underlying content bytes.
3.3 Cascade
When an evidence record is tombstoned, every claim, entity merge,
episode, suggested action, and inference snapshot that
transitively depended on it MUST be invalidated. The mechanism is
specified in State Machines. The operation that
tombstones evidence is itself the cascade: a CascadeTombstone
op carries the set of dependent records it invalidates, atomically.
4. Operations
4.1 Common shape
Every operation, regardless of payload variant, carries the following fields:
op_id: anOpIdassigned by the authoring node.author: theNodeIdof the authoring node.timestamp: a hybrid logical clock value(wall_ms, logical, node).payload: a typed payload; see Operations for the variants.signature: an Ed25519 signature over the canonical encoding of the op with thesignaturefield cleared. See Signatures and Wire Format.
An operation with a missing signature is valid only if it has been intentionally sanitised by an authorised filter (see UCAN and Caveats). All other unsigned ops MUST be rejected.
4.2 Total order
The triple (timestamp.wall_ms, timestamp.logical, timestamp.node) induces a total order over all operations in a
mesh. Where this specification requires ops to be applied "in
order," it means in this total order.
Two operations with the same (wall_ms, logical, node) MUST NOT
exist; that case is a violation of the HLC tick discipline (see
Clocks) and the receiving node MUST treat it as
an authoritative integrity failure.
4.3 Idempotence
Application of an operation to a projection MUST be idempotent. A node that receives the same op twice MUST produce the same projected state as if it had received it once.
Implementations typically achieve this by deduplicating on
op_id at the log layer, but the requirement is on the projected
state, not on the deduplication mechanism.
5. Projections
A projection is a materialised view computed from the operation log. Projections are derived state — they MUST be fully reconstructable from the op log alone.
A conforming implementation MUST provide projections sufficient to answer the queries described in Projections. The protocol distinguishes four projections by purpose, not by implementation strategy:
- Salience projection — for ranking what is currently important.
- Inference projection — for assembling a model context window.
- Detail projection — for per-id user-interface lookups; the on-disk read layer.
- Debug-graph projection — for inspection and verification tooling. Optional in production.
Implementations MAY combine the underlying storage of multiple projections; they MUST NOT collapse the read interfaces such that the query semantics of one projection contaminate another. The load-bearing distinction is between what each projection answers, not between where its bytes live. See Projections for the contract each projection must honour.
6. The provenance graph
The relationships between evidence, claims, and other derived records form a directed acyclic graph (DAG). The vertices are records; the edges are "derived from" links carried in the operation that produced the derived record.
A conforming implementation MUST:
- Record the supporting operations of every derived claim, episode, and suggested action. The set of supporting operations MUST be recoverable from the op log.
- Treat the derivation graph as a DAG. An operation that would introduce a cycle into the derivation graph MUST be rejected.
- Implement transitive invalidation: when a vertex is tombstoned or rejected, every vertex transitively reachable along outgoing edges MUST be invalidated. The state-machine consequences of invalidation are specified in State Machines.
The entity-resolution graph (which entity merges into which) is not required to be acyclic; the merge-conflict resolution rules in Mesh Coordination handle the cyclic case deterministically.
7. Authoring authority
An operation is authorised if and only if its author held a
capability admitting both the operation's Action (the kind of
write it performs) and its Resource (the data class it touches),
with caveats satisfied, at the operation's timestamp.
Authorisation is verified by walking the chain of UCAN delegations from the op's author to the mesh root. The mechanism is specified in UCAN and Caveats and Capabilities. A receiving node MUST reject operations it cannot authorise.
8. Informative: a worked example
An informative section. Does not impose requirements.
The user's phone ingests a calendar event. The phone:
- Computes the BLAKE3 hash of the canonical calendar event bytes.
- Allocates an
EvidenceId. - Builds an
IngestEvidenceop withsource_type = "calendar", theContentHash, the upstream UID as theSourceAnchor, and any extracted metadata in the custom-metadata field. - Ticks its HLC, writes the op's
timestamp. - Signs the op with the phone's NodeId key (detached JWS over canonical encoding, signature field cleared during signing).
- Appends the signed op to the local log.
A subsequent deterministic-extraction pass on this evidence
produces candidate claim ops with Hint status, each carrying
this evidence's EvidenceId in its supporting-operations field.
A later inference call may produce an episode op whose supporting
operations include both the evidence and the claims. A still-later
user assertion may confirm one of those claims, transitioning it
to Fact.
Every step of that pipeline is on the log. Every record above evidence has a path back to evidence. The user can walk that path in either direction.
Operations
This chapter enumerates the operation variants defined for v0.1. Every state change in a Likewise mesh is one of these variants. The wire encoding of an operation is specified in Wire Format; this chapter describes payloads and their semantics.
1. The operation envelope
Every operation, regardless of payload variant, MUST carry the fields described below. Implementations MAY use any in-memory representation; the wire-format chapter specifies the canonical serialisation that signatures are computed over.
| Field | Required | Purpose |
|---|---|---|
id | yes | An OpId. ULID-shaped, time-sortable, globally unique within the mesh. |
schema_version | yes | The op-payload schema version. Future revisions of this specification MAY introduce new payload-format versions; recipients MUST migrate on read. |
timestamp | yes | A hybrid logical clock value. See Clocks. |
node_id | yes | The originating node's NodeId. |
causal_deps | yes | A possibly-empty set of OpId predecessors the author wishes to mark as explicit causal dependencies. May be empty when the author is willing to rely solely on the HLC ordering. |
payload | yes | One of the typed variants enumerated below. |
signature | conditional | Detached JWS over the canonical encoding of the op with the signature field cleared. Required for all ops except those that have been intentionally sanitised by an authorised filter; see UCAN and Caveats. |
A receiving node MUST reject any operation whose envelope is
malformed, whose id collides with an op already in the log under
the same node_id and timestamp, or whose signature is invalid
in a context where one was required.
2. Payload categories
The v0.1 substrate vocabulary partitions operations into seven categories:
- Evidence operations record raw inputs.
- Entity operations create, alias, merge, and split entities.
- Claim operations create and evolve claims.
- Job operations schedule, claim, and complete units of work.
- Artifact operations create and evict generic byproducts of derivation, including the inference-snapshot artefacts used by Part 2.
- User-assertion operations carry the user's overrides on derived state.
- Mesh operations govern delegation, revocation, coordination, and routing.
Two further op types — CreateEpisode/UpdateEpisode and
CreateSuggestedAction/UpdateActionStatus — are
application-layer conventions used by the reference implementation
to surface the substrate to a user. They are documented in
Annex: Application Conventions, not here.
A node that does not surface the graph to a user — for example,
an organisation's node consuming a scoped slice — has no need to
implement them.
Subsequent sections describe each variant. Field types use informal names; their precise wire encodings are in Wire Format.
3. Evidence operations
3.1 IngestEvidence
Creates an immutable evidence record.
| Field | Purpose |
|---|---|
evidence_id | An EvidenceId. |
content_hash | BLAKE3 hash of the canonical content bytes. |
source_type | Short identifier for the upstream system (e.g. "calendar", "photo", "contact"). |
source_anchor | Stable upstream identifier (calendar UID, photo asset id, message id). |
metadata_snapshot | Optional structured metadata extracted at ingest time (timestamp, location, participants). |
Receiving nodes MUST treat the (evidence_id, content_hash) pair
as fixed for the lifetime of the mesh. The bytes referenced by
content_hash MAY be absent on a given node.
3.2 TombstoneEvidence
Removes an evidence record from active circulation. The original ingest op is preserved on the log; only the application of new operations against the tombstoned record changes.
| Field | Purpose |
|---|---|
evidence_id | The evidence being tombstoned. |
reason | One of UserRequest, Privacy, DataExpiry, or another well-known string introduced in a future minor version. |
A TombstoneEvidence op MUST trigger the derivation cascade
defined in State Machines: every claim, episode,
suggested action, and inference snapshot that transitively
depends on the tombstoned evidence is invalidated atomically.
4. Entity operations
4.1 CreateEntity
Introduces a new entity into the mesh.
| Field | Purpose |
|---|---|
entity_id | An EntityId. |
entity_type | Person, Place, Organisation, Device, Account, Document, Concept, Commitment, Event. Future minor versions MAY add types. |
initial_label | Human-readable name. |
source_claims | The claims that motivated the creation, if any. |
4.2 AddEntityAlias
Adds an alternative label for an existing entity.
| Field | Purpose |
|---|---|
entity_id | The target entity. |
alias | Alternative label. |
4.3 MergeEntities
Resolves two or more entities to a single survivor. The survivor absorbs the consumed entities' claims, with redirection so that references to the consumed entities continue to resolve.
| Field | Purpose |
|---|---|
survivor | The EntityId that persists. |
consumed | The EntityIds being absorbed. |
rationale | Free-form prose explaining why they are the same. |
When two MergeEntities ops conflict (each consumes an entity the
other survives), the receiving node MUST resolve the conflict
deterministically by OpId ordering. The full rule is given in
Mesh Coordination.
A MergeEntities op authored by a user-assertion authority MUST
take precedence over machine-derived merges, regardless of OpId
order.
4.4 SplitEntity
Reverses a prior merge.
| Field | Purpose |
|---|---|
original | The entity to split. |
new_entities | The set of entities the split produces. |
rationale | Free-form prose explaining why they are different. |
5. Claim operations
A claim is the protocol's unit of asserted belief about an entity. Claims have a status (Hint, Claim, Fact, Disputed, Rejected, Superseded, Stale) whose transitions are specified in State Machines.
5.1 CreateClaim
| Field | Purpose |
|---|---|
claim_id | A ClaimId. |
claim_type | Attribute, Relationship, Membership, Temporal, Spatial, Behavioral, Derived. |
subject | The EntityId the claim is about. |
predicate | A predicate from the centralised vocabulary. The vocabulary is part of the specification; future minor versions MAY add predicates. |
object | One of: an EntityId, text, a number, a boolean, a timestamp, or a structured object. |
initial_status | Typically Hint or Claim. |
confidence | A confidence vector with multiple components. |
provenance | The supporting evidence, claims, and jobs. |
5.2 UpdateClaimStatus
| Field | Purpose |
|---|---|
claim_id | Target claim. |
new_status | New status from the lifecycle. |
rationale | Optional free-form prose. |
5.3 UpdateClaimConfidence
| Field | Purpose |
|---|---|
claim_id | Target claim. |
new_confidence | Updated confidence vector. |
5.4 SupersedeClaim
| Field | Purpose |
|---|---|
old_claim_id | The claim being replaced. |
new_claim_id | The replacement, which MUST already exist on the log. |
rationale | Free-form prose. |
A claim with status Fact (i.e. user-confirmed) is frozen
and MUST NOT be superseded by a non-user-assertion-authored
op. User assertions MAY override frozen claims.
6. Job operations
The job vocabulary lets multiple nodes cooperate on the same unit of work without external coordination. The full state-machine semantics are in Mesh Coordination; the table below specifies the payload shape only.
6.1 ScheduleJob
Declares that a job exists and may be claimed.
| Field | Purpose |
|---|---|
job_id | A JobId. |
kind | A typed work-kind string (e.g. cortex.synthesize.window). The protocol does not constrain the namespace, but implementations SHOULD use a reverse-DNS-style prefix for portability. |
payload | Opaque bytes the eventual handler interprets. |
policy_envelope | Policy and capability constraints attached at scheduling time. |
6.2 ClaimWork
A node takes responsibility for executing a scheduled job.
| Field | Purpose |
|---|---|
job_id | The job being claimed. |
claimer | The NodeId of the claiming node. |
lease_duration_ms | How long the lease lasts. |
The lease's effective expiry is computed against the HLC wall component, not against any node's local wall clock. See Mesh Coordination.
6.3 CompleteJob
Records that a job finished and its outputs are on the log.
| Field | Purpose |
|---|---|
job_id | The job. |
output_claims | Claims produced. |
output_artifacts | Artefacts produced. |
telemetry | Duration, token counts, model latency. |
6.4 YieldWork
A claimer voluntarily releases a job before completion.
| Field | Purpose |
|---|---|
job_id | The job. |
claimer | MUST match the current claimer's NodeId. |
reason | Free-form prose. |
6.5 ExpireWork
Any node MAY emit an ExpireWork op once a lease's HLC-relative
deadline has passed. The op moves the job back to the unclaimed
state.
| Field | Purpose |
|---|---|
job_id | The job. |
expired_claimer | MUST match the current claimer's NodeId. |
reason | Conventionally "deadline_passed". |
7. User-assertion operations
The user is the final authority on facts about themselves. A user assertion takes precedence over machine-derived state and MUST be respected by the receiving node's projection logic.
7.1 UserAssert
| Field | Purpose |
|---|---|
assertion_type | Confirm, Reject, Edit, Pin, Hide, LaneRule. |
target | A Claim, Entity, or semantic-lane reference. |
semantic_lane | Optional lane qualifier. |
Effects by assertion type:
Confirm— promotes the target claim toFact. The claim becomes frozen against subsequent automated invalidation.Reject— sets the target claim toRejectedand triggers the derivation cascade.Edit— creates a versioned replacement claim that the receiving node MUST treat as superseding the original.Pin— freezes the target without altering its current status.Hide— display-layer directive; the claim persists on the log but is excluded from user-facing surfaces.LaneRule— blocks or requires confirmation for derivations in a named semantic lane. The set of lane-rule effects is specified in State Machines.
User-assertion ops are authored by a node that holds a
write capability on the relevant resource with no caveats
restricting UserAssertion. Implementations MAY in addition
require that the authoring node corresponds to a "user-bearing"
role established by mesh policy; the v0.1 specification does not
mandate this.
8. Artifact operations
Artefacts are generic machine-produced byproducts of derivation:
embeddings, transcripts, OCR text, and the inference snapshots
that record model calls. The artefact mechanism is substrate;
specific artefact types layered on top of it (notably
likewise.inference.snapshot, used by Part 2)
inherit lifecycle and storage from this section.
8.1 CreateArtifact
| Field | Purpose |
|---|---|
artifact_id | An ArtifactId. |
artifact_type | Short identifier ("image_embedding", "ocr_text", "transcript", "likewise.inference.snapshot", ...). |
source_job | Optional link to the producing job. |
inputs_used | Evidence inputs. |
content_hash | BLAKE3 of the artefact content. |
content_inline | Optional inline bytes for small artefacts. |
model_id, model_version | Optional. Required for inference-snapshot artefacts. |
size_bytes | Content size. |
ttl_ms | Optional time-to-live, after which the artefact is eligible for eviction. |
The likewise.inference.snapshot artifact type is specified in
detail in Inference Audit.
8.2 EvictArtifact
Drops the content of an artefact (the metadata is retained on the log).
| Field | Purpose |
|---|---|
artifact_id | Target artefact. |
8a. Application-layer ops (informative pointer)
The reference implementation also emits CreateEpisode,
UpdateEpisode, CreateSuggestedAction, and
UpdateActionStatus as part of its user-facing surface. These
are documented in
Annex: Application Conventions. They are
not part of the substrate vocabulary; a substrate-only
implementation that receives them on the wire MAY accept and
store them on the log without maintaining any projection state
for them.
9. Mesh operations
9.1 DesignateCoordinator
Owner-only. Names the node responsible for the deterministic derivation pass. There is no automatic election; coordinator selection is an explicit user act.
| Field | Purpose |
|---|---|
coordinator | The NodeId that should run the deterministic pipeline. |
A DesignateCoordinator op authored by any node other than the
mesh owner MUST be rejected.
9.2 DelegateUcan
Carries a UCAN delegation in the op log.
| Field | Purpose |
|---|---|
ucan_cid | The ContentHash of the token bytes. Acts as the delegation's identity. |
ucan_bytes | The detached-JWS UCAN token. |
A DelegateUcan op authored by a node that has not yet been
seen on the log MAY be accepted unsigned, on the condition
that the embedded UCAN binds the authoring NodeId to the
issuer's DID. This is the bootstrap path by which a new node's
key first becomes known to the mesh; see
UCAN and Caveats.
9.3 RevokeUcan
| Field | Purpose |
|---|---|
ucan_cid | The content hash of the delegation being revoked. |
A RevokeUcan op MUST be authored by the issuer of the
delegation it revokes (or by a node with write authority over
that DID's delegations under a still-valid parent). Receiving
nodes MUST prune the subgraph of delegations beneath the
revoked one and MUST re-evaluate the authorisation of any ops
whose authority depended on it.
9.4 RouteKind
Owner-only. Routes a class of jobs to a specific node.
| Field | Purpose |
|---|---|
kind | The work-kind string. |
route | An Option<NodeId>. Setting to None clears the directive. |
While a route is set, only the named node MAY successfully
emit a ClaimWork op for that kind. Other nodes' claim ops
MUST be rejected. Routes follow last-write-wins semantics by
op timestamp.
A RouteKind op authored by any node other than the mesh
owner MUST be rejected.
10. Operation indexing
Implementations MUST be able to retrieve operations from the log
by OpId, by (node_id, timestamp), and by author-frontier
(see Sync). They MAY provide additional indices
for efficient projection rebuilds.
11. Reserved variants
Future minor versions of this specification MAY introduce new op variants. An implementation that encounters an unknown variant on the wire MUST reject the op, log the rejection, and continue processing subsequent ops. It MUST NOT corrupt its log by dropping unknown variants silently or by guessing at their semantics.
The reserved-prefix convention for namespacing third-party extensions is described in Open Issues; a stable extension mechanism is anticipated but not normative in v0.1.
Wire Format
This chapter specifies the byte-level encoding of operations and related structures as they cross between nodes. It defines:
- the canonical encoding used for signature computation,
- the encoding of operation identifiers, hashes, and clocks,
- the framing for collections of operations on the wire,
- the encoding of cursors and frontiers used by the sync endpoint.
The transport-layer protocol that carries these encoded bytes is specified in Sync. The signature algorithm and detached-JWS envelope are specified in Signatures.
1. Encoding format
The canonical encoding is postcard, a compact deterministic binary serialisation defined at https://postcard.jamesmunns.com/. Implementations MUST use postcard's deterministic ordering and varint conventions.
Postcard was chosen for v0.1 because it is compact, deterministic, and has independent implementations in multiple languages. The choice is not load-bearing in the long run; an implementation MAY expose alternative encodings (JSON, CBOR, MessagePack) for debugging or for application-layer interop, but operations authored or accepted on the wire MUST be the postcard encoding. The signature is computed over the postcard bytes.
1.1 Determinism requirements
Two implementations encoding the same operation values MUST produce byte-identical postcard output. Implementations MUST:
- Encode struct fields in the order this specification declares them (subsequent chapters declare order alongside payloads).
- Encode option-typed fields as
0x00forNoneand0x01followed by the value bytes forSome. - Encode collections as
varint(len)followed by the elements in their authored order. - Encode booleans as a single byte:
0x00false,0x01true. - Encode integers as varints unless this specification specifies fixed-width.
1.2 Versioning of the encoding
The wire encoding does not carry an explicit version tag at the op level. Schema evolution within a minor version is constrained to backwards-compatible additions only — see Conventions. The absence of an explicit version tag is one of the known cross-implementation hazards and is expected to be addressed in a subsequent major version.
2. Identifier encodings
2.1 NodeId
A NodeId is encoded as an unsigned 64-bit varint. The mapping
from NodeId value to the corresponding Ed25519 public key is
established by DelegateUcan ops on the log; see
Signatures.
2.2 ULID-shaped record identifiers
OpId, EvidenceId, EntityId, ClaimId, JobId, ArtifactId,
EpisodeId, and ActionId are encoded as 16 raw bytes (the
canonical ULID byte form, big-endian: 48-bit timestamp + 80-bit
randomness).
2.3 ContentHash
A ContentHash is encoded as 32 raw bytes (the BLAKE3 output).
Hex encoding MAY be used in human-readable contexts (debugging,
log lines, headers) but MUST NOT be used on the canonical wire.
2.4 DID
A DID is encoded as a length-prefixed UTF-8 string, with the full
URI form (did:method:identifier).
3. Hybrid logical clock encoding
A Timestamp is encoded as a struct in the following order:
wall_ms: 64-bit unsigned integer (varint).logical: 32-bit unsigned integer (varint).node: aNodeId(varint as above).
The clock value's semantics and tick rules are specified in Clocks.
4. Operation envelope encoding
Every operation is encoded as a struct in the order this section declares.
| Field | Type | Encoding |
|---|---|---|
id | OpId | 16 bytes. |
schema_version | varint | Currently 1. |
timestamp | Timestamp | as above. |
node_id | NodeId | varint. |
causal_deps | Vec<OpId> | varint(len) + 16 × len bytes. |
payload | tagged union | varint discriminant + variant body; see Operations. |
signature | Option<Vec<u8>> | option byte + length-prefixed bytes when Some. |
The variant discriminants for the payload union are assigned by this specification and MUST be stable across implementations of the same major version.
5. Canonical signing form
The signature is computed over the operation's canonical
encoding with the signature field cleared to None.
Procedure:
- Set
signature = Noneon the operation. - Encode the operation per Section 4.
- Compute the Ed25519 signature over the resulting bytes using the authoring node's private key.
- Set
signature = Some(<signature bytes wrapped in a detached JWS envelope>)per Signatures.
The reverse-verification procedure for receivers is specified in Signatures.
This rule — that the signature is cleared during signing — is the single most error-prone aspect of v0.1 implementation. Implementers SHOULD test it explicitly in cross-language interoperability fixtures.
6. Sanitised operations
When an operation is sanitised (a caveat strips fields before
crossing a delegation; see UCAN and Caveats),
the sender MUST clear the signature field. The recipient MUST
NOT attempt to verify a signature on a sanitised op.
Sanitisation happens at the sender. The recipient distinguishes sanitised ops from corrupted ops by the presence of a caveat-derived sanitisation marker on the op envelope (described in UCAN and Caveats). An op that arrives without a signature and without the sanitisation marker MUST be rejected.
7. Operation collections on the wire
The sync endpoint exchanges sequences of operations. The
on-the-wire encoding of Vec<Operation> is the postcard encoding
of the sequence: varint(len) followed by each operation in
order.
The order in the sequence is significant only as a hint: recipients MUST apply received ops by their HLC total order, not by sequence position.
8. Causal frontier encoding
A CausalFrontier is a per-author summary of the maximum
operation seen from each node. It is encoded as a map with the
following structure:
varint(num_authors)
for each author:
NodeId (varint)
Timestamp (struct)
The order of map entries on the wire is by ascending NodeId.
For use as a sync cursor in HTTP query parameters, the frontier
is base64url-encoded (RFC 4648, no padding). The cursor is
opaque to clients beyond this format; clients MUST NOT attempt
to construct cursor values other than by echoing back values
received from a server, except for the empty frontier (encoded as
varint(0), base64url AA), which means "from the beginning."
9. UCAN token wire format
A UCAN delegation referenced by DelegateUcan is carried as
opaque bytes (Vec<u8>) — specifically, the detached-JWS form of
a UCAN v0.10 token over a JSON payload. The UCAN content hash
(ucan_cid) is the BLAKE3 of these bytes.
The UCAN token format is specified externally; see UCAN and Caveats for the v0.10 details and the v1.0 migration plan.
10. Mesh-rules hash
The mesh-rules document is a small structured value carrying the non-negotiable parameters of a mesh (protocol version, agreed caveat vocabulary, agreed sanitisation rules). It is encoded canonically per Section 1, and its hash is the BLAKE3 of those bytes.
The mesh-rules hash is exchanged on every sync exchange via the
X-Likewise-Mesh-Rules-Hash HTTP header (see Sync)
to detect rule drift between peers.
11. Header conventions
When operations are exchanged over HTTP, the following headers have normative meaning:
Content-Type: application/octet-streamfor postcard bodies.X-Likewise-Next-Frontier: <base64url>— set by a server on successful pull responses; tells the client what frontier to send next.X-Likewise-Mesh-Rules-Hash: <hex>— set by both sides on every request and response; mismatch triggers the handshake-pause behaviour specified in Sync.
Implementations MAY define additional headers for diagnostics,
provided they do not begin with X-Likewise- (which is reserved
for protocol-defined headers).
Sync
This chapter specifies how nodes exchange operations. The protocol defines exactly one endpoint, two HTTP methods, and one cursor. The simplicity is intentional: synchronisation is the most load-bearing operation in a decentralised system, and richer protocols are harder to implement compatibly.
1. Transport
Nodes communicate over HTTP/1.1 or later with TLS recommended on any non-loopback transport. The v0.1 specification does not require any particular HTTP feature beyond:
- Request and response bodies up to a server-advertised limit (default: 8 MiB; see Section 7).
- Custom request and response headers.
- Standard status codes.
WebSockets, gRPC, QUIC, or peer-to-peer transports MAY be used by implementations as alternatives, but two implementations claiming v0.1 conformance MUST both support the HTTP profile defined in this chapter.
2. The single endpoint: /ops
A v0.1 node MUST expose GET /ops and POST /ops. A node MAY
expose additional administrative endpoints; they are not part of
this specification.
The path /ops is mounted at the root of the node's HTTP origin.
A node MAY operate behind a reverse proxy that adds path prefixes,
in which case the proxy is responsible for mapping back to /ops
for compliant peers.
3. Pulling operations: GET /ops
Pulls operations the requester does not already have.
3.1 Request
GET /ops?since=<base64url-frontier>&limit=<n>
X-Likewise-Mesh-Rules-Hash: <hex>
Authorization: Bearer <node-bearer-token>
Query parameters:
since(required) — a base64url-encodedCausalFrontierrepresenting the requester's high-water mark per author. The empty frontier (base64urlAA) means "from the beginning of the log." See Wire Format.limit(optional) — an upper bound on the number of operations the server returns. The server MAY return fewer thanlimiteven if more are available; clients MUST be prepared to issue follow-up requests using the returned next-frontier cursor. Servers MAY enforce an upper bound onlimitand clamp values exceeding it.
The Authorization header carries a node-bearer token that
authenticates the requesting node. Token issuance and refresh are
specified in Signatures.
3.2 Response
200 OK
Content-Type: application/octet-stream
X-Likewise-Next-Frontier: <base64url>
X-Likewise-Mesh-Rules-Hash: <hex>
<postcard-encoded Vec<Operation>>
Body: the postcard encoding of the sequence of operations the server is willing to send, filtered by the requester's capability set (Section 5).
Headers:
X-Likewise-Next-Frontier— the cursor the requester should send on its next pull. This frontier MUST encompass every operation in the response and MAY encompass operations the server chose to filter.X-Likewise-Mesh-Rules-Hash— the server's current mesh-rules hash. The requester MUST compare to its own; on mismatch the pause-on-drift behaviour in Section 6 applies.
3.3 Idempotence and safety
GET /ops is safe and idempotent. Repeated calls with the same
since cursor MUST return the same operations modulo log growth
on the server in the interim.
4. Pushing operations: POST /ops
Submits operations the sender wants the recipient to apply.
4.1 Request
POST /ops
Content-Type: application/octet-stream
X-Likewise-Mesh-Rules-Hash: <hex>
Authorization: Bearer <node-bearer-token>
<postcard-encoded Vec<Operation>>
Body: a postcard-encoded sequence of operations.
4.2 Response
200 OK
Content-Type: application/json
X-Likewise-Mesh-Rules-Hash: <hex>
{ "appended": N, "duplicated": M, "rejected": K }
Where:
appendedis the number of operations newly added to the recipient's log.duplicatedis the number that the recipient already had on its log (deduplicated byOpId).rejectedis the number that failed authorisation, signature verification, or schema validation.
The recipient MUST verify each incoming operation per
Signatures and authorise it per
UCAN and Caveats. Operations that fail
either check MUST be excluded from appended and counted toward
rejected. Implementations SHOULD log rejections with enough
detail for an operator to diagnose, but the wire response SHOULD
NOT leak per-op rejection reasons across capability boundaries.
4.3 Idempotence
Application of POST /ops MUST be idempotent: re-submitting the
same operations MUST result in the same recipient state, with
duplicates counted toward duplicated rather than appended a
second time.
5. Source-side filtering
A server MUST filter outbound operations by the requester's capability set before responding. The filter:
- Authorises each candidate operation against the requester's delegation chain. Operations the requester is not authorised to read are excluded.
- Applies any
sanitizecaveats that govern the requester's delegation. Sanitised operations have signatures cleared per Wire Format.
The full filter pipeline is specified in UCAN and Caveats. The contract here is that the wire never carries operations the requester is not authorised to see.
6. The mesh-rules-hash handshake
Both sides include X-Likewise-Mesh-Rules-Hash on every request
and response. On mismatch:
- The receiving side MUST treat the request as a "drift"
condition. It MAY return a
409 Conflictresponse and abort the exchange, or it MAY continue the exchange while logging the drift; this is a deployment policy choice. - The sending side, on receiving a
409 Conflictfor a mesh-rules-hash mismatch, MUST pause its sync loop with that peer and surface the condition to the operator. It MUST NOT re-attempt the same exchange before resolving the drift.
The rationale is that two nodes operating under different mesh-rules documents may both believe a given op is authorised but disagree about what its caveats mean. Continuing to sync in that condition silently corrupts the shared interpretation of the log.
The v0.1 protocol does not include an automatic mesh-rules negotiation. Resolving drift requires operator action — typically adopting a newer common rules document. A future revision is expected to add a negotiation pre-handshake; this is an open issue.
7. Limits
A v0.1 server MUST support requests and responses up to 8 MiB total body size. It MAY support larger sizes; clients MUST be prepared to receive 413-Payload-Too-Large responses on push and MUST batch their submissions accordingly.
A v0.1 server SHOULD enforce a per-peer rate limit. The
specification does not mandate a particular rate; servers MAY
return 429 Too Many Requests and clients MUST honour Retry-After.
8. Order of application on the receiver
A receiver applying operations from a POST /ops body MUST:
- Decode the postcard payload to a sequence of operations.
- Sort by HLC total order:
(timestamp.wall_ms, timestamp.logical, timestamp.node)ascending. - Apply each operation in order, deduplicating by
OpId. - Update its causal frontier accordingly.
- Tick its own HLC past the maximum received timestamp.
The fifth step is part of the HLC discipline specified in Clocks.
9. Liveness
A successful GET /ops exchange doubles as a liveness signal:
the requester learns that the responder is reachable and has not
revoked the requester's bearer. There is no separate heartbeat in
v0.1.
10. Polling cadence
The protocol does not specify how often a node should pull. A plausible v0.1 default is 30 seconds for a node on a stable local network and 5 minutes for a mobile node on metered connectivity. Implementations MAY back off on transport errors and SHOULD jitter their cadence to avoid thundering herds in a large mesh.
A future revision is expected to add server-initiated push hints (WebSocket or webhook) for lower-latency convergence. This is an open issue; v0.1 conformant nodes use polling.
11. Informative: why one endpoint
Informative section. Does not impose requirements.
A reader familiar with replicated-log systems will recognise the shape: a frontier-based pull plus an idempotent push is a standard pattern. v0.1 deliberately resists adding more — batched merkle-trees, differential range queries, sparse-index exchanges — because every additional sync mode is a place where two implementations can disagree without either being wrong.
The cost is that catching up a long-disconnected node from genesis is a sequence of paginated pulls rather than a bulk transfer. For the meshes this protocol targets — small, mostly warm, mostly online — that cost is negligible. Future revisions MAY add bulk-transfer modes for first-synchronisation and very- large-mesh scenarios; v0.1 does not.
Clocks
This chapter specifies the Hybrid Logical Clock (HLC) used to timestamp operations. The HLC is the mechanism by which two operations can be totally ordered across a mesh whose nodes disagree about wall-clock time.
The chapter fills the gap that v0.1 implementations had handled
implicitly: the discipline by which a node updates its HLC.
The clock value alone is not enough; a clock without a discipline
will, eventually, produce two operations from different authors
with the same (wall_ms, logical, node) triple, and a mesh that
permits that has no way to converge.
1. The HLC value
An HLC value is a triple:
| Field | Type | Notes |
|---|---|---|
wall_ms | unsigned 64-bit | Milliseconds since the Unix epoch (1970-01-01T00:00:00Z), as best the node can estimate. |
logical | unsigned 32-bit | A counter that advances within a single wall_ms. |
node | NodeId | The authoring node. |
The wire encoding is specified in Wire Format.
2. Total order
For any two HLC values a and b:
a < b if and only if (a.wall_ms, a.logical, a.node) < (b.wall_ms, b.logical, b.node) in lexicographic order.
This induces a strict total order on operations within a mesh. Where the rest of this specification refers to the order of operations, it means this order.
A receiver MUST apply received operations in this order regardless of the sequence position they arrived in (see Sync).
3. Per-node state
Each node maintains a single HLC value, called its local clock.
The local clock has the same fields as an HLC value above; its
node field is the node's own NodeId.
The local clock advances under two disciplines: the tick discipline on emit, and the recv discipline on receive.
4. The tick discipline
Before authoring a new operation, a node MUST advance its local
clock by the following procedure. Let prior be the local clock
value before tick, and wall_now be the node's current
wall-clock reading (in milliseconds since Unix epoch).
tick(prior, wall_now) -> next:
if wall_now > prior.wall_ms:
next.wall_ms = wall_now
next.logical = 0
else:
next.wall_ms = prior.wall_ms
next.logical = prior.logical + 1
next.node = prior.node
return next
The newly authored operation MUST carry next as its timestamp.
After authoring, the node's local clock MUST equal next.
Two requirements follow from this procedure:
- Strict monotonicity.
next > priorfor any prior. A node MUST NOT author two operations with the same timestamp. - Wall-clock dominance.
next.wall_ms >= wall_nowifwall_now > prior.wall_ms. The HLC tracks reality forward when it can.
If prior.logical is at the maximum representable value, the
node MUST treat the case as a clock overflow and refuse to
author further operations until the wall component advances. In
practice this is unreachable at the millisecond resolution and
32-bit counter v0.1 specifies, but conformant implementations
MUST still handle it.
5. The recv discipline
When a node receives a remote operation with timestamp remote,
it MUST update its local clock by the following procedure. Let
prior be the local clock and wall_now be the current wall
reading.
recv(prior, remote, wall_now) -> next:
let max_wall = max(prior.wall_ms, remote.wall_ms, wall_now)
if max_wall == prior.wall_ms and max_wall == remote.wall_ms:
next.logical = max(prior.logical, remote.logical) + 1
elif max_wall == prior.wall_ms:
next.logical = prior.logical + 1
elif max_wall == remote.wall_ms:
next.logical = remote.logical + 1
else:
next.logical = 0
next.wall_ms = max_wall
next.node = prior.node
return next
After applying recv, the node's local clock MUST equal next.
The next op the node authors will then dominate remote,
preserving the invariant that any op authored by this node after
seeing remote is later in the total order than remote.
The recv discipline MUST be applied for every remote operation, including operations that the receiver chooses to discard for authorisation reasons. (Failing to advance the clock for filtered-out ops produces an observable hole in causal ordering that breaks the frontier invariant.)
6. Wall-clock skew
The HLC is robust to bounded wall-clock skew between nodes —
that is, two nodes whose clocks are within some bound of each
other will produce timestamps whose order tracks the real order
of authoring. A node whose clock is far ahead of its peers will
"pull" the mesh's timestamps forward (other nodes will adopt the
larger wall_ms on receive). A node whose clock is far behind
will not.
The protocol does not specify a skew bound. Implementations SHOULD:
- Synchronise their wall clocks against an external time source when one is available (NTP, a peer's clock).
- Treat as suspicious any received op whose
wall_msis more than one hour ahead ofwall_now. - Continue to apply such ops in the total order, while logging the anomaly for operator inspection.
Skew tolerance is a known open issue: the v0.1 specification does not give an implementation tools to reject a peer producing wildly future-dated timestamps. A future revision is expected to add an out-of-band skew limit negotiated as part of mesh-rules.
7. Lease expiry uses HLC, not wall clock
Lease-based work claims (ClaimWork) carry a lease_duration_ms
that is interpreted against the HLC wall component, not against
the local wall clock of any single node:
expired_at(claim_op) -> hlc_threshold
let claimed_wall = claim_op.timestamp.wall_ms
return claimed_wall + claim_op.payload.lease_duration_ms
is_expired(claim_op, current_hlc) -> bool
return current_hlc.wall_ms > expired_at(claim_op)
This makes lease expiry robust to clock skew across the mesh in the same way the rest of the protocol is. See Mesh Coordination.
8. Informative: why HLC instead of vector clocks
Informative section. Does not impose requirements.
A vector clock would carry a logical counter per author and let
two ops be partially ordered. The HLC is strictly less
expressive: it produces a total order, breaking concurrency ties
arbitrarily by node. This is acceptable for Likewise
because:
- The protocol's merge semantics are last-write-wins by
OpIdfor the cases where two ops conflict; partial order would not give an implementation more information than total order already provides. - The total order plus a per-author causal frontier gives sync a clean cursor: "everything past this frontier" is unambiguous.
- A vector clock requires a per-author entry that grows with mesh size; the HLC is fixed-size.
The cost is that the protocol is not a CRDT in the strict sense — two nodes with the same set of operations agree on order regardless of how they observed them, but they don't have richer concurrency information to inspect. For Likewise's domain — a single user's mesh — that cost is the right trade.
Signatures
This chapter specifies how operations are signed and verified, and
how nodes authenticate to one another over HTTP. It is the
specification of the JWS envelope used by the signature field
on every operation, the canonical signing form referenced from
Wire Format, and the bearer-token issuance
used by Sync.
1. Algorithm
All signatures defined by v0.1 of this specification are Ed25519 (RFC 8032). Signature size is fixed at 64 bytes; verification keys are 32 bytes.
A future revision may add additional algorithms. v0.1 conformant implementations MUST support Ed25519 and MAY accept any other algorithm if and only if a future minor version explicitly introduces it.
2. Per-node keys
Each node holds exactly one Ed25519 signing key for the duration
of its lifetime. Key rotation in v0.1 is performed by issuing a
new node identity (a new NodeId and key pair) and delegating
authority to it via DelegateUcan; the previous identity may
then be revoked.
The mapping NodeId -> Ed25519 public key is established by the
node's first DelegateUcan op observed on the log. This op MUST
embed a UCAN whose iss field is the issuer's DID and whose
sub (or equivalent) names the NodeId and carries the public
key. After this binding op is observed, every subsequent op
authored by that NodeId MUST be verified against the bound
key.
3. Detached JWS envelope for op signatures
The wire-level value of an operation's signature field is a
detached JWS as defined by RFC 7515 §A.5: the header and
signature segments are present, the payload segment is empty.
The detached JWS is encoded as the UTF-8 bytes of the string:
BASE64URL(header) "." "." BASE64URL(signature)
Where:
headeris the JSON Object Serialisation of:
with field order as written, no insignificant whitespace, ASCII encoding. The{"alg":"EdDSA","kid":"node-<node_id>"}kidvalue is the literal prefix"node-"followed by the node'sNodeIdrendered as decimal digits (sinceNodeIdis a 64-bit integer in v0.1; see Wire Format).signatureis the 64-byte Ed25519 signature over the canonical signing form defined in Section 4.
The Vec<u8> carried in the operation's signature field is
the UTF-8 byte sequence of the above string. Implementations
MUST NOT include line breaks or trailing whitespace.
4. Canonical signing form
To sign or verify an operation:
- Construct the operation per the operation envelope.
- Set
signature = None. - Encode the operation per Wire Format.
Call the result
op_bytes. - Sign or verify
op_bytesusing the Ed25519 key bound to the operation'snode_id.
When signing, the resulting 64-byte signature is wrapped per
Section 3 and stored as Some(...) in the signature field
before transmission.
When verifying, the receiver unwraps the detached JWS, recovers
the 64-byte raw signature, and verifies it against op_bytes
constructed from the received op (with its signature cleared
to None) using the public key bound to the op's node_id.
A receiver MUST reject any operation whose verification fails, unless the op is a sanitised op admitted by Section 6.
5. Implementation note: round-tripping the signature field
The most common implementation error in this area is mishandling
the round-trip: implementations sign an op with the field set
to None, transmit it with the field set to Some(jws), and
then attempt to verify by re-signing the received op as-is —
yielding a signature over different bytes. Implementations MUST
explicitly clear the signature field before computing the
canonical encoding for verification, and MUST treat that step
as the canonical procedure regardless of how the op is held in
memory.
A reference test vector for cross-implementation interop is expected to ship with v0.1.1, alongside the public release of the reference implementation. Until both exist, implementers cannot fully validate signature canonicalisation against an authoritative source; the procedure in this section and the field-ordering rules in Wire Format are what to follow in the meantime. See Implementations for status.
6. Sanitised operations
When an outbound op crosses a delegation that requires
sanitisation (per the sanitize caveat described in
UCAN and Caveats), the sanitiser
modifies the op's payload by stripping or redacting the affected
fields. Because the resulting op no longer matches the bytes
the original signature was computed over, the signature would
no longer verify. Therefore the sanitiser MUST clear the
signature field on the sanitised op (set it to None) and
record the sanitisation in a marker field (specified in
UCAN and Caveats).
The receiver MUST NOT attempt signature verification on a sanitised op. It MUST verify that the sanitisation marker is consistent with a delegation that authorised the sender to apply it; this is the receive-side procedure specified in UCAN and Caveats.
An op that arrives without a signature and without a sanitisation marker MUST be rejected. The two conditions are the only legitimate paths to an unsigned op on the wire (and even the bootstrap path described in Section 7 produces a signed op).
7. Bootstrap: the first op a node authors
A node that has not yet been seen on the log presents a chicken-and-egg problem: the receiver does not know the node's public key, so cannot verify the op that establishes the binding.
The protocol resolves this by requiring that a node's first
authored op be a DelegateUcan carrying a UCAN that:
- Is signed by the issuer's DID key (not the node's).
- Embeds the node's public key in the UCAN's subject claim.
- Is itself well-formed and verifiable against the issuer's DID.
The receiver:
- Recognises that the authoring
NodeIdis unknown. - Decodes the embedded UCAN.
- Verifies the UCAN's signature against the issuer's DID.
- If valid, extracts the embedded public key and binds it to
the authoring
NodeId. - Verifies the op's own signature using the freshly-bound key.
If any step fails, the op is rejected. After this op is applied, the node's identity is known and subsequent ops authored by it follow the standard signing rules.
8. Bearer tokens for HTTP authentication
The Authorization: Bearer header on GET /ops and POST /ops
identifies the requesting node to the server. A bearer token is
a short-lived signed assertion of node identity, structured as
follows:
BASE64URL(header) "." BASE64URL(payload) "." BASE64URL(signature)
(The standard JWS Compact Serialisation, this time not detached.)
Header:
{"alg":"EdDSA","kid":"node-<node_id>"}
Payload (JSON):
{
"iss": "<NodeId>",
"aud": "<recipient NodeId or origin>",
"iat": <unix-seconds>,
"exp": <unix-seconds>,
"nonce": "<random>"
}
Tokens MUST have an expiry (exp) no more than one hour in
the future. Recipients MUST reject tokens that are expired,
that present an iss not bound to a valid public key on the
log, or that reuse a nonce already seen for the same iss
within the validity window.
Token issuance is per-request: a node generates a fresh token for each peer, signs it, and presents it. There is no central issuer. A future revision may add a refresh-token mechanism; v0.1 implementations issue one-shot tokens.
9. Verifying authority
A signature establishes that the authoring NodeId produced the
op. It does not establish that the NodeId was authorised to
produce it. Authorisation is a separate check performed against
the UCAN delegation graph — see
UCAN and Caveats and
Capabilities. Both checks are required;
either failure rejects the op.
UCAN and Caveats
This chapter specifies how authority is delegated, attenuated, and revoked, and the caveat vocabulary that narrows a delegation. The protocol uses User-Controlled Authorization Networks (UCAN) as the underlying delegation primitive, and extends UCAN's policy slot with a domain-specific caveat set.
1. UCAN version
v0.1 of this specification builds on UCAN v0.10, the last JWT-shaped revision of the UCAN format. A UCAN token is a detached JWS over a JSON payload with the standard fields:
| Field | Purpose |
|---|---|
iss | Issuer DID. The party delegating authority. |
aud | Audience DID. The party receiving authority. |
att | Attestation array. Each entry is a (resource, action, caveats) capability. |
nbf | Not-before time (Unix seconds). |
exp | Expiry time (Unix seconds). |
prf | Proof chain. Array of parent UCAN content hashes. |
The full UCAN v0.10 format is specified externally; the canonical reference is the UCAN working group repository.
1.1 v1.0 migration
The UCAN working group has moved on to v1.0, which uses a DAG-CBOR plus Varsig envelope and CIDv1 references. Likewise's v0.1 specifies v0.10 because that is what the reference implementation uses. The v0.10 → v1.0 migration is a known open issue and is expected to land as part of the next major version.
2. Capabilities in v0.10's att field
Every entry in a UCAN's att array is a Likewise
capability. The protocol places its capability schema directly
in the UCAN policy slot:
{
"resource": "<Resource enum value>",
"action": "<Action enum value>",
"caveats": { ... }
}
The set of legal resource and action values, and the legal
caveats schema, are specified in
Capabilities. This chapter covers how
delegations are linked, attenuated, and revoked; the next
chapter covers what they can authorise.
3. The delegation graph
A capability flows through the mesh as a chain of UCAN delegations rooted at the user. The user issues their root delegation to one or more nodes (typically the phone), authorising those nodes to author further delegations.
When a delegation D_b cites a parent D_a in its prf
array, the receiving node MUST:
- Resolve
D_afrom the op log (or refuse the delegation if it cannot). - Verify
D_awas issued by the DID thatD_b's issuer holds delegation under, transitively up to the user. - Verify
D_b's capabilities are an attenuation ofD_a's (Section 4). - Verify the time bounds on
D_bare withinD_a's (D_b.nbf >= D_a.nbf,D_b.exp <= D_a.exp).
A delegation that fails any of these checks MUST be rejected.
4. Strict attenuation
A child delegation's capability set MUST be a subset of its
parent's. Attenuation is checked per-(resource, action) pair:
the child MAY include any pair the parent includes (or any pair
strictly narrowed by additional caveats), and MUST NOT include
pairs the parent does not.
For each capability in the child:
- The
(resource, action)pair MUST appear in the parent (possibly with broader caveats). - The child's
caveatsMUST be at least as restrictive as the parent's (Section 5).
A delegation that broadens any caveat compared to its parent MUST be rejected by every receiving node, regardless of whether the broadened delegation was signed correctly.
5. Caveats
Every caveat is optional, meaning "no restriction along this
axis." A delegation with no caveats authorises the full scope
of the (resource, action) pair (subject to any restrictions
inherited from its parent).
Caveat narrowing rules: a child caveat is at least as restrictive as a parent caveat if and only if every operation that satisfies the child's caveat would also satisfy the parent's.
The v0.1 caveat vocabulary comprises six fields. Future minor versions MAY add caveats; an unknown caveat field MUST be treated as an absolute restriction (a delegation carrying an unknown caveat is admitted, but no operation can satisfy the unknown caveat — effectively granting the empty capability).
5.1 source_types
Restricts the capability to evidence whose source_type
matches one of the listed values.
| Form | Meaning |
|---|---|
| absent | No restriction. |
["calendar"] | Only operations on calendar-source evidence. |
["calendar", "contact"] | Either calendar or contact. |
Narrowing: child's set MUST be a subset of parent's.
5.2 predicates
Restricts the capability to claim operations whose predicate matches one of the listed values.
| Form | Meaning |
|---|---|
| absent | No restriction. |
["located_at"] | Only claim ops with predicate "located_at". |
Narrowing: child's set MUST be a subset of parent's.
5.3 kind_prefix
Restricts the capability to job operations whose kind field
starts with one of the listed prefixes.
| Form | Meaning |
|---|---|
| absent | No restriction. |
["cortex.synthesize."] | Only synthesize-class jobs. |
["cortex."] | Any reverse-DNS-prefixed cortex job kind. |
Narrowing: each child prefix MUST be a prefix of (or equal to) some parent prefix.
5.4 time_range
Restricts the capability to operations whose timestamp.wall_ms
falls within the given range.
| Form | Meaning |
|---|---|
| absent | No restriction. |
[start_ms, end_ms] | Inclusive lower bound, exclusive upper bound. |
Narrowing: child's range MUST be contained within parent's.
5.5 sanitize
Specifies field-level redactions that MUST be applied to operations crossing this delegation. Sanitisation is unique among caveats in that it does not block an op; it modifies it in flight.
The v0.1 sanitisation rules are:
| Rule | Effect |
|---|---|
StripGeo | Remove latitude, longitude, altitude, and any other geographic coordinates from evidence metadata, claim objects, and artefact bodies. |
RedactParticipants | Replace participant identifiers with anonymised placeholders consistent within the operation but not linkable to the original entities. |
TruncateContent(N) | Truncate any content body to at most N bytes. |
StripCustomMetadata | Remove any custom-metadata fields not specified by the protocol. |
A delegation MAY specify multiple sanitise rules; they are applied in the order listed.
Narrowing: a child delegation's sanitize rule list MUST be a
superset of its parent's (sanitisation strengthens at each hop).
5.5.1 The sanitisation marker
When an op is sanitised on the wire, the sanitiser MUST set the
op's signature field to None (per
Signatures) AND attach a sanitisation
marker. The marker is a payload-internal field whose presence
both:
- tells the receiver the op was deliberately filtered, not corrupted, and
- records the chain of sanitise rules applied (so the receiver can audit that the rules match a delegation the sender held).
The exact wire shape of the marker is specified in Wire Format; the contract here is that the marker is a structurally-required part of any unsigned, deliberately-modified op.
A receiver MUST verify that the marker's claimed sanitise chain is admitted by some delegation the sender holds reaching back to the user. A marker that does not match an authorised chain MUST cause the op to be rejected.
5.6 audit_inference
Requires the delegated node to emit
likewise.inference.snapshot artefacts (see
Inference Audit)
for every model call performed against data covered by this
delegation.
| Form | Meaning |
|---|---|
absent or false | No requirement. The delegated node MAY emit snapshots for its own bookkeeping but is not obliged to. |
true | The delegated node MUST emit a likewise.inference.snapshot artefact for every inference call performed against ops admitted by this delegation. |
Narrowing: a child caveat with audit_inference: true is
admissible under any parent (the parent did not require audit;
the child voluntarily promises it). A child caveat with
audit_inference: false is admissible only if the parent also
permits audit-free operation. In other words, audit
requirements strengthen down the chain; they cannot be
relaxed.
This caveat is the mechanism by which the user requires
auditable inference from a delegated party — for example, an
organisation running a Likewise node under a scoped
delegation. When a delegation carries audit_inference: true,
the snapshots that the delegated node emits become themselves
operations on the user's log (subject to the user's read
capability on the artifact ops the delegated node produces),
completing the audit loop across the delegation boundary.
The corresponding invariant is specified in Invariants §I-9.
6. Revocation
A RevokeUcan op authored by a delegation's issuer (or by a
node holding write authority over the issuer's DID under a
still-valid parent) retires the delegation. Receiving nodes
MUST:
- Mark the delegation's content hash as revoked in the local UCAN view.
- Recursively mark any delegations whose
prfcites the revoked one as revoked (transitive cascade). - Re-evaluate the authorisation of every operation whose authority chain depended on a now-revoked delegation. Such operations are NOT removed from the log, but they MUST NOT be applied to projections.
The on-revoke rebuild is one of the more expensive operations in the protocol; implementations SHOULD batch revocations and defer the rebuild to the next idle window when latency permits.
A revoked delegation cannot be un-revoked. To restore the authority, the issuer issues a new delegation.
7. The authorise-and-filter pipeline
When a node receives operations (whether from its own scheduler authoring them locally or from a remote peer), it MUST run the following pipeline before applying them to projections:
- Verify signatures per Signatures.
- Authorise each op against the authoring node's effective
capability set — the union of capabilities derived from
delegations rooted at the user, restricted by all caveats
in the chain. An op is authorised iff its
ActionandResourceare admitted and its caveats are satisfied. - Apply transitive cascades: re-evaluate ops whose authority depended on now-revoked delegations.
- Sanitise outbound ops crossing delegations with
sanitizecaveats.
Steps 1-3 run on receive; step 4 runs on send. The pipeline is specified in detail in Capabilities.
8. The user's root delegation
The mesh is bootstrapped by the user issuing a root UCAN to the first node (typically the user's phone). The root delegation:
- Has the user's DID as
iss. - Has the first node's
NodeId-bound DID asaud. - Carries the maximal capability set (
(*, *)with no caveats). - Has no
prf; it is the chain root.
Subsequent delegations cite the root (or a descendant of it) as their proof. The user holds the keypair backing their DID; an implementation MUST provide the user with a mechanism to authorise root re-issuance and to revoke the existing root.
The protocol does not specify the user-interface for this authorisation; that is implementation-defined. The protocol specifies only the wire format of the resulting UCANs.
Capabilities
This chapter specifies the capability vocabulary used in UCAN
delegations: the set of legal Resource values, the set of
legal Action values, the legal combinations, and the
authorise-and-filter pipeline that uses them.
1. Resources
A Resource names a class of protocol-defined entity that a
capability authorises an action on. The v0.1 resource vocabulary:
| Resource | Covers |
|---|---|
Ops | Universal — any operation, regardless of category. |
Evidence | Evidence operations (IngestEvidence, TombstoneEvidence). |
Entity | Entity operations (CreateEntity, AddEntityAlias, MergeEntities, SplitEntity). |
Claim | Claim operations (CreateClaim, UpdateClaimStatus, UpdateClaimConfidence, SupersedeClaim). |
Job | Job operations (ScheduleJob, ClaimWork, CompleteJob, YieldWork, ExpireWork). |
Episode | Episode operations (CreateEpisode, UpdateEpisode). |
Artifact | Artefact operations (CreateArtifact, EvictArtifact). |
Action | Suggested-action operations (CreateSuggestedAction, UpdateActionStatus). |
Mesh | Mesh-coordination operations (DesignateCoordinator, RouteKind). |
UserAssertion | User-assertion operations (UserAssert). |
Registration | Identity and delegation operations (DelegateUcan, RevokeUcan). |
Ops is the universal resource: a capability granted on Ops
applies to any operation, equivalent to a union of all the
specific resources. Implementations MUST treat Ops
appropriately when checking attenuation — a child capability on
a specific resource MAY appear under a parent capability on
Ops.
Future minor versions MAY add resources. Implementations MUST reject capabilities naming an unknown resource.
2. Actions
An Action names what may be done with a resource. The v0.1
action vocabulary:
| Action | Meaning |
|---|---|
Read | The holder may receive operations of the resource class on inbound sync. |
Write | The holder may author operations of the resource class. |
Schedule | The holder may emit ScheduleJob ops (only meaningful with Resource::Job). |
Claim | The holder may emit ClaimWork ops. |
Complete | The holder may emit CompleteJob, YieldWork, and ExpireWork ops. |
Read is the gating action for outbound sync filtering: a
peer's GET /ops response MUST only include ops the peer holds
Read for. Write is the gating action for op authoring: a
node MUST NOT successfully apply an op it authored without
holding Write on the relevant resource.
The job-specific actions (Schedule, Claim, Complete) split
job authority into discrete capabilities so that, for example, a
phone can schedule synthesis jobs while only a trusted server
may claim them.
3. Resource × Action matrix
Not every (Resource, Action) combination is meaningful. The
table below summarises which combinations the v0.1 specification
defines. Cells marked — indicate combinations that have no
defined effect (a delegation may include them but they will
authorise nothing useful; an implementation MAY warn but MUST
NOT reject).
| Resource → / Action ↓ | Read | Write | Schedule | Claim | Complete |
|---|---|---|---|---|---|
Ops | ✓ | ✓ | — | — | — |
Evidence | ✓ | ✓ | — | — | — |
Entity | ✓ | ✓ | — | — | — |
Claim | ✓ | ✓ | — | — | — |
Job | ✓ | — | ✓ | ✓ | ✓ |
Episode | ✓ | ✓ | — | — | — |
Artifact | ✓ | ✓ | — | — | — |
Action | ✓ | ✓ | — | — | — |
Mesh | ✓ | ✓ | — | — | — |
UserAssertion | ✓ | ✓ | — | — | — |
Registration | ✓ | ✓ | — | — | — |
The Mesh resource grants authority over RouteKind and
DesignateCoordinator. These are owner-only ops: the protocol
requires that the authoring node hold Mesh.Write AND that
the authoring node be the mesh owner (i.e. the holder of the
root delegation chain). A node holding Mesh.Write via a
non-root delegation MUST have any RouteKind or
DesignateCoordinator ops it authors rejected.
4. Caveat applicability
The six caveats specified in UCAN and Caveats apply to capabilities as follows:
| Caveat | Applies to | Effect |
|---|---|---|
source_types | Capabilities on Evidence and on Ops | Restricts which evidence's source_type the holder may read or write. |
predicates | Capabilities on Claim and on Ops | Restricts which claim predicates the holder may read or write. |
kind_prefix | Capabilities on Job | Restricts which job kinds the holder may schedule, claim, or complete. |
time_range | Any capability | Restricts the timestamp range of operations the capability admits. |
sanitize | Any capability with Read | Specifies sanitisation applied to operations crossing the delegation outbound. |
audit_inference | Capabilities on Job (Claim or Complete), Claim (Write), Artifact (Write), or Ops | When true, requires the delegated node to emit likewise.inference.snapshot artefacts for every model call performed against data covered by this delegation. |
A caveat applied to a resource it does not narrow has no
effect: a kind_prefix on a capability over Evidence does
not restrict anything, because Evidence ops do not have a
kind field. Such caveats MAY be present (they do not invalidate
the delegation) but they do not authorise additional behaviour.
5. The authorise-and-filter pipeline
This section specifies the procedure a node runs when ingesting an operation, whether locally authored or received over the wire. It MUST be applied in the order specified.
5.1 On receive (inbound)
For each incoming op:
-
Reject malformed. If the op fails wire-format validation (per Wire Format), reject it.
-
Verify signature (or skip for sanitised ops; see step 6). If the op carries a
signature, verify it per Signatures. If verification fails, reject it. -
Resolve authority. Identify the authoring
NodeIdand walk the chain of UCAN delegations from thatNodeId's bound DID to the user's root. If no such chain exists, reject the op. -
Check active validity. Reject the op if any delegation in its authority chain is revoked, not yet active (
nbfin the future), or expired (expin the past). The check uses the op'stimestamp.wall_msfornbf/expcomparisons, not the receiver's local wall clock. -
Authorise. The op's
(Resource, Action)MUST appear in the effective capability set derived from the chain (the intersection of caveats along the chain). The op's payload MUST satisfy every caveat: source-type checks for evidence ops, predicate checks for claim ops, kind-prefix checks for job ops, time-range checks against the op's timestamp. -
Verify sanitisation marker (for unsigned ops only). The marker MUST identify a sanitise rule chain admitted by some delegation the authoring node holds reaching to the user. If verification fails, reject the op.
-
Apply. The op is authorised and authentic; the implementation may now apply it to projections.
A rejected op is dropped from the apply pipeline. Implementations SHOULD log rejections; they MUST NOT silently apply rejected ops or partially apply them.
5.2 On send (outbound)
When a node responds to a GET /ops request, it MUST filter the
candidate ops by the requester's effective capability set
before serialising them onto the wire:
-
Authorise. For each candidate op, evaluate whether the requester is authorised to read it (the equivalent of the inbound check, against the requester's chain). If not, exclude the op from the response.
-
Sanitise. For each remaining op, if the requester's delegation chain carries
sanitizerules, apply the rule chain to a clone of the op:- Apply each rule's redactions in order.
- Set the cloned op's
signaturetoNone. - Attach the sanitisation marker recording the rule chain.
- Use the cloned, sanitised op as the response value.
The sanitisation step happens server-side; the requester receives only the sanitised op and cannot recover the redacted fields. This is the only authorised way for an unsigned op to appear on the wire.
5.3 On transitive revocation
When a RevokeUcan op is applied, every previously-applied op
whose authority chain depended on the revoked delegation MUST
be re-evaluated:
- Walk the projection's index of applied ops by chain.
- For each affected op, re-run the authorise pipeline as if the op had just been received.
- Ops that no longer authorise MUST be removed from the projections (the underlying op log entry is preserved).
This is the operation that gives revocation real teeth: an op that was admitted under a delegation no longer trusted is no longer trusted, retroactively.
6. Capability composition
The user's root delegation is (Ops, *) with no caveats —
maximal authority. Subsequent delegations narrow this. A node
in practice typically holds:
(Ops, Read)with sanitisation caveats — to receive most ops with privacy filtering.(Evidence, Write)withsource_typescaveat — to ingest evidence from a specific connector.(Job, Schedule)withkind_prefixcaveat — to schedule inference work in a specific class.(Job, Claim)and(Job, Complete)with the matchingkind_prefix— to actually do the work.(UserAssertion, Write)— to forward user feedback.
A device-specific delegation typically composes several of these into a single UCAN; the implementation builds a node's effective capability set by unioning the granted capabilities across that node's delegations.
7. Reserved combinations
The protocol reserves the following capability behaviours for future minor versions; v0.1 implementations MUST NOT issue or accept delegations using them:
- Capabilities on a resource type introduced in a future version that the receiving node does not understand.
- Caveat fields not in the v0.1 vocabulary.
A delegation containing a reserved combination MUST be rejected by a v0.1 conformant node.
Projections
A projection is a materialised read view derived from the operation log. This chapter specifies the projection contract: what each projection MUST be able to answer, what relationships between projections are load-bearing, and what implementations are free to optimise.
The chapter fills the gap that v0.1 implementations had handled implicitly: an explicit statement of which projection-related behaviours an implementation MUST provide and which it MAY choose internally.
1. The disposability invariant
A conformant implementation's projections MUST be fully reconstructable from the operation log as it stands at any point in time.
Concretely:
- An implementation MUST be able to rebuild every projection from the log alone. There MUST NOT exist any state in a projection that has no derivation rule from the log.
- An implementation MUST NOT modify projections by any means other than applying operations. UI actions, schedulers, caches, and external systems MUST go through the op-log layer.
- A projection MAY be discarded at any time and rebuilt on demand. An implementation MAY persist projections for performance, but persisting them MUST NOT change the log's authority over their content.
This invariant is the load-bearing reason that the protocol can guarantee the user owns their derived data: nothing the system believes about the user lives outside the log.
2. The three substrate projections
The protocol defines three projections by the queries they answer, not by their storage strategy. An implementation MUST provide each of the three query surfaces; it MAY combine the underlying storage as it sees fit, provided each surface remains queryable as specified.
A fourth projection, salience, is used by the reference implementation to surface records to a user. It is an application-layer convention rather than substrate, and is specified in Annex: Application Conventions §A.3. A node that does not surface records to a user — for example, an organisation's node consuming a scoped slice — has no need to implement it.
2.1 Inference projection
Purpose. Assemble a model context window for an inference call.
Required queries.
- Window context. Given a time window, return the evidence, claims, entities, and episodes a model should receive as context for synthesising over that window.
- Per-entity context. Given an entity, return the relevant claim stack and supporting evidence for an inference call centred on that entity.
Constraints.
- The inference projection MUST be designed for assembly into prompt-shaped data structures, not for UI rendering.
- It MUST track which claims and evidence are framing tags versus narrative content. Implementations MUST be able to produce both shapes.
- It SHOULD be cheap to update incrementally as new ops arrive, because it is consulted on every inference call.
The inference projection's content is not the prompt itself — the prompt is constructed by the implementation's inference pipeline. The projection's responsibility is to provide correct context; the pipeline's responsibility is to assemble it.
2.2 Detail projection
Purpose. Answer per-id user-interface lookups.
Required queries.
- Get by id. Given an
EntityId,EpisodeId,ClaimId,ActionId, orEvidenceId, return all user-visible fields for that record: title, label, claim text, status, confidence, provenance links, supporting evidence summary. - List by predicate. Given an entity and a predicate, return the current claims of that predicate on that entity (with effective status applied).
- Provenance trace. Given any derived record, return the chain of supporting operations transitively to evidence.
Constraints.
- The detail projection MUST be durable (typically on-disk). Every conformant node carries it, regardless of whether the node is rendering a UI.
- It MUST rebuild from the log when missing or corrupted.
- It MUST honour user assertions: a
Rejectuser-assertion MUST cause subsequent reads of the affected claim to return the rejected status, regardless of underlying derivation state. - Reads MUST be keyed direct lookups. Implementations MAY add secondary indices for richer queries; the v0.1 spec does not require them.
The detail projection is the only projection a v0.1 conformant node MUST persist across restarts.
2.3 Debug-graph projection
Purpose. Full-graph inspection for tooling and verification.
Required queries.
- Full graph dump. Return all entities, claims, evidence references, and edges between them as of the current log prefix.
- Cycle detection. Identify cycles in the derivation graph (which MUST NOT exist).
Constraints.
- The debug-graph projection is OPTIONAL in production deployments.
- An implementation that includes it SHOULD make it available via inspection tooling (a CLI, an admin endpoint).
The protocol provides this projection because being able to ask "show me the entire graph" is a load-bearing debugging capability for a system the user is asked to trust.
3. The non-collapse rule
An implementation MAY combine the underlying storage of multiple projections — for example, keeping a single SQLite file with separate tables for the detail and inference projections — but it MUST NOT fold the read interfaces such that one projection's query semantics contaminate another.
In particular:
- An inference-context query MUST NOT return UI-shaped detail records.
- A detail-by-id query MUST NOT carry inference-window framing tags as if they were claim content.
- An implementation that adds an application-layer projection (such as the salience projection in Annex §A.3) MUST NOT fold its read interface into a substrate projection.
The reason for the rule is observable: collapsing produces a single fat object that is too slow for ranking, too lossy for UI, and too memory-hungry for inference contexts. Implementations that have tried it have re-discovered why the distinction exists; the spec encodes the lesson normatively.
4. Rebuild from log
Every conformant implementation MUST provide a rebuild operation that:
- Drops or otherwise invalidates the current state of all projections.
- Replays the entire op log in HLC total order, applying each op to all three substrate projections (and to any application-layer projections the implementation maintains).
- Reaches a steady state in which subsequent op application continues normally.
Rebuild is the recovery mechanism for projection corruption and the verification mechanism for new implementations. Rebuilding from a known log and comparing the result to a trusted reference output is intended to be the strongest test of projection correctness; once the reference implementation (see Implementations) is public, its rebuilds will serve as that reference. Rebuild SHOULD be deterministic up to algorithm-internal choices: the same implementation must produce the same projection from the same log every time, and two implementations that satisfy this chapter's contract must agree on every fact derivable from the log even if they internally choose different ranking or scoring strategies in their application-layer projections.
5. Per-projection authority
When two projections produce conflicting answers about the same fact, the detail projection wins for user-visible display, and the inference projection wins for model context. The debug-graph projection — and any application-layer projection an implementation chooses to maintain — MUST NOT supply authoritative answers about the user's data.
If the detail and inference projections disagree about a user-visible field, the implementation has a bug. The spec considers them obligated to agree on every fact derivable from the log.
6. The frontier and projection state
A node's causal frontier (per Sync) is itself a
projection — it summarises the maximum HLC seen per author,
which is computable from the op log alone. Implementations MUST
maintain the frontier consistently with the log: after applying
op O to projections, the frontier MUST reflect O's
timestamp.
A node SHOULD persist the frontier to avoid re-scanning the entire log on startup, but a node that does not persist it MUST recompute it correctly on the first sync exchange.
7. Capability filtering and projection
Operations rejected by the authorise-and-filter pipeline (per Capabilities) MUST NOT enter projections. The integrity of "everything in projections is authorised" is load-bearing for any reasoning about projections under capability evolution: when a delegation is revoked, the re-evaluation step (per Capabilities) MUST remove from projections any record whose authority chain no longer authorises it, even though the underlying log entry is preserved.
8. Informative: storage strategies
Informative section. Does not impose requirements.
A reference-implementation deployment uses:
- An in-memory window-segmented structure for the inference projection.
- A SQLite database with a per-id table for the detail projection.
- A separate
petgraph::StableGraphfor the debug-graph projection (rebuilt on demand rather than maintained). - An in-memory hash map for the application-layer salience projection's scores (see Annex §A.3).
Other deployments might combine them differently. The contract above is the only thing the protocol requires.
Invariants
The invariants in this chapter are the non-negotiable rules of the protocol. Every other normative statement in the specification exists to make one of these rules implementable; an implementation that violates any of them is non-conformant regardless of which other sections it satisfies.
This chapter is the canonical, formal version of the rules introduced informally in Concepts and Motivation.
I-1. Log canonicality
Only operations on the log mutate canonical state. No other mutation source is admissible. Any state held by an implementation that is not derivable from the op log is, by definition, not part of the user's knowledge graph.
Concretely:
- A projection MUST NOT carry a field that has no derivation rule from the log.
- An external integration (a UI, a scheduler, a third-party bridge) MUST mediate every mutation through the op-log layer.
- Any apparent state — a notification badge, a cached thumbnail — that lives outside the log is a presentation artefact, not a fact.
I-2. Projection disposability
All projections are reconstructable from the op log. No projection's content may be load-bearing in a way that prevents rebuild.
Implementations MUST be able to drop any projection and rebuild it from the log alone. The rebuild is specified in Projections.
I-3. Transitive provenance to evidence
Every user-visible claim, episode, and suggested action has a chain back to evidence.
Concretely:
- A claim's
provenancefield references the supporting evidence and supporting claims. - An episode's
evidence_ids,claim_ids, andentity_idsfields are populated. - A suggested action's
supporting_claims,supporting_evidence, andderivation_jobfields are populated.
For each link in the chain, the referenced record MUST be present on the log (or be a tombstoned record whose absence is itself an op).
A derived record without provenance to evidence is malformed and MUST be rejected.
I-4. Derivation is a DAG
The derivation graph is a directed acyclic graph. A claim MUST NOT (transitively) cite itself. An operation that would introduce a cycle into the derivation graph MUST be rejected.
The entity-resolution graph (which entity merges into which) is not required to be acyclic; cycles in entity resolution are resolved deterministically per Mesh Coordination.
The DAG-ness of derivation is what makes invalidation decidable: a tombstone or rejection cascades forward along outgoing edges in finite steps.
I-5. Sync converges operations, not projections
Two nodes that have applied the same set of operations agree on canonical state. Differences in projection materialisation are permitted — implementations may differ in salience algorithms or indexing strategies — but the underlying truth they project from MUST be the same.
A v0.1 conformance test consists in part of:
- Send the same op log to two implementations.
- Verify their detail projections answer the same get-by-id queries identically.
Implementations whose detail projections disagree on facts derivable from a shared log are non-conformant.
I-6. Per-author HLC monotonicity
No author produces two operations with the same HLC value. Within a single author's stream of authored ops, HLC values MUST be strictly monotonically increasing.
This is enforced by the tick discipline in
Clocks. A receiving node
that observes two ops with identical (wall_ms, logical, node)
from the same author MUST treat the condition as an integrity
failure — the authoring node violated the protocol — and reject
both ops.
The frontier-based sync cursor depends on this invariant.
I-7. Authentic authorship
Every authored operation is signed by its author, or is a deliberately sanitised op admitted by an authority chain.
There are exactly two ways for an op to appear unsigned on the wire:
- The op has been sanitised under a
sanitizecaveat per UCAN and Caveats. Such an op carries a sanitisation marker. - The op is the bootstrap
DelegateUcanthat establishes a new node's binding, where authentication is provided by the embedded UCAN's own signature rather than by the op envelope's.
An op that arrives unsigned without satisfying one of these conditions MUST be rejected.
I-8. Atomic tombstone cascade
Removing evidence cascades atomically through derived data.
When TombstoneEvidence (or CascadeTombstone) is applied,
every claim, entity merge, episode, suggested action, and
inference snapshot that transitively depended on the tombstoned
evidence MUST be invalidated as part of the same logical apply.
A receiver MUST NOT observe a state in which the evidence is
gone but its dependents remain "live" in projections.
Implementation strategies (single transaction, op-batched apply, idempotent retry) are at the implementation's discretion; the observable atomicity is what the spec requires.
I-9. Inference is recorded (when audit is in force)
A node performing inference under audit MUST emit an
InferenceSnapshot artefact for every model call.
Audit is in force in two cases:
-
The node is operating under the user's root delegation. The reference implementation, and any implementation a user runs on their own devices, falls into this category. For such nodes, audit is the default and is normative for v0.1 conformance — an inference call without a corresponding snapshot is a violation regardless of what other invariants the implementation satisfies. This is what makes the user's own personal mesh auditable end-to-end.
-
The node is operating under a delegation whose caveats require audit. A user delegating to an organisation's node MAY attach an
audit_inferencecaveat (specified in UCAN and Caveats) requiring the delegated node to emit snapshots for inference performed against the delegated data. In this case, the snapshots are themselves visible on the log the user receives back from the delegated node, completing the audit loop across organisational boundaries.
A delegated node operating without an audit caveat is not required by this invariant to record its internal inference. Whatever the delegated node does with the data it received — training, summarisation, classification, recommendation — is governed by the delegation's other caveats and by whatever out-of-band agreements the user and the delegated party have. This is a deliberate scope choice: the protocol's role is to let the user decide whether audit applies, not to mandate it for every party that ever processes a piece of the user's graph.
When audit is in force, every derived claim, every materialised
record (including episodes and suggested actions where they
exist as application-layer conventions), and every other
inference output MUST link to its producing snapshot via the
record's provenance fields or via causal_deps. The "how did
it know?" question has, when audit is in force, a literal
answer consisting of evidence and claims.
The snapshot artefact's required content (model identity, retrieved context, prompt, output, telemetry) is specified in Inference Audit §2.
I-10. Authority is verified per op
An operation is admitted to projections only if its authoring node held the necessary capability at the operation's timestamp.
Authority is verified by walking the chain of UCAN delegations from the authoring node to the user's root, at the operation's HLC timestamp. Operations whose chain is incomplete, expired, not yet active, or revoked MUST NOT enter projections.
When a delegation is revoked retroactively (per Capabilities), ops that no longer authorise MUST be removed from projections, even though their op-log entries are preserved.
A note on enforcement
These ten invariants are not aspirational. An implementation that violates any of them produces a system in which the user cannot trust what the system says about them — which is the condition the protocol exists to prevent.
A v0.1 conformance test consists of demonstrating that each invariant holds under a battery of concrete operations and sequences. The seven scenarios planned to ship with the reference implementation (see Implementations) are intended to collectively cover I-1 through I-10. Until those scenarios are public, the path to "behaviourally conformant for v0.1" is to construct equivalent coverage from this chapter's invariants directly.
State Machines
This chapter specifies the lifecycle state machines for the substrate record types whose transitions are non-trivial: claims, jobs, and node registrations. The job FSM was already specified in Mesh Coordination; this chapter restates it for completeness.
The Episode and Suggested-action FSMs used by the reference implementation are application-layer conventions and live in Annex: Application Conventions.
For each FSM:
- States are listed with a brief description.
- Transitions are listed with the operation that causes them and any normative constraints.
- Cascading effects on other records are specified.
1. Claim FSM
A claim's lifecycle is the most consequential FSM in the protocol because user-visible recommendations depend on whether the underlying claims are believed.
1.1 States
| State | Meaning |
|---|---|
Hint | Initial low-confidence guess. Not surfaced to the user. |
Claim | The system is operating on this as a working belief. |
Fact | User-confirmed. Frozen against subsequent automatic invalidation. |
Disputed | The system has conflicting claims about the same subject and predicate; surfacing requires resolution. |
Rejected | The user (or downstream evidence) has invalidated this claim. |
Superseded | Replaced by a newer claim. |
Stale | A supporting source was invalidated; the claim's evidential basis no longer holds. |
1.2 Transitions
| From | To | Cause | Constraint |
|---|---|---|---|
| (creation) | Hint or Claim | CreateClaim op | Initial status. |
Hint | Claim | UpdateClaimStatus | Confidence threshold passed. |
Claim | Fact | UserAssert(Confirm) | User-authored. The claim is frozen — subsequent non-user UpdateClaimStatus ops MUST NOT change its status. |
Claim | Rejected | UserAssert(Reject) | Triggers the cascade in Section 1.3. |
Claim | Disputed | UpdateClaimStatus | Used when a conflicting claim of the same subject and predicate exists. |
any non-Fact | Superseded | SupersedeClaim | The replacement claim's claim_id MUST already be on the log. |
any non-Fact | Stale | derivation cascade | Triggered by TombstoneEvidence of a supporting evidence record or Reject of a supporting claim. |
1.3 Cascade on rejection
When a claim transitions to Rejected via a user assertion, the
implementation MUST traverse the derivation DAG forward and
invalidate every record that transitively depended on the
rejected claim:
- Dependent claims transition to
Stale. - Dependent episodes transition to
Stale(Section 2). - Dependent suggested actions transition to
Rejected(Section 3). - Dependent inference snapshots are tagged stale; their artefacts MAY be evicted on the next eviction pass.
The cascade is part of the same logical op apply (per I-8 in Invariants).
1.4 Frozen-fact immunity
A claim with status Fact is frozen. The following
constraints apply:
- A non-user-authored op (
UpdateClaimStatus,UpdateClaimConfidence,SupersedeClaim) targeting a frozen claim MUST be rejected. - A
TombstoneEvidenceop MAY cascade into a frozen claim (Section 1.3) only if the op is itself authored by a node with user-assertion authority. Otherwise the cascade stops at the frozen claim's boundary. - A
UserAssert(Reject)op MAY override frozen state — the user retains the authority to change their mind.
2. Job FSM
Restated from Mesh Coordination for completeness.
| State | Meaning |
|---|---|
Pending | Scheduled but not claimed. Eligible for claim. |
Claimed | A worker holds the lease. |
Completed | A CompleteJob op terminated the job. |
Transitions:
(creation) → Pending:ScheduleJob.Pending → Claimed:ClaimWork.Claimed → Pending:YieldWorkorExpireWork.Claimed → Completed:CompleteJob.
Completed is terminal.
3. Node-registration FSM
A node's lifecycle in a mesh is governed by the UCAN delegation graph rather than by an explicit state field, but the observable states are useful to name.
3.1 States
| State | Meaning |
|---|---|
Pending | The node has authored its bootstrap DelegateUcan but the receiving nodes have not yet observed it. |
Active | The bootstrap delegation has been observed; the node may author and receive ops per its capability set. |
Suspended | A delegation in the node's chain is no longer in force (typically due to a parent's nbf/exp window) but is not revoked. |
Revoked | The node's authority has been retired by RevokeUcan of a parent in its chain. |
3.2 Transitions
| From | To | Cause |
|---|---|---|
| (creation) | Pending | First op authored by an unknown node |
Pending | Active | Bootstrap DelegateUcan observed and verified |
Active | Suspended | Time-bound delegation expired, but parent still valid |
Suspended | Active | Renewed delegation issued |
Active or Suspended | Revoked | RevokeUcan of a parent in the chain |
Revoked is not terminal in the sense that the same node
identity can later be re-admitted by a new delegation chain;
v0.1 implementations MAY treat Revoked as recoverable
provided they re-evaluate the entire authority chain at the
time of re-admission.
A node in Suspended MUST NOT have its newly-authored ops
applied to projections; the ops remain on the log but are
treated as if their authority chain were broken until the
suspension lifts.
A node in Revoked MUST have its previously-applied ops
re-evaluated per
Capabilities.
4. Status precedence under user assertions
Across all FSMs in this chapter, user assertions take precedence over machine-derived state. Concretely:
- A
UserAssert(Confirm)is a one-way trip toward stronger belief; the affected record is frozen against demotion. - A
UserAssert(Reject)is final; the affected record is invalidated and stays invalidated until a subsequentUserAssertoverrides it. - A user-authored
UpdateActionStatus(or analogous op for other record types) takes precedence over any system-authored op of the same shape.
The mechanism by which the receiving node distinguishes
user-authored from system-authored ops is the authoring
node's capability set: a node holding UserAssertion.Write
without restriction (and bound to the user's own root) is
"user-bearing" in the sense the spec needs. The exact
implementation is in
Capabilities.
Mesh Coordination
Part 2 of the specification, first chapter. This chapter is the work-distribution layer of Likewise. It depends on the substrate (Part 1) for op log, sync, signatures, capabilities, and projections; it adds the vocabulary by which multiple nodes cooperate on a single user's work.
The companion chapter Inference Audit covers the second concern of Part 2: how inference calls performed by audited nodes become recoverable artefacts on the log.
An implementation that wants to be a substrate peer — for example, an organisation's node consuming a scoped slice of a user's graph for its own internal purposes — does not need to implement this chapter. It can sync, verify, authorise, and read the log without participating in the work-routing machinery. An implementation that wants to participate in distributed work on a user's behalf — the reference implementation, a server the user runs at home, a delegated organisation node the user has asked to handle inference jobs — does need this chapter.
This chapter specifies how multiple nodes cooperate on a single user's mesh: how work is scheduled and claimed, how the designated coordinator's role differs from a peer's, how the owner routes specific job kinds to specific nodes, and how conflicts between concurrent claims are resolved.
The relevant operations were enumerated in Operations; this chapter specifies their semantics, state-machine effects, and authority requirements. The inference-snapshot artefact format that audited nodes emit when they execute work is specified in Inference Audit.
1. Roles in a mesh
A mesh has the following roles. A single node MAY hold more than one role.
- Owner. The node that holds the user's root UCAN delegation.
The owner is the only node authorised to author
DesignateCoordinatorandRouteKindops (per Capabilities). In a typical deployment the user's phone is the owner. - Coordinator. The node currently designated to run the deterministic derivation pass. There is exactly one coordinator per mesh at a given log prefix. The owner designates the coordinator explicitly; there is no automatic election.
- Worker. Any node with
(Job, Claim)authority. Workers claim and execute scheduled jobs. - Peer. A node that is none of the above; it receives the log under whatever caveats apply to its delegation.
These roles are protocol-level. An implementation MAY add finer-grained internal roles (an "ingestion" role, a "surfacing" role); they are out of scope for v0.1.
2. The job state machine
A job is created by a ScheduleJob op and proceeds through the
following states:
Pending ──ClaimWork──► Claimed ──CompleteJob──► Completed
▲ │
│ ├──YieldWork────────► Pending
│ │
└────────ExpireWork──────┘ (when lease HLC-deadline passes)
Transitions:
ScheduleJob: creates a job inPending.ClaimWork: aPendingjob becomesClaimed. Authored by the worker; carrieslease_duration_ms.CompleteJob: aClaimedjob becomesCompleted. Authored by the current claimer.YieldWork: aClaimedjob returns toPending. Authored by the current claimer.ExpireWork: aClaimedjob whose HLC-relative deadline has passed returns toPending. May be authored by any node, not only the original claimer.
Completed is terminal. A job once completed is not
re-claimed; subsequent ClaimWork ops naming a completed job
MUST be rejected.
3. Lease expiry
A ClaimWork op carries lease_duration_ms and is authored at
HLC timestamp claim_op.timestamp. The lease's effective
deadline is:
deadline_wall_ms = claim_op.timestamp.wall_ms + lease_duration_ms
A job is considered expired at any point where some node's
HLC has wall_ms > deadline_wall_ms. Expiry is measured
against the HLC, not against any node's local wall clock; this
makes expiry robust to clock skew across the mesh (see
Clocks).
Any node MAY emit an ExpireWork op once it observes expiry.
Multiple nodes MAY emit concurrent ExpireWork ops for the
same job; the receiving nodes apply them idempotently.
A claimer that wishes to extend its lease MUST do so by emitting
a fresh ClaimWork op (with a new op_id and current
timestamp) before the previous deadline. There is no separate
lease-renewal op in v0.1.
4. Conflicting claims
Two workers may emit ClaimWork ops for the same Pending job
concurrently. The receiving node resolves the conflict by HLC
total order: the ClaimWork with the smaller HLC value is the
winner, and subsequent ClaimWork ops on the same job
while it is Claimed MUST be rejected.
If both ops have an indistinguishable HLC value (which is only possible if the HLC tick discipline is violated; see Clocks), the receiver MUST reject both ops as an integrity failure rather than picking arbitrarily.
A worker whose ClaimWork was rejected SHOULD NOT immediately
re-attempt; it SHOULD wait for the current lease to expire
(after which the job is Pending again) or for a YieldWork
op from the current claimer.
5. RouteKind
A RouteKind op directs all jobs of a given kind to a single
node. While a route is set:
- The owner-authored
RouteKindis the authoritative target. - A
ClaimWorkop naming a job whosekindis routed MUST be rejected unless the claimer matches the routed target. - A worker whose
(Job, Claim)capability admits the kind but who is not the routed target MUST NOT successfully claim routed jobs.
RouteKind ops follow last-write-wins semantics by HLC total
order: the most recent RouteKind for a given kind is in
force. Setting route to None clears the directive; the kind
returns to the default "any eligible worker may claim."
RouteKind is owner-only; per
Capabilities, an op carrying a
RouteKind payload from a non-owner MUST be rejected.
A useful pattern enabled by RouteKind: a phone with no GPU
schedules synthesis jobs and routes them to a server with a
GPU. The phone never claims those jobs because the route
restricts claiming to the server. The same delegation graph
that authorises the server to do the work also authorises it to
read the prompt context.
6. DesignateCoordinator
The coordinator is the node responsible for the deterministic derivation pass: the part of the inference pipeline that two nodes seeing the same op log MUST agree about. This typically includes auto-observation of evidence, entity resolution, and the rhythm pass (see the reference implementation's pipeline documentation for examples).
DesignateCoordinator ops:
- MUST be authored by the mesh owner.
- Take effect at their HLC timestamp.
- May be re-issued at any time to change coordinator. The most
recent
DesignateCoordinatorop (by HLC total order) is in force.
There is exactly one coordinator at any HLC timestamp. A node that is not the coordinator MUST NOT author derivation ops that the coordinator would normally author. An op that violates this rule MUST be rejected on receive.
An implementation MAY track the "non-coordinator drift" — the case where the coordinator has been quiet for an unusually long time — and surface it to the owner so the owner can re-designate. v0.1 does not specify an automatic re-designation; the owner remains in control.
7. Causal dependencies between jobs
The causal_deps field on every operation (introduced in
Operations) carries a possibly-empty set of
predecessor OpIds. For job ops, causal_deps is the
mechanism for DAG chaining: a synthesis job can depend on
the completion of a tool-use job by including the tool-use
job's CompleteJob op id in its causal_deps.
A node receiving a job op with non-empty causal_deps MUST:
- Verify each dependency is present on the local log (the op was either authored locally or received from a peer).
- Defer applying the dependent op to its projections until all dependencies are present and applied.
Job dependencies form a DAG. A cycle in the dependency graph is a protocol violation; an implementation that detects one MUST reject the offending op.
8. The work roster
Implementations maintain a work roster projection that tracks each job's current state, claimer, and lease deadline. The roster is derived from the op log per the rules in this chapter. The protocol does not specify the roster's storage shape; it specifies only the queries the roster must answer:
- "What is the current state of job X?"
- "Which jobs of kind K are currently
Pendingand admitted by any active route?" - "Which
Claimedjobs are past their deadline?"
These queries are sufficient to drive the worker loop:
periodically check the roster for Pending jobs the local
node may claim, attempt to claim them, execute, and emit
CompleteJob (or YieldWork).
9. Worker etiquette
These are SHOULD-level recommendations for worker implementations to keep the mesh healthy:
- A worker SHOULD NOT claim more jobs than it can complete within the lease duration.
- A worker SHOULD emit
YieldWorkif it knows it cannot complete a job (low battery, going to sleep, handler error). - A worker SHOULD NOT race other workers for
Pendingjobs. After aClaimWorkis accepted by some node, other workers SHOULD back off until expiry or yield. - A worker SHOULD jitter its claim cadence to avoid thundering-herd effects in a mesh with many concurrent workers.
10. Job kinds
The kind field on a job is a typed work-kind string. The
protocol does not constrain the namespace, but recommends
reverse-DNS-prefix style. Examples used by the reference
implementation:
cortex.extract.tier1— per-evidence deterministic extraction.cortex.enrich.tier2— per-entity interpretive enrichment.cortex.synthesize.window— per-window episode synthesis.cortex.action.<verb>— action-execution handlers.cortex.tool.<name>— tool-use handlers feeding inference.
A job kind is application-defined (the kind tells a handler what to do); the protocol's interest is solely in routing claims by prefix. An implementation MAY register handlers for kinds it recognises and ignore kinds it does not — claims for unrecognised kinds simply never arrive at the unrecognising node.
11. Inference produced by jobs
When a job's handler invokes a model, the resulting inference call is governed by Inference Audit — specifically, whether the executing node is operating under audit-in-force conditions and, if so, what artefact and link records the call must leave behind.
The relevant interactions with this chapter are:
- The
source_jobfield of alikewise.inference.snapshotartefact links the snapshot back to the job whoseCompleteJobop closed it. Implementations correlate the two via the substrate'scausal_depsmechanism. - The job's
output_artifactsfield onCompleteJobSHOULD include the snapshot'sartifact_idwhen the job's handler performed audited inference. - Job kinds in the reverse-DNS namespace
cortex.synthesize.*,cortex.extract.*, andcortex.tool.*are conventionally used for inference work; they are application-defined and not normative for v0.1.
See Inference Audit for the snapshot artefact's full content format, the linking rules from inference outputs back to snapshots, and the snapshot lifecycle.
Inference Audit
Part 2 of the specification, second chapter. This chapter depends on the substrate (Part 1) for op log, capabilities, and the generic artefact mechanism, and on the previous chapter (Mesh Coordination & Inference) for job and lease semantics. It specifies the convention by which inference calls become recoverable artefacts on the log.
The Likewise substrate (Part 1) lets a user own the canonical record of facts derived about them. The mesh coordination layer (the previous chapter in Part 2) lets multiple nodes cooperate on the work of producing those derived records. This chapter specifies the third concern of Part 2: how the inference itself — the model calls that produce many of the protocol's derived records — becomes recoverable and auditable.
Audit is what closes the "how did it know?" loop. A
recommendation, a derived claim, a synthesised episode — each
can be traced back, mechanically, to the model call that
produced it, the prompt and context fed to that call, and the
model's literal output. The mechanism is the
likewise.inference.snapshot artefact: a typed artefact emitted
alongside any audited inference call, riding the substrate's
generic artefact machinery.
The chapter is short because the mechanism is small. Audit requires three things:
- a rule about when a node must emit snapshots,
- a content format for the snapshot artefact, and
- a linking convention that ties produced records back to the snapshot.
Each is specified in turn below.
1. When a snapshot must be emitted
A node MUST emit a likewise.inference.snapshot artefact for every
model call it performs in either of the following cases:
-
The node is operating under the user's root delegation. The reference implementation, and any node a user runs on their own devices, falls into this category. For such nodes audit is the default; an inference call without a corresponding snapshot is a violation of v0.1 conformance, regardless of what other invariants the implementation satisfies. This is what makes the user's own personal mesh auditable end-to-end.
-
The node is operating under a delegation whose
caveatsincludeaudit_inference: true. A user delegating to an organisation's node MAY attach this caveat (specified in UCAN and Caveats §5.6) to require the delegated node to emit snapshots for inference performed against the delegated data. The snapshots become themselves operations on the user's log, completing the audit loop across the delegation boundary.
In all other cases — a delegated node operating without an audit caveat — snapshot emission is optional. The node MAY emit snapshots for its own bookkeeping but is not required to. What the node does internally with the data it received is governed by the delegation's other caveats and by whatever out-of-band agreements the user and the delegated party have, not by this chapter.
This split is deliberate. The protocol's role is to let the user decide whether audit applies, not to mandate it for every party that ever processes a piece of the user's graph. Mandating audit universally would be unenforceable across organisational boundaries; making it caveat-controlled gives the user the lever they need without overreaching.
2. The likewise.inference.snapshot artefact
A likewise.inference.snapshot artefact is a CreateArtifact op
(see Operations §8.1) whose
artifact_type is the literal string
"likewise.inference.snapshot". The artefact's content (the
bytes referenced by content_hash, optionally inlined via
content_inline) MUST be a postcard-encoded record with the
following fields, in this order:
| Field | Purpose |
|---|---|
model_id | The identifier of the model used (e.g. "gemma-4-E2B-Q4_K_M"). |
model_version | The model-specific version or revision tag. |
backend | The inference backend ("llama-cpp", "litert-lm", ...). |
retrieved_context | The structured set of evidence ids, claim ids, and entity ids that were assembled into the prompt. |
prompt | The literal prompt sent to the model, including system message and user turns. |
output | The model's response, including any structured fields the handler parsed out. |
telemetry | Wall-clock duration, token counts (prompt + completion), latency components if available. |
started_at, completed_at | HLC values bracketing the call. |
The artefact's envelope source_job field MUST be set to the
job_id of the job whose handler made the call. The
inputs_used field MUST list every evidence id in
retrieved_context (the substrate's generic-artefact contract
already requires this for any artefact produced from evidence).
Additional implementation-specific fields MAY be present in the encoded record. Future minor versions of this specification MAY add reserved fields; an implementation that does not understand a future field MUST preserve it during round-trip rather than discarding it.
3. Linking from outputs to snapshots
Any record produced by a snapshot-emitting inference call MUST link back to the snapshot. Specifically:
- A derived
CreateClaimop produced by inference MUST include the snapshot'sartifact_idin itsprovenancefield. - A
CreateArtifactop for any non-snapshot artefact produced by the same job (an embedding, a transcript) MUST set itssource_jobto the same job whose snapshot also references it, and SHOULD additionally include the snapshot'sartifact_idincausal_deps. - For nodes that implement the application-layer conventions in
the annex, a
CreateEpisodeop produced by inference MUST include the snapshot'sartifact_idin itscausal_deps. ACreateSuggestedActionop MUST set itsderivation_jobto the producing job and MUST additionally include the snapshot'sartifact_idincausal_deps.
This is the chain that makes the "how did it know?" question
mechanically answerable. Walking from any audited output to its
snapshot, then from the snapshot to its retrieved_context, is
the literal audit path.
A receiver MUST reject an audited output op whose link to a snapshot is missing or unresolvable on the local log. "Audited output" means: an op whose authoring node is operating under audit-in-force per Section 1, that the spec or a convention identifies as a class for which audit linking is required.
4. Snapshot lifecycle
Snapshot artefacts inherit the substrate's generic artefact
lifecycle (eviction, tombstone-cascade) from
Operations §8.1–§8.2. A
node MAY set ttl_ms on snapshots it emits; once the TTL
elapses, the snapshot is eligible for eviction under storage
pressure.
Eviction is irreversible: once evicted, the snapshot's content
is gone and the audit chain is broken from that point forward
for any output that depended on the evicted snapshot. The
CreateArtifact op remains on the log, so the existence of
the inference call is still recoverable; only the contents
(retrieved context, prompt, output) are lost.
Implementations operating under the user's root delegation SHOULD retain snapshots for at least the lifetime of the records that link to them, treating eviction as a last-resort under storage pressure. The user's own audit trail is among the most load-bearing data in the mesh; evicting it freely defeats the point.
Implementations operating under an audit_inference caveat
SHOULD respect any retention window the delegating user has
expressed in the delegation or in mesh-rules. v0.1 does not
specify a wire-level retention-window field; the user
communicates retention expectations through other means.
Strengthening this is an open issue.
5. Relationship to the audit invariant
The normative consequence of this chapter is captured in Invariants §I-9. That invariant is the binding requirement; this chapter specifies the mechanism by which the invariant is satisfied.
A reader who wants the short version reads I-9. A reader who is implementing the audit layer reads this chapter.
6. What audit does not cover
The audit mechanism specified here is deliberately narrow. It covers:
- Inference performed by a Likewise node, where "inference" means a call to a model that produces user-visible derived records.
It does not cover:
-
Inference performed off-protocol. A delegated party that receives a slice of the user's data and trains an internal model on it has not made a "Likewise inference call" for the purposes of this chapter, regardless of whether the user might wish they had. The protocol's lever for that scenario is the delegation's caveats — the user can refuse to delegate the data — not the audit invariant.
-
Statistical or aggregate inference. A retailer that receives many users' grocery rhythms and computes population statistics has not, under v0.1, performed a per-user inference call. Their internal pipeline is theirs to govern.
-
Auditing the model itself. The snapshot records the model identifier, but the protocol does not specify how to verify that the named model was actually the model that produced the output. Model attestation is a separate concern.
These exclusions are real and worth being explicit about. Audit is an important property the protocol provides; it is not a property the protocol can extend beyond its own boundaries.
Annex: Application Conventions
This annex describes conventions the reference implementation uses to surface the substrate to a user. The material here is non-normative. A v0.1 conformant node MAY implement these conventions, MAY substitute alternatives, or MAY omit them entirely.
The substrate (Part 1) and the inference pipeline (Part 2) are deliberately separable from the question of "how is this surfaced to a user." Different applications will make different choices about that question. The conventions in this annex are one set of choices that ship with the reference implementation; recording them here lets readers understand what the reference implementation is doing without conflating those choices with the protocol's load-bearing parts.
If you are implementing an interoperable node — for example, an organisation's node that synchronises a scoped slice of a user's graph — you can ignore this annex entirely. The substrate and the inference pipeline are sufficient to participate in a Likewise mesh. Applications that depend on the conventions in this annex will simply not find the records they expect, and that is allowed.
A.1 Episode operations
Episodes are temporally-bounded clusters of related evidence, entities, and claims. The reference implementation surfaces them to a user as narrative units: a trip, a project, a relationship arc, a meaningful day. They are not substrate primitives — nothing in the data model or sync protocol requires them.
A.1.1 CreateEpisode
| Field | Purpose |
|---|---|
episode_id | An EpisodeId. |
title | Short title. |
summary | Optional longer description. |
temporal_start | The episode's start time. |
temporal_end | Optional end time; absence indicates ongoing. |
evidence_ids, claim_ids, entity_ids | Supporting records. |
confidence | Episode-quality score. |
Per-run inference provenance for an episode is carried by an
InferenceSnapshot artefact emitted alongside the
CreateEpisode op (when audit is in force per Part 2).
Implementations correlate episode and snapshot via causal_deps.
A.1.2 UpdateEpisode
| Field | Purpose |
|---|---|
episode_id | Target episode. |
title, summary, confidence | Optional updates. |
status | Optional transition (Active, Stale, Archived). |
claim_ids_add, evidence_ids_add | Supporting records to add. |
A.1.3 Episode FSM
| State | Meaning |
|---|---|
Active | The episode is current; surfaceable. |
Stale | A supporting record was invalidated; episode no longer reflects reality. |
Archived | The user has set the episode aside. |
Transitions:
| From | To | Cause |
|---|---|---|
| (creation) | Active | CreateEpisode |
Active | Stale | derivation cascade or UpdateEpisode { status: Stale } |
Active or Stale | Archived | UpdateEpisode { status: Archived } |
Active or Stale | (deleted) | TombstoneEvidence cascading to all supporting evidence |
A Reject user assertion targeting an episode transitions it to
Stale and triggers the substrate's derivation cascade (see
Invariants).
A Confirm user assertion targeting an episode MAY freeze it
analogously to claim freezing; the convention does not specify
this further for v0.1.
A.2 Suggested-action operations
Suggested actions are recommendations the system surfaces to a user — "send this message," "review this calendar," "reconsider this goal." They are pure UX: the system's outputs as visible to the user, in a refutable, lifecycle-tracked form.
A.2.1 CreateSuggestedAction
| Field | Purpose |
|---|---|
action_id | An ActionId. |
title, description | User-facing content. |
action_type | Short identifier ("set_reminder", "create_album", "draft_email", ...). |
source_episode | The episode that motivated the action. |
supporting_claims, supporting_evidence | Provenance. |
derivation_job | The job that produced the action. Required when audit is in force; suggested actions then trace to inference. |
confidence | Action-quality score. |
A.2.2 UpdateActionStatus
| Field | Purpose |
|---|---|
action_id | Target action. |
new_status | Proposed, Approved, Executing, Completed, Rejected, Dismissed, Failed, Expired. |
execution_result | Optional details. |
A.2.3 Action FSM
| State | Meaning |
|---|---|
Proposed | The system has surfaced the suggestion; the user has not yet acted. |
Approved | The user accepted the suggestion. |
Executing | A handler is performing the action. |
Completed | The action finished successfully. |
Failed | A handler reported failure. |
Rejected | The user explicitly rejected the suggestion. |
Dismissed | The user dismissed the suggestion (without rejecting it; it MAY resurface). |
Expired | A time-window for relevance passed without user action. |
Transitions:
| From | To | Cause |
|---|---|---|
| (creation) | Proposed | CreateSuggestedAction |
Proposed | Approved | UpdateActionStatus(Approved) (user-authored) |
Proposed | Rejected | UpdateActionStatus(Rejected) (user-authored) |
Proposed | Dismissed | UpdateActionStatus(Dismissed) (user-authored) |
Proposed | Expired | UpdateActionStatus(Expired) (system-authored, when relevance window passes) |
Approved | Executing | UpdateActionStatus(Executing) |
Executing | Completed | UpdateActionStatus(Completed) with execution_result |
Executing | Failed | UpdateActionStatus(Failed) with execution_result |
Dismissed | Proposed | UpdateActionStatus(Proposed) (the system may resurface a dismissed action with new evidence) |
A Reject user assertion on a suggested action SHOULD prevent
the system from re-proposing the same action shape; the
convention does not specify the exact mechanism for v0.1.
A.3 Salience projection
The salience projection is a ranking-for-display surface used by the reference implementation to decide which entities, episodes, and suggested actions to show the user now.
It is not a substrate primitive: a node that is not surfacing records to a user — for example, an organisation's node consuming a scoped slice of the graph — has no use for it. Implementations that do surface records to users will need some such projection; this section describes the shape the reference implementation adopted, which alternative implementations may use as a starting point.
A.3.1 Required queries
For an implementation choosing to support the convention:
- Top-N by salience. Given a salience cap N and a time window, return the top N entities, episodes, or suggested actions ranked by a salience score.
- Salience for an id. Given an entity, episode, or suggested action, return its current salience score.
A.3.2 Constraints
- The salience projection SHOULD be in-memory (or fast enough to the user that it functionally is).
- It SHOULD be small enough that an implementation can rebuild it from the log within seconds at the scale of a single user's data.
- It MUST NOT be used as a UI store: queries over salience return rankings, not display payloads. Display payloads come from the detail projection (Projections, §2.3).
A.3.3 Score composition
The reference implementation composes salience as a weighted sum of components:
| Component | Weight | Meaning |
|---|---|---|
| Recency | 0.20 | How recently the underlying evidence arrived. |
| Corroboration | 0.20 | How many independent claims support the record. |
| Upcoming | 0.25 | Proximity to a user-visible time horizon (next event, deadline). |
| Open loops | 0.25 | Whether the record represents an unresolved commitment. |
| Affinity | 0.10 | A user-tunable weighting toward certain entity types. |
These weights are not part of the convention. They are recorded here because they are the values the reference implementation ships with; implementations adopting a salience projection are free to choose their own.
A.4 Why these are conventions, not substrate
The protocol's substrate is sufficient to express the user's knowledge graph and synchronise it across nodes. The inference pipeline (Part 2) is sufficient to perform distributed model calls with auditable provenance. Together those two layers are what makes it possible for the user to own what the system says about them — the load-bearing claim of the protocol.
What episodes, suggested actions, and salience scores add is a particular shape of user-facing application: narratives, recommendations, and a ranking-for-display surface. Those shapes are useful and the reference implementation provides them, but they are not constitutive of the protocol. An implementation without them is still a Likewise implementation; it is just one that has chosen to surface the substrate differently.
The org-as-peer scenario is the cleanest example. A retailer running a Likewise node holds a scoped delegation to a user's grocery-rhythm claim. The retailer's node does not surface anything to the user — it consumes a slice of state to inform its own systems. Episodes, suggested actions, and salience scoring are nonsense in that context. The substrate plus the inference pipeline (if the retailer chooses to participate in distributed inference) are sufficient.
A.5 Compatibility expectations
If your implementation chooses to support these conventions, it SHOULD do so in a way that interoperates with other implementations that also support them. Specifically:
- Episode and SuggestedAction op variants SHOULD use the field shapes documented above so the reference implementation can consume them.
- The Episode and Action FSMs SHOULD follow the transitions documented above.
- Salience score composition is implementation-defined; there is no compatibility requirement.
If your implementation chooses not to support a convention, ops of the unsupported types arriving on the wire from a supporting peer SHOULD be silently ignored at the projection layer. The substrate-level handling — signature verification, authority check, application to the op log — proceeds as for any other op; the implementation simply does not maintain the projection state that the unsupported convention defines.
Open Issues
This chapter catalogues known cross-implementation hazards in v0.1 of the specification. Each entry describes the issue, the concrete risk it presents, and the direction of the eventual resolution. Some of these will be addressed in a future minor version; some will require a major version. Where this chapter makes commitments to future versions, those commitments are non-normative.
The chapter exists for two reasons. First, hiding known hazards from implementers is a worse outcome than acknowledging them up front. Second, an open public list of known issues is how the specification gets corrected — it invites the discussion that produces v0.2 and v1.0.
If you discover a hazard not listed here, please open an issue against the specification repository.
OI-1. Wire format has no version tag
The postcard encoding of an operation does not carry an explicit format-version field. Schema evolution within a minor version is constrained to additions only, but a binary mismatch between two implementations using different incompatible schema versions will fail at decode time without a graceful error.
Risk. A future schema migration that is not strictly additive — for example, removing or repurposing a field — will silently corrupt logs read by an implementation expecting the old shape.
Direction. The next major version is expected to introduce
a leading version byte (or a varint) on every op envelope,
allowing recipients to dispatch on schema version explicitly.
Workaround in v0.1. Implementations SHOULD include the
specification version they implement in their bearer token
metadata or in a discovery endpoint, so peers can refuse to
sync with mismatched versions before the wire format issue
manifests. The X-Likewise-Mesh-Rules-Hash header (see
Sync) provides a partial signal.
OI-2. Causal frontier cursor is opaque
The since cursor passed to GET /ops is the base64url
encoding of a postcard CausalFrontier value. This is
opaque from the client's perspective beyond the empty-frontier
special case, and there is no negotiation about its format.
Risk. An implementation that changes the underlying
CausalFrontier representation in a non-additive way will
break clients that hold persisted cursors from an earlier
version.
Direction. Future versions are expected to specify a
versioned cursor envelope or to standardise the
CausalFrontier shape explicitly so changes are detectable.
Workaround in v0.1. Implementations MUST treat cursors as
write-once-then-echo: a client sends back exactly what the
server returned in X-Likewise-Next-Frontier. Implementations
SHOULD discard cached cursors on protocol-version upgrades.
OI-3. Sanitised ops carry no signature, by design
A sanitised op has its signature cleared to None and is
distinguished from corruption by a sanitisation marker (see
Wire Format).
Risk. A receiver that does not implement marker checking will either reject all sanitised ops as corrupted (denying service to legitimate filtered traffic) or accept all unsigned ops as sanitised (admitting forged traffic). The marker check is the only thing distinguishing the two cases.
Status. This is a design decision, not a defect. The specification's contract is that sanitisation is intentional and the marker is verifiable against the sender's delegation chain.
Direction. A future revision MAY introduce a hash-chain mechanism that lets a receiver verify a sanitised op's provenance to its pre-sanitisation form, addressing the "delegated trust" concern at the cost of additional bytes on the wire. v0.1 does not include this.
OI-4. Mesh-rules drift has no negotiation
Two peers with different X-Likewise-Mesh-Rules-Hash values
pause sync (per Sync).
There is no automatic protocol for resolving the divergence.
Risk. A long-running mesh whose rules document has incrementally drifted on one node (typically because the operator updated it) will lock that node out of sync until the divergence is resolved manually.
Direction. A future revision is expected to define a mesh-rules-negotiation pre-handshake: peers exchange rule documents and either adopt the newer common version or explicitly refuse to interoperate. The exact mechanism is open.
Workaround in v0.1. Operators MUST manage rules versioning out of band (e.g., by deploying rule updates to all nodes in lockstep). Implementations SHOULD log mesh-rules-hash mismatches loudly enough to catch operator errors early.
OI-5. HLC skew tolerance is implicit
The protocol does not specify a maximum allowable wall-clock
skew between a node and the operations it accepts (per
Clocks). A node whose clock
is far in the future can effectively rewrite the order of the
mesh's history by emitting future-dated timestamps; receiving
nodes will adopt the larger wall_ms on receive.
Risk. A compromised or malfunctioning node can dominate the HLC ordering for the rest of the mesh, distorting the meaning of "before" and "after" for as long as it does so.
Direction. A future revision is expected to specify a
negotiated skew bound as part of the mesh-rules document:
operations whose wall_ms exceeds the recipient's local time
by more than the bound are rejected.
Workaround in v0.1. Implementations SHOULD warn on operations whose timestamp is more than one hour ahead of the local wall clock. They MAY refuse to accept such ops as a local policy choice, but doing so is not specified by v0.1 and may cause sync to lag.
OI-6. UCAN v0.10 is the wire format
v0.1 implementations carry UCAN v0.10 (JWT-shaped) tokens. The UCAN working group has moved to v1.0 (DAG-CBOR + Varsig + CIDv1 envelopes). v0.10 is no longer the upstream's preferred format.
Risk. Tooling and ecosystem support for v0.10 will atrophy over time. New external libraries will target v1.0 and inter-protocol interop (with other UCAN-using systems) will be harder.
Direction. The next major version is expected to migrate to UCAN v1.0. The migration is non-trivial: token canonical form, signature shape, and the proof-chain reference encoding all change. The migration MAY be staged (v0.10 and v1.0 co-existing during transition) or atomic; the working group will decide.
Workaround in v0.1. Implementations are stuck on v0.10. They SHOULD isolate the UCAN implementation behind a narrow interface so the migration is a contained change.
OI-7. No bulk-transfer mode for first sync
Catching up a long-disconnected node from genesis requires
paginated GET /ops calls (per Sync).
For a mesh with millions of ops this can be slow.
Risk. Onboarding a new node, or recovering a node that has been offline for an extended period, takes longer than it needs to.
Direction. A future minor version is expected to add a
bulk-transfer mode (likely a streaming response with a
specific Accept header on GET /ops) that ships a snapshot
plus a delta from a known checkpoint.
Workaround in v0.1. Implementations MAY ship the underlying storage offline (USB drive, file copy) for first-time onboarding, then resume incremental sync. This is operator choice, not a protocol mechanism.
OI-8. No server-initiated push hints
The sync protocol is pull-based. A node learns of new operations only when it polls. There is no server-initiated push of "you have new operations to fetch."
Risk. Propagation latency is bounded below by the polling cadence, which is in tension with battery and bandwidth considerations on mobile nodes.
Direction. A future minor version is expected to add an optional WebSocket or webhook endpoint a server can use to hint a peer that fresh operations are available. Hints are advisory; the actual op exchange remains pull-based for authoritative correctness.
Workaround in v0.1. Implementations choose polling cadences that balance latency and resource use (typical defaults: 30 seconds on stable connections, 5 minutes on metered).
OI-9. No confidential sync
A peer can probe a node for the existence of operations it is
not authorised to receive by sending crafted since cursors
and observing the response shape. The capability filter
prevents the operations themselves from being returned, but
it does not prevent a peer from learning that the
unreachable operations exist.
Risk. An attacker with read access to part of the log can infer the existence and approximate timing of operations they are not authorised to see.
Direction. A future revision is expected to adopt a confidential-sync mechanism (likely modelled on Willow Protocol's private-set-intersection-style approach), where peers cannot probe for unauthorised operations at all. This is a significant protocol redesign and is unlikely to land before a major version bump.
Workaround in v0.1. Implementations SHOULD NOT distinguish "unauthorised op" from "no op" in their response shape (return the filtered op set without any "filtered N" indicator). Implementations MUST NOT return per-op "you are not authorised" errors, which would themselves leak the existence of the filtered ops.
OI-10. Predicate vocabulary is not yet standardised externally
The set of claim predicates is centralised in this specification but is not yet structured for external extension. An application wishing to add a new predicate (for a new domain — health data, financial data, professional context) must either propose it for inclusion in the specification or hijack a generic predicate.
Risk. Without an external-extension mechanism, the predicate vocabulary either grows to encompass every imaginable domain (unwieldy) or fragments across non-standard predicate strings (non-interoperable).
Direction. A future minor version is expected to introduce
namespaced predicate prefixes (e.g., org.cortex.location_at
versus com.example.medical.diagnosed_with) and a registry
mechanism for third-party namespaces.
Workaround in v0.1. Stay within the existing vocabulary where possible. For application-specific extensions, use the custom-metadata field on evidence and let consumers interpret it; do not author claims with non-vocabulary predicates.
How to propose changes
Each open issue here is a candidate for revision. To propose a direction, open an issue on the specification repository (see Contributing) and reference the OI number above. Substantive changes are expected to land in v0.2 (additive minor) or v1.0 (backwards-incompatible major) depending on scope.
Implementations
This page lists known implementations of Likewise and explains what conformance means.
Status
There is no public Likewise implementation at the time this specification was first published. The protocol was developed alongside an in-progress Rust implementation that the authors have been working under the codename "Cortex". The codename is not a committed product name — the implementation may eventually ship as the baseline Likewise app itself, or under a different name entirely; that decision has not been made. Where this specification refers to "Cortex," read it as "the in-development reference implementation."
Cortex is currently in private development on macOS and iOS. It is not yet released, and this page makes no commitments about its release timing. When it does become public, this page will be updated with repository links, the final name, and conformance notes.
The text below describes the intended shape of the reference implementation and the intended behavioural-conformance suite. Both should be read as forward-looking; neither is currently available for download.
The reference implementation (codename Cortex)
The reference implementation is a Rust implementation of Likewise that runs on macOS and iOS as a small mesh of nodes communicating over HTTP. The user runs a node on each of their devices. It was the implementation against which this specification was written, so where the specification is silent or ambiguous, its intended behaviour is the strongest signal about what was meant — practically, this matters less than it would for a published specification, because the implementation is not yet available for an implementer to compare against.
Intended reference behavioural tests
When the reference implementation is published, it will ship seven end-to-end scenarios that exercise the wire surface against a real engine, real SQLite storage, and real HTTP loopback transport. The intent is that these scenarios constitute the reference suite for behavioural conformance:
solo— single-node ingest, derivation, projection rebuild.warm-restart— node restart recovers state from the log alone.enrollment— the UCAN delegation handshake that admits a new node to a mesh.scoped-enrollment— the same handshake under caveat restrictions, including sanitisation rules and revocation.claim-lifecycle— claim FSM transitions, derivation DAG cascade on user assertion, and frozen-fact immunity.tool-use-agent-loop— non-inference job handlers chained withdepends_on, inference-snapshot artefacts, and suggested- action approval, on a single node.mesh-agent-loop— the same loop distributed across three specialist nodes (phone, inference, tools) cooperating viaRouteKindand cross-nodedepends_on.
A second implementation that passes equivalents of these seven scenarios — wired into its own engine and transport, against its own storage — is what "behaviourally conformant for v0.1" is intended to mean. The scenarios are not the spec; the spec is the spec. The scenarios are how we plan to operationalise it once the reference implementation is public.
Compatible implementations
There are no public implementations of any kind at the time of writing. When implementations exist, this page will list them. To submit one, see Contributing.
(Or — open an issue, paste a link to your implementation and a brief description of what it covers, and we will add it.)
What conformance means
The specification distinguishes four levels of conformance:
Level 1 — wire-format conformance. The implementation can read and write operations that an existing v0.1 implementation will accept and apply correctly. It honours the postcard encoding, the canonical signing rules, and the HTTP sync endpoint shape.
Level 2 — semantic conformance. In addition to Level 1, the implementation respects the projection contract — it answers queries about an op log identically (modulo intentional optimisations) to the reference implementation, given the same op log as input.
Level 3 — capability conformance. In addition to Level 2, the implementation honours UCAN delegations and caveats correctly — including sanitisation, transitive revocation, and the attenuation-only re-delegation rule.
Level 4 — full behavioural conformance. In addition to Level 3, the implementation passes equivalents of the seven reference scenarios listed above.
An implementation may claim a level publicly. We strongly recommend explicit mention of the conformance level along with the test artefacts that demonstrate it, so users can assess trustworthiness without reading the source.
Compatibility expectations across versions
The specification is versioned (see
Conventions for the current version).
Two implementations on the same major version SHOULD interoperate
without negotiation. Two implementations on different major
versions MAY refuse to interoperate; the X-Likewise-Mesh-Rules-Hash
header on the sync endpoint is the v0.1 mechanism by which a
mismatched pair detects this and pauses sync rather than corrupting
each other.
A future revision will clarify the negotiation protocol for mesh-rules drift; this is tracked as an open issue.
Implementation notes for new ports
A handful of practical observations from building the reference implementation that may save another implementer time:
- The HLC tick discipline is the single most common source of divergence bugs. Treat it as load-bearing from day one. See Clocks.
- The signature canonicalisation rule (clear the signature field on the op, encode, then sign and put the signature back) is easy to get subtly wrong. The detached-JWS output is what crosses the wire; the in-storage representation contains the signature.
- The projection split exists because collapsing it into one fat state object produces a system that is too slow for ranking, too lossy for UI, and too memory-hungry for inference contexts. Implementers porting from a single-store substrate should resist the urge to fold them.
- Sanitisation clears signatures intentionally; an implementation that treats signature absence as corruption will reject legitimately filtered ops. Distinguish the two cases up front.
- Job and lease ops use the HLC for lease expiry, not a wall clock. Implementations that read the wall clock to decide whether a lease is expired will misbehave when nodes have skewed clocks.
Calling the project
The protocol is "Likewise." When citing it, please use that name and a link to this specification.
The reference implementation is currently working under the codename Cortex. The codename is provisional. Its eventual public name is not fixed — it may ship as the baseline Likewise app itself, or under another name. Treat any "Cortex" references in this specification as shorthand for "the in-development reference implementation"; if and when the implementation is released under its final name, this page will be updated.
What is committed: the protocol is Likewise, the standard is this document, and the implementation — whatever its final name — is one realisation of it.