Likewise

A protocol for decentralized personal knowledge graphs.

A user runs a node on each of their own devices. Nodes share an append-only log of signed operations. The log encodes evidence (photos, calendar events, contacts), the working hypotheses derived from that evidence, the permissions governing who may read or derive from what, and a record of every inference call made against any of it.

This book is the protocol specification.

How to read it

If you are encountering Likewise for the first time, read in this order:

  1. Motivation — why this protocol exists, and what the status quo gets wrong.
  2. Overview — the system in five minutes, no code.
  3. Concepts — the mental model. Evidence, ops, claims, projections, capabilities, mesh.
  4. Comparison — honest contrast with Solid, AT Protocol, Nostr, Iroh, the local-first manifesto, and UCAN.

If you are implementing a compatible node, the normative specification is organised into three parts:

  • Part 1: The substrate — chapters 00 Conventions through 12 State Machines (skipping chapter 09). Sufficient for any conformant node, including organisation peers consuming a scoped slice of a user's graph. If you are building a substrate-only peer, this is everything you need.
  • Part 2: The inference pipelineMesh Coordination and Inference Audit. Adds the vocabulary by which nodes cooperate on a user's work and the convention by which audited inference calls become recoverable artefacts on the log. Required for nodes participating in distributed work; substrate-only peers MAY ignore.
  • Annex: Application conventionsEpisodes, Suggested Actions, Salience. Non-normative. The reference implementation's choices for surfacing the substrate to a user; alternative implementations are free to substitute.

After the three parts: Open Issues catalogues known cross-implementation hazards. The high-level chapters above are non-normative; the spec chapters use RFC 2119 keywords.

If you are looking for an existing implementation, see Implementations.

What this protocol is not

It is not a particular application. It is not a particular AI model. It is not a synchronization library or a database engine. It is the wire-level agreement that lets independently-built nodes interoperate over a single user's knowledge graph.

Status

v0.1 — draft for public review. The wire format was developed alongside an in-progress reference implementation (currently private, working codename Cortex — provisional). It is not yet stable across major versions, and there is no public implementation an interested party can run today. See Implementations for status, and Open Issues for known cross-implementation hazards.

License

Creative Commons Attribution 4.0 International (CC-BY-4.0).

Motivation

Where the value is moving

For most of the consumer software era, the question of "who owns your data" had a clean enough answer to ignore: you owned the photos and the calendar entries and the contact list, and the platform owned the account that hosted them. If you wanted your data back, you exported a zip file. The trade was uncomfortable but legible.

That trade is no longer where the action is. The artefact platforms care most about producing now is not the photo, it's the claim about you the photo helped to derive. That you go to the same coffee shop on Tuesdays. That this is your child. That you're probably training for a marathon. That you and Sarah are close. None of those facts are in the original photo. They are inferred — sometimes by hand-written classifiers, increasingly by language models — and they are the part of the data pipeline with leverage. They are what makes a feed personalised, an ad targeted, a recommendation feel uncanny.

Those derived claims are not, in any practical sense, yours. You cannot read them. You cannot correct them. You cannot port them to another product. When the platform deletes your account they vanish with it, even though the work of building them used your evidence and the model that derived them was trained, in part, on people like you. You are the substrate on which the intelligence is built and the only party in the loop without a copy of the result.

This protocol exists because that arrangement is bad and because it is about to get much worse.

The default the internet was built around

This is not new. It is the consumer internet's defining feature.

For twenty-five years, almost every successful piece of consumer software has agreed on the same arrangement: the party providing the service collects the record of the user. The cookie was an implementation detail. The free account was an implementation detail. The "personalised" feed, the loyalty card, the recommended purchase — all implementation details on top of the same underlying contract. The party doing the work also kept the work's record, and the record was not the user's.

The arrangement persisted because, for most of those twenty-five years, the record was structured enough to be useful to the collecting party but raw enough to be inert in anyone else's hands. Click-streams, locations, search histories — they powered ad targeting and recommendation rankings, but they were not, by themselves, a model of who you were. They were ingredients. The party with the most ingredients had the best recipes.

That is what changed. The same logs that were inert raw material a decade ago are now training data and prompt context for systems that can describe you to yourself with uncomfortable accuracy. The economic value of being the party that holds the record has risen by an order of magnitude. So has the asymmetry between you and the party that holds it.

Likewise does not propose that data collection should stop, and it does not deny that there is real value in the systems being built on top of these records. It proposes something more specific: that the question of who holds the canonical record of the user — the user themselves, or whoever is currently making money from them — is now load-bearing for whether any of this is something done for the user instead of to them. And it proposes that the historical default — the party providing the service is also the party that owns the record — is no longer acceptable.

The next wave makes the asymmetry sharper

Personal AI is on its way. Locally-runnable models that can read your calendar and your photos and your messages and reason about them are already plausible on consumer hardware. The pitch — your phone understands your life, surfaces what's relevant, drafts the message, schedules the call — is real and probably correct.

The default architecture for delivering that pitch will be a single vendor's app, talking to a single vendor's cloud, with the model running wherever the vendor finds cheapest and the derived claims sitting in whatever storage the vendor has chosen. The user will get intelligence. They will not get a copy of what the system believes about them. They will not get a way to refute it. They will not get a way to move to a competitor without starting over. They will not be told which of their evidence the model looked at when it drafted that suggestion.

Calling this "AI on your phone" obscures what is actually happening. The model running on the phone is the visible part. The interesting part is the data substrate underneath it, and that substrate — the graph of evidence and claims and permissions — is what determines whether a personal AI is a product the user owns or a product the user is.

What "owning your knowledge graph" should mean

If a system makes claims about you, owning those claims should mean five concrete things:

  • You can read every claim. Not in the form "here are the topics the assistant has noticed about you," but in the form "here are the facts the system is operating on, with the evidence each was derived from." This is the only way to know what you're being judged by.

  • You can refute or revise any claim. Inference is fallible. The system thought your sister was your wife; it inferred a job title you don't have. A claim system that can't be corrected by the person it describes is not serving them.

  • You can audit every inference call. When the model is asked "what should we surface to this user today," the question, the retrieved context, the model identity, and the answer must all be recoverable. "How did it know?" should have a literal, byte-for-byte answer.

  • You can move. Your evidence and your derived claims should travel from one implementation to another without a vendor in the loop. If the implementation you started with stops being maintained, or starts behaving in ways you object to, you should be able to walk away with everything you came in with and everything that has been derived since.

  • You can grant and revoke. A system that runs on your devices inevitably sees more than any third party should. Sharing the bits that need sharing — the family calendar with a partner, the work events with a colleague's scheduling assistant — must be a capability, not a flag in someone else's database.

None of those properties are exotic. They are what anyone would expect from a record they had any power over. The reason they are not the default in personal-data systems is not that they are hard; it is that incumbents have no incentive to provide them.

Why a protocol, not a product

You can build a single product that gives one user a private, auditable, portable knowledge graph and call it done. We tried that first. The trouble is that the moment the user wants their data to flow between two devices, or between two pieces of software, or to a trusted second party — a partner, a coach, a therapist's intake form — you need an agreement about how the bits travel. If that agreement is private to the product, the user is back where they started: locked in, only this time the lock has a friendlier name.

A protocol is different. A protocol is the rule that lets a phone running implementation A and a laptop running implementation B synchronise the same user's data without either of them being a trust anchor. It lets a researcher build a node that ingests a new kind of evidence (medical records, fitness data) and federate into an existing graph. It lets the user, ten years from now, run their graph on software no one in this room has heard of yet, because the specification — not the product — is what their data is denominated in.

This document is that specification.

Concrete scenarios this protocol changes

"I want to switch phones"

Today: your assistant's understanding of you is locked to the vendor. You start over.

Under this protocol: your devices are nodes in a small mesh you own. Adding a new phone means enrolling another node. The append-only log syncs to it. Within minutes the new device has the same understanding of you as the old one, derived from the same evidence under the same rules.

"I want to know why it suggested that"

Today: the vendor surfaces a recommendation; the reasoning is opaque, and at best you get a generic "based on your activity."

Under this protocol: every inference call is itself an artefact on the log. The retrieved context, the model identity, and the prompt are recoverable. "Why did it suggest I message Sarah today" has an answer that consists of specific evidence and specific claims, with their provenance. If the answer is wrong, you can refute the claims behind it and watch the recommendation update.

"I want my partner to see family events but not work events"

Today: you either share an entire account or you don't. The granularity is missing.

Under this protocol: capabilities are first-class. You delegate a read capability scoped to a class of evidence (calendar entries with a particular tag, say) and a class of derived claim, and you can revoke that delegation at any time. The receiving node can only synchronise the slice of the log it has been authorised for. There is no privileged "admin" account; there is only a graph of delegations rooted at the user.

"I want to refute a claim the system made"

Today: there is, often, no surface for this. The system knows what the system knows; you live with it.

Under this protocol: a user assertion is itself an op on the log. Refuting a claim flows through the derivation graph: anything that was derived from the refuted claim becomes invalid. The same mechanism that propagates evidence forward propagates corrections backward. The user is the final authority on facts about themselves, mechanically, not just rhetorically.

"I want a server to do the heavy inference"

Today: you either trust a vendor cloud or you don't have a server.

Under this protocol: a server is just another node in the mesh, enrolled by the user, with capabilities the user defined. It runs the inference work the phone can't. The phone keeps the canonical log; the server's outputs are themselves logged operations the phone receives and can audit. The user can revoke the server's capability at any time, at which point its derivations stop being trusted and the affected claims invalidate.

"I want to share my grocery rhythm with a retailer, without sharing my purchases"

Today: you either accept the loyalty-card terms in full (and the retailer collects a fine-grained record of your transactions, app sessions, and adjacent ad-platform signals) or you opt out (and the retailer falls back to coarser inference from third-party data, which is no better for either side).

Under this protocol: you delegate the retailer's node a capability scoped to a single claim — your grocery-visit rhythm — with caveats that prevent any underlying evidence (receipts, photos, location pings, basket details) from crossing the boundary. The retailer gets a precise, accurate answer to a useful question, and no more. You can revoke the delegation in one operation. Both sides know exactly what was shared because the wire format describes it precisely.

Consensual data partnership

The previous scenario points at the protocol's most interesting consequence — one its designers didn't initially set out to deliver. The same machinery that lets a user share data between their own devices also lets them share data, on their own terms, with anyone else.

Today, when a retailer wants to know that you regularly buy apples, they have to guess. They collect transaction logs, loyalty-card swipes, app session data, and ad-platform signals; they segment the behaviour across millions of users until a confident probability emerges that you are an apple-buyer; and the result is, at best, a guess the retailer holds about you that you will never see and cannot correct. The cost of producing the guess is enormous. The accuracy is uneven. The relationship is adversarial — every additional signal the retailer captures is a small extraction.

Now consider the same scenario differently. The user has ground-truth claims about themselves: that they go to a grocery store roughly four times a month, that the visits cluster on Saturdays, that the basket size has been growing. Those claims already exist on the user's personal mesh, because the user's own evidence — calendar, location, photos of receipts — derived them.

Sharing those claims with a retailer is no longer an act of surveillance acceptance. It is an act of delegation. The user issues a UCAN scoped to the retailer's node with caveats:

  • only the predicates they care about (grocery_visit_rhythm),
  • none of the underlying evidence (no source-typed photos or calendar entries cross the boundary),
  • sanitisation rules that strip descriptive content fields,
  • a time-range that auto-expires the delegation in twelve months.

The retailer deploys a Likewise node — same wire protocol, same op log, same authority machinery — and that node synchronises only the slice of the user's log this delegation admits. The node materialises a tiny knowledge graph: possibly nothing more than the rhythm claim and its confidence. The underlying photos, locations, and basket details never leave the user's mesh. If the user revokes the delegation, the retailer's node loses its authorisation, and the slice of state it materialised becomes invalid by the same cascade rule that retires any other revoked authority.

This is not a hypothesis about a future protocol. The capabilities, caveats, sanitisation rules, and revocation semantics are already specified for the single-user mesh case (see Capabilities and UCAN and Caveats). The same machinery generalises directly: a "node" in this protocol does not have to be a personal device. It can be any party — a retailer, a clinic, an employer's scheduling assistant, a research institution, a public-interest data trust — that the user has chosen to invite in. The materialisation that party holds can be as small as one claim or as large as the user authorises.

The economic shape this enables is different from the status quo. The retailer pays nothing for the bulk-collection infrastructure they no longer need. The user shares specific claims they have chosen to share, on terms they have chosen, and can stop at any time. Both sides know exactly what is being shared because the wire protocol describes it precisely. Compliance with the user's "no" is enforced mechanically, not by lawsuit.

The protocol does not specify how the resulting market gets built. It does not specify pricing, payment rails, negotiation formats, or contract templates. What it specifies is the substrate: a wire format in which "I share these claims, with this masking, until I revoke" is something that can be expressed precisely and verified independently by both parties. The market on top of that substrate is for others to design.

There is a version of the future where every commercial relationship that today depends on third-party tracking is reconstituted as a voluntary, scoped, revocable delegation between the user and the counterparty. There is also a version where it isn't, and the incumbents preserve their bulk-collection model because nothing in the law or the market forces a change. The protocol exists, in part, so that the first version becomes possible.

The non-negotiable rules

The protocol is built around six rules that exist to make the properties above survive contact with reality. They are stated formally in Invariants; the short forms are:

  1. Only operations mutate canonical truth. Everything else is a projection that can be rebuilt from the log.
  2. Every user-visible claim has transitive provenance to evidence.
  3. Derivation forms a directed acyclic graph. Refutations cascade.
  4. Sync converges operations, not projections. Two nodes that have seen the same operations agree on what is true.
  5. Every operation is signed by its author. Identity is per-device, bound by capability delegations rooted at the user.
  6. Inference is auditable — by default on the user's own nodes, and on demand when delegated to others, via a caveat the user attaches.

Anything an implementation does that violates one of those rules breaks the user's ability to own what the system says about them. That is why they are non-negotiable.

What this is not trying to be

It is not trying to be a social network. The graph is private to the user and the parties they have explicitly delegated to. There is no global namespace, no public feed, no follow graph.

It is not trying to be a general-purpose database. The data model is shaped for personal context — evidence, entities, claims, episodes, suggested actions — not for arbitrary tabular workloads.

It is not trying to replace cloud AI for everyone. Some users will prefer the convenience of a vendor offering. The protocol is for the users — and the implementers — who would prefer the alternative to exist.

It is not trying to be a finished system. The reference implementation works end-to-end and is the source of truth for what the wire format actually is today, but the spec has known open questions, listed honestly in Open Issues. The point of publishing now is to make those questions public before the de-facto answers are decided by whoever ships first.

What we want from readers

If you are an implementer: read the spec. Build a compatible node. Tell us where the spec is unclear or where two reasonable readings diverge.

If you are a researcher: the protocol is licensed CC-BY-4.0. Cite it, fork it, write a better version. We would rather lose to a better protocol than win with a worse one.

If you are a user: there is no public implementation you can run today. The protocol's first reference implementation (provisional codename Cortex) is in private development and not yet released. The point of publishing the specification before the implementation is to ensure the standard — and the property it gives you, of owning what the system says about you — is not a luxury feature. It is the precondition for any of this being something done for you instead of to you. The implementation will follow.

Overview

This chapter describes the protocol in five minutes, no code. If you want the why first, read Motivation — this chapter assumes you already accept that owning your own knowledge graph is worth specifying. If you want the wire-level rules, jump to Conventions.

The picture

A user runs a small mesh of nodes — typically a phone, a laptop, and maybe a server they own. Each node is a self-contained implementation of the protocol: a local database, a sync engine, and (where the hardware allows) an inference engine. The nodes talk to each other directly over the local network or the public internet, never through a central service.

What the nodes share is a single append-only log of signed operations. The log is the canonical state. Everything else — what the user sees on a card, what the model is given as context, what is highlighted as "important today" — is a projection of the log, regenerable from it.

What's on the log

Operations come in a few categories.

Evidence operations record raw inputs the user has chosen to ingest: a photo (referenced by content hash, not embedded), a calendar event, a contact card, a message thread. Evidence is immutable once written. Removing it requires a tombstone op, which cascades through everything derived from it.

Entity operations record the things the user's life is about: people, places, organisations, events, commitments, concepts. Entities are not pre-defined by the protocol; they are derived. The protocol specifies how an implementation may merge two entities ("Sarah" the contact is the same as "Sarah M." extracted from a photo caption), how it may split one back apart, and how it must record the provenance of those decisions.

Claim operations record the working hypotheses the system is operating on: "Sarah is a close contact." "Tuesday mornings are gym mornings." "The next coffee with Mike is overdue." Claims have a status — hint, claim, fact — that reflects how strongly they are believed and whether the user has confirmed them. Claims have explicit confidence and explicit provenance: every claim links back through the operations that derived it to the evidence at the bottom.

User-assertion operations record what the user themselves has said: "yes, that's right," "no, refute that," "merge those two." User assertions take precedence over derived claims. They are the mechanism by which the user is the final authority on facts about themselves.

Artifact operations record machine-produced byproducts of derivation: embeddings, transcripts, OCR text, and the inference-snapshot artefacts that record model calls. Artefacts ride the same op-log machinery as everything else, with a TTL and eviction lifecycle for storage management.

Job and lease operations record work the mesh has scheduled, claimed, completed, or yielded — for example, "this server should synthesise an episode for last week." This is how a phone offloads inference to a laptop without anyone having to be in charge of the whole mesh.

Capability operations record permissions: who may write what, who may read what, who may schedule what kind of work. They use UCAN delegations, rooted at the user.

Coordinator and routing operations record decisions about who does what in the mesh: which node coordinates derivation, which kinds of jobs route to which node.

Two further op categories — Episode operations, which record narrative clusters of evidence and claims, and Suggested-action operations, which record recommendations the system surfaces to a user — are application-layer conventions used by the reference implementation, not part of the substrate proper. They are documented in Annex: Application Conventions. A node that does not surface records to a user — for example, an organisation's node — has no need to emit them.

The full substrate taxonomy is in Operations. For now the important point is: every state change is one of these operations, every operation is signed by its author, every operation is timestamped with a hybrid logical clock so causal order is total across the mesh.

What you read from

A naive reader of an append-only log would have to fold over the whole thing every time they wanted to know whether Sarah was a close contact. Implementations don't. They maintain a small set of projections — materialized read views — that an op-application function keeps in sync with the log.

The protocol distinguishes three substrate projections by purpose:

  • A larger, in-memory, model-prompt-oriented view used to assemble context windows for inference calls.
  • A durable, on-disk, lookup-oriented view used by the user interface for "show me everything you know about Sarah."
  • A debug-only, full-graph view, used by inspection tooling.

Each projection consumes the log; none of them are canonical. Any of them can be discarded and rebuilt. The protocol specifies what each one must be able to answer; how an implementation builds it is open.

A fourth projection — a small, in-memory, ranking-oriented view used to decide what's salient now — is an application-layer convention used by the reference implementation, not part of the substrate. It is documented in Annex: Application Conventions §A.3. A node that does not surface records to a user has no need for it.

How nodes converge

There is one HTTP endpoint and one cursor. A node asks a peer for "the operations you have that I don't," sending its causal frontier as the cursor. The peer returns the matching slice of its log. Both nodes apply received operations into their local log idempotently. Because operations are timestamped with a hybrid logical clock and the merge rules for any conflicting updates are deterministic, two nodes that have seen the same set of operations agree on the same projected state.

There is no leader. There is no central coordinator. There is no handshake more elaborate than "what's your causal frontier; here is the set difference." Sync is the same operation whether two nodes are catching up after a week apart or staying current minute-by-minute.

Capabilities filter what crosses the wire. A node holding only a read-only delegation for calendar evidence will not be served claims about photos. The filter runs on the source side. Operations that must be sanitised before crossing — strip GPS, redact participants, truncate body text — have their signatures cleared at sanitisation time, which makes the change visible to the recipient as a deliberate intent rather than a corruption.

A node does not have to be one of the user's own devices. The same delegation machinery scopes what an invited third party — an organisation, a clinic, a service the user has chosen to share with — can see. A retailer running a Likewise node receives only the claims the user delegated to them, sanitised per the caveats the user attached. The same wire format that synchronises a phone and a laptop also synchronises a personal mesh and a partner the user has explicitly opted in. See Motivation: Consensual data partnership.

How permissions work

Every node has a key. Every operation is signed by a node key. Every node key is itself the subject of capability delegations issued by the user (or by another node the user has delegated authority to).

A capability is a triple: a resource (operations of a certain class, evidence of a certain class, jobs of a certain kind), an action (read, write, schedule, claim, complete), and a set of caveats that narrow it (only evidence of these source types, only claims with these predicates, only jobs in this time range, only operations that have been sanitised in these specific ways).

Delegations form a graph rooted at the user. Revoking a delegation prunes the subgraph beneath it. The protocol specifies how a node must interpret an incoming op against its capability set, so any two implementations agree on whether a given op was authorised at the moment it was sent.

How inference is audited

When a node operating under audit calls a model — to summarise a window, to draft a recommendation, to extract entities from a photo caption — the call itself becomes an operation. The retrieved context, the prompt, the model identity, the timing, and the output are all recorded as a likewise.inference.snapshot artefact on the log.

Audit is in force in two cases:

  • The node is one of the user's own. Any node operating under the user's root delegation — the user's phone, their laptop, a server they run at home — emits snapshots by default. This is the case the reference implementation satisfies and is what makes the user's personal mesh auditable end-to-end.
  • The user has required it of a delegated party. A user delegating to an organisation's node MAY attach an audit_inference: true caveat, requiring that delegated node to emit snapshots for inference performed against the delegated data. The snapshots become themselves operations on the log the user receives back.

A delegated node operating without an audit caveat is not required to record its inference. What it does internally with the data the user authorised is governed by the delegation's other caveats, not by the audit invariant. This is a deliberate scope choice: the protocol's role is to let the user decide whether audit applies, not to mandate it for every party that ever processes a piece of the user's graph.

When audit is in force, the snapshot is referenced from any record the call produced. Asking "why did the system suggest I message Sarah today" follows the link from the suggested action to the snapshot to the inputs. There is no operation produced under audit that produces a user-visible result without leaving this trail.

Snapshots are themselves bounded — they have a TTL, they can be evicted, they can be tombstoned with the rest of an evidence cascade. But while they exist, they are the audit record.

A day in the life

A user takes a photo. The phone ingests it as an evidence operation (content hash + EXIF + Vision labels), runs the deterministic extraction pass on the labels and any visible text, and emits some candidate claims as hint-status operations. Nothing has been shown to the user yet.

Overnight, the user's laptop — which has more capable hardware — claims the synthesise job for yesterday. It pulls the relevant slice of the log, assembles a model context, makes one inference call, and writes the result back as operations. Because the laptop is operating under the user's root delegation, audit is in force by default; the inference snapshot is also written. The reference implementation materialises the result as an episode and a suggested action — both application-layer conventions, not part of the substrate proper — so a card can be rendered later. A different implementation might materialise the same result a different way; the substrate-level claims and the snapshot are what the protocol guarantees.

The phone receives the new operations on next sync. Its salience projection (also an application-layer convention used by the reference implementation) rebuilds. The next time the user opens the app, a card appears: "Coffee with Mike — last seen at the same shop two weeks ago, your usual rhythm is monthly." The user taps "show why." The app follows the suggested-action's link to its snapshot, which lists the evidence, the claims, the model used, and the literal prompt.

The next day the user refutes one of the claims — the system assumed Mike worked nearby, but he doesn't. That refutation is a user-assertion op. The derivation cascade fires: claims that depended on Mike's location are invalidated. The next salience projection no longer surfaces the suggestion that depended on it.

None of this required a service. None of it required a vendor. None of it could have happened in a way the user couldn't audit or undo.

Where to go next

  • Concepts — the mental model in more depth, with diagrams.
  • Comparison — how this protocol relates to Solid, AT Protocol, Nostr, Iroh, the local-first manifesto, and UCAN.
  • Conventions — the start of the normative specification.

Concepts

This chapter is the mental model in depth. It is non-normative — the specification chapters are where the must-haves live — but a reader who finishes this chapter should be able to predict how a Likewise node would behave in most situations, and should be able to read the spec without surprise.

The shape

┌──────────────┐                       ┌──────────────┐
│   Evidence   │  immutable inputs;    │   Evidence   │
│  (immutable) │  hashed, not stored   │  (immutable) │
└──────┬───────┘  in-band              └──────┬───────┘
       │                                       │
       ▼                                       ▼
┌─────────────────────────────────────────────────────────┐
│                    Operation Log                        │
│   append-only, signed, hybrid-logical-clock ordered     │
│  ─ evidence ops ─ entity ops ─ claim ops ─ episode ─    │
│  ─ action ops ─ user assertions ─ job ops ─ ucan ─      │
└─────────────────────────────────────────────────────────┘
                          │
                          │ deterministic apply_op
                          ▼
┌─────────────────────────────────────────────────────────┐
│                     Projections                         │
│  ─ salience  ─ inference  ─ details  ─ debug graph ─    │
│         each rebuildable from the log alone             │
└─────────────────────────────────────────────────────────┘
                          │
                          │ surfaced through
                          ▼
                       Surface
              (cards, suggested actions, UI)

The diagram is the system. The log is canonical. The projections are caches. The surface is the part the user touches. Everything below that line — every claim, every recommendation, every episode — exists because of an op that produced it, and that op is on the log.

Evidence

Evidence is the raw material the system reasons over. A photo, a calendar event, a contact card, a message thread, a location ping. Evidence is immutable: once an evidence op lands on the log, the content it points at does not change. Removing evidence is its own op (TombstoneEvidence) which causes a derivation cascade — see below — but the historical record of what was once known survives, because removing operations would break the rebuild-from-log invariant.

Evidence is referenced by content hash (BLAKE3). The hash is on the log; the content itself need not be. An implementation may store the photo bytes in a local blob store, in a peer's blob store, or not at all. The protocol cares that the hash is there and is verifiable; it does not mandate where the bytes live.

Each piece of evidence has a source anchor: a stable identifier from the upstream system (calendar event UID, photo asset id) that lets multiple nodes agree they are talking about the same external fact, even if they extracted it slightly differently.

Operations

An operation is a typed, signed message that mutates the log. It carries:

  • A typed payload (one of ~31 variants — evidence, entity, claim, episode, action, user assertion, job, capability, coordinator, routing).
  • A timestamp in hybrid logical clock form: (wall_ms, logical, node).
  • An author node id.
  • A signature by the author's key (RFC 7515 detached JWS, Ed25519, over the canonical encoding of the op with the signature field cleared).
  • For sanitised ops: the signature is cleared on transit, signalling that the op has been intentionally redacted by an authorised caveat. Recipients distinguish "altered in transit" (corruption) from "deliberately sanitised by an authorised filter."

Why typed ops instead of a generic CRDT? Because the protocol's content domain is narrow and well-understood. A typed vocabulary — "create entity," "supersede claim," "tombstone evidence and cascade" — gives an implementation the information it needs to maintain projections and derivations without a generic merge engine. The trade is expressivity: the protocol does not try to be a general collaborative-document substrate. It tries to be a precise model of a single user's knowledge graph.

Time: the hybrid logical clock

Two devices that disagree about the wall clock should still agree about what happened first. The protocol uses a hybrid logical clock (HLC) for that: every op timestamp is (wall_ms, logical, node), and ordering is lexicographic over the triple. The wall component keeps timestamps roughly aligned with human time; the logical component handles bursts within a millisecond; the node id breaks ties when two devices emit at the same (wall, logical).

The clock has two disciplines, both normative:

  • Tick on emit. Before a node writes a local op, it ticks its HLC, ensuring the new timestamp strictly dominates every prior timestamp the node knows about.
  • Recv on receive. When a node receives a remote op, it advances its own HLC past the received timestamp.

If either discipline is violated, two nodes can disagree about the order of operations they have both received. This is the kind of quiet bug that is undetectable in test fixtures and devastating in production.

The causal frontier

A frontier is the per-author maximum-timestamp summary of what a node has seen. When two nodes synchronise, they exchange frontiers and ship each other the operations the other doesn't have. Because the HLC induces a total order per author, "what you don't have" is a clean set difference rather than a merkle-tree dance.

Frontiers are also the cursor for incremental sync. A node tells a peer "send me everything past this frontier"; the peer streams the matching ops and returns the resulting frontier as the cursor for the next exchange.

Entities

An entity is a stable identity for a thing in the user's life: a person, a place, an organisation, a recurring commitment, a project, a concept. Entities are not pre-defined by the protocol; they are derived from evidence by the implementation's resolution pass. The protocol specifies how an implementation merges or splits entities and what provenance it must record when it does.

Entity identity is per-mesh, ULID-based, and survives across nodes — once two nodes have synchronised the operation that created an entity, they refer to it by the same id. Entity labels (the human-readable name) are claims like any other and can change; the id is what holds the cluster of claims together.

Claims

A claim is a working hypothesis: a subject (often an entity), a predicate (drawn from a centralized vocabulary), an object, a confidence vector, and a set of supporting operations. "Sarah is a close contact" is a claim. "Tuesdays are gym mornings" is a claim.

Claims have a status that reflects how strongly the system believes them and whether the user has had a say:

  • Hint — the system has noticed something but is not surfacing it yet.
  • Claim — the system is operating on this as a working belief.
  • Fact — the user has confirmed it; subsequent derivations may not silently override it.

Claims can be superseded (a newer op replaces an older claim about the same subject and predicate) and rejected (user assertion or downstream evidence invalidates them). Both transitions are themselves operations on the log, so the history of what the system used to believe is recoverable.

Confidence is a vector, not a scalar — the protocol carries multiple components (e.g. evidential, derivational, temporal) so an implementation can decide its own composition rule without losing the underlying signal.

The derivation DAG and provenance

Every claim links to the evidence and other claims it was derived from — its supporting operations. Following those links forms a directed acyclic graph rooted at evidence. The protocol enforces that the derivation graph is a DAG (the entity-resolution graph may have cycles; derivation may not), because cycles in derivation would make invalidation undecidable.

When a piece of evidence is tombstoned, or a user rejects a claim, the cascade walks the DAG forward and invalidates everything that transitively depended on the source. This is what makes the "refute" gesture in the surface mean something. The user is not just hiding a card; they are marking a node in the graph dead, and the system has to honour the consequences.

This is also what makes auditability mechanical. Asking "why does the system believe X" is following the DAG backwards from the claim to its supporting ops to the evidence at the leaves. There is no narrative to consult — the trail is the trail.

Episodes and suggested actions (application-layer)

The reference implementation also defines two op types that exist purely to surface the substrate to a user. They are application-layer conventions, not part of the substrate proper, and live in Annex: Application Conventions. A node that does not surface records to a user — for example, an organisation peer — has no need to emit them.

An episode is a cluster of related evidence and claims with temporal bounds: a trip, a project, a relationship, a day worth remembering. Episodes are how the reference implementation presents narrative instead of list.

A suggested action is a recommendation the system makes to the user: send this message, review this calendar, reconsider this goal. Suggested actions have their own lifecycle (proposed, shown, acted, dismissed) and their own provenance link to the inference call that produced them. They exist to make recommendations refutable — a user's "stop suggesting this" is itself an op that the inference pipeline must respect.

Both are documented because the reference implementation emits them and other implementations may want to interoperate with applications that consume them. They are not, however, what makes a node Likewise-conformant; the substrate is.

Projections

Reading the log directly to answer "what does the system know about Sarah" would require a fold over millions of ops. Implementations maintain projections: in-memory or on-disk views that an op-application function keeps in sync with the log on every write.

The protocol distinguishes three substrate projections by purpose, and the distinction is load-bearing:

  • An inference projection is shaped for assembling a model context window. It is not a UI store; it is prompt furniture.
  • A detail projection is durable, on-disk, and shaped for the user-facing reads ("show me everything you know about Sarah"). It carries titles, labels, claim text, provenance links.
  • A debug-graph projection exists for inspection tooling. It contains the full graph of entities and claims and is generally not maintained at production load.

A fourth projection — a salience projection used to rank what is important now — is an application-layer convention, not part of the substrate. The reference implementation provides one; alternatives are free to substitute or omit.

The reason these are separate is that collapsing them produces a single fat object that is too slow for ranking, too lossy for UI, and too memory-hungry for inference contexts. Implementations are free to optimise within each projection; they are not free to fold them into one.

The detail projection rebuilds from the log when missing or corrupted. This is the mechanism that closes the loop on the "projections are disposable" rule: an implementation can lose every cache and recover from the log alone.

Capabilities

A capability is a triple (resource, action, caveats):

  • Resource — a class of operation or content (operations of a kind, evidence of a source type, claims with a predicate, jobs of a kind, episodes, artefacts, suggested actions, mesh coordination, registration).
  • Action — what may be done (read, write, schedule, claim, complete).
  • Caveats — narrowing constraints on the resource and action:
    • source_types — only evidence from these source types.
    • predicates — only claims with these predicates.
    • kind_prefix — only jobs whose kind starts with this prefix.
    • time_range — only ops with timestamps in this window.
    • sanitize — operations crossing this delegation must be passed through these field-level redactions: strip GPS, redact participants, truncate content bodies, strip custom metadata.

Capabilities are carried by UCAN delegations rooted at the user. A capability is delegated by a parent, may be re-delegated by the recipient if and only if the new delegation is attenuated (its caveats are at least as restrictive as the parent's), and may be revoked at any time. Revocation prunes the subgraph of delegations beneath the revoked one and invalidates any already-applied operations whose authority depended on the revoked capability.

Sanitisation is the most subtle caveat. When an op crosses a delegation that requires sanitisation, the relevant fields are redacted and the signature is cleared. The recipient sees an unsigned op tagged as sanitised, which is treated as a deliberate filter and applied. An op without a signature that is not tagged as sanitised is rejected — the missing signature would otherwise be indistinguishable from corruption.

The capability machinery is symmetric: a delegation chain that admits a personal device (the user's laptop, an inference server the user runs at home) is structurally identical to one that admits an organisation the user has chosen to invite in. A retailer deploying a Likewise node, a clinic running a scheduling assistant against a user-authorised slice of their calendar, an employer's scheduling helper that sees only the predicates the employee has consented to share — all of these are the same kind of peer to the protocol. They differ only in the scope of the delegation the user has signed. This is what makes consensual commercial data partnership a use case the protocol enables out of the box rather than a separate machinery.

Mesh coordination

Multiple nodes can do work for the same user. The protocol provides a vocabulary for who claims what:

  • ScheduleJob — declare that a unit of work needs to happen (e.g., "synthesise an episode for last week").
  • ClaimWork — a node takes responsibility for a scheduled job, with a hybrid-logical-clock-relative lease.
  • CompleteJob — the work is done; the result is written as follow-on ops the rest of the mesh receives on next sync.
  • YieldWork — the claiming node is releasing the lease voluntarily.
  • ExpireWork — the lease passed without a completion; another node may now claim.

Two further ops shape who does what:

  • DesignateCoordinator — the user (or a delegate) designates the node responsible for the deterministic derivation pass. This is not an election; it is a declaration. The coordinator's output is what the mesh agrees to derive from a given log prefix.
  • RouteKind — the user routes a class of jobs to a specific node. Once routed, only the target node may claim jobs of that kind. Used to direct heavy inference to a server while keeping the phone in charge of the log.

These ops use the same UCAN-shape capabilities as everything else: scheduling a job requires Schedule on Job; claiming requires Claim on Job with a kind_prefix caveat that admits the kind.

Inference auditing

When a node operating under audit calls a model, the call itself becomes an op. Specifically, a likewise.inference.snapshot artefact is written to the log recording the retrieved context (the evidence and claims fed into the prompt), the model identity, telemetry (latency, token counts, backend), and the output. Any claim or suggested action the call produced links back to the snapshot.

Audit is in force in two cases: when the node is operating under the user's root delegation (a node the user runs themselves), and when the node is operating under a delegation whose audit_inference caveat the user has set to true. A delegated node operating without an audit caveat is not required by the protocol to record its inference; what it does internally is governed by the delegation's other caveats. The user retains the choice; the protocol enforces it when chosen.

Snapshots are first-class artefacts and follow the same lifecycle: they have TTLs, they can be evicted, they can be tombstoned with their underlying evidence. While they exist, they are the authoritative answer to "why did the system say that."

The full mechanism is specified in Inference Audit.

The six rules in plain English

These are the non-negotiable rules from the Invariants chapter, restated without normative language so the intent is clear:

  1. Only operations change the truth. Caches and projections do not. Anything you can't reproduce by replaying the log is not real.
  2. Every claim has provenance. No fact about the user appears without a chain back to evidence the user provided.
  3. Derivation is a DAG, and refutations cascade. Marking something wrong has consequences the system has to honour.
  4. Sync converges operations, not projections. Two nodes that have seen the same ops agree on truth, regardless of how each chose to materialise it.
  5. Every op is signed. Identity is per-device, anchored at the user's root delegation. There are no anonymous writes.
  6. Inference is auditable. On the user's own nodes, every model call is recorded as a referenceable op by default. On nodes the user has delegated to, audit is opt-in via a caveat the user attaches to the delegation.

A system that violates any of these breaks the user's ability to own what it says about them. The rest of the spec exists to make those rules precise enough to implement.

Three layers, one specification

The specification is organised into three explicit layers that match the architecture above:

  • Part 1: The substrate. Evidence, claims, entities, sync, signatures, capabilities, the substrate projections. This is what every conformant node implements. It is sufficient on its own to express and synchronise a user-owned knowledge graph across an arbitrary set of authorised peers, including organisational peers.
  • Part 2: The inference pipeline. Job scheduling and claiming, routing kinds to specific nodes, and the inference-snapshot artefact convention that gives the system its audit trail. An implementation that wants to participate in distributed audited inference implements Part 2 on top of Part 1; an implementation that doesn't (a substrate-only peer) ignores it entirely.
  • Annex: Application conventions. Episodes, suggested actions, and the salience-ranking projection — the reference implementation's choices for surfacing the substrate to a user. These are not normative; alternative implementations are free to substitute their own application layer.

This split is load-bearing for the org-as-peer scenario: a retailer's node implementing Part 1 (and optionally Part 2) is fully conformant without ever touching the application conventions. A user-facing node in the spirit of the reference implementation will typically implement all three layers.

Where to go next

  • Comparison — how this protocol relates to other decentralized-data work.
  • Conventions — the start of the normative specification.

Comparison with adjacent work

This chapter is an honest contrast between Likewise and other public work in the decentralized-data and personal-AI space. The aim is not to persuade. It is to give a reader who already knows one of these projects the shortest possible path to understanding what Likewise does differently — and, importantly, where another project does something better and Likewise should not be chosen.

The protocol is young. Several of the projects below are not. None of what follows is meant to disparage them; they are the reason Likewise could be designed at all.

Solid

Solid (Tim Berners-Lee's project, ongoing at MIT and Inrupt) returns control of personal data to users by storing it in user-owned Pods that any application can read or write with permission. The goal is to break data silos so multiple apps can interoperate over the same RDF graph the user owns.

A Pod is an HTTP server exposing Linked Data Platform containers and RDF resources (Turtle, JSON-LD). Identity is WebID (an HTTP URI that dereferences to a profile document) authenticated via Solid-OIDC. Authorization is Web Access Control or Access Control Policy — ACL documents attached to resources. Mutation is plain HTTP CRUD. The Solid Notifications Protocol pushes resource updates over WebSocket / WebHook.

Where it overlaps with Likewise. Both treat "your data, your server" as the foundational stance. Both decentralize identity. Both have capability-flavoured access control. Both expect external applications to operate on a graph the user owns.

Where it diverges. Solid is CRUD-on-RDF-resources; Likewise is an append-only signed-op log with deterministic projections. Solid has no concept of evidence-claim-episode lineage, no causal ordering (no HLC or vector clock), no per-op signatures, no inference auditing, and no work routing. Pods assume an always-online HTTP origin; Likewise expects a small mesh of user-owned devices with intermittent connectivity. Solid leans on the open-world semantics of RDF; Likewise's predicate vocabulary is centralised and lint-enforced for the same reasons it has typed ops in the first place.

Sources: Solid Project, Solid Specification, Solid Protocol.

AT Protocol

AT Protocol (Bluesky) decentralizes social networking by giving each user a portable, content-addressed repository that can move between hosting providers (Personal Data Servers, or PDSes). Relays aggregate a public firehose so anyone can build a feed, index, or app over the network without a central gatekeeper.

Each user has a DID resolving to a signing key and service endpoint. Their PDS holds a repo: a Merkle Search Tree of records (DAG-CBOR / IPLD), each commit signed by the account key. Records conform to Lexicons — typed JSON schemas named with NSIDs. Sync is via the firehose (a WebSocket stream of commits) and CAR-file repo export for migration.

Where it overlaps with Likewise. This is the closest cousin in the list. Both: per-user signed log, content-addressed records, account/key portability, schema-typed records (Lexicons play the role Likewise's typed op variants play), single-author repos with cryptographic verification independent of the host. The shape "signed append-only repo plus sync from a frontier" is the same pattern.

Where it diverges. AT Protocol is public-by-default broadcast designed for global indexing — relays slurp everyone's firehose so anyone can build a search engine over the network. Likewise is private-by-default mesh, gated by UCAN delegations with sanitisation caveats, where every op crossing the wire passes through a capability filter. AT has no UCAN-style delegation, no work scheduling or routing, no inference-snapshot artefacts, no multi-projection materialisation, no evidence→claim→episode derivation DAG, and no derived-data invalidation. AT records are user-authored social objects; Likewise ops include machine-derived hypotheses with provenance back to evidence and a mechanism for the user to refute them.

If you want public discoverability and a thriving third-party indexing ecosystem, AT Protocol is the right tool. If you want a private mesh of a single user's own devices, the goals diverge enough that they are not really competitors.

Sources: AT Protocol, AT Protocol Specification, Data Repositories.

Nostr

Nostr ("Notes and Other Stuff Transmitted by Relays") is a censorship-resistant publish-subscribe substrate. Users sign events with a keypair and broadcast them to multiple relays; readers subscribe to relays and verify signatures locally, so no single relay can silence a user.

Identity is a secp256k1 keypair. The wire unit is an event with a signed (id, pubkey, created_at, kind, tags, content, sig) envelope. Kinds are integers (0 = profile, 1 = text note, 3 = follows, 30000+ = addressable). Relays speak a small WebSocket protocol. There is no causal ordering, no consensus, and no required durability.

Where it overlaps with Likewise. Per-event signing with a user-owned key. Multi-host distribution. Client-side verification. The "everything is a signed event with a kind" mental model rhymes with Likewise's signed-op log with typed payload variants.

Where it diverges. Nostr has no causal frontier — events are effectively a flat set ordered by created_at, which is whatever the user picks. There is no delegation-with-attenuation that has seen serious adoption (NIP-26 was largely abandoned). There is no derived state, no projections, no evidence-or-claim model, no work routing. Nostr is intentionally public broadcast; encrypted DMs exist but are a thin add-on. Nostr's tag system is freeform and emergent; Likewise's predicate vocabulary is centralised and small by design, because predictable derivation requires a closed vocabulary.

Sources: Nostr, NIPs.

Iroh

Iroh ("dial keys, not IPs") is a modular Rust networking stack that gives any two devices an end-to-end-encrypted QUIC connection identified by public key, traversing NATs via relay servers when direct holepunching fails. Higher-level protocols (blobs, docs, gossip) sit on top of the transport.

NodeId is an Ed25519 public key. iroh-blobs handles BLAKE3 content-addressed transfer with resumable verified streaming. iroh-docs is a multi-writer key-value replica. iroh-gossip does epidemic broadcast. iroh-willow is in development as a next-gen replacement using the Willow data model.

Where it overlaps with Likewise. Both target multi-device sync over hostile networks. Both are Rust-first. Both use Ed25519 keys as the identity primitive. Both content-address payloads (BLAKE3 in Iroh, hash-referenced evidence in Likewise). iroh-docs replicas with a per-author key look superficially like Likewise's signed-op log with a per-node identity.

Where it diverges. Iroh is transport plus sync primitives, not a domain model. It has no claims, episodes, inference snapshots, UCAN delegation graph, projection model, or scheduled-work vocabulary. Iroh's authorization story beyond namespace write-keys is intentionally underspecified.

This is largely a non-overlap. Likewise could plausibly be implemented over Iroh's transport — replacing today's HTTP + reqwest layer with iroh-net QUIC connections — and the result would be additive rather than competitive. Today's reference implementation uses HTTP because it is sufficient for a LAN mesh.

Sources: Iroh, Iroh Docs, iroh-willow.

Local-first software (Ink & Switch)

The local-first manifesto is not a protocol; it is the essay that named seven ideals modern cloud apps fail at: no spinners, multi-device, offline, seamless collaboration, longevity, privacy/security, and user ownership. The essay surveys CRDTs (and Automerge in particular) as candidate plumbing, but the seven ideals are values, not specifications.

Where it overlaps with Likewise. Likewise is squarely a local-first system by these criteria — every ideal is a design goal. The "rebuild projections from the op log" stance directly serves longevity (#5: works in 10 years) and ownership (#7: you own your data). The single-user multi-device mesh addresses multi-device (#2) and offline (#3). The capability-gated sharing model serves privacy (#6). And so on.

Where it diverges. The manifesto leans on CRDT auto-merge of arbitrary structured documents as the canonical answer to multi-device sync. Likewise uses an append-only signed log with deterministic projections and last-write-wins-by-OpId for entity merges, not a generic CRDT. This is a deliberate choice: the data domain is narrow enough that a typed op vocabulary is more precise than a generic mergeable document model, and easier to reason about for derivation. The price is that Likewise is not the right tool for collaborative document editing across multiple users — it is single-user-mesh, not multi-user-collab.

The manifesto is also silent on something Likewise has a strong opinion about: derived intelligence with auditable provenance. Local-first thinking informed Likewise; Likewise commits to a stance the manifesto does not take.

Source: Local-First Software.

UCAN — a building block, not a competitor

UCAN (User-Controlled Authorization Network) is offline-verifiable, decentralized authorization. Instead of an OAuth server issuing tokens, the resource owner signs a delegation directly to a delegate, who can re-delegate (attenuated) further. Verification is purely cryptographic: walk the chain, check signatures and attenuation.

A token is a signed envelope over {iss, aud, sub, cmd, policy, exp, nbf, …}. UCAN v1.0 (DAG-CBOR + Varsig + CIDv1 envelopes) is the current direction of the working group; v0.10 was the last JWT-shaped revision.

How Likewise uses UCAN. Every DelegateUcan op carries a v0.10 token; an implementation's UCAN view materialises the delegation graph and enforces strict attenuation per hop. Likewise extends UCAN's policy/caveat slot with a domain-specific caveat set: source_types, predicates, kind_prefix, time_range, and a sanitize directive (StripGeo, RedactParticipants, TruncateContent, StripCustomMetadata). These plug into a capability policy engine that runs authorization plus transitive-cascade plus field-level sanitisation on every outbound op stream. Likewise also extends the Resource and Action enums with Job and Schedule so work routing rides the same delegation graph.

Migration cost. Likewise is currently on UCAN v0.10. The v0.10 → v1.0 migration is non-trivial (envelope format and canonicalisation differ) and is tracked as an open issue.

Sources: UCAN Specification, ucan.xyz.

Automerge

Automerge is a CRDT library and sync engine for collaborative document editing. Documents are JSON-shaped CRDTs with full op history; the sync protocol exchanges Bloom-filtered have/need summaries until peers converge. Works over any byte transport.

Where it overlaps with Likewise. Both are append-history-based and target offline-first multi-device. Likewise's "rebuild projections from op log" is structurally similar to Automerge's "materialise document state from op history."

Where it diverges. Automerge is content-agnostic — it merges generic JSON. Likewise's ops are typed and domain-specific. Automerge has no built-in authorization model; Likewise has UCAN end-to-end. Conflict resolution: Automerge uses CRDT merge semantics per field; Likewise uses last-write-wins-by-OpId for entities (with deterministic cycle resolution). Its approach is simpler and less expressive, but it is better suited to derived data, where "the latest user assertion wins" is the right rule.

For collaborative editing across multiple users, Automerge wins. Likewise is not trying to play that game.

Source: Automerge.

Willow Protocol

Willow (2023+) is an authenticated-sync protocol designed for partial replication of large keyed datasets with capability-based access control and confidential sync — peers only learn about data they are authorised to see, including not learning what they are missing.

Data lives in namespaces, subspaces, paths, and entries. An entry is (namespace_id, subspace_id, path, timestamp, payload_length, payload_digest). Subspaces typically map one-per-author. Prefix pruning gives "destructive editing": writing at blog/idea with a newer timestamp deletes all blog/idea/* descendants. Authorization is Meadowcap, a capability system supporting both owned (top-down) and communal (bottom-up) namespaces. Confidential sync uses private-set-intersection-style techniques.

Where it overlaps with Likewise. This is the closest architectural cousin. Capability-based auth (Meadowcap rhymes with UCAN-plus-caveats), per-author signed entries (analogous to Likewise's signed ops), partial sync (Willow's range-based "area of interest" rhymes with Likewise's frontier-plus-filter), timestamp ordering. iroh-willow brings these capabilities into the same Rust ecosystem Likewise's reference implementation inhabits.

Where it diverges. Willow is a storage and sync substrate, not a knowledge model. It has no claims, episodes, inference snapshots, derivation DAG, or work routing. It is the layer underneath what Likewise does. Conversely, Willow's confidential sync is stronger than what Likewise does today — Likewise relies on the sender honestly applying its capability filter server-side, where Willow's design prevents peers from probing for unauthorised data at all. This is a real gap, and one we expect to close some day; it is tracked as an open issue. Willow's destructive editing via prefix-pruning is also more aggressive than Likewise's tombstone-cascade (which preserves the log and only invalidates derivations).

If Willow had existed when Likewise started, this specification might be a knowledge-graph model defined over Willow rather than alongside it. The right relationship may yet turn out to be that one.

Sources: Willow Protocol, Willow Data Model.

Honest synthesis

What Likewise contributes that the projects above don't

  • A typed knowledge-graph vocabulary (evidence → claim → entity → episode → action) baked into the op log, not modeled on top of a generic store. Lexicons (AT) and predicates (Solid/RDF) get close, but they are schema systems, not lifecycle models with derivation DAGs and tombstone-cascade semantics.
  • Inference auditability as a separable layer. The protocol defines a likewise.inference.snapshot artefact type and a conditional invariant that requires snapshots from any node operating under the user's root delegation, or under a delegation whose audit_inference caveat the user has set. Every audited model call lands as a snapshot recording retrieved context, model identity, telemetry, and output; derived records link back. None of the surveyed protocols treat machine-derived state as a thing that needs provenance back to evidence. The audit pipeline is a separable layer (Part 2 of the specification), so a substrate-only peer — for example, an organisation node receiving a scoped slice of the user's graph — is conformant without participating in audit unless the user required it via caveat.
  • Domain-extended UCAN caveats including audit_inference — the v0.1 caveat vocabulary (source_types, predicates, kind_prefix, time_range, sanitize, audit_inference) covers both data scoping and behavioural requirements. UCAN itself is the building block; the caveat vocabulary is Likewise's contribution.
  • Work routing in the same op log (ScheduleJob, ClaimWork, RouteKind) so heterogeneous nodes — phone without inference, server with GPU — cooperate via the same delegation graph that gates data access. AT, Solid, Nostr have no equivalent; Iroh has a separate task system at a different layer.
  • An opinionated read-path projection split (salience, inference, detail, debug-graph) tuned for on-device LLM prompting, UI reads, and ranking from a single log.
  • A substrate for consensual commercial data sharing. The capabilities, caveats, and sanitisation rules that secure the user's own mesh generalise directly to delegations to organisations the user invites in. A retailer's node, a clinic's node, an employer's scheduling assistant — each can run a conformant peer with a scope-restricted view of the user's graph, receiving only the claims the user authorised, with sanitisation enforced at the wire boundary. None of the projects above target this user-org-consent shape: they are either personal-only (Iroh, local-first, Automerge) or public-broadcast (AT, Nostr), with Solid the closest in spirit but lacking the caveat + sanitisation vocabulary that makes scoped commercial sharing tractable in practice. See Motivation: Consensual data partnership.

What Likewise doesn't do that one of these does well

  • Confidential sync. Willow's design prevents peers from probing for data they are not authorised to see. Likewise relies on the sender honestly applying its capability filter server-side. Closing this gap is an open issue.
  • Generic structural merging. For collaborative text or list editing across users, Automerge is better. Likewise's last-write-wins-by-OpId is deliberately coarse, because the domain doesn't need finer.
  • Public discoverability and third-party indexing. AT Protocol's firehose model is the right tool for "anyone can build an app over the public stream." Likewise is private-by-default and would have to add new machinery to do this; we have no current plans to.
  • Mature ecosystem of clients and apps. Solid has Inrupt and the Community Solid Server. AT has Bluesky and the wider ATmosphere. Nostr has dozens of clients. Likewise has one pre-1.0 reference implementation. The ecosystem cost is real.
  • Account portability across hosts. AT Protocol's DID + CAR-export migration is more developed than Likewise's story, which assumes the user owns all participating nodes rather than migrating between hosting providers.
  • NAT traversal and transport. Iroh's holepunching plus relay stack is what you would want for cross-network device sync. Likewise's HTTP loopback transport is sufficient for a LAN mesh; an Iroh-backed transport is plausible future work.
  • UCAN v1.0. Likewise is on v0.10 (JWT shape); the ecosystem is migrating to v1.0 envelopes (DAG-CBOR plus Varsig). This is technical debt, not a design choice.

The honest one-line summary

If you want public-network social, choose AT Protocol. If you want collaborative editing, choose Automerge. If you want a capability- based confidential-sync substrate, watch Willow closely. If you want a knowledge graph of yourself, owned by you, with auditable inference and a private mesh of your own devices — that's what this protocol is for, and we don't currently know of another public specification that targets the same brief.

Conventions

This chapter defines the conventions used throughout the specification. Subsequent chapters are normative; this one tells you how to read them.

Status of this document

This is Likewise, version 0.1 — draft for public review.

The wire format described in this specification is exercised by an end-to-end reference implementation. It is not yet stable across major versions. Backwards-incompatible changes between v0.1 and v1.0 are expected. Known cross-implementation hazards are catalogued in Open Issues.

Conformance language

The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC 2119 and RFC 8174 when, and only when, they appear in all capitals.

In short:

  • MUST / MUST NOT — absolute requirement / prohibition.
  • SHOULD / SHOULD NOT — strong recommendation; deviation requires understanding consequences.
  • MAY — truly optional.

Normative versus informative material

Each chapter below is divided into normative sections (which use RFC 2119 keywords) and informative sections (which do not). An informative section may explain rationale, give examples, or sketch how an implementation might satisfy the normative requirements. Informative material does not impose requirements. Where the two appear to conflict, the normative material wins.

Examples in code blocks, diagrams, and prose anecdotes are informative.

Versioning

Likewise follows a semantic-versioning shape:

  • Major version changes are backwards-incompatible. An implementation MUST NOT silently interoperate across major versions. A change to the wire format, the canonical signing rules, the operation payload encoding, or the meaning of a capability caveat is a major-version change.
  • Minor version changes are backwards-compatible additions: new operation variants, new caveats, new sanitisation rules, new reserved fields with safe defaults. An implementation that does not understand a minor-version addition MUST treat it as unknown-but-tolerated where the spec allows, and reject the op otherwise. The specification chapter that introduces an addition states which.
  • Patch version changes are editorial only — they do not change observable behaviour.

Two implementations on the same major version SHOULD interoperate without negotiation. Two implementations on different major versions MAY refuse to interoperate; the X-Likewise-Mesh-Rules-Hash sync header is the v0.1 mechanism by which a mismatched pair detects this and pauses sync rather than corrupting each other (see Sync).

Defined terms

The following terms are used with precise meanings throughout the specification.

  • Node — a process running an implementation of this protocol. A node has a long-lived NodeId and a corresponding signing key. A node is the unit of authorship for operations.
  • User — the human (or organisation) at whose authority all delegations in a mesh are rooted. Identified by a DID.
  • Mesh — the set of nodes belonging to one user. Mesh membership is governed by capability delegations rooted at the user.
  • Operation (or op) — the typed, signed unit of state change. Defined in Operations.
  • Op log (or just log) — a node's append-only sequence of operations.
  • Projection — a materialised read view derived from the op log. Defined in Projections.
  • Evidence — an immutable raw input the user has chosen to ingest, referenced by content hash. Defined in Data Model.
  • Claim — a working hypothesis about the user, derived from evidence and other claims. Defined in Data Model.
  • Capability — a triple (Resource, Action, Caveats) authorising a node to perform a class of operation. Defined in Capabilities.
  • HLC — hybrid logical clock. The timestamp scheme defined in Clocks.
  • Causal frontier — the per-author maximum-timestamp summary a node uses as its sync cursor. Defined in Sync.
  • Owner — the node holding the user's root delegation. Owner is a per-mesh role, not a separate identity. Some operations (notably RouteKind and DesignateCoordinator) are owner-only.
  • Coordinator — the node designated to run the deterministic derivation pass for the mesh. There is exactly one coordinator per mesh at a given log prefix; the user designates it explicitly (see Mesh Coordination).

Authoritative sources

When this specification is silent or ambiguous, fall back in this order:

  1. The relevant RFC for any externally-defined primitive (RFC 7515 for JWS, the UCAN specification for tokens, etc.).
  2. The maintainers' issue tracker, which is where ambiguities are clarified in subsequent revisions of the specification.

The protocol was developed alongside an in-progress reference implementation (working codename Cortex — provisional; see Implementations). The implementation is not yet publicly available; once it is, its observed behaviour will become the practical fall-back authority for v0.1 ambiguities. Until then, file an issue.

How to cite

When citing this specification, use the form:

Likewise, version 0.1. https://getlikewise.ai/spec/

The protocol is licensed CC-BY-4.0 (see LICENSE at the repository root). Attribution is required.

Data Model

This chapter defines the structural elements an implementation manipulates: evidence, operations, projections, and the identifier types that link them. It does not define wire encodings (see Wire Format) or the full operation taxonomy (see Operations).

1. Layers

A conforming implementation MUST distinguish three layers:

  1. Evidence — immutable, content-addressed inputs the user has ingested.
  2. Operations — typed, signed, totally-ordered records of state change. The operation log is the canonical store of truth.
  3. Projections — materialised read views derived from the operation log. Projections are disposable and rebuildable.

Higher layers in this list depend only on lower layers. Operations reference evidence by hash. Projections are computed from the operation log. No projection's state may be mutated except by applying operations.

Evidence content (the photo bytes, the calendar payload) MAY be stored separately from the op log in any way an implementation chooses, including not at all on a given node, so long as the content hash recorded in the operation referencing it remains verifiable when the bytes are present.

2. Identifier types

The protocol uses several typed identifier categories. Where this specification refers to an "Id" of a particular kind, the identifier MUST belong to that category. Implementations SHOULD prevent cross-category confusion at the type level where the implementation language permits it.

2.1 NodeId

A NodeId is a long-lived identifier for a node. A NodeId MUST correspond, one-to-one, to an Ed25519 public key used for op signing. The mapping is established when a node first authors a DelegateUcan op announcing its presence and is fixed for the lifetime of the node.

A NodeId is assigned by the implementation at node initialisation and MUST be unique within a mesh. The protocol does not specify the encoding of NodeId beyond requiring that it be a stable byte string suitable for use as a map key and a JWS kid value.

2.2 ULID-shaped record identifiers

The following identifiers are time-sortable ULID- shaped values: OpId, EvidenceId, EntityId, ClaimId, JobId, ArtifactId, EpisodeId, ActionId. Implementations MUST treat each as opaque outside the operations that produce them, except that ULID-derived total ordering MAY be relied upon for tie-breaking where the specification calls for it (e.g. last-write-wins entity merge in Mesh Coordination).

2.3 ContentHash

A ContentHash is a 32-byte BLAKE3 hash of canonical bytes. It is used to reference:

  • Evidence payloads (the photo, the calendar event content).
  • UCAN tokens (for proof-chain references).
  • Mesh-rules documents (for the sync handshake).

ContentHash values MUST be encoded as 32 raw bytes on the wire and MAY be encoded as 64-character lowercase hex in human-readable contexts.

2.4 DID

The user is identified by a Decentralized Identifier (DID). The protocol does not constrain the DID method; did:key and did:plc are both acceptable. The user's DID is the issuer of the root delegation in a mesh.

A node's signing key is bound to its NodeId rather than to a DID directly; the binding from NodeId to a DID is established by the chain of UCAN delegations rooted at the user.

3. Evidence

3.1 Identity

Each piece of evidence has:

  • An EvidenceId (assigned by the ingesting node).
  • A ContentHash of the canonical content.
  • A SourceAnchor — a stable identifier from the upstream system the content was ingested from (calendar event UID, photo asset identifier, message identifier). Multiple nodes that ingest the same upstream item MUST agree on the SourceAnchor.
  • A source_type — a short string identifying the kind of upstream system ("calendar", "photo", "contact", "location", ...). This is the value matched by the source_types capability caveat (see UCAN and Caveats).
  • Optional custom metadata.

3.2 Immutability

Evidence is immutable. Once an evidence-ingest operation lands on the log, the content referenced by that op's ContentHash MUST NOT change. To remove evidence, the user (or an authorised node) emits a TombstoneEvidence op, which triggers a derivation cascade.

Tombstoning preserves the evidence-ingest op on the log. An implementation MUST NOT delete the original op; the historical record of what was once known survives, even if the implementation has discarded the underlying content bytes.

3.3 Cascade

When an evidence record is tombstoned, every claim, entity merge, episode, suggested action, and inference snapshot that transitively depended on it MUST be invalidated. The mechanism is specified in State Machines. The operation that tombstones evidence is itself the cascade: a CascadeTombstone op carries the set of dependent records it invalidates, atomically.

4. Operations

4.1 Common shape

Every operation, regardless of payload variant, carries the following fields:

  • op_id: an OpId assigned by the authoring node.
  • author: the NodeId of the authoring node.
  • timestamp: a hybrid logical clock value (wall_ms, logical, node).
  • payload: a typed payload; see Operations for the variants.
  • signature: an Ed25519 signature over the canonical encoding of the op with the signature field cleared. See Signatures and Wire Format.

An operation with a missing signature is valid only if it has been intentionally sanitised by an authorised filter (see UCAN and Caveats). All other unsigned ops MUST be rejected.

4.2 Total order

The triple (timestamp.wall_ms, timestamp.logical, timestamp.node) induces a total order over all operations in a mesh. Where this specification requires ops to be applied "in order," it means in this total order.

Two operations with the same (wall_ms, logical, node) MUST NOT exist; that case is a violation of the HLC tick discipline (see Clocks) and the receiving node MUST treat it as an authoritative integrity failure.

4.3 Idempotence

Application of an operation to a projection MUST be idempotent. A node that receives the same op twice MUST produce the same projected state as if it had received it once.

Implementations typically achieve this by deduplicating on op_id at the log layer, but the requirement is on the projected state, not on the deduplication mechanism.

5. Projections

A projection is a materialised view computed from the operation log. Projections are derived state — they MUST be fully reconstructable from the op log alone.

A conforming implementation MUST provide projections sufficient to answer the queries described in Projections. The protocol distinguishes four projections by purpose, not by implementation strategy:

  • Salience projection — for ranking what is currently important.
  • Inference projection — for assembling a model context window.
  • Detail projection — for per-id user-interface lookups; the on-disk read layer.
  • Debug-graph projection — for inspection and verification tooling. Optional in production.

Implementations MAY combine the underlying storage of multiple projections; they MUST NOT collapse the read interfaces such that the query semantics of one projection contaminate another. The load-bearing distinction is between what each projection answers, not between where its bytes live. See Projections for the contract each projection must honour.

6. The provenance graph

The relationships between evidence, claims, and other derived records form a directed acyclic graph (DAG). The vertices are records; the edges are "derived from" links carried in the operation that produced the derived record.

A conforming implementation MUST:

  • Record the supporting operations of every derived claim, episode, and suggested action. The set of supporting operations MUST be recoverable from the op log.
  • Treat the derivation graph as a DAG. An operation that would introduce a cycle into the derivation graph MUST be rejected.
  • Implement transitive invalidation: when a vertex is tombstoned or rejected, every vertex transitively reachable along outgoing edges MUST be invalidated. The state-machine consequences of invalidation are specified in State Machines.

The entity-resolution graph (which entity merges into which) is not required to be acyclic; the merge-conflict resolution rules in Mesh Coordination handle the cyclic case deterministically.

7. Authoring authority

An operation is authorised if and only if its author held a capability admitting both the operation's Action (the kind of write it performs) and its Resource (the data class it touches), with caveats satisfied, at the operation's timestamp.

Authorisation is verified by walking the chain of UCAN delegations from the op's author to the mesh root. The mechanism is specified in UCAN and Caveats and Capabilities. A receiving node MUST reject operations it cannot authorise.

8. Informative: a worked example

An informative section. Does not impose requirements.

The user's phone ingests a calendar event. The phone:

  1. Computes the BLAKE3 hash of the canonical calendar event bytes.
  2. Allocates an EvidenceId.
  3. Builds an IngestEvidence op with source_type = "calendar", the ContentHash, the upstream UID as the SourceAnchor, and any extracted metadata in the custom-metadata field.
  4. Ticks its HLC, writes the op's timestamp.
  5. Signs the op with the phone's NodeId key (detached JWS over canonical encoding, signature field cleared during signing).
  6. Appends the signed op to the local log.

A subsequent deterministic-extraction pass on this evidence produces candidate claim ops with Hint status, each carrying this evidence's EvidenceId in its supporting-operations field. A later inference call may produce an episode op whose supporting operations include both the evidence and the claims. A still-later user assertion may confirm one of those claims, transitioning it to Fact.

Every step of that pipeline is on the log. Every record above evidence has a path back to evidence. The user can walk that path in either direction.

Operations

This chapter enumerates the operation variants defined for v0.1. Every state change in a Likewise mesh is one of these variants. The wire encoding of an operation is specified in Wire Format; this chapter describes payloads and their semantics.

1. The operation envelope

Every operation, regardless of payload variant, MUST carry the fields described below. Implementations MAY use any in-memory representation; the wire-format chapter specifies the canonical serialisation that signatures are computed over.

FieldRequiredPurpose
idyesAn OpId. ULID-shaped, time-sortable, globally unique within the mesh.
schema_versionyesThe op-payload schema version. Future revisions of this specification MAY introduce new payload-format versions; recipients MUST migrate on read.
timestampyesA hybrid logical clock value. See Clocks.
node_idyesThe originating node's NodeId.
causal_depsyesA possibly-empty set of OpId predecessors the author wishes to mark as explicit causal dependencies. May be empty when the author is willing to rely solely on the HLC ordering.
payloadyesOne of the typed variants enumerated below.
signatureconditionalDetached JWS over the canonical encoding of the op with the signature field cleared. Required for all ops except those that have been intentionally sanitised by an authorised filter; see UCAN and Caveats.

A receiving node MUST reject any operation whose envelope is malformed, whose id collides with an op already in the log under the same node_id and timestamp, or whose signature is invalid in a context where one was required.

2. Payload categories

The v0.1 substrate vocabulary partitions operations into seven categories:

  • Evidence operations record raw inputs.
  • Entity operations create, alias, merge, and split entities.
  • Claim operations create and evolve claims.
  • Job operations schedule, claim, and complete units of work.
  • Artifact operations create and evict generic byproducts of derivation, including the inference-snapshot artefacts used by Part 2.
  • User-assertion operations carry the user's overrides on derived state.
  • Mesh operations govern delegation, revocation, coordination, and routing.

Two further op types — CreateEpisode/UpdateEpisode and CreateSuggestedAction/UpdateActionStatus — are application-layer conventions used by the reference implementation to surface the substrate to a user. They are documented in Annex: Application Conventions, not here. A node that does not surface the graph to a user — for example, an organisation's node consuming a scoped slice — has no need to implement them.

Subsequent sections describe each variant. Field types use informal names; their precise wire encodings are in Wire Format.

3. Evidence operations

3.1 IngestEvidence

Creates an immutable evidence record.

FieldPurpose
evidence_idAn EvidenceId.
content_hashBLAKE3 hash of the canonical content bytes.
source_typeShort identifier for the upstream system (e.g. "calendar", "photo", "contact").
source_anchorStable upstream identifier (calendar UID, photo asset id, message id).
metadata_snapshotOptional structured metadata extracted at ingest time (timestamp, location, participants).

Receiving nodes MUST treat the (evidence_id, content_hash) pair as fixed for the lifetime of the mesh. The bytes referenced by content_hash MAY be absent on a given node.

3.2 TombstoneEvidence

Removes an evidence record from active circulation. The original ingest op is preserved on the log; only the application of new operations against the tombstoned record changes.

FieldPurpose
evidence_idThe evidence being tombstoned.
reasonOne of UserRequest, Privacy, DataExpiry, or another well-known string introduced in a future minor version.

A TombstoneEvidence op MUST trigger the derivation cascade defined in State Machines: every claim, episode, suggested action, and inference snapshot that transitively depends on the tombstoned evidence is invalidated atomically.

4. Entity operations

4.1 CreateEntity

Introduces a new entity into the mesh.

FieldPurpose
entity_idAn EntityId.
entity_typePerson, Place, Organisation, Device, Account, Document, Concept, Commitment, Event. Future minor versions MAY add types.
initial_labelHuman-readable name.
source_claimsThe claims that motivated the creation, if any.

4.2 AddEntityAlias

Adds an alternative label for an existing entity.

FieldPurpose
entity_idThe target entity.
aliasAlternative label.

4.3 MergeEntities

Resolves two or more entities to a single survivor. The survivor absorbs the consumed entities' claims, with redirection so that references to the consumed entities continue to resolve.

FieldPurpose
survivorThe EntityId that persists.
consumedThe EntityIds being absorbed.
rationaleFree-form prose explaining why they are the same.

When two MergeEntities ops conflict (each consumes an entity the other survives), the receiving node MUST resolve the conflict deterministically by OpId ordering. The full rule is given in Mesh Coordination.

A MergeEntities op authored by a user-assertion authority MUST take precedence over machine-derived merges, regardless of OpId order.

4.4 SplitEntity

Reverses a prior merge.

FieldPurpose
originalThe entity to split.
new_entitiesThe set of entities the split produces.
rationaleFree-form prose explaining why they are different.

5. Claim operations

A claim is the protocol's unit of asserted belief about an entity. Claims have a status (Hint, Claim, Fact, Disputed, Rejected, Superseded, Stale) whose transitions are specified in State Machines.

5.1 CreateClaim

FieldPurpose
claim_idA ClaimId.
claim_typeAttribute, Relationship, Membership, Temporal, Spatial, Behavioral, Derived.
subjectThe EntityId the claim is about.
predicateA predicate from the centralised vocabulary. The vocabulary is part of the specification; future minor versions MAY add predicates.
objectOne of: an EntityId, text, a number, a boolean, a timestamp, or a structured object.
initial_statusTypically Hint or Claim.
confidenceA confidence vector with multiple components.
provenanceThe supporting evidence, claims, and jobs.

5.2 UpdateClaimStatus

FieldPurpose
claim_idTarget claim.
new_statusNew status from the lifecycle.
rationaleOptional free-form prose.

5.3 UpdateClaimConfidence

FieldPurpose
claim_idTarget claim.
new_confidenceUpdated confidence vector.

5.4 SupersedeClaim

FieldPurpose
old_claim_idThe claim being replaced.
new_claim_idThe replacement, which MUST already exist on the log.
rationaleFree-form prose.

A claim with status Fact (i.e. user-confirmed) is frozen and MUST NOT be superseded by a non-user-assertion-authored op. User assertions MAY override frozen claims.

6. Job operations

The job vocabulary lets multiple nodes cooperate on the same unit of work without external coordination. The full state-machine semantics are in Mesh Coordination; the table below specifies the payload shape only.

6.1 ScheduleJob

Declares that a job exists and may be claimed.

FieldPurpose
job_idA JobId.
kindA typed work-kind string (e.g. cortex.synthesize.window). The protocol does not constrain the namespace, but implementations SHOULD use a reverse-DNS-style prefix for portability.
payloadOpaque bytes the eventual handler interprets.
policy_envelopePolicy and capability constraints attached at scheduling time.

6.2 ClaimWork

A node takes responsibility for executing a scheduled job.

FieldPurpose
job_idThe job being claimed.
claimerThe NodeId of the claiming node.
lease_duration_msHow long the lease lasts.

The lease's effective expiry is computed against the HLC wall component, not against any node's local wall clock. See Mesh Coordination.

6.3 CompleteJob

Records that a job finished and its outputs are on the log.

FieldPurpose
job_idThe job.
output_claimsClaims produced.
output_artifactsArtefacts produced.
telemetryDuration, token counts, model latency.

6.4 YieldWork

A claimer voluntarily releases a job before completion.

FieldPurpose
job_idThe job.
claimerMUST match the current claimer's NodeId.
reasonFree-form prose.

6.5 ExpireWork

Any node MAY emit an ExpireWork op once a lease's HLC-relative deadline has passed. The op moves the job back to the unclaimed state.

FieldPurpose
job_idThe job.
expired_claimerMUST match the current claimer's NodeId.
reasonConventionally "deadline_passed".

7. User-assertion operations

The user is the final authority on facts about themselves. A user assertion takes precedence over machine-derived state and MUST be respected by the receiving node's projection logic.

7.1 UserAssert

FieldPurpose
assertion_typeConfirm, Reject, Edit, Pin, Hide, LaneRule.
targetA Claim, Entity, or semantic-lane reference.
semantic_laneOptional lane qualifier.

Effects by assertion type:

  • Confirm — promotes the target claim to Fact. The claim becomes frozen against subsequent automated invalidation.
  • Reject — sets the target claim to Rejected and triggers the derivation cascade.
  • Edit — creates a versioned replacement claim that the receiving node MUST treat as superseding the original.
  • Pin — freezes the target without altering its current status.
  • Hide — display-layer directive; the claim persists on the log but is excluded from user-facing surfaces.
  • LaneRule — blocks or requires confirmation for derivations in a named semantic lane. The set of lane-rule effects is specified in State Machines.

User-assertion ops are authored by a node that holds a write capability on the relevant resource with no caveats restricting UserAssertion. Implementations MAY in addition require that the authoring node corresponds to a "user-bearing" role established by mesh policy; the v0.1 specification does not mandate this.

8. Artifact operations

Artefacts are generic machine-produced byproducts of derivation: embeddings, transcripts, OCR text, and the inference snapshots that record model calls. The artefact mechanism is substrate; specific artefact types layered on top of it (notably likewise.inference.snapshot, used by Part 2) inherit lifecycle and storage from this section.

8.1 CreateArtifact

FieldPurpose
artifact_idAn ArtifactId.
artifact_typeShort identifier ("image_embedding", "ocr_text", "transcript", "likewise.inference.snapshot", ...).
source_jobOptional link to the producing job.
inputs_usedEvidence inputs.
content_hashBLAKE3 of the artefact content.
content_inlineOptional inline bytes for small artefacts.
model_id, model_versionOptional. Required for inference-snapshot artefacts.
size_bytesContent size.
ttl_msOptional time-to-live, after which the artefact is eligible for eviction.

The likewise.inference.snapshot artifact type is specified in detail in Inference Audit.

8.2 EvictArtifact

Drops the content of an artefact (the metadata is retained on the log).

FieldPurpose
artifact_idTarget artefact.

8a. Application-layer ops (informative pointer)

The reference implementation also emits CreateEpisode, UpdateEpisode, CreateSuggestedAction, and UpdateActionStatus as part of its user-facing surface. These are documented in Annex: Application Conventions. They are not part of the substrate vocabulary; a substrate-only implementation that receives them on the wire MAY accept and store them on the log without maintaining any projection state for them.

9. Mesh operations

9.1 DesignateCoordinator

Owner-only. Names the node responsible for the deterministic derivation pass. There is no automatic election; coordinator selection is an explicit user act.

FieldPurpose
coordinatorThe NodeId that should run the deterministic pipeline.

A DesignateCoordinator op authored by any node other than the mesh owner MUST be rejected.

9.2 DelegateUcan

Carries a UCAN delegation in the op log.

FieldPurpose
ucan_cidThe ContentHash of the token bytes. Acts as the delegation's identity.
ucan_bytesThe detached-JWS UCAN token.

A DelegateUcan op authored by a node that has not yet been seen on the log MAY be accepted unsigned, on the condition that the embedded UCAN binds the authoring NodeId to the issuer's DID. This is the bootstrap path by which a new node's key first becomes known to the mesh; see UCAN and Caveats.

9.3 RevokeUcan

FieldPurpose
ucan_cidThe content hash of the delegation being revoked.

A RevokeUcan op MUST be authored by the issuer of the delegation it revokes (or by a node with write authority over that DID's delegations under a still-valid parent). Receiving nodes MUST prune the subgraph of delegations beneath the revoked one and MUST re-evaluate the authorisation of any ops whose authority depended on it.

9.4 RouteKind

Owner-only. Routes a class of jobs to a specific node.

FieldPurpose
kindThe work-kind string.
routeAn Option<NodeId>. Setting to None clears the directive.

While a route is set, only the named node MAY successfully emit a ClaimWork op for that kind. Other nodes' claim ops MUST be rejected. Routes follow last-write-wins semantics by op timestamp.

A RouteKind op authored by any node other than the mesh owner MUST be rejected.

10. Operation indexing

Implementations MUST be able to retrieve operations from the log by OpId, by (node_id, timestamp), and by author-frontier (see Sync). They MAY provide additional indices for efficient projection rebuilds.

11. Reserved variants

Future minor versions of this specification MAY introduce new op variants. An implementation that encounters an unknown variant on the wire MUST reject the op, log the rejection, and continue processing subsequent ops. It MUST NOT corrupt its log by dropping unknown variants silently or by guessing at their semantics.

The reserved-prefix convention for namespacing third-party extensions is described in Open Issues; a stable extension mechanism is anticipated but not normative in v0.1.

Wire Format

This chapter specifies the byte-level encoding of operations and related structures as they cross between nodes. It defines:

  • the canonical encoding used for signature computation,
  • the encoding of operation identifiers, hashes, and clocks,
  • the framing for collections of operations on the wire,
  • the encoding of cursors and frontiers used by the sync endpoint.

The transport-layer protocol that carries these encoded bytes is specified in Sync. The signature algorithm and detached-JWS envelope are specified in Signatures.

1. Encoding format

The canonical encoding is postcard, a compact deterministic binary serialisation defined at https://postcard.jamesmunns.com/. Implementations MUST use postcard's deterministic ordering and varint conventions.

Postcard was chosen for v0.1 because it is compact, deterministic, and has independent implementations in multiple languages. The choice is not load-bearing in the long run; an implementation MAY expose alternative encodings (JSON, CBOR, MessagePack) for debugging or for application-layer interop, but operations authored or accepted on the wire MUST be the postcard encoding. The signature is computed over the postcard bytes.

1.1 Determinism requirements

Two implementations encoding the same operation values MUST produce byte-identical postcard output. Implementations MUST:

  • Encode struct fields in the order this specification declares them (subsequent chapters declare order alongside payloads).
  • Encode option-typed fields as 0x00 for None and 0x01 followed by the value bytes for Some.
  • Encode collections as varint(len) followed by the elements in their authored order.
  • Encode booleans as a single byte: 0x00 false, 0x01 true.
  • Encode integers as varints unless this specification specifies fixed-width.

1.2 Versioning of the encoding

The wire encoding does not carry an explicit version tag at the op level. Schema evolution within a minor version is constrained to backwards-compatible additions only — see Conventions. The absence of an explicit version tag is one of the known cross-implementation hazards and is expected to be addressed in a subsequent major version.

2. Identifier encodings

2.1 NodeId

A NodeId is encoded as an unsigned 64-bit varint. The mapping from NodeId value to the corresponding Ed25519 public key is established by DelegateUcan ops on the log; see Signatures.

2.2 ULID-shaped record identifiers

OpId, EvidenceId, EntityId, ClaimId, JobId, ArtifactId, EpisodeId, and ActionId are encoded as 16 raw bytes (the canonical ULID byte form, big-endian: 48-bit timestamp + 80-bit randomness).

2.3 ContentHash

A ContentHash is encoded as 32 raw bytes (the BLAKE3 output). Hex encoding MAY be used in human-readable contexts (debugging, log lines, headers) but MUST NOT be used on the canonical wire.

2.4 DID

A DID is encoded as a length-prefixed UTF-8 string, with the full URI form (did:method:identifier).

3. Hybrid logical clock encoding

A Timestamp is encoded as a struct in the following order:

  1. wall_ms: 64-bit unsigned integer (varint).
  2. logical: 32-bit unsigned integer (varint).
  3. node: a NodeId (varint as above).

The clock value's semantics and tick rules are specified in Clocks.

4. Operation envelope encoding

Every operation is encoded as a struct in the order this section declares.

FieldTypeEncoding
idOpId16 bytes.
schema_versionvarintCurrently 1.
timestampTimestampas above.
node_idNodeIdvarint.
causal_depsVec<OpId>varint(len) + 16 × len bytes.
payloadtagged unionvarint discriminant + variant body; see Operations.
signatureOption<Vec<u8>>option byte + length-prefixed bytes when Some.

The variant discriminants for the payload union are assigned by this specification and MUST be stable across implementations of the same major version.

5. Canonical signing form

The signature is computed over the operation's canonical encoding with the signature field cleared to None.

Procedure:

  1. Set signature = None on the operation.
  2. Encode the operation per Section 4.
  3. Compute the Ed25519 signature over the resulting bytes using the authoring node's private key.
  4. Set signature = Some(<signature bytes wrapped in a detached JWS envelope>) per Signatures.

The reverse-verification procedure for receivers is specified in Signatures.

This rule — that the signature is cleared during signing — is the single most error-prone aspect of v0.1 implementation. Implementers SHOULD test it explicitly in cross-language interoperability fixtures.

6. Sanitised operations

When an operation is sanitised (a caveat strips fields before crossing a delegation; see UCAN and Caveats), the sender MUST clear the signature field. The recipient MUST NOT attempt to verify a signature on a sanitised op.

Sanitisation happens at the sender. The recipient distinguishes sanitised ops from corrupted ops by the presence of a caveat-derived sanitisation marker on the op envelope (described in UCAN and Caveats). An op that arrives without a signature and without the sanitisation marker MUST be rejected.

7. Operation collections on the wire

The sync endpoint exchanges sequences of operations. The on-the-wire encoding of Vec<Operation> is the postcard encoding of the sequence: varint(len) followed by each operation in order.

The order in the sequence is significant only as a hint: recipients MUST apply received ops by their HLC total order, not by sequence position.

8. Causal frontier encoding

A CausalFrontier is a per-author summary of the maximum operation seen from each node. It is encoded as a map with the following structure:

varint(num_authors)
for each author:
  NodeId         (varint)
  Timestamp      (struct)

The order of map entries on the wire is by ascending NodeId.

For use as a sync cursor in HTTP query parameters, the frontier is base64url-encoded (RFC 4648, no padding). The cursor is opaque to clients beyond this format; clients MUST NOT attempt to construct cursor values other than by echoing back values received from a server, except for the empty frontier (encoded as varint(0), base64url AA), which means "from the beginning."

9. UCAN token wire format

A UCAN delegation referenced by DelegateUcan is carried as opaque bytes (Vec<u8>) — specifically, the detached-JWS form of a UCAN v0.10 token over a JSON payload. The UCAN content hash (ucan_cid) is the BLAKE3 of these bytes.

The UCAN token format is specified externally; see UCAN and Caveats for the v0.10 details and the v1.0 migration plan.

10. Mesh-rules hash

The mesh-rules document is a small structured value carrying the non-negotiable parameters of a mesh (protocol version, agreed caveat vocabulary, agreed sanitisation rules). It is encoded canonically per Section 1, and its hash is the BLAKE3 of those bytes.

The mesh-rules hash is exchanged on every sync exchange via the X-Likewise-Mesh-Rules-Hash HTTP header (see Sync) to detect rule drift between peers.

11. Header conventions

When operations are exchanged over HTTP, the following headers have normative meaning:

  • Content-Type: application/octet-stream for postcard bodies.
  • X-Likewise-Next-Frontier: <base64url> — set by a server on successful pull responses; tells the client what frontier to send next.
  • X-Likewise-Mesh-Rules-Hash: <hex> — set by both sides on every request and response; mismatch triggers the handshake-pause behaviour specified in Sync.

Implementations MAY define additional headers for diagnostics, provided they do not begin with X-Likewise- (which is reserved for protocol-defined headers).

Sync

This chapter specifies how nodes exchange operations. The protocol defines exactly one endpoint, two HTTP methods, and one cursor. The simplicity is intentional: synchronisation is the most load-bearing operation in a decentralised system, and richer protocols are harder to implement compatibly.

1. Transport

Nodes communicate over HTTP/1.1 or later with TLS recommended on any non-loopback transport. The v0.1 specification does not require any particular HTTP feature beyond:

  • Request and response bodies up to a server-advertised limit (default: 8 MiB; see Section 7).
  • Custom request and response headers.
  • Standard status codes.

WebSockets, gRPC, QUIC, or peer-to-peer transports MAY be used by implementations as alternatives, but two implementations claiming v0.1 conformance MUST both support the HTTP profile defined in this chapter.

2. The single endpoint: /ops

A v0.1 node MUST expose GET /ops and POST /ops. A node MAY expose additional administrative endpoints; they are not part of this specification.

The path /ops is mounted at the root of the node's HTTP origin. A node MAY operate behind a reverse proxy that adds path prefixes, in which case the proxy is responsible for mapping back to /ops for compliant peers.

3. Pulling operations: GET /ops

Pulls operations the requester does not already have.

3.1 Request

GET /ops?since=<base64url-frontier>&limit=<n>
X-Likewise-Mesh-Rules-Hash: <hex>
Authorization: Bearer <node-bearer-token>

Query parameters:

  • since (required) — a base64url-encoded CausalFrontier representing the requester's high-water mark per author. The empty frontier (base64url AA) means "from the beginning of the log." See Wire Format.
  • limit (optional) — an upper bound on the number of operations the server returns. The server MAY return fewer than limit even if more are available; clients MUST be prepared to issue follow-up requests using the returned next-frontier cursor. Servers MAY enforce an upper bound on limit and clamp values exceeding it.

The Authorization header carries a node-bearer token that authenticates the requesting node. Token issuance and refresh are specified in Signatures.

3.2 Response

200 OK
Content-Type: application/octet-stream
X-Likewise-Next-Frontier: <base64url>
X-Likewise-Mesh-Rules-Hash: <hex>

<postcard-encoded Vec<Operation>>

Body: the postcard encoding of the sequence of operations the server is willing to send, filtered by the requester's capability set (Section 5).

Headers:

  • X-Likewise-Next-Frontier — the cursor the requester should send on its next pull. This frontier MUST encompass every operation in the response and MAY encompass operations the server chose to filter.
  • X-Likewise-Mesh-Rules-Hash — the server's current mesh-rules hash. The requester MUST compare to its own; on mismatch the pause-on-drift behaviour in Section 6 applies.

3.3 Idempotence and safety

GET /ops is safe and idempotent. Repeated calls with the same since cursor MUST return the same operations modulo log growth on the server in the interim.

4. Pushing operations: POST /ops

Submits operations the sender wants the recipient to apply.

4.1 Request

POST /ops
Content-Type: application/octet-stream
X-Likewise-Mesh-Rules-Hash: <hex>
Authorization: Bearer <node-bearer-token>

<postcard-encoded Vec<Operation>>

Body: a postcard-encoded sequence of operations.

4.2 Response

200 OK
Content-Type: application/json
X-Likewise-Mesh-Rules-Hash: <hex>

{ "appended": N, "duplicated": M, "rejected": K }

Where:

  • appended is the number of operations newly added to the recipient's log.
  • duplicated is the number that the recipient already had on its log (deduplicated by OpId).
  • rejected is the number that failed authorisation, signature verification, or schema validation.

The recipient MUST verify each incoming operation per Signatures and authorise it per UCAN and Caveats. Operations that fail either check MUST be excluded from appended and counted toward rejected. Implementations SHOULD log rejections with enough detail for an operator to diagnose, but the wire response SHOULD NOT leak per-op rejection reasons across capability boundaries.

4.3 Idempotence

Application of POST /ops MUST be idempotent: re-submitting the same operations MUST result in the same recipient state, with duplicates counted toward duplicated rather than appended a second time.

5. Source-side filtering

A server MUST filter outbound operations by the requester's capability set before responding. The filter:

  1. Authorises each candidate operation against the requester's delegation chain. Operations the requester is not authorised to read are excluded.
  2. Applies any sanitize caveats that govern the requester's delegation. Sanitised operations have signatures cleared per Wire Format.

The full filter pipeline is specified in UCAN and Caveats. The contract here is that the wire never carries operations the requester is not authorised to see.

6. The mesh-rules-hash handshake

Both sides include X-Likewise-Mesh-Rules-Hash on every request and response. On mismatch:

  • The receiving side MUST treat the request as a "drift" condition. It MAY return a 409 Conflict response and abort the exchange, or it MAY continue the exchange while logging the drift; this is a deployment policy choice.
  • The sending side, on receiving a 409 Conflict for a mesh-rules-hash mismatch, MUST pause its sync loop with that peer and surface the condition to the operator. It MUST NOT re-attempt the same exchange before resolving the drift.

The rationale is that two nodes operating under different mesh-rules documents may both believe a given op is authorised but disagree about what its caveats mean. Continuing to sync in that condition silently corrupts the shared interpretation of the log.

The v0.1 protocol does not include an automatic mesh-rules negotiation. Resolving drift requires operator action — typically adopting a newer common rules document. A future revision is expected to add a negotiation pre-handshake; this is an open issue.

7. Limits

A v0.1 server MUST support requests and responses up to 8 MiB total body size. It MAY support larger sizes; clients MUST be prepared to receive 413-Payload-Too-Large responses on push and MUST batch their submissions accordingly.

A v0.1 server SHOULD enforce a per-peer rate limit. The specification does not mandate a particular rate; servers MAY return 429 Too Many Requests and clients MUST honour Retry-After.

8. Order of application on the receiver

A receiver applying operations from a POST /ops body MUST:

  1. Decode the postcard payload to a sequence of operations.
  2. Sort by HLC total order: (timestamp.wall_ms, timestamp.logical, timestamp.node) ascending.
  3. Apply each operation in order, deduplicating by OpId.
  4. Update its causal frontier accordingly.
  5. Tick its own HLC past the maximum received timestamp.

The fifth step is part of the HLC discipline specified in Clocks.

9. Liveness

A successful GET /ops exchange doubles as a liveness signal: the requester learns that the responder is reachable and has not revoked the requester's bearer. There is no separate heartbeat in v0.1.

10. Polling cadence

The protocol does not specify how often a node should pull. A plausible v0.1 default is 30 seconds for a node on a stable local network and 5 minutes for a mobile node on metered connectivity. Implementations MAY back off on transport errors and SHOULD jitter their cadence to avoid thundering herds in a large mesh.

A future revision is expected to add server-initiated push hints (WebSocket or webhook) for lower-latency convergence. This is an open issue; v0.1 conformant nodes use polling.

11. Informative: why one endpoint

Informative section. Does not impose requirements.

A reader familiar with replicated-log systems will recognise the shape: a frontier-based pull plus an idempotent push is a standard pattern. v0.1 deliberately resists adding more — batched merkle-trees, differential range queries, sparse-index exchanges — because every additional sync mode is a place where two implementations can disagree without either being wrong.

The cost is that catching up a long-disconnected node from genesis is a sequence of paginated pulls rather than a bulk transfer. For the meshes this protocol targets — small, mostly warm, mostly online — that cost is negligible. Future revisions MAY add bulk-transfer modes for first-synchronisation and very- large-mesh scenarios; v0.1 does not.

Clocks

This chapter specifies the Hybrid Logical Clock (HLC) used to timestamp operations. The HLC is the mechanism by which two operations can be totally ordered across a mesh whose nodes disagree about wall-clock time.

The chapter fills the gap that v0.1 implementations had handled implicitly: the discipline by which a node updates its HLC. The clock value alone is not enough; a clock without a discipline will, eventually, produce two operations from different authors with the same (wall_ms, logical, node) triple, and a mesh that permits that has no way to converge.

1. The HLC value

An HLC value is a triple:

FieldTypeNotes
wall_msunsigned 64-bitMilliseconds since the Unix epoch (1970-01-01T00:00:00Z), as best the node can estimate.
logicalunsigned 32-bitA counter that advances within a single wall_ms.
nodeNodeIdThe authoring node.

The wire encoding is specified in Wire Format.

2. Total order

For any two HLC values a and b:

a < b if and only if (a.wall_ms, a.logical, a.node) < (b.wall_ms, b.logical, b.node) in lexicographic order.

This induces a strict total order on operations within a mesh. Where the rest of this specification refers to the order of operations, it means this order.

A receiver MUST apply received operations in this order regardless of the sequence position they arrived in (see Sync).

3. Per-node state

Each node maintains a single HLC value, called its local clock. The local clock has the same fields as an HLC value above; its node field is the node's own NodeId.

The local clock advances under two disciplines: the tick discipline on emit, and the recv discipline on receive.

4. The tick discipline

Before authoring a new operation, a node MUST advance its local clock by the following procedure. Let prior be the local clock value before tick, and wall_now be the node's current wall-clock reading (in milliseconds since Unix epoch).

tick(prior, wall_now) -> next:
    if wall_now > prior.wall_ms:
        next.wall_ms = wall_now
        next.logical = 0
    else:
        next.wall_ms = prior.wall_ms
        next.logical = prior.logical + 1
    next.node = prior.node
    return next

The newly authored operation MUST carry next as its timestamp. After authoring, the node's local clock MUST equal next.

Two requirements follow from this procedure:

  1. Strict monotonicity. next > prior for any prior. A node MUST NOT author two operations with the same timestamp.
  2. Wall-clock dominance. next.wall_ms >= wall_now if wall_now > prior.wall_ms. The HLC tracks reality forward when it can.

If prior.logical is at the maximum representable value, the node MUST treat the case as a clock overflow and refuse to author further operations until the wall component advances. In practice this is unreachable at the millisecond resolution and 32-bit counter v0.1 specifies, but conformant implementations MUST still handle it.

5. The recv discipline

When a node receives a remote operation with timestamp remote, it MUST update its local clock by the following procedure. Let prior be the local clock and wall_now be the current wall reading.

recv(prior, remote, wall_now) -> next:
    let max_wall = max(prior.wall_ms, remote.wall_ms, wall_now)

    if max_wall == prior.wall_ms and max_wall == remote.wall_ms:
        next.logical = max(prior.logical, remote.logical) + 1
    elif max_wall == prior.wall_ms:
        next.logical = prior.logical + 1
    elif max_wall == remote.wall_ms:
        next.logical = remote.logical + 1
    else:
        next.logical = 0

    next.wall_ms = max_wall
    next.node = prior.node
    return next

After applying recv, the node's local clock MUST equal next. The next op the node authors will then dominate remote, preserving the invariant that any op authored by this node after seeing remote is later in the total order than remote.

The recv discipline MUST be applied for every remote operation, including operations that the receiver chooses to discard for authorisation reasons. (Failing to advance the clock for filtered-out ops produces an observable hole in causal ordering that breaks the frontier invariant.)

6. Wall-clock skew

The HLC is robust to bounded wall-clock skew between nodes — that is, two nodes whose clocks are within some bound of each other will produce timestamps whose order tracks the real order of authoring. A node whose clock is far ahead of its peers will "pull" the mesh's timestamps forward (other nodes will adopt the larger wall_ms on receive). A node whose clock is far behind will not.

The protocol does not specify a skew bound. Implementations SHOULD:

  • Synchronise their wall clocks against an external time source when one is available (NTP, a peer's clock).
  • Treat as suspicious any received op whose wall_ms is more than one hour ahead of wall_now.
  • Continue to apply such ops in the total order, while logging the anomaly for operator inspection.

Skew tolerance is a known open issue: the v0.1 specification does not give an implementation tools to reject a peer producing wildly future-dated timestamps. A future revision is expected to add an out-of-band skew limit negotiated as part of mesh-rules.

7. Lease expiry uses HLC, not wall clock

Lease-based work claims (ClaimWork) carry a lease_duration_ms that is interpreted against the HLC wall component, not against the local wall clock of any single node:

expired_at(claim_op) -> hlc_threshold
    let claimed_wall = claim_op.timestamp.wall_ms
    return claimed_wall + claim_op.payload.lease_duration_ms

is_expired(claim_op, current_hlc) -> bool
    return current_hlc.wall_ms > expired_at(claim_op)

This makes lease expiry robust to clock skew across the mesh in the same way the rest of the protocol is. See Mesh Coordination.

8. Informative: why HLC instead of vector clocks

Informative section. Does not impose requirements.

A vector clock would carry a logical counter per author and let two ops be partially ordered. The HLC is strictly less expressive: it produces a total order, breaking concurrency ties arbitrarily by node. This is acceptable for Likewise because:

  • The protocol's merge semantics are last-write-wins by OpId for the cases where two ops conflict; partial order would not give an implementation more information than total order already provides.
  • The total order plus a per-author causal frontier gives sync a clean cursor: "everything past this frontier" is unambiguous.
  • A vector clock requires a per-author entry that grows with mesh size; the HLC is fixed-size.

The cost is that the protocol is not a CRDT in the strict sense — two nodes with the same set of operations agree on order regardless of how they observed them, but they don't have richer concurrency information to inspect. For Likewise's domain — a single user's mesh — that cost is the right trade.

Signatures

This chapter specifies how operations are signed and verified, and how nodes authenticate to one another over HTTP. It is the specification of the JWS envelope used by the signature field on every operation, the canonical signing form referenced from Wire Format, and the bearer-token issuance used by Sync.

1. Algorithm

All signatures defined by v0.1 of this specification are Ed25519 (RFC 8032). Signature size is fixed at 64 bytes; verification keys are 32 bytes.

A future revision may add additional algorithms. v0.1 conformant implementations MUST support Ed25519 and MAY accept any other algorithm if and only if a future minor version explicitly introduces it.

2. Per-node keys

Each node holds exactly one Ed25519 signing key for the duration of its lifetime. Key rotation in v0.1 is performed by issuing a new node identity (a new NodeId and key pair) and delegating authority to it via DelegateUcan; the previous identity may then be revoked.

The mapping NodeId -> Ed25519 public key is established by the node's first DelegateUcan op observed on the log. This op MUST embed a UCAN whose iss field is the issuer's DID and whose sub (or equivalent) names the NodeId and carries the public key. After this binding op is observed, every subsequent op authored by that NodeId MUST be verified against the bound key.

3. Detached JWS envelope for op signatures

The wire-level value of an operation's signature field is a detached JWS as defined by RFC 7515 §A.5: the header and signature segments are present, the payload segment is empty.

The detached JWS is encoded as the UTF-8 bytes of the string:

BASE64URL(header) "." "." BASE64URL(signature)

Where:

  • header is the JSON Object Serialisation of:
    {"alg":"EdDSA","kid":"node-<node_id>"}
    
    with field order as written, no insignificant whitespace, ASCII encoding. The kid value is the literal prefix "node-" followed by the node's NodeId rendered as decimal digits (since NodeId is a 64-bit integer in v0.1; see Wire Format).
  • signature is the 64-byte Ed25519 signature over the canonical signing form defined in Section 4.

The Vec<u8> carried in the operation's signature field is the UTF-8 byte sequence of the above string. Implementations MUST NOT include line breaks or trailing whitespace.

4. Canonical signing form

To sign or verify an operation:

  1. Construct the operation per the operation envelope.
  2. Set signature = None.
  3. Encode the operation per Wire Format. Call the result op_bytes.
  4. Sign or verify op_bytes using the Ed25519 key bound to the operation's node_id.

When signing, the resulting 64-byte signature is wrapped per Section 3 and stored as Some(...) in the signature field before transmission.

When verifying, the receiver unwraps the detached JWS, recovers the 64-byte raw signature, and verifies it against op_bytes constructed from the received op (with its signature cleared to None) using the public key bound to the op's node_id.

A receiver MUST reject any operation whose verification fails, unless the op is a sanitised op admitted by Section 6.

5. Implementation note: round-tripping the signature field

The most common implementation error in this area is mishandling the round-trip: implementations sign an op with the field set to None, transmit it with the field set to Some(jws), and then attempt to verify by re-signing the received op as-is — yielding a signature over different bytes. Implementations MUST explicitly clear the signature field before computing the canonical encoding for verification, and MUST treat that step as the canonical procedure regardless of how the op is held in memory.

A reference test vector for cross-implementation interop is expected to ship with v0.1.1, alongside the public release of the reference implementation. Until both exist, implementers cannot fully validate signature canonicalisation against an authoritative source; the procedure in this section and the field-ordering rules in Wire Format are what to follow in the meantime. See Implementations for status.

6. Sanitised operations

When an outbound op crosses a delegation that requires sanitisation (per the sanitize caveat described in UCAN and Caveats), the sanitiser modifies the op's payload by stripping or redacting the affected fields. Because the resulting op no longer matches the bytes the original signature was computed over, the signature would no longer verify. Therefore the sanitiser MUST clear the signature field on the sanitised op (set it to None) and record the sanitisation in a marker field (specified in UCAN and Caveats).

The receiver MUST NOT attempt signature verification on a sanitised op. It MUST verify that the sanitisation marker is consistent with a delegation that authorised the sender to apply it; this is the receive-side procedure specified in UCAN and Caveats.

An op that arrives without a signature and without a sanitisation marker MUST be rejected. The two conditions are the only legitimate paths to an unsigned op on the wire (and even the bootstrap path described in Section 7 produces a signed op).

7. Bootstrap: the first op a node authors

A node that has not yet been seen on the log presents a chicken-and-egg problem: the receiver does not know the node's public key, so cannot verify the op that establishes the binding.

The protocol resolves this by requiring that a node's first authored op be a DelegateUcan carrying a UCAN that:

  1. Is signed by the issuer's DID key (not the node's).
  2. Embeds the node's public key in the UCAN's subject claim.
  3. Is itself well-formed and verifiable against the issuer's DID.

The receiver:

  1. Recognises that the authoring NodeId is unknown.
  2. Decodes the embedded UCAN.
  3. Verifies the UCAN's signature against the issuer's DID.
  4. If valid, extracts the embedded public key and binds it to the authoring NodeId.
  5. Verifies the op's own signature using the freshly-bound key.

If any step fails, the op is rejected. After this op is applied, the node's identity is known and subsequent ops authored by it follow the standard signing rules.

8. Bearer tokens for HTTP authentication

The Authorization: Bearer header on GET /ops and POST /ops identifies the requesting node to the server. A bearer token is a short-lived signed assertion of node identity, structured as follows:

BASE64URL(header) "." BASE64URL(payload) "." BASE64URL(signature)

(The standard JWS Compact Serialisation, this time not detached.)

Header:

{"alg":"EdDSA","kid":"node-<node_id>"}

Payload (JSON):

{
  "iss": "<NodeId>",
  "aud": "<recipient NodeId or origin>",
  "iat": <unix-seconds>,
  "exp": <unix-seconds>,
  "nonce": "<random>"
}

Tokens MUST have an expiry (exp) no more than one hour in the future. Recipients MUST reject tokens that are expired, that present an iss not bound to a valid public key on the log, or that reuse a nonce already seen for the same iss within the validity window.

Token issuance is per-request: a node generates a fresh token for each peer, signs it, and presents it. There is no central issuer. A future revision may add a refresh-token mechanism; v0.1 implementations issue one-shot tokens.

9. Verifying authority

A signature establishes that the authoring NodeId produced the op. It does not establish that the NodeId was authorised to produce it. Authorisation is a separate check performed against the UCAN delegation graph — see UCAN and Caveats and Capabilities. Both checks are required; either failure rejects the op.

UCAN and Caveats

This chapter specifies how authority is delegated, attenuated, and revoked, and the caveat vocabulary that narrows a delegation. The protocol uses User-Controlled Authorization Networks (UCAN) as the underlying delegation primitive, and extends UCAN's policy slot with a domain-specific caveat set.

1. UCAN version

v0.1 of this specification builds on UCAN v0.10, the last JWT-shaped revision of the UCAN format. A UCAN token is a detached JWS over a JSON payload with the standard fields:

FieldPurpose
issIssuer DID. The party delegating authority.
audAudience DID. The party receiving authority.
attAttestation array. Each entry is a (resource, action, caveats) capability.
nbfNot-before time (Unix seconds).
expExpiry time (Unix seconds).
prfProof chain. Array of parent UCAN content hashes.

The full UCAN v0.10 format is specified externally; the canonical reference is the UCAN working group repository.

1.1 v1.0 migration

The UCAN working group has moved on to v1.0, which uses a DAG-CBOR plus Varsig envelope and CIDv1 references. Likewise's v0.1 specifies v0.10 because that is what the reference implementation uses. The v0.10 → v1.0 migration is a known open issue and is expected to land as part of the next major version.

2. Capabilities in v0.10's att field

Every entry in a UCAN's att array is a Likewise capability. The protocol places its capability schema directly in the UCAN policy slot:

{
  "resource": "<Resource enum value>",
  "action":   "<Action enum value>",
  "caveats":  { ... }
}

The set of legal resource and action values, and the legal caveats schema, are specified in Capabilities. This chapter covers how delegations are linked, attenuated, and revoked; the next chapter covers what they can authorise.

3. The delegation graph

A capability flows through the mesh as a chain of UCAN delegations rooted at the user. The user issues their root delegation to one or more nodes (typically the phone), authorising those nodes to author further delegations.

When a delegation D_b cites a parent D_a in its prf array, the receiving node MUST:

  1. Resolve D_a from the op log (or refuse the delegation if it cannot).
  2. Verify D_a was issued by the DID that D_b's issuer holds delegation under, transitively up to the user.
  3. Verify D_b's capabilities are an attenuation of D_a's (Section 4).
  4. Verify the time bounds on D_b are within D_a's (D_b.nbf >= D_a.nbf, D_b.exp <= D_a.exp).

A delegation that fails any of these checks MUST be rejected.

4. Strict attenuation

A child delegation's capability set MUST be a subset of its parent's. Attenuation is checked per-(resource, action) pair: the child MAY include any pair the parent includes (or any pair strictly narrowed by additional caveats), and MUST NOT include pairs the parent does not.

For each capability in the child:

  • The (resource, action) pair MUST appear in the parent (possibly with broader caveats).
  • The child's caveats MUST be at least as restrictive as the parent's (Section 5).

A delegation that broadens any caveat compared to its parent MUST be rejected by every receiving node, regardless of whether the broadened delegation was signed correctly.

5. Caveats

Every caveat is optional, meaning "no restriction along this axis." A delegation with no caveats authorises the full scope of the (resource, action) pair (subject to any restrictions inherited from its parent).

Caveat narrowing rules: a child caveat is at least as restrictive as a parent caveat if and only if every operation that satisfies the child's caveat would also satisfy the parent's.

The v0.1 caveat vocabulary comprises six fields. Future minor versions MAY add caveats; an unknown caveat field MUST be treated as an absolute restriction (a delegation carrying an unknown caveat is admitted, but no operation can satisfy the unknown caveat — effectively granting the empty capability).

5.1 source_types

Restricts the capability to evidence whose source_type matches one of the listed values.

FormMeaning
absentNo restriction.
["calendar"]Only operations on calendar-source evidence.
["calendar", "contact"]Either calendar or contact.

Narrowing: child's set MUST be a subset of parent's.

5.2 predicates

Restricts the capability to claim operations whose predicate matches one of the listed values.

FormMeaning
absentNo restriction.
["located_at"]Only claim ops with predicate "located_at".

Narrowing: child's set MUST be a subset of parent's.

5.3 kind_prefix

Restricts the capability to job operations whose kind field starts with one of the listed prefixes.

FormMeaning
absentNo restriction.
["cortex.synthesize."]Only synthesize-class jobs.
["cortex."]Any reverse-DNS-prefixed cortex job kind.

Narrowing: each child prefix MUST be a prefix of (or equal to) some parent prefix.

5.4 time_range

Restricts the capability to operations whose timestamp.wall_ms falls within the given range.

FormMeaning
absentNo restriction.
[start_ms, end_ms]Inclusive lower bound, exclusive upper bound.

Narrowing: child's range MUST be contained within parent's.

5.5 sanitize

Specifies field-level redactions that MUST be applied to operations crossing this delegation. Sanitisation is unique among caveats in that it does not block an op; it modifies it in flight.

The v0.1 sanitisation rules are:

RuleEffect
StripGeoRemove latitude, longitude, altitude, and any other geographic coordinates from evidence metadata, claim objects, and artefact bodies.
RedactParticipantsReplace participant identifiers with anonymised placeholders consistent within the operation but not linkable to the original entities.
TruncateContent(N)Truncate any content body to at most N bytes.
StripCustomMetadataRemove any custom-metadata fields not specified by the protocol.

A delegation MAY specify multiple sanitise rules; they are applied in the order listed.

Narrowing: a child delegation's sanitize rule list MUST be a superset of its parent's (sanitisation strengthens at each hop).

5.5.1 The sanitisation marker

When an op is sanitised on the wire, the sanitiser MUST set the op's signature field to None (per Signatures) AND attach a sanitisation marker. The marker is a payload-internal field whose presence both:

  • tells the receiver the op was deliberately filtered, not corrupted, and
  • records the chain of sanitise rules applied (so the receiver can audit that the rules match a delegation the sender held).

The exact wire shape of the marker is specified in Wire Format; the contract here is that the marker is a structurally-required part of any unsigned, deliberately-modified op.

A receiver MUST verify that the marker's claimed sanitise chain is admitted by some delegation the sender holds reaching back to the user. A marker that does not match an authorised chain MUST cause the op to be rejected.

5.6 audit_inference

Requires the delegated node to emit likewise.inference.snapshot artefacts (see Inference Audit) for every model call performed against data covered by this delegation.

FormMeaning
absent or falseNo requirement. The delegated node MAY emit snapshots for its own bookkeeping but is not obliged to.
trueThe delegated node MUST emit a likewise.inference.snapshot artefact for every inference call performed against ops admitted by this delegation.

Narrowing: a child caveat with audit_inference: true is admissible under any parent (the parent did not require audit; the child voluntarily promises it). A child caveat with audit_inference: false is admissible only if the parent also permits audit-free operation. In other words, audit requirements strengthen down the chain; they cannot be relaxed.

This caveat is the mechanism by which the user requires auditable inference from a delegated party — for example, an organisation running a Likewise node under a scoped delegation. When a delegation carries audit_inference: true, the snapshots that the delegated node emits become themselves operations on the user's log (subject to the user's read capability on the artifact ops the delegated node produces), completing the audit loop across the delegation boundary.

The corresponding invariant is specified in Invariants §I-9.

6. Revocation

A RevokeUcan op authored by a delegation's issuer (or by a node holding write authority over the issuer's DID under a still-valid parent) retires the delegation. Receiving nodes MUST:

  1. Mark the delegation's content hash as revoked in the local UCAN view.
  2. Recursively mark any delegations whose prf cites the revoked one as revoked (transitive cascade).
  3. Re-evaluate the authorisation of every operation whose authority chain depended on a now-revoked delegation. Such operations are NOT removed from the log, but they MUST NOT be applied to projections.

The on-revoke rebuild is one of the more expensive operations in the protocol; implementations SHOULD batch revocations and defer the rebuild to the next idle window when latency permits.

A revoked delegation cannot be un-revoked. To restore the authority, the issuer issues a new delegation.

7. The authorise-and-filter pipeline

When a node receives operations (whether from its own scheduler authoring them locally or from a remote peer), it MUST run the following pipeline before applying them to projections:

  1. Verify signatures per Signatures.
  2. Authorise each op against the authoring node's effective capability set — the union of capabilities derived from delegations rooted at the user, restricted by all caveats in the chain. An op is authorised iff its Action and Resource are admitted and its caveats are satisfied.
  3. Apply transitive cascades: re-evaluate ops whose authority depended on now-revoked delegations.
  4. Sanitise outbound ops crossing delegations with sanitize caveats.

Steps 1-3 run on receive; step 4 runs on send. The pipeline is specified in detail in Capabilities.

8. The user's root delegation

The mesh is bootstrapped by the user issuing a root UCAN to the first node (typically the user's phone). The root delegation:

  • Has the user's DID as iss.
  • Has the first node's NodeId-bound DID as aud.
  • Carries the maximal capability set ((*, *) with no caveats).
  • Has no prf; it is the chain root.

Subsequent delegations cite the root (or a descendant of it) as their proof. The user holds the keypair backing their DID; an implementation MUST provide the user with a mechanism to authorise root re-issuance and to revoke the existing root.

The protocol does not specify the user-interface for this authorisation; that is implementation-defined. The protocol specifies only the wire format of the resulting UCANs.

Capabilities

This chapter specifies the capability vocabulary used in UCAN delegations: the set of legal Resource values, the set of legal Action values, the legal combinations, and the authorise-and-filter pipeline that uses them.

1. Resources

A Resource names a class of protocol-defined entity that a capability authorises an action on. The v0.1 resource vocabulary:

ResourceCovers
OpsUniversal — any operation, regardless of category.
EvidenceEvidence operations (IngestEvidence, TombstoneEvidence).
EntityEntity operations (CreateEntity, AddEntityAlias, MergeEntities, SplitEntity).
ClaimClaim operations (CreateClaim, UpdateClaimStatus, UpdateClaimConfidence, SupersedeClaim).
JobJob operations (ScheduleJob, ClaimWork, CompleteJob, YieldWork, ExpireWork).
EpisodeEpisode operations (CreateEpisode, UpdateEpisode).
ArtifactArtefact operations (CreateArtifact, EvictArtifact).
ActionSuggested-action operations (CreateSuggestedAction, UpdateActionStatus).
MeshMesh-coordination operations (DesignateCoordinator, RouteKind).
UserAssertionUser-assertion operations (UserAssert).
RegistrationIdentity and delegation operations (DelegateUcan, RevokeUcan).

Ops is the universal resource: a capability granted on Ops applies to any operation, equivalent to a union of all the specific resources. Implementations MUST treat Ops appropriately when checking attenuation — a child capability on a specific resource MAY appear under a parent capability on Ops.

Future minor versions MAY add resources. Implementations MUST reject capabilities naming an unknown resource.

2. Actions

An Action names what may be done with a resource. The v0.1 action vocabulary:

ActionMeaning
ReadThe holder may receive operations of the resource class on inbound sync.
WriteThe holder may author operations of the resource class.
ScheduleThe holder may emit ScheduleJob ops (only meaningful with Resource::Job).
ClaimThe holder may emit ClaimWork ops.
CompleteThe holder may emit CompleteJob, YieldWork, and ExpireWork ops.

Read is the gating action for outbound sync filtering: a peer's GET /ops response MUST only include ops the peer holds Read for. Write is the gating action for op authoring: a node MUST NOT successfully apply an op it authored without holding Write on the relevant resource.

The job-specific actions (Schedule, Claim, Complete) split job authority into discrete capabilities so that, for example, a phone can schedule synthesis jobs while only a trusted server may claim them.

3. Resource × Action matrix

Not every (Resource, Action) combination is meaningful. The table below summarises which combinations the v0.1 specification defines. Cells marked indicate combinations that have no defined effect (a delegation may include them but they will authorise nothing useful; an implementation MAY warn but MUST NOT reject).

Resource → / Action ↓ReadWriteScheduleClaimComplete
Ops
Evidence
Entity
Claim
Job
Episode
Artifact
Action
Mesh
UserAssertion
Registration

The Mesh resource grants authority over RouteKind and DesignateCoordinator. These are owner-only ops: the protocol requires that the authoring node hold Mesh.Write AND that the authoring node be the mesh owner (i.e. the holder of the root delegation chain). A node holding Mesh.Write via a non-root delegation MUST have any RouteKind or DesignateCoordinator ops it authors rejected.

4. Caveat applicability

The six caveats specified in UCAN and Caveats apply to capabilities as follows:

CaveatApplies toEffect
source_typesCapabilities on Evidence and on OpsRestricts which evidence's source_type the holder may read or write.
predicatesCapabilities on Claim and on OpsRestricts which claim predicates the holder may read or write.
kind_prefixCapabilities on JobRestricts which job kinds the holder may schedule, claim, or complete.
time_rangeAny capabilityRestricts the timestamp range of operations the capability admits.
sanitizeAny capability with ReadSpecifies sanitisation applied to operations crossing the delegation outbound.
audit_inferenceCapabilities on Job (Claim or Complete), Claim (Write), Artifact (Write), or OpsWhen true, requires the delegated node to emit likewise.inference.snapshot artefacts for every model call performed against data covered by this delegation.

A caveat applied to a resource it does not narrow has no effect: a kind_prefix on a capability over Evidence does not restrict anything, because Evidence ops do not have a kind field. Such caveats MAY be present (they do not invalidate the delegation) but they do not authorise additional behaviour.

5. The authorise-and-filter pipeline

This section specifies the procedure a node runs when ingesting an operation, whether locally authored or received over the wire. It MUST be applied in the order specified.

5.1 On receive (inbound)

For each incoming op:

  1. Reject malformed. If the op fails wire-format validation (per Wire Format), reject it.

  2. Verify signature (or skip for sanitised ops; see step 6). If the op carries a signature, verify it per Signatures. If verification fails, reject it.

  3. Resolve authority. Identify the authoring NodeId and walk the chain of UCAN delegations from that NodeId's bound DID to the user's root. If no such chain exists, reject the op.

  4. Check active validity. Reject the op if any delegation in its authority chain is revoked, not yet active (nbf in the future), or expired (exp in the past). The check uses the op's timestamp.wall_ms for nbf / exp comparisons, not the receiver's local wall clock.

  5. Authorise. The op's (Resource, Action) MUST appear in the effective capability set derived from the chain (the intersection of caveats along the chain). The op's payload MUST satisfy every caveat: source-type checks for evidence ops, predicate checks for claim ops, kind-prefix checks for job ops, time-range checks against the op's timestamp.

  6. Verify sanitisation marker (for unsigned ops only). The marker MUST identify a sanitise rule chain admitted by some delegation the authoring node holds reaching to the user. If verification fails, reject the op.

  7. Apply. The op is authorised and authentic; the implementation may now apply it to projections.

A rejected op is dropped from the apply pipeline. Implementations SHOULD log rejections; they MUST NOT silently apply rejected ops or partially apply them.

5.2 On send (outbound)

When a node responds to a GET /ops request, it MUST filter the candidate ops by the requester's effective capability set before serialising them onto the wire:

  1. Authorise. For each candidate op, evaluate whether the requester is authorised to read it (the equivalent of the inbound check, against the requester's chain). If not, exclude the op from the response.

  2. Sanitise. For each remaining op, if the requester's delegation chain carries sanitize rules, apply the rule chain to a clone of the op:

    • Apply each rule's redactions in order.
    • Set the cloned op's signature to None.
    • Attach the sanitisation marker recording the rule chain.
    • Use the cloned, sanitised op as the response value.

The sanitisation step happens server-side; the requester receives only the sanitised op and cannot recover the redacted fields. This is the only authorised way for an unsigned op to appear on the wire.

5.3 On transitive revocation

When a RevokeUcan op is applied, every previously-applied op whose authority chain depended on the revoked delegation MUST be re-evaluated:

  1. Walk the projection's index of applied ops by chain.
  2. For each affected op, re-run the authorise pipeline as if the op had just been received.
  3. Ops that no longer authorise MUST be removed from the projections (the underlying op log entry is preserved).

This is the operation that gives revocation real teeth: an op that was admitted under a delegation no longer trusted is no longer trusted, retroactively.

6. Capability composition

The user's root delegation is (Ops, *) with no caveats — maximal authority. Subsequent delegations narrow this. A node in practice typically holds:

  • (Ops, Read) with sanitisation caveats — to receive most ops with privacy filtering.
  • (Evidence, Write) with source_types caveat — to ingest evidence from a specific connector.
  • (Job, Schedule) with kind_prefix caveat — to schedule inference work in a specific class.
  • (Job, Claim) and (Job, Complete) with the matching kind_prefix — to actually do the work.
  • (UserAssertion, Write) — to forward user feedback.

A device-specific delegation typically composes several of these into a single UCAN; the implementation builds a node's effective capability set by unioning the granted capabilities across that node's delegations.

7. Reserved combinations

The protocol reserves the following capability behaviours for future minor versions; v0.1 implementations MUST NOT issue or accept delegations using them:

  • Capabilities on a resource type introduced in a future version that the receiving node does not understand.
  • Caveat fields not in the v0.1 vocabulary.

A delegation containing a reserved combination MUST be rejected by a v0.1 conformant node.

Projections

A projection is a materialised read view derived from the operation log. This chapter specifies the projection contract: what each projection MUST be able to answer, what relationships between projections are load-bearing, and what implementations are free to optimise.

The chapter fills the gap that v0.1 implementations had handled implicitly: an explicit statement of which projection-related behaviours an implementation MUST provide and which it MAY choose internally.

1. The disposability invariant

A conformant implementation's projections MUST be fully reconstructable from the operation log as it stands at any point in time.

Concretely:

  • An implementation MUST be able to rebuild every projection from the log alone. There MUST NOT exist any state in a projection that has no derivation rule from the log.
  • An implementation MUST NOT modify projections by any means other than applying operations. UI actions, schedulers, caches, and external systems MUST go through the op-log layer.
  • A projection MAY be discarded at any time and rebuilt on demand. An implementation MAY persist projections for performance, but persisting them MUST NOT change the log's authority over their content.

This invariant is the load-bearing reason that the protocol can guarantee the user owns their derived data: nothing the system believes about the user lives outside the log.

2. The three substrate projections

The protocol defines three projections by the queries they answer, not by their storage strategy. An implementation MUST provide each of the three query surfaces; it MAY combine the underlying storage as it sees fit, provided each surface remains queryable as specified.

A fourth projection, salience, is used by the reference implementation to surface records to a user. It is an application-layer convention rather than substrate, and is specified in Annex: Application Conventions §A.3. A node that does not surface records to a user — for example, an organisation's node consuming a scoped slice — has no need to implement it.

2.1 Inference projection

Purpose. Assemble a model context window for an inference call.

Required queries.

  • Window context. Given a time window, return the evidence, claims, entities, and episodes a model should receive as context for synthesising over that window.
  • Per-entity context. Given an entity, return the relevant claim stack and supporting evidence for an inference call centred on that entity.

Constraints.

  • The inference projection MUST be designed for assembly into prompt-shaped data structures, not for UI rendering.
  • It MUST track which claims and evidence are framing tags versus narrative content. Implementations MUST be able to produce both shapes.
  • It SHOULD be cheap to update incrementally as new ops arrive, because it is consulted on every inference call.

The inference projection's content is not the prompt itself — the prompt is constructed by the implementation's inference pipeline. The projection's responsibility is to provide correct context; the pipeline's responsibility is to assemble it.

2.2 Detail projection

Purpose. Answer per-id user-interface lookups.

Required queries.

  • Get by id. Given an EntityId, EpisodeId, ClaimId, ActionId, or EvidenceId, return all user-visible fields for that record: title, label, claim text, status, confidence, provenance links, supporting evidence summary.
  • List by predicate. Given an entity and a predicate, return the current claims of that predicate on that entity (with effective status applied).
  • Provenance trace. Given any derived record, return the chain of supporting operations transitively to evidence.

Constraints.

  • The detail projection MUST be durable (typically on-disk). Every conformant node carries it, regardless of whether the node is rendering a UI.
  • It MUST rebuild from the log when missing or corrupted.
  • It MUST honour user assertions: a Reject user-assertion MUST cause subsequent reads of the affected claim to return the rejected status, regardless of underlying derivation state.
  • Reads MUST be keyed direct lookups. Implementations MAY add secondary indices for richer queries; the v0.1 spec does not require them.

The detail projection is the only projection a v0.1 conformant node MUST persist across restarts.

2.3 Debug-graph projection

Purpose. Full-graph inspection for tooling and verification.

Required queries.

  • Full graph dump. Return all entities, claims, evidence references, and edges between them as of the current log prefix.
  • Cycle detection. Identify cycles in the derivation graph (which MUST NOT exist).

Constraints.

  • The debug-graph projection is OPTIONAL in production deployments.
  • An implementation that includes it SHOULD make it available via inspection tooling (a CLI, an admin endpoint).

The protocol provides this projection because being able to ask "show me the entire graph" is a load-bearing debugging capability for a system the user is asked to trust.

3. The non-collapse rule

An implementation MAY combine the underlying storage of multiple projections — for example, keeping a single SQLite file with separate tables for the detail and inference projections — but it MUST NOT fold the read interfaces such that one projection's query semantics contaminate another.

In particular:

  • An inference-context query MUST NOT return UI-shaped detail records.
  • A detail-by-id query MUST NOT carry inference-window framing tags as if they were claim content.
  • An implementation that adds an application-layer projection (such as the salience projection in Annex §A.3) MUST NOT fold its read interface into a substrate projection.

The reason for the rule is observable: collapsing produces a single fat object that is too slow for ranking, too lossy for UI, and too memory-hungry for inference contexts. Implementations that have tried it have re-discovered why the distinction exists; the spec encodes the lesson normatively.

4. Rebuild from log

Every conformant implementation MUST provide a rebuild operation that:

  1. Drops or otherwise invalidates the current state of all projections.
  2. Replays the entire op log in HLC total order, applying each op to all three substrate projections (and to any application-layer projections the implementation maintains).
  3. Reaches a steady state in which subsequent op application continues normally.

Rebuild is the recovery mechanism for projection corruption and the verification mechanism for new implementations. Rebuilding from a known log and comparing the result to a trusted reference output is intended to be the strongest test of projection correctness; once the reference implementation (see Implementations) is public, its rebuilds will serve as that reference. Rebuild SHOULD be deterministic up to algorithm-internal choices: the same implementation must produce the same projection from the same log every time, and two implementations that satisfy this chapter's contract must agree on every fact derivable from the log even if they internally choose different ranking or scoring strategies in their application-layer projections.

5. Per-projection authority

When two projections produce conflicting answers about the same fact, the detail projection wins for user-visible display, and the inference projection wins for model context. The debug-graph projection — and any application-layer projection an implementation chooses to maintain — MUST NOT supply authoritative answers about the user's data.

If the detail and inference projections disagree about a user-visible field, the implementation has a bug. The spec considers them obligated to agree on every fact derivable from the log.

6. The frontier and projection state

A node's causal frontier (per Sync) is itself a projection — it summarises the maximum HLC seen per author, which is computable from the op log alone. Implementations MUST maintain the frontier consistently with the log: after applying op O to projections, the frontier MUST reflect O's timestamp.

A node SHOULD persist the frontier to avoid re-scanning the entire log on startup, but a node that does not persist it MUST recompute it correctly on the first sync exchange.

7. Capability filtering and projection

Operations rejected by the authorise-and-filter pipeline (per Capabilities) MUST NOT enter projections. The integrity of "everything in projections is authorised" is load-bearing for any reasoning about projections under capability evolution: when a delegation is revoked, the re-evaluation step (per Capabilities) MUST remove from projections any record whose authority chain no longer authorises it, even though the underlying log entry is preserved.

8. Informative: storage strategies

Informative section. Does not impose requirements.

A reference-implementation deployment uses:

  • An in-memory window-segmented structure for the inference projection.
  • A SQLite database with a per-id table for the detail projection.
  • A separate petgraph::StableGraph for the debug-graph projection (rebuilt on demand rather than maintained).
  • An in-memory hash map for the application-layer salience projection's scores (see Annex §A.3).

Other deployments might combine them differently. The contract above is the only thing the protocol requires.

Invariants

The invariants in this chapter are the non-negotiable rules of the protocol. Every other normative statement in the specification exists to make one of these rules implementable; an implementation that violates any of them is non-conformant regardless of which other sections it satisfies.

This chapter is the canonical, formal version of the rules introduced informally in Concepts and Motivation.

I-1. Log canonicality

Only operations on the log mutate canonical state. No other mutation source is admissible. Any state held by an implementation that is not derivable from the op log is, by definition, not part of the user's knowledge graph.

Concretely:

  • A projection MUST NOT carry a field that has no derivation rule from the log.
  • An external integration (a UI, a scheduler, a third-party bridge) MUST mediate every mutation through the op-log layer.
  • Any apparent state — a notification badge, a cached thumbnail — that lives outside the log is a presentation artefact, not a fact.

I-2. Projection disposability

All projections are reconstructable from the op log. No projection's content may be load-bearing in a way that prevents rebuild.

Implementations MUST be able to drop any projection and rebuild it from the log alone. The rebuild is specified in Projections.

I-3. Transitive provenance to evidence

Every user-visible claim, episode, and suggested action has a chain back to evidence.

Concretely:

  • A claim's provenance field references the supporting evidence and supporting claims.
  • An episode's evidence_ids, claim_ids, and entity_ids fields are populated.
  • A suggested action's supporting_claims, supporting_evidence, and derivation_job fields are populated.

For each link in the chain, the referenced record MUST be present on the log (or be a tombstoned record whose absence is itself an op).

A derived record without provenance to evidence is malformed and MUST be rejected.

I-4. Derivation is a DAG

The derivation graph is a directed acyclic graph. A claim MUST NOT (transitively) cite itself. An operation that would introduce a cycle into the derivation graph MUST be rejected.

The entity-resolution graph (which entity merges into which) is not required to be acyclic; cycles in entity resolution are resolved deterministically per Mesh Coordination.

The DAG-ness of derivation is what makes invalidation decidable: a tombstone or rejection cascades forward along outgoing edges in finite steps.

I-5. Sync converges operations, not projections

Two nodes that have applied the same set of operations agree on canonical state. Differences in projection materialisation are permitted — implementations may differ in salience algorithms or indexing strategies — but the underlying truth they project from MUST be the same.

A v0.1 conformance test consists in part of:

  1. Send the same op log to two implementations.
  2. Verify their detail projections answer the same get-by-id queries identically.

Implementations whose detail projections disagree on facts derivable from a shared log are non-conformant.

I-6. Per-author HLC monotonicity

No author produces two operations with the same HLC value. Within a single author's stream of authored ops, HLC values MUST be strictly monotonically increasing.

This is enforced by the tick discipline in Clocks. A receiving node that observes two ops with identical (wall_ms, logical, node) from the same author MUST treat the condition as an integrity failure — the authoring node violated the protocol — and reject both ops.

The frontier-based sync cursor depends on this invariant.

I-7. Authentic authorship

Every authored operation is signed by its author, or is a deliberately sanitised op admitted by an authority chain.

There are exactly two ways for an op to appear unsigned on the wire:

  1. The op has been sanitised under a sanitize caveat per UCAN and Caveats. Such an op carries a sanitisation marker.
  2. The op is the bootstrap DelegateUcan that establishes a new node's binding, where authentication is provided by the embedded UCAN's own signature rather than by the op envelope's.

An op that arrives unsigned without satisfying one of these conditions MUST be rejected.

I-8. Atomic tombstone cascade

Removing evidence cascades atomically through derived data.

When TombstoneEvidence (or CascadeTombstone) is applied, every claim, entity merge, episode, suggested action, and inference snapshot that transitively depended on the tombstoned evidence MUST be invalidated as part of the same logical apply. A receiver MUST NOT observe a state in which the evidence is gone but its dependents remain "live" in projections.

Implementation strategies (single transaction, op-batched apply, idempotent retry) are at the implementation's discretion; the observable atomicity is what the spec requires.

I-9. Inference is recorded (when audit is in force)

A node performing inference under audit MUST emit an InferenceSnapshot artefact for every model call.

Audit is in force in two cases:

  1. The node is operating under the user's root delegation. The reference implementation, and any implementation a user runs on their own devices, falls into this category. For such nodes, audit is the default and is normative for v0.1 conformance — an inference call without a corresponding snapshot is a violation regardless of what other invariants the implementation satisfies. This is what makes the user's own personal mesh auditable end-to-end.

  2. The node is operating under a delegation whose caveats require audit. A user delegating to an organisation's node MAY attach an audit_inference caveat (specified in UCAN and Caveats) requiring the delegated node to emit snapshots for inference performed against the delegated data. In this case, the snapshots are themselves visible on the log the user receives back from the delegated node, completing the audit loop across organisational boundaries.

A delegated node operating without an audit caveat is not required by this invariant to record its internal inference. Whatever the delegated node does with the data it received — training, summarisation, classification, recommendation — is governed by the delegation's other caveats and by whatever out-of-band agreements the user and the delegated party have. This is a deliberate scope choice: the protocol's role is to let the user decide whether audit applies, not to mandate it for every party that ever processes a piece of the user's graph.

When audit is in force, every derived claim, every materialised record (including episodes and suggested actions where they exist as application-layer conventions), and every other inference output MUST link to its producing snapshot via the record's provenance fields or via causal_deps. The "how did it know?" question has, when audit is in force, a literal answer consisting of evidence and claims.

The snapshot artefact's required content (model identity, retrieved context, prompt, output, telemetry) is specified in Inference Audit §2.

I-10. Authority is verified per op

An operation is admitted to projections only if its authoring node held the necessary capability at the operation's timestamp.

Authority is verified by walking the chain of UCAN delegations from the authoring node to the user's root, at the operation's HLC timestamp. Operations whose chain is incomplete, expired, not yet active, or revoked MUST NOT enter projections.

When a delegation is revoked retroactively (per Capabilities), ops that no longer authorise MUST be removed from projections, even though their op-log entries are preserved.

A note on enforcement

These ten invariants are not aspirational. An implementation that violates any of them produces a system in which the user cannot trust what the system says about them — which is the condition the protocol exists to prevent.

A v0.1 conformance test consists of demonstrating that each invariant holds under a battery of concrete operations and sequences. The seven scenarios planned to ship with the reference implementation (see Implementations) are intended to collectively cover I-1 through I-10. Until those scenarios are public, the path to "behaviourally conformant for v0.1" is to construct equivalent coverage from this chapter's invariants directly.

State Machines

This chapter specifies the lifecycle state machines for the substrate record types whose transitions are non-trivial: claims, jobs, and node registrations. The job FSM was already specified in Mesh Coordination; this chapter restates it for completeness.

The Episode and Suggested-action FSMs used by the reference implementation are application-layer conventions and live in Annex: Application Conventions.

For each FSM:

  • States are listed with a brief description.
  • Transitions are listed with the operation that causes them and any normative constraints.
  • Cascading effects on other records are specified.

1. Claim FSM

A claim's lifecycle is the most consequential FSM in the protocol because user-visible recommendations depend on whether the underlying claims are believed.

1.1 States

StateMeaning
HintInitial low-confidence guess. Not surfaced to the user.
ClaimThe system is operating on this as a working belief.
FactUser-confirmed. Frozen against subsequent automatic invalidation.
DisputedThe system has conflicting claims about the same subject and predicate; surfacing requires resolution.
RejectedThe user (or downstream evidence) has invalidated this claim.
SupersededReplaced by a newer claim.
StaleA supporting source was invalidated; the claim's evidential basis no longer holds.

1.2 Transitions

FromToCauseConstraint
(creation)Hint or ClaimCreateClaim opInitial status.
HintClaimUpdateClaimStatusConfidence threshold passed.
ClaimFactUserAssert(Confirm)User-authored. The claim is frozen — subsequent non-user UpdateClaimStatus ops MUST NOT change its status.
ClaimRejectedUserAssert(Reject)Triggers the cascade in Section 1.3.
ClaimDisputedUpdateClaimStatusUsed when a conflicting claim of the same subject and predicate exists.
any non-FactSupersededSupersedeClaimThe replacement claim's claim_id MUST already be on the log.
any non-FactStalederivation cascadeTriggered by TombstoneEvidence of a supporting evidence record or Reject of a supporting claim.

1.3 Cascade on rejection

When a claim transitions to Rejected via a user assertion, the implementation MUST traverse the derivation DAG forward and invalidate every record that transitively depended on the rejected claim:

  • Dependent claims transition to Stale.
  • Dependent episodes transition to Stale (Section 2).
  • Dependent suggested actions transition to Rejected (Section 3).
  • Dependent inference snapshots are tagged stale; their artefacts MAY be evicted on the next eviction pass.

The cascade is part of the same logical op apply (per I-8 in Invariants).

1.4 Frozen-fact immunity

A claim with status Fact is frozen. The following constraints apply:

  • A non-user-authored op (UpdateClaimStatus, UpdateClaimConfidence, SupersedeClaim) targeting a frozen claim MUST be rejected.
  • A TombstoneEvidence op MAY cascade into a frozen claim (Section 1.3) only if the op is itself authored by a node with user-assertion authority. Otherwise the cascade stops at the frozen claim's boundary.
  • A UserAssert(Reject) op MAY override frozen state — the user retains the authority to change their mind.

2. Job FSM

Restated from Mesh Coordination for completeness.

StateMeaning
PendingScheduled but not claimed. Eligible for claim.
ClaimedA worker holds the lease.
CompletedA CompleteJob op terminated the job.

Transitions:

  • (creation) → Pending: ScheduleJob.
  • Pending → Claimed: ClaimWork.
  • Claimed → Pending: YieldWork or ExpireWork.
  • Claimed → Completed: CompleteJob.

Completed is terminal.

3. Node-registration FSM

A node's lifecycle in a mesh is governed by the UCAN delegation graph rather than by an explicit state field, but the observable states are useful to name.

3.1 States

StateMeaning
PendingThe node has authored its bootstrap DelegateUcan but the receiving nodes have not yet observed it.
ActiveThe bootstrap delegation has been observed; the node may author and receive ops per its capability set.
SuspendedA delegation in the node's chain is no longer in force (typically due to a parent's nbf/exp window) but is not revoked.
RevokedThe node's authority has been retired by RevokeUcan of a parent in its chain.

3.2 Transitions

FromToCause
(creation)PendingFirst op authored by an unknown node
PendingActiveBootstrap DelegateUcan observed and verified
ActiveSuspendedTime-bound delegation expired, but parent still valid
SuspendedActiveRenewed delegation issued
Active or SuspendedRevokedRevokeUcan of a parent in the chain

Revoked is not terminal in the sense that the same node identity can later be re-admitted by a new delegation chain; v0.1 implementations MAY treat Revoked as recoverable provided they re-evaluate the entire authority chain at the time of re-admission.

A node in Suspended MUST NOT have its newly-authored ops applied to projections; the ops remain on the log but are treated as if their authority chain were broken until the suspension lifts.

A node in Revoked MUST have its previously-applied ops re-evaluated per Capabilities.

4. Status precedence under user assertions

Across all FSMs in this chapter, user assertions take precedence over machine-derived state. Concretely:

  • A UserAssert(Confirm) is a one-way trip toward stronger belief; the affected record is frozen against demotion.
  • A UserAssert(Reject) is final; the affected record is invalidated and stays invalidated until a subsequent UserAssert overrides it.
  • A user-authored UpdateActionStatus (or analogous op for other record types) takes precedence over any system-authored op of the same shape.

The mechanism by which the receiving node distinguishes user-authored from system-authored ops is the authoring node's capability set: a node holding UserAssertion.Write without restriction (and bound to the user's own root) is "user-bearing" in the sense the spec needs. The exact implementation is in Capabilities.

Mesh Coordination

Part 2 of the specification, first chapter. This chapter is the work-distribution layer of Likewise. It depends on the substrate (Part 1) for op log, sync, signatures, capabilities, and projections; it adds the vocabulary by which multiple nodes cooperate on a single user's work.

The companion chapter Inference Audit covers the second concern of Part 2: how inference calls performed by audited nodes become recoverable artefacts on the log.

An implementation that wants to be a substrate peer — for example, an organisation's node consuming a scoped slice of a user's graph for its own internal purposes — does not need to implement this chapter. It can sync, verify, authorise, and read the log without participating in the work-routing machinery. An implementation that wants to participate in distributed work on a user's behalf — the reference implementation, a server the user runs at home, a delegated organisation node the user has asked to handle inference jobs — does need this chapter.

This chapter specifies how multiple nodes cooperate on a single user's mesh: how work is scheduled and claimed, how the designated coordinator's role differs from a peer's, how the owner routes specific job kinds to specific nodes, and how conflicts between concurrent claims are resolved.

The relevant operations were enumerated in Operations; this chapter specifies their semantics, state-machine effects, and authority requirements. The inference-snapshot artefact format that audited nodes emit when they execute work is specified in Inference Audit.

1. Roles in a mesh

A mesh has the following roles. A single node MAY hold more than one role.

  • Owner. The node that holds the user's root UCAN delegation. The owner is the only node authorised to author DesignateCoordinator and RouteKind ops (per Capabilities). In a typical deployment the user's phone is the owner.
  • Coordinator. The node currently designated to run the deterministic derivation pass. There is exactly one coordinator per mesh at a given log prefix. The owner designates the coordinator explicitly; there is no automatic election.
  • Worker. Any node with (Job, Claim) authority. Workers claim and execute scheduled jobs.
  • Peer. A node that is none of the above; it receives the log under whatever caveats apply to its delegation.

These roles are protocol-level. An implementation MAY add finer-grained internal roles (an "ingestion" role, a "surfacing" role); they are out of scope for v0.1.

2. The job state machine

A job is created by a ScheduleJob op and proceeds through the following states:

   Pending  ──ClaimWork──►  Claimed  ──CompleteJob──►  Completed
      ▲                        │
      │                        ├──YieldWork────────►  Pending
      │                        │
      └────────ExpireWork──────┘  (when lease HLC-deadline passes)

Transitions:

  • ScheduleJob: creates a job in Pending.
  • ClaimWork: a Pending job becomes Claimed. Authored by the worker; carries lease_duration_ms.
  • CompleteJob: a Claimed job becomes Completed. Authored by the current claimer.
  • YieldWork: a Claimed job returns to Pending. Authored by the current claimer.
  • ExpireWork: a Claimed job whose HLC-relative deadline has passed returns to Pending. May be authored by any node, not only the original claimer.

Completed is terminal. A job once completed is not re-claimed; subsequent ClaimWork ops naming a completed job MUST be rejected.

3. Lease expiry

A ClaimWork op carries lease_duration_ms and is authored at HLC timestamp claim_op.timestamp. The lease's effective deadline is:

deadline_wall_ms = claim_op.timestamp.wall_ms + lease_duration_ms

A job is considered expired at any point where some node's HLC has wall_ms > deadline_wall_ms. Expiry is measured against the HLC, not against any node's local wall clock; this makes expiry robust to clock skew across the mesh (see Clocks).

Any node MAY emit an ExpireWork op once it observes expiry. Multiple nodes MAY emit concurrent ExpireWork ops for the same job; the receiving nodes apply them idempotently.

A claimer that wishes to extend its lease MUST do so by emitting a fresh ClaimWork op (with a new op_id and current timestamp) before the previous deadline. There is no separate lease-renewal op in v0.1.

4. Conflicting claims

Two workers may emit ClaimWork ops for the same Pending job concurrently. The receiving node resolves the conflict by HLC total order: the ClaimWork with the smaller HLC value is the winner, and subsequent ClaimWork ops on the same job while it is Claimed MUST be rejected.

If both ops have an indistinguishable HLC value (which is only possible if the HLC tick discipline is violated; see Clocks), the receiver MUST reject both ops as an integrity failure rather than picking arbitrarily.

A worker whose ClaimWork was rejected SHOULD NOT immediately re-attempt; it SHOULD wait for the current lease to expire (after which the job is Pending again) or for a YieldWork op from the current claimer.

5. RouteKind

A RouteKind op directs all jobs of a given kind to a single node. While a route is set:

  • The owner-authored RouteKind is the authoritative target.
  • A ClaimWork op naming a job whose kind is routed MUST be rejected unless the claimer matches the routed target.
  • A worker whose (Job, Claim) capability admits the kind but who is not the routed target MUST NOT successfully claim routed jobs.

RouteKind ops follow last-write-wins semantics by HLC total order: the most recent RouteKind for a given kind is in force. Setting route to None clears the directive; the kind returns to the default "any eligible worker may claim."

RouteKind is owner-only; per Capabilities, an op carrying a RouteKind payload from a non-owner MUST be rejected.

A useful pattern enabled by RouteKind: a phone with no GPU schedules synthesis jobs and routes them to a server with a GPU. The phone never claims those jobs because the route restricts claiming to the server. The same delegation graph that authorises the server to do the work also authorises it to read the prompt context.

6. DesignateCoordinator

The coordinator is the node responsible for the deterministic derivation pass: the part of the inference pipeline that two nodes seeing the same op log MUST agree about. This typically includes auto-observation of evidence, entity resolution, and the rhythm pass (see the reference implementation's pipeline documentation for examples).

DesignateCoordinator ops:

  • MUST be authored by the mesh owner.
  • Take effect at their HLC timestamp.
  • May be re-issued at any time to change coordinator. The most recent DesignateCoordinator op (by HLC total order) is in force.

There is exactly one coordinator at any HLC timestamp. A node that is not the coordinator MUST NOT author derivation ops that the coordinator would normally author. An op that violates this rule MUST be rejected on receive.

An implementation MAY track the "non-coordinator drift" — the case where the coordinator has been quiet for an unusually long time — and surface it to the owner so the owner can re-designate. v0.1 does not specify an automatic re-designation; the owner remains in control.

7. Causal dependencies between jobs

The causal_deps field on every operation (introduced in Operations) carries a possibly-empty set of predecessor OpIds. For job ops, causal_deps is the mechanism for DAG chaining: a synthesis job can depend on the completion of a tool-use job by including the tool-use job's CompleteJob op id in its causal_deps.

A node receiving a job op with non-empty causal_deps MUST:

  • Verify each dependency is present on the local log (the op was either authored locally or received from a peer).
  • Defer applying the dependent op to its projections until all dependencies are present and applied.

Job dependencies form a DAG. A cycle in the dependency graph is a protocol violation; an implementation that detects one MUST reject the offending op.

8. The work roster

Implementations maintain a work roster projection that tracks each job's current state, claimer, and lease deadline. The roster is derived from the op log per the rules in this chapter. The protocol does not specify the roster's storage shape; it specifies only the queries the roster must answer:

  • "What is the current state of job X?"
  • "Which jobs of kind K are currently Pending and admitted by any active route?"
  • "Which Claimed jobs are past their deadline?"

These queries are sufficient to drive the worker loop: periodically check the roster for Pending jobs the local node may claim, attempt to claim them, execute, and emit CompleteJob (or YieldWork).

9. Worker etiquette

These are SHOULD-level recommendations for worker implementations to keep the mesh healthy:

  • A worker SHOULD NOT claim more jobs than it can complete within the lease duration.
  • A worker SHOULD emit YieldWork if it knows it cannot complete a job (low battery, going to sleep, handler error).
  • A worker SHOULD NOT race other workers for Pending jobs. After a ClaimWork is accepted by some node, other workers SHOULD back off until expiry or yield.
  • A worker SHOULD jitter its claim cadence to avoid thundering-herd effects in a mesh with many concurrent workers.

10. Job kinds

The kind field on a job is a typed work-kind string. The protocol does not constrain the namespace, but recommends reverse-DNS-prefix style. Examples used by the reference implementation:

  • cortex.extract.tier1 — per-evidence deterministic extraction.
  • cortex.enrich.tier2 — per-entity interpretive enrichment.
  • cortex.synthesize.window — per-window episode synthesis.
  • cortex.action.<verb> — action-execution handlers.
  • cortex.tool.<name> — tool-use handlers feeding inference.

A job kind is application-defined (the kind tells a handler what to do); the protocol's interest is solely in routing claims by prefix. An implementation MAY register handlers for kinds it recognises and ignore kinds it does not — claims for unrecognised kinds simply never arrive at the unrecognising node.

11. Inference produced by jobs

When a job's handler invokes a model, the resulting inference call is governed by Inference Audit — specifically, whether the executing node is operating under audit-in-force conditions and, if so, what artefact and link records the call must leave behind.

The relevant interactions with this chapter are:

  • The source_job field of a likewise.inference.snapshot artefact links the snapshot back to the job whose CompleteJob op closed it. Implementations correlate the two via the substrate's causal_deps mechanism.
  • The job's output_artifacts field on CompleteJob SHOULD include the snapshot's artifact_id when the job's handler performed audited inference.
  • Job kinds in the reverse-DNS namespace cortex.synthesize.*, cortex.extract.*, and cortex.tool.* are conventionally used for inference work; they are application-defined and not normative for v0.1.

See Inference Audit for the snapshot artefact's full content format, the linking rules from inference outputs back to snapshots, and the snapshot lifecycle.

Inference Audit

Part 2 of the specification, second chapter. This chapter depends on the substrate (Part 1) for op log, capabilities, and the generic artefact mechanism, and on the previous chapter (Mesh Coordination & Inference) for job and lease semantics. It specifies the convention by which inference calls become recoverable artefacts on the log.

The Likewise substrate (Part 1) lets a user own the canonical record of facts derived about them. The mesh coordination layer (the previous chapter in Part 2) lets multiple nodes cooperate on the work of producing those derived records. This chapter specifies the third concern of Part 2: how the inference itself — the model calls that produce many of the protocol's derived records — becomes recoverable and auditable.

Audit is what closes the "how did it know?" loop. A recommendation, a derived claim, a synthesised episode — each can be traced back, mechanically, to the model call that produced it, the prompt and context fed to that call, and the model's literal output. The mechanism is the likewise.inference.snapshot artefact: a typed artefact emitted alongside any audited inference call, riding the substrate's generic artefact machinery.

The chapter is short because the mechanism is small. Audit requires three things:

  1. a rule about when a node must emit snapshots,
  2. a content format for the snapshot artefact, and
  3. a linking convention that ties produced records back to the snapshot.

Each is specified in turn below.

1. When a snapshot must be emitted

A node MUST emit a likewise.inference.snapshot artefact for every model call it performs in either of the following cases:

  1. The node is operating under the user's root delegation. The reference implementation, and any node a user runs on their own devices, falls into this category. For such nodes audit is the default; an inference call without a corresponding snapshot is a violation of v0.1 conformance, regardless of what other invariants the implementation satisfies. This is what makes the user's own personal mesh auditable end-to-end.

  2. The node is operating under a delegation whose caveats include audit_inference: true. A user delegating to an organisation's node MAY attach this caveat (specified in UCAN and Caveats §5.6) to require the delegated node to emit snapshots for inference performed against the delegated data. The snapshots become themselves operations on the user's log, completing the audit loop across the delegation boundary.

In all other cases — a delegated node operating without an audit caveat — snapshot emission is optional. The node MAY emit snapshots for its own bookkeeping but is not required to. What the node does internally with the data it received is governed by the delegation's other caveats and by whatever out-of-band agreements the user and the delegated party have, not by this chapter.

This split is deliberate. The protocol's role is to let the user decide whether audit applies, not to mandate it for every party that ever processes a piece of the user's graph. Mandating audit universally would be unenforceable across organisational boundaries; making it caveat-controlled gives the user the lever they need without overreaching.

2. The likewise.inference.snapshot artefact

A likewise.inference.snapshot artefact is a CreateArtifact op (see Operations §8.1) whose artifact_type is the literal string "likewise.inference.snapshot". The artefact's content (the bytes referenced by content_hash, optionally inlined via content_inline) MUST be a postcard-encoded record with the following fields, in this order:

FieldPurpose
model_idThe identifier of the model used (e.g. "gemma-4-E2B-Q4_K_M").
model_versionThe model-specific version or revision tag.
backendThe inference backend ("llama-cpp", "litert-lm", ...).
retrieved_contextThe structured set of evidence ids, claim ids, and entity ids that were assembled into the prompt.
promptThe literal prompt sent to the model, including system message and user turns.
outputThe model's response, including any structured fields the handler parsed out.
telemetryWall-clock duration, token counts (prompt + completion), latency components if available.
started_at, completed_atHLC values bracketing the call.

The artefact's envelope source_job field MUST be set to the job_id of the job whose handler made the call. The inputs_used field MUST list every evidence id in retrieved_context (the substrate's generic-artefact contract already requires this for any artefact produced from evidence).

Additional implementation-specific fields MAY be present in the encoded record. Future minor versions of this specification MAY add reserved fields; an implementation that does not understand a future field MUST preserve it during round-trip rather than discarding it.

3. Linking from outputs to snapshots

Any record produced by a snapshot-emitting inference call MUST link back to the snapshot. Specifically:

  • A derived CreateClaim op produced by inference MUST include the snapshot's artifact_id in its provenance field.
  • A CreateArtifact op for any non-snapshot artefact produced by the same job (an embedding, a transcript) MUST set its source_job to the same job whose snapshot also references it, and SHOULD additionally include the snapshot's artifact_id in causal_deps.
  • For nodes that implement the application-layer conventions in the annex, a CreateEpisode op produced by inference MUST include the snapshot's artifact_id in its causal_deps. A CreateSuggestedAction op MUST set its derivation_job to the producing job and MUST additionally include the snapshot's artifact_id in causal_deps.

This is the chain that makes the "how did it know?" question mechanically answerable. Walking from any audited output to its snapshot, then from the snapshot to its retrieved_context, is the literal audit path.

A receiver MUST reject an audited output op whose link to a snapshot is missing or unresolvable on the local log. "Audited output" means: an op whose authoring node is operating under audit-in-force per Section 1, that the spec or a convention identifies as a class for which audit linking is required.

4. Snapshot lifecycle

Snapshot artefacts inherit the substrate's generic artefact lifecycle (eviction, tombstone-cascade) from Operations §8.1–§8.2. A node MAY set ttl_ms on snapshots it emits; once the TTL elapses, the snapshot is eligible for eviction under storage pressure.

Eviction is irreversible: once evicted, the snapshot's content is gone and the audit chain is broken from that point forward for any output that depended on the evicted snapshot. The CreateArtifact op remains on the log, so the existence of the inference call is still recoverable; only the contents (retrieved context, prompt, output) are lost.

Implementations operating under the user's root delegation SHOULD retain snapshots for at least the lifetime of the records that link to them, treating eviction as a last-resort under storage pressure. The user's own audit trail is among the most load-bearing data in the mesh; evicting it freely defeats the point.

Implementations operating under an audit_inference caveat SHOULD respect any retention window the delegating user has expressed in the delegation or in mesh-rules. v0.1 does not specify a wire-level retention-window field; the user communicates retention expectations through other means. Strengthening this is an open issue.

5. Relationship to the audit invariant

The normative consequence of this chapter is captured in Invariants §I-9. That invariant is the binding requirement; this chapter specifies the mechanism by which the invariant is satisfied.

A reader who wants the short version reads I-9. A reader who is implementing the audit layer reads this chapter.

6. What audit does not cover

The audit mechanism specified here is deliberately narrow. It covers:

  • Inference performed by a Likewise node, where "inference" means a call to a model that produces user-visible derived records.

It does not cover:

  • Inference performed off-protocol. A delegated party that receives a slice of the user's data and trains an internal model on it has not made a "Likewise inference call" for the purposes of this chapter, regardless of whether the user might wish they had. The protocol's lever for that scenario is the delegation's caveats — the user can refuse to delegate the data — not the audit invariant.

  • Statistical or aggregate inference. A retailer that receives many users' grocery rhythms and computes population statistics has not, under v0.1, performed a per-user inference call. Their internal pipeline is theirs to govern.

  • Auditing the model itself. The snapshot records the model identifier, but the protocol does not specify how to verify that the named model was actually the model that produced the output. Model attestation is a separate concern.

These exclusions are real and worth being explicit about. Audit is an important property the protocol provides; it is not a property the protocol can extend beyond its own boundaries.

Annex: Application Conventions

This annex describes conventions the reference implementation uses to surface the substrate to a user. The material here is non-normative. A v0.1 conformant node MAY implement these conventions, MAY substitute alternatives, or MAY omit them entirely.

The substrate (Part 1) and the inference pipeline (Part 2) are deliberately separable from the question of "how is this surfaced to a user." Different applications will make different choices about that question. The conventions in this annex are one set of choices that ship with the reference implementation; recording them here lets readers understand what the reference implementation is doing without conflating those choices with the protocol's load-bearing parts.

If you are implementing an interoperable node — for example, an organisation's node that synchronises a scoped slice of a user's graph — you can ignore this annex entirely. The substrate and the inference pipeline are sufficient to participate in a Likewise mesh. Applications that depend on the conventions in this annex will simply not find the records they expect, and that is allowed.

A.1 Episode operations

Episodes are temporally-bounded clusters of related evidence, entities, and claims. The reference implementation surfaces them to a user as narrative units: a trip, a project, a relationship arc, a meaningful day. They are not substrate primitives — nothing in the data model or sync protocol requires them.

A.1.1 CreateEpisode

FieldPurpose
episode_idAn EpisodeId.
titleShort title.
summaryOptional longer description.
temporal_startThe episode's start time.
temporal_endOptional end time; absence indicates ongoing.
evidence_ids, claim_ids, entity_idsSupporting records.
confidenceEpisode-quality score.

Per-run inference provenance for an episode is carried by an InferenceSnapshot artefact emitted alongside the CreateEpisode op (when audit is in force per Part 2). Implementations correlate episode and snapshot via causal_deps.

A.1.2 UpdateEpisode

FieldPurpose
episode_idTarget episode.
title, summary, confidenceOptional updates.
statusOptional transition (Active, Stale, Archived).
claim_ids_add, evidence_ids_addSupporting records to add.

A.1.3 Episode FSM

StateMeaning
ActiveThe episode is current; surfaceable.
StaleA supporting record was invalidated; episode no longer reflects reality.
ArchivedThe user has set the episode aside.

Transitions:

FromToCause
(creation)ActiveCreateEpisode
ActiveStalederivation cascade or UpdateEpisode { status: Stale }
Active or StaleArchivedUpdateEpisode { status: Archived }
Active or Stale(deleted)TombstoneEvidence cascading to all supporting evidence

A Reject user assertion targeting an episode transitions it to Stale and triggers the substrate's derivation cascade (see Invariants).

A Confirm user assertion targeting an episode MAY freeze it analogously to claim freezing; the convention does not specify this further for v0.1.

A.2 Suggested-action operations

Suggested actions are recommendations the system surfaces to a user — "send this message," "review this calendar," "reconsider this goal." They are pure UX: the system's outputs as visible to the user, in a refutable, lifecycle-tracked form.

A.2.1 CreateSuggestedAction

FieldPurpose
action_idAn ActionId.
title, descriptionUser-facing content.
action_typeShort identifier ("set_reminder", "create_album", "draft_email", ...).
source_episodeThe episode that motivated the action.
supporting_claims, supporting_evidenceProvenance.
derivation_jobThe job that produced the action. Required when audit is in force; suggested actions then trace to inference.
confidenceAction-quality score.

A.2.2 UpdateActionStatus

FieldPurpose
action_idTarget action.
new_statusProposed, Approved, Executing, Completed, Rejected, Dismissed, Failed, Expired.
execution_resultOptional details.

A.2.3 Action FSM

StateMeaning
ProposedThe system has surfaced the suggestion; the user has not yet acted.
ApprovedThe user accepted the suggestion.
ExecutingA handler is performing the action.
CompletedThe action finished successfully.
FailedA handler reported failure.
RejectedThe user explicitly rejected the suggestion.
DismissedThe user dismissed the suggestion (without rejecting it; it MAY resurface).
ExpiredA time-window for relevance passed without user action.

Transitions:

FromToCause
(creation)ProposedCreateSuggestedAction
ProposedApprovedUpdateActionStatus(Approved) (user-authored)
ProposedRejectedUpdateActionStatus(Rejected) (user-authored)
ProposedDismissedUpdateActionStatus(Dismissed) (user-authored)
ProposedExpiredUpdateActionStatus(Expired) (system-authored, when relevance window passes)
ApprovedExecutingUpdateActionStatus(Executing)
ExecutingCompletedUpdateActionStatus(Completed) with execution_result
ExecutingFailedUpdateActionStatus(Failed) with execution_result
DismissedProposedUpdateActionStatus(Proposed) (the system may resurface a dismissed action with new evidence)

A Reject user assertion on a suggested action SHOULD prevent the system from re-proposing the same action shape; the convention does not specify the exact mechanism for v0.1.

A.3 Salience projection

The salience projection is a ranking-for-display surface used by the reference implementation to decide which entities, episodes, and suggested actions to show the user now.

It is not a substrate primitive: a node that is not surfacing records to a user — for example, an organisation's node consuming a scoped slice of the graph — has no use for it. Implementations that do surface records to users will need some such projection; this section describes the shape the reference implementation adopted, which alternative implementations may use as a starting point.

A.3.1 Required queries

For an implementation choosing to support the convention:

  • Top-N by salience. Given a salience cap N and a time window, return the top N entities, episodes, or suggested actions ranked by a salience score.
  • Salience for an id. Given an entity, episode, or suggested action, return its current salience score.

A.3.2 Constraints

  • The salience projection SHOULD be in-memory (or fast enough to the user that it functionally is).
  • It SHOULD be small enough that an implementation can rebuild it from the log within seconds at the scale of a single user's data.
  • It MUST NOT be used as a UI store: queries over salience return rankings, not display payloads. Display payloads come from the detail projection (Projections, §2.3).

A.3.3 Score composition

The reference implementation composes salience as a weighted sum of components:

ComponentWeightMeaning
Recency0.20How recently the underlying evidence arrived.
Corroboration0.20How many independent claims support the record.
Upcoming0.25Proximity to a user-visible time horizon (next event, deadline).
Open loops0.25Whether the record represents an unresolved commitment.
Affinity0.10A user-tunable weighting toward certain entity types.

These weights are not part of the convention. They are recorded here because they are the values the reference implementation ships with; implementations adopting a salience projection are free to choose their own.

A.4 Why these are conventions, not substrate

The protocol's substrate is sufficient to express the user's knowledge graph and synchronise it across nodes. The inference pipeline (Part 2) is sufficient to perform distributed model calls with auditable provenance. Together those two layers are what makes it possible for the user to own what the system says about them — the load-bearing claim of the protocol.

What episodes, suggested actions, and salience scores add is a particular shape of user-facing application: narratives, recommendations, and a ranking-for-display surface. Those shapes are useful and the reference implementation provides them, but they are not constitutive of the protocol. An implementation without them is still a Likewise implementation; it is just one that has chosen to surface the substrate differently.

The org-as-peer scenario is the cleanest example. A retailer running a Likewise node holds a scoped delegation to a user's grocery-rhythm claim. The retailer's node does not surface anything to the user — it consumes a slice of state to inform its own systems. Episodes, suggested actions, and salience scoring are nonsense in that context. The substrate plus the inference pipeline (if the retailer chooses to participate in distributed inference) are sufficient.

A.5 Compatibility expectations

If your implementation chooses to support these conventions, it SHOULD do so in a way that interoperates with other implementations that also support them. Specifically:

  • Episode and SuggestedAction op variants SHOULD use the field shapes documented above so the reference implementation can consume them.
  • The Episode and Action FSMs SHOULD follow the transitions documented above.
  • Salience score composition is implementation-defined; there is no compatibility requirement.

If your implementation chooses not to support a convention, ops of the unsupported types arriving on the wire from a supporting peer SHOULD be silently ignored at the projection layer. The substrate-level handling — signature verification, authority check, application to the op log — proceeds as for any other op; the implementation simply does not maintain the projection state that the unsupported convention defines.

Open Issues

This chapter catalogues known cross-implementation hazards in v0.1 of the specification. Each entry describes the issue, the concrete risk it presents, and the direction of the eventual resolution. Some of these will be addressed in a future minor version; some will require a major version. Where this chapter makes commitments to future versions, those commitments are non-normative.

The chapter exists for two reasons. First, hiding known hazards from implementers is a worse outcome than acknowledging them up front. Second, an open public list of known issues is how the specification gets corrected — it invites the discussion that produces v0.2 and v1.0.

If you discover a hazard not listed here, please open an issue against the specification repository.

OI-1. Wire format has no version tag

The postcard encoding of an operation does not carry an explicit format-version field. Schema evolution within a minor version is constrained to additions only, but a binary mismatch between two implementations using different incompatible schema versions will fail at decode time without a graceful error.

Risk. A future schema migration that is not strictly additive — for example, removing or repurposing a field — will silently corrupt logs read by an implementation expecting the old shape.

Direction. The next major version is expected to introduce a leading version byte (or a varint) on every op envelope, allowing recipients to dispatch on schema version explicitly.

Workaround in v0.1. Implementations SHOULD include the specification version they implement in their bearer token metadata or in a discovery endpoint, so peers can refuse to sync with mismatched versions before the wire format issue manifests. The X-Likewise-Mesh-Rules-Hash header (see Sync) provides a partial signal.

OI-2. Causal frontier cursor is opaque

The since cursor passed to GET /ops is the base64url encoding of a postcard CausalFrontier value. This is opaque from the client's perspective beyond the empty-frontier special case, and there is no negotiation about its format.

Risk. An implementation that changes the underlying CausalFrontier representation in a non-additive way will break clients that hold persisted cursors from an earlier version.

Direction. Future versions are expected to specify a versioned cursor envelope or to standardise the CausalFrontier shape explicitly so changes are detectable.

Workaround in v0.1. Implementations MUST treat cursors as write-once-then-echo: a client sends back exactly what the server returned in X-Likewise-Next-Frontier. Implementations SHOULD discard cached cursors on protocol-version upgrades.

OI-3. Sanitised ops carry no signature, by design

A sanitised op has its signature cleared to None and is distinguished from corruption by a sanitisation marker (see Wire Format).

Risk. A receiver that does not implement marker checking will either reject all sanitised ops as corrupted (denying service to legitimate filtered traffic) or accept all unsigned ops as sanitised (admitting forged traffic). The marker check is the only thing distinguishing the two cases.

Status. This is a design decision, not a defect. The specification's contract is that sanitisation is intentional and the marker is verifiable against the sender's delegation chain.

Direction. A future revision MAY introduce a hash-chain mechanism that lets a receiver verify a sanitised op's provenance to its pre-sanitisation form, addressing the "delegated trust" concern at the cost of additional bytes on the wire. v0.1 does not include this.

OI-4. Mesh-rules drift has no negotiation

Two peers with different X-Likewise-Mesh-Rules-Hash values pause sync (per Sync). There is no automatic protocol for resolving the divergence.

Risk. A long-running mesh whose rules document has incrementally drifted on one node (typically because the operator updated it) will lock that node out of sync until the divergence is resolved manually.

Direction. A future revision is expected to define a mesh-rules-negotiation pre-handshake: peers exchange rule documents and either adopt the newer common version or explicitly refuse to interoperate. The exact mechanism is open.

Workaround in v0.1. Operators MUST manage rules versioning out of band (e.g., by deploying rule updates to all nodes in lockstep). Implementations SHOULD log mesh-rules-hash mismatches loudly enough to catch operator errors early.

OI-5. HLC skew tolerance is implicit

The protocol does not specify a maximum allowable wall-clock skew between a node and the operations it accepts (per Clocks). A node whose clock is far in the future can effectively rewrite the order of the mesh's history by emitting future-dated timestamps; receiving nodes will adopt the larger wall_ms on receive.

Risk. A compromised or malfunctioning node can dominate the HLC ordering for the rest of the mesh, distorting the meaning of "before" and "after" for as long as it does so.

Direction. A future revision is expected to specify a negotiated skew bound as part of the mesh-rules document: operations whose wall_ms exceeds the recipient's local time by more than the bound are rejected.

Workaround in v0.1. Implementations SHOULD warn on operations whose timestamp is more than one hour ahead of the local wall clock. They MAY refuse to accept such ops as a local policy choice, but doing so is not specified by v0.1 and may cause sync to lag.

OI-6. UCAN v0.10 is the wire format

v0.1 implementations carry UCAN v0.10 (JWT-shaped) tokens. The UCAN working group has moved to v1.0 (DAG-CBOR + Varsig + CIDv1 envelopes). v0.10 is no longer the upstream's preferred format.

Risk. Tooling and ecosystem support for v0.10 will atrophy over time. New external libraries will target v1.0 and inter-protocol interop (with other UCAN-using systems) will be harder.

Direction. The next major version is expected to migrate to UCAN v1.0. The migration is non-trivial: token canonical form, signature shape, and the proof-chain reference encoding all change. The migration MAY be staged (v0.10 and v1.0 co-existing during transition) or atomic; the working group will decide.

Workaround in v0.1. Implementations are stuck on v0.10. They SHOULD isolate the UCAN implementation behind a narrow interface so the migration is a contained change.

OI-7. No bulk-transfer mode for first sync

Catching up a long-disconnected node from genesis requires paginated GET /ops calls (per Sync). For a mesh with millions of ops this can be slow.

Risk. Onboarding a new node, or recovering a node that has been offline for an extended period, takes longer than it needs to.

Direction. A future minor version is expected to add a bulk-transfer mode (likely a streaming response with a specific Accept header on GET /ops) that ships a snapshot plus a delta from a known checkpoint.

Workaround in v0.1. Implementations MAY ship the underlying storage offline (USB drive, file copy) for first-time onboarding, then resume incremental sync. This is operator choice, not a protocol mechanism.

OI-8. No server-initiated push hints

The sync protocol is pull-based. A node learns of new operations only when it polls. There is no server-initiated push of "you have new operations to fetch."

Risk. Propagation latency is bounded below by the polling cadence, which is in tension with battery and bandwidth considerations on mobile nodes.

Direction. A future minor version is expected to add an optional WebSocket or webhook endpoint a server can use to hint a peer that fresh operations are available. Hints are advisory; the actual op exchange remains pull-based for authoritative correctness.

Workaround in v0.1. Implementations choose polling cadences that balance latency and resource use (typical defaults: 30 seconds on stable connections, 5 minutes on metered).

OI-9. No confidential sync

A peer can probe a node for the existence of operations it is not authorised to receive by sending crafted since cursors and observing the response shape. The capability filter prevents the operations themselves from being returned, but it does not prevent a peer from learning that the unreachable operations exist.

Risk. An attacker with read access to part of the log can infer the existence and approximate timing of operations they are not authorised to see.

Direction. A future revision is expected to adopt a confidential-sync mechanism (likely modelled on Willow Protocol's private-set-intersection-style approach), where peers cannot probe for unauthorised operations at all. This is a significant protocol redesign and is unlikely to land before a major version bump.

Workaround in v0.1. Implementations SHOULD NOT distinguish "unauthorised op" from "no op" in their response shape (return the filtered op set without any "filtered N" indicator). Implementations MUST NOT return per-op "you are not authorised" errors, which would themselves leak the existence of the filtered ops.

OI-10. Predicate vocabulary is not yet standardised externally

The set of claim predicates is centralised in this specification but is not yet structured for external extension. An application wishing to add a new predicate (for a new domain — health data, financial data, professional context) must either propose it for inclusion in the specification or hijack a generic predicate.

Risk. Without an external-extension mechanism, the predicate vocabulary either grows to encompass every imaginable domain (unwieldy) or fragments across non-standard predicate strings (non-interoperable).

Direction. A future minor version is expected to introduce namespaced predicate prefixes (e.g., org.cortex.location_at versus com.example.medical.diagnosed_with) and a registry mechanism for third-party namespaces.

Workaround in v0.1. Stay within the existing vocabulary where possible. For application-specific extensions, use the custom-metadata field on evidence and let consumers interpret it; do not author claims with non-vocabulary predicates.

How to propose changes

Each open issue here is a candidate for revision. To propose a direction, open an issue on the specification repository (see Contributing) and reference the OI number above. Substantive changes are expected to land in v0.2 (additive minor) or v1.0 (backwards-incompatible major) depending on scope.

Implementations

This page lists known implementations of Likewise and explains what conformance means.

Status

There is no public Likewise implementation at the time this specification was first published. The protocol was developed alongside an in-progress Rust implementation that the authors have been working under the codename "Cortex". The codename is not a committed product name — the implementation may eventually ship as the baseline Likewise app itself, or under a different name entirely; that decision has not been made. Where this specification refers to "Cortex," read it as "the in-development reference implementation."

Cortex is currently in private development on macOS and iOS. It is not yet released, and this page makes no commitments about its release timing. When it does become public, this page will be updated with repository links, the final name, and conformance notes.

The text below describes the intended shape of the reference implementation and the intended behavioural-conformance suite. Both should be read as forward-looking; neither is currently available for download.

The reference implementation (codename Cortex)

The reference implementation is a Rust implementation of Likewise that runs on macOS and iOS as a small mesh of nodes communicating over HTTP. The user runs a node on each of their devices. It was the implementation against which this specification was written, so where the specification is silent or ambiguous, its intended behaviour is the strongest signal about what was meant — practically, this matters less than it would for a published specification, because the implementation is not yet available for an implementer to compare against.

Intended reference behavioural tests

When the reference implementation is published, it will ship seven end-to-end scenarios that exercise the wire surface against a real engine, real SQLite storage, and real HTTP loopback transport. The intent is that these scenarios constitute the reference suite for behavioural conformance:

  1. solo — single-node ingest, derivation, projection rebuild.
  2. warm-restart — node restart recovers state from the log alone.
  3. enrollment — the UCAN delegation handshake that admits a new node to a mesh.
  4. scoped-enrollment — the same handshake under caveat restrictions, including sanitisation rules and revocation.
  5. claim-lifecycle — claim FSM transitions, derivation DAG cascade on user assertion, and frozen-fact immunity.
  6. tool-use-agent-loop — non-inference job handlers chained with depends_on, inference-snapshot artefacts, and suggested- action approval, on a single node.
  7. mesh-agent-loop — the same loop distributed across three specialist nodes (phone, inference, tools) cooperating via RouteKind and cross-node depends_on.

A second implementation that passes equivalents of these seven scenarios — wired into its own engine and transport, against its own storage — is what "behaviourally conformant for v0.1" is intended to mean. The scenarios are not the spec; the spec is the spec. The scenarios are how we plan to operationalise it once the reference implementation is public.

Compatible implementations

There are no public implementations of any kind at the time of writing. When implementations exist, this page will list them. To submit one, see Contributing.

(Or — open an issue, paste a link to your implementation and a brief description of what it covers, and we will add it.)

What conformance means

The specification distinguishes four levels of conformance:

Level 1 — wire-format conformance. The implementation can read and write operations that an existing v0.1 implementation will accept and apply correctly. It honours the postcard encoding, the canonical signing rules, and the HTTP sync endpoint shape.

Level 2 — semantic conformance. In addition to Level 1, the implementation respects the projection contract — it answers queries about an op log identically (modulo intentional optimisations) to the reference implementation, given the same op log as input.

Level 3 — capability conformance. In addition to Level 2, the implementation honours UCAN delegations and caveats correctly — including sanitisation, transitive revocation, and the attenuation-only re-delegation rule.

Level 4 — full behavioural conformance. In addition to Level 3, the implementation passes equivalents of the seven reference scenarios listed above.

An implementation may claim a level publicly. We strongly recommend explicit mention of the conformance level along with the test artefacts that demonstrate it, so users can assess trustworthiness without reading the source.

Compatibility expectations across versions

The specification is versioned (see Conventions for the current version). Two implementations on the same major version SHOULD interoperate without negotiation. Two implementations on different major versions MAY refuse to interoperate; the X-Likewise-Mesh-Rules-Hash header on the sync endpoint is the v0.1 mechanism by which a mismatched pair detects this and pauses sync rather than corrupting each other.

A future revision will clarify the negotiation protocol for mesh-rules drift; this is tracked as an open issue.

Implementation notes for new ports

A handful of practical observations from building the reference implementation that may save another implementer time:

  • The HLC tick discipline is the single most common source of divergence bugs. Treat it as load-bearing from day one. See Clocks.
  • The signature canonicalisation rule (clear the signature field on the op, encode, then sign and put the signature back) is easy to get subtly wrong. The detached-JWS output is what crosses the wire; the in-storage representation contains the signature.
  • The projection split exists because collapsing it into one fat state object produces a system that is too slow for ranking, too lossy for UI, and too memory-hungry for inference contexts. Implementers porting from a single-store substrate should resist the urge to fold them.
  • Sanitisation clears signatures intentionally; an implementation that treats signature absence as corruption will reject legitimately filtered ops. Distinguish the two cases up front.
  • Job and lease ops use the HLC for lease expiry, not a wall clock. Implementations that read the wall clock to decide whether a lease is expired will misbehave when nodes have skewed clocks.

Calling the project

The protocol is "Likewise." When citing it, please use that name and a link to this specification.

The reference implementation is currently working under the codename Cortex. The codename is provisional. Its eventual public name is not fixed — it may ship as the baseline Likewise app itself, or under another name. Treat any "Cortex" references in this specification as shorthand for "the in-development reference implementation"; if and when the implementation is released under its final name, this page will be updated.

What is committed: the protocol is Likewise, the standard is this document, and the implementation — whatever its final name — is one realisation of it.