← back to blog
·72 min read·by Super Admin

Software Architecture Has No Rigor — and AI Feeds on It

Every serious engineering field can verify a design before it's built. Software architecture can't — and an AI generating against an unverifiable spec will hallucinate, duplicate, and reorder with impunity. Here's the method I use at Guliel: model software as what it actually is — data, transformations, and a place to run — into a category you can check.

Every other engineering field can check its work

Design a circuit and you can interrogate it. Kirchhoff's laws either hold or they don't; the design is a formal object and "correct" is computed, before a single component is soldered. Structural engineering has its load calculations. Network design has its capacity and routing proofs. Each of those fields has a notion of a design being verifiably wrong — a question you can ask the design itself, on paper, and get a real answer.

Software architecture has nothing of the kind.

I spent years feeling this before I could name it. Every time I designed with boxes and arrows — UML, an architecture diagram, a PRD full of rectangles — something felt off. Not that the diagrams were ugly. It was that my actual reasoning about the system — this composes with that, this would be redundant, this boundary is going to leak — none of it fit on the diagram. The diagram could record a conclusion but not the deduction that produced it. And a picture that holds conclusions but not reasoning can never tell you a conclusion is wrong.

So I went looking. I read the academic software-architecture literature hoping for something rigorous, and found the opposite: dozens of definitions of "architecture," "component," "connector," each subtly different, most unusable, and across all of them no deduction rules — nothing that lets you start from a design and derive a property. You were left to draw, ship, and discover what should have been refactored only once the code existed and the mistake had teeth.

Then I started modelling systems as categories, and the missing thing appeared. A category is not a picture of a system; it is a system you can compute with. Its diagrams have laws. Some paths through them must be equal — they must commute — and that is a fact you can check. An architecture stops being a sketch you defend in a meeting and becomes an object you can be wrong about, on paper, before the code exists.

And there is a reason this stopped being a craftsman's preference and became urgent. We now hand these specs to AI. An unverifiable spec was a slow, quiet tax when humans wrote all the code. Handed to a machine that generates against it at scale, it is something else entirely. I'll come back to that — it is the sharpest argument for everything here.

How I actually design — from the outside in

First, the workflow — because the parts only make sense in the order you'd actually meet them.

I design from the outside in. Not from the database. From the human.

  1. Requirements. What does the user need to accomplish?

  2. Interaction surface. How will they touch it — a screen, a phone, an API call, an email? This is a physical decision, and it comes early, because it constrains everything after it.

  3. Result data. Given that interaction, what data structure makes it feel effortless? This is the shape the screen wants — designed before, and independently of, how that data is produced.

  4. Source data + transformations. Now: what do I need to store, and what algorithms turn the stored shape into the result shape?

  5. Locations. Where does each of those things physically live and run — which server, which thread, the browser, the database?

  6. Components. Finally, bundle all of that into modules a team can own.

Step 3 — the result data — comes before step 4, the source data, and that is deliberate. The shape the interface wants is pulled from the interaction; the source data and the code are pushed to produce it. Designing the database first and hoping a nice screen falls out is how you get screens that fight their own data.

That workflow kept producing good systems, so eventually I asked the obvious question: what is it actually made of? Strip the process away, and underneath there are only four things.

The four atoms

A running program does exactly two things, and only two. It holds and transforms data, and it moves that data between physical places. Four atomic parts, in two pairs — and each one is a category: a set of objects together with the structure-preserving maps, the morphisms, between them. That is what will make the whole design checkable. The precise version is one section down; the names first.

The logical pair — what the software means:

  • Data (Dat) — its objects are the data types; its morphisms are the stored, structural relations between them — a field, a foreign key.

  • Transformations (Trn) — its objects are the algorithms that rewrite data. Pure meaning; no notion of where.

The physical pair — what the software occupies:

  • Locations (Loc) — the physical sites where code runs and data rests: a thread, a core, a region of RAM, a disk, a network card.

  • Transmissions (Trm) — moving one piece of data across a boundary, from one location to another.

The logical pair is the software in a vacuum. The physical pair is what drags it into the real world — because software in a vacuum manifests nothing. No screen, no input, no interface. An interface exists only because data is transmitted to a physical boundary a human can see or touch. That is not a detail; it is half of what an architecture is — and the half that boxes-and-arrows diagrams quietly omit.

graph LR
    subgraph Logical["Logical — what it means"]
        Data["Dat — data"]
        Transformations["Trn — transformations"]
    end
    subgraph Physical["Physical — what it occupies"]
        Locations["Loc — locations"]
        Transmissions["Trm — transmissions"]
    end
    Transformations -->|"in"| Data
    Transformations -->|"out"| Data
    Transmissions -->|"from"| Locations
    Transmissions -->|"to"| Locations
    Transmissions -.->|"carry"| Data

    style Data fill:#4f8cf7,color:#fff
    style Transformations fill:#7fc47f,color:#000
    style Locations fill:#f77f7f,color:#fff
    style Transmissions fill:#7fc4c4,color:#000

The categorical machinery

Those four names — Dat, Trn, Loc, Trm — plus the two that complete the method, placement and component, are not metaphors. Each is a real category; the relations among them are real functors. Pinning that down now, before the examples, is deliberate: it is what lets the rest of this article say "morphism," "commute," or "this reduces to that" and mean something you can check — not architecture jargon dressed in Greek letters.

You do not strictly need the formalism to follow the method. But this is written for engineers, and the whole point is rigor, so it belongs here, in front of the examples — not in an appendix. Expand it now, or read the article through and expand it when a term first bites. It opens with the definitions to keep in hand — the notation the whole style gallery below runs on — and ends with a worked reduction you can copy.

Deep dive: the categorical machinery

The loose words in the main post — relation, compose, deduced, can be wrong — each have an exact counterpart. The exact language is category theory, and it is worth the page because it turns "good architecture" from taste into something you can check. This section is the reference: read it once, and every style in the gallery below reads in the same notation.

Definitions to keep in mind

Six terms. The gallery uses nothing else — keep them in hand and skip back here when one bites.

  • Category — a set of objects and arrows (morphisms) between them. Arrows compose: if f : A → B and g : B → C then g ∘ f : A → C exists; composition is associative; every object has an identity arrow. That is the entire definition. Anything shaped like "things, and structure-preserving ways to get from one to another" is a category.

  • Morphism signaturef : A → B. Always write the source and the target; the signature is half the content.

  • Partial morphismf? : A → B, defined on only some of A. It is how optionality (a nullable field, a value present only in one case) is written.

  • Commuting diagram — a diagram commutes when any two paths with the same start and end are equal as morphisms: g ∘ f = h. Commuting is the property you check — a picture cannot fail to commute, a category can.

  • Functor — a structure-preserving map between categories: it sends objects to objects and arrows to arrows, and preserves composition and identities.

  • Free monoid A* — the finite sequences of A under concatenation: a log, a history, an append-only stream.

One more, used only for cross-cutting concerns: a natural transformation is a uniform family of morphisms relating two parallel functors — one arrow per object, every square commuting. "The same wrapper, applied everywhere."

The four atoms are four categories

A running system holds and transforms data, and moves it between places. Those are four categories.

  • Dat — objects are data types (entities, plus primitives , 𝕊, 𝔹, Date); morphisms are the stored, structural relations between them — a field, a foreign key. Rows are products ×, tagged unions are sums , a history is a free monoid A*.

  • Trn — a transformation (an algorithm). Here is the first sharp move: a transformation is an object, not an arrow — because the next step places it, and you place objects, not arrows. Each carries two projection morphisms into Dat:

    Morphism

    Signature

    Semantics

    t_from

    Trn → Dat

    the input type the transformation consumes

    t_to

    Trn → Dat

    the output type it produces

    The algorithms still compose — that structure is recovered as the free category on these objects — but now each is a thing you can point at. Effectful transformations are marked .

  • Loc — objects are physical execution sites: a thread, a core, a RAM region, a disk, a NIC, and composites of them (Browser, RouteHandler, Postgres). Morphisms are adjacency — "can hand off directly to."

  • Trm — a transmission (one datum crossing a location boundary). Like a transformation it is an object, with three projections:

    Morphism

    Signature

    Semantics

    c_from

    Trm → Loc

    source location

    c_to

    Trm → Loc

    target location

    carries

    Trm → Dat

    the datum on the wire — there is no untyped transmission

Dat and Trn are the logical pair — what the software means. Loc and Trm are the physical pair — what it occupies. An architecture is the application of the logical onto the physical.

Placement is a span, not a function

That application is the key correction. When you first formalise it, you write "every transformation has a location" — a function runsAt : Trn → Loc. It is false. A validation runs in the Browser and on the RouteHandler. A render runs server-side and client-side. One transformation, many locations.

So "where does T run" is not a function — it is a relation. The fix is to reify the pairing: a placement is its own object, with projection morphisms.

Placement

Projections

Meaning

TrnLoc

tl_trn : TrnLoc → Trn · tl_loc : TrnLoc → Loc · tl_cmp : TrnLoc → Cmp

a transformation deployed at a location, inside a component

DataLoc

dl_data : DataLoc → Dat · dl_loc : DataLoc → Loc · dl_cmp : DataLoc → Cmp

a datum materialised at a location, inside a component

TrmCmp

ts_trm : TrmCmp → Trm · ts_cmp : TrmCmp → Cmp

a transmission used by a component

Each is a span — an apex object with arrows out to the things it relates. Because a placement is its own object, one transformation can be the target of many TrnLoc projections — one per place it runs.

graph TB
    TrnLoc["TrnLoc<br/>(a placement)"]
    DataLoc["DataLoc<br/>(a placement)"]
    Trn["Trn"]
    Dat["Dat"]
    Loc["Loc"]
    Cmp["Component"]

    TrnLoc -->|"tl_trn"| Trn
    TrnLoc -->|"tl_loc"| Loc
    TrnLoc -->|"tl_cmp"| Cmp
    DataLoc -->|"dl_data"| Dat
    DataLoc -->|"dl_loc"| Loc
    DataLoc -->|"dl_cmp"| Cmp

    style TrnLoc fill:#f7c04f,color:#000
    style DataLoc fill:#f7c04f,color:#000
    style Trn fill:#7fc47f,color:#000
    style Dat fill:#4f8cf7,color:#fff
    style Loc fill:#f77f7f,color:#fff
    style Cmp fill:#cf7fcf,color:#fff

"Where does T run" is then the fibre { tl_loc(tl) : tl_trn(tl) = T } — the set of placements projecting to T. It may be empty (an unused transformation), a singleton, or larger. runsAt was never a function.

Components compose

A component (Cmp) is a cohesive bundle of placements sharing one Cmp value. What it owns is deduced — never stored twice — as the fibres of the projections: data(c) is the DataLocs with dl_cmp = c, behaviour(c) the TrnLocs with tl_cmp = c, locs(c) the locations they occupy.

Components compose. Two of them glue along a shared transmission — a port of one and a port of the other naming the same Trm — and in the composite C = A ⋈ B that transmission becomes internal. Gluing is associative and has an identity (the empty pass-through component), so components form a category Comp: objects are interfaces, morphisms are components, composition is port-matching. This is the formal content of "a system is made of services" — a composed service's data, behaviour and locations are computed from its parts, not redrawn by hand.

Coherence laws — where "wrong" lives

Because every relationship is a structure-preserving map, "the architecture is well-formed" becomes a handful of equations. The load-bearing one:

Law 1 — Placement honesty. For every placement of a transformation T at a location L, each input t_from(T) is either materialised at L (a DataLoc over that type at L) or delivered to L (a Trm with carries = t_from(T) and c_to = L). No transformation reads data that is not present at its location.

Read backwards, it is a defect detector: a transformation reading data no transmission carries to its location is a provable hole, not a style choice. The others are the same idea — every transmission is typed and crosses a real boundary (c_from ≠ c_to); a cross-location dependency is mediated by a Trm, never a direct reach; a composed component's parts must not contradict each other. An architecture is well-formed exactly when the maps between its categories commute. A failed law names the missing piece — and that is the whole reason to bother.

How a diagram deduces — a worked reduction

This is the move the method exists for, in the formal language. The article runs it on Invoice and Expense; the five steps never change, and apply to any two objects you suspect are one.

Step 1 — Write the naive model down, rigorously. Start where intuition starts — Invoice and Expense, two objects of Dat — and write every morphism out of each, with its target:

Invoice morphism

target

Expense morphism

target

issuer

Organization

payer

Organization

customer

Organization

supplier

Organization

total

Money

total

Money

lines

LineItem*

lines

LineItem*

issuedAt

Date

incurredAt

Date

categorize

Category

The instant you write the targets, something the noun never showed surfaces: both objects' morphisms land in the same objects — Organization, Money, Date, LineItem. They share a centre.

Step 2 — Map one category onto the other. Propose a functor F : Expense-cat → Invoice-cat. On objects it is forced — Expense ↦ Invoice, every shared object to itself. On morphisms, pair by target: supplier ↦ customer, payer ↦ issuer, total ↦ total, lines ↦ lines, incurredAt ↦ issuedAt.

Step 3 — Check the squares commute. A mapping means nothing unless it respects structure. Take total: the square commutes exactly when total ∘ F = identity ∘ total — and since F and identity are identities on the targets, that collapses to one question: do the two totals compute the same function? They do. Walk Organization: supplier ↦ customer (both "the other party"), payer ↦ issuer (both "the owning tenant"). Every shared- structure square commutes.

Step 4 — Read the verdict. Every square commuted, and F is a bijection on objects. A functor that is bijective on objects and commutes everywhere is the identity in disguise. There was only ever one category. Invoice and Expense are the same object of Dat — a proof, not a preference.

Step 5 — What did not commute is the other half of the answer. categorize : Expense → Category had no partner; the direction of the money is a genuine difference. The morphisms that fail to commute are the real distinctions — and the deduction hands them to you precisely. They are not erased, they are segregated: kept as partial morphisms on the unified object, selected by a discriminator.

graph LR
    Doc["Document"]
    Org["Organization"]
    Money["Money"]
    DateO["Date"]
    Dir["{ INCOMING, OUTGOING }"]
    Cat["Category"]
    Doc -->|"counterparty"| Org
    Doc -->|"owner"| Org
    Doc -->|"total"| Money
    Doc -->|"date"| DateO
    Doc -->|"direction"| Dir
    Doc -.->|"category? · partial"| Cat
    style Doc fill:#4f8cf7,color:#fff
    style Org fill:#f77f7f,color:#fff
    style Money fill:#f7c04f,color:#000
    style DateO fill:#f7c04f,color:#000
    style Dir fill:#cf7fcf,color:#fff
    style Cat fill:#cf7fcf,color:#fff

So a reduction is two moves at once — consolidate every morphism whose square commuted onto one object, segregate every morphism that did not as a partial morphism. The output — one documents table, a direction column, category nullable, the expenses table dropped — is read straight off the diagram. The model you can verify is the model you build from.

One dividend: strategies

Two TrnLocs over the same Trn — possibly different code, same t_from → t_to job — are parallel arrows fitting one slot. That is the exact shape of a strategy: interchangeable implementations behind one interface. Country tax handlers, swappable PDF templates, a live data provider versus a mock — all the same structure. Adding a strategy is adjoining one parallel arrow, never editing the core. The method does not just describe a system; it names the axis along which it extends without being rewritten.

Software isn't objects — it's data, transformations, and a place to run

We have the four atoms. So here is the question that decides whether you can actually use them: why does almost everyone model software with something else entirely — objects?

Object-oriented programming taught us to model the world as objects: an Invoice is a class, an Expense is a class. It feels natural because the words are nouns. But a class is not one of the atoms — it is a premature bundle of several. A class fuses a Dat object (its fields) with the Trns that act on it (its methods), and silently pins them to one Loc — one heap, one process — and it makes you commit to that bundle before you have done any analysis. The CPU never sees the bundle. It sees Dat, Trn, Loc separately. The object is a story narrated on top — and the story has a cost.

// The object-oriented instinct: every noun gets a class.
class Invoice {
  lineItems: LineItem[];
  customer:  Customer;       // who we billed
  issuer:    Organization;   // ...us
  total(): Money { /* sum line items, apply tax */ }
  send(): void { /* ... */ }
}

class Expense {
  lineItems: LineItem[];
  supplier:  Supplier;       // who billed us
  payer:     Organization;   // ...us, again
  total(): Money { /* sum line items, apply tax — a second time */ }
  categorize(): void { /* ... */ }
}

// ...and, written by someone else on another day, two more nouns:
class Customer { name: string; taxId: string; address: Address; }
class Supplier { name: string; taxId: string; address: Address; }

Four classes — and the rot is already visible: total() written twice, and a Customer and a Supplier that are line-for-line identical. This is not a contrived example. It is what any codebase touched by many hands — or by an AI — drifts into, because nothing makes the redundancy checkable.

Map it onto the atoms and the checking begins. All four — Invoice, Expense, Customer, Supplier — are objects of Dat. Their fields (customer, issuer, name, taxId, …) are Dat-morphisms, the structural maps out of each object. total is not a Dat-morphism — it is an algorithm, an object of Trn. OOP fused the Dat object and its Trns into each class, and that glue is exactly what hid the structure.

Now stop describing the model and compute with it. Write it down the way a category demands — every object of Dat, every morphism, source and target named. Five objects, six morphisms:

graph LR
    Customer["Customer"]
    Supplier["Supplier"]
    Org["Organization"]
    Invoice["Invoice"]
    Expense["Expense"]
    Invoice -->|"i_by"| Org
    Invoice -->|"i_for"| Customer
    Expense -->|"e_by"| Supplier
    Expense -->|"e_for"| Org
    Customer -->|"c_is"| Org
    Supplier -->|"s_is"| Org
    style Customer fill:#4f8cf7,color:#fff
    style Supplier fill:#4f8cf7,color:#fff
    style Invoice fill:#4f8cf7,color:#fff
    style Expense fill:#4f8cf7,color:#fff
    style Org fill:#f77f7f,color:#fff

Read it straight off: an Invoice is issued by an Organization (i_by) for a Customer (i_for); an Expense is billed by a Supplier (e_by) for an Organization (e_for). Customer and Supplier each carry a single morphism — into Organization. Two inference rules now take the diagram the rest of the way. Nothing else.

Rule 1 — composition. A category is closed under composition: given f : A → B and g : B → C, the morphism g ∘ f : A → C exists — you do not get to choose. And Customer and Supplier carry the same morphism set as Organizationname, taxId, address, into the same primitives — so each is isomorphic to it; c_is and s_is are the witnessing isomorphisms. Compose straight through them:

graph LR
    Invoice["Invoice"]
    Customer["Customer ≅ Org"]
    Org["Organization"]
    Invoice -->|"i_for"| Customer
    Customer -->|"c_is"| Org
    Invoice -.->|"i_for′ = c_is ∘ i_for"| Org
    style Invoice fill:#4f8cf7,color:#fff
    style Customer fill:#9a9a9a,color:#fff
    style Org fill:#f77f7f,color:#fff

i_for′ = c_is ∘ i_for : Invoice → Org, and likewise e_by′ = s_is ∘ e_by : Expense → Org. The composite is not new data — it is forced to exist. And once it does, Customer and Supplier are isomorphic copies of Organization that nothing else reaches: drop them. Three objects remain, and Invoice and Expense now have the same shape — two morphisms, both into Organization:

graph LR
    Invoice["Invoice"]
    Expense["Expense"]
    Org["Organization"]
    Invoice -->|"i_by"| Org
    Invoice -->|"i_for′"| Org
    Expense -->|"e_by′"| Org
    Expense -->|"e_for"| Org
    style Invoice fill:#4f8cf7,color:#fff
    style Expense fill:#4f8cf7,color:#fff
    style Org fill:#f77f7f,color:#fff

That shared centre did not exist a paragraph ago; Rule 1 created it — and it is exactly what the next rule needs.

Rule 2 — a functor that commutes everywhere is an identity in disguise. Propose a functor F from the Expense sub-category to the Invoice one. On objects it is forced: Expense ↦ Invoice, Org ↦ Org. On morphisms, pair them by role: e_by′ ↦ i_for′, e_for ↦ i_by. F is a functor only if it respects the structure — only if every square commutes:

graph LR
    Expense["Expense"]
    Invoice["Invoice"]
    OrgA["Organization"]
    OrgB["Organization"]
    Expense -->|"F"| Invoice
    Expense -->|"e_by′"| OrgA
    Invoice -->|"i_for′"| OrgB
    OrgA -->|"id"| OrgB
    style Expense fill:#4f8cf7,color:#fff
    style Invoice fill:#4f8cf7,color:#fff
    style OrgA fill:#f77f7f,color:#fff
    style OrgB fill:#f77f7f,color:#fff

The square commutes exactly when i_for′ ∘ F = id ∘ e_by′ — and since F and id are identities on the target, that collapses to one plain question: do e_by′ and i_for′ pick out the same Organization? They do — both are the counterparty, the other party to the transaction. Walk the second square, e_for ↦ i_by: both pick out the owner, the tenant the document belongs to. It commutes too.

F is a bijection on objects, and every square commutes. A functor that is bijective on objects and commutes everywhere is not a bridge between two categories — it is the identity in disguise. There was only ever one object. Invoice and Expense are the same object of Dat, written down twice — a proof, not a preference. (The commuting squares are the naturality condition: the exact categorical content of "these two are the same thing.") The duplicated total() from the four-class sketch falls with them — one object carries one total, one Trn; the second was never a transformation, only a class boundary.

Rule 3 — what does not commute is segregated, not erased. The deduction never said "merge everything." categorize : Expense → Category had no partner to pair with; the direction of the money — leaving us, or coming in — is a genuine difference with nothing to commute against. Those are the real distinctions, and the deduction hands them to you precisely: keep each as a discriminator or a partial morphism on the one unified object — consolidate every morphism that commuted, segregate every morphism that did not:

graph LR
    Doc["Document"]
    Org["Organization"]
    Dir["{ INCOMING, OUTGOING }"]
    Cat["Category"]
    Doc -->|"counterparty"| Org
    Doc -->|"owner"| Org
    Doc -->|"direction"| Dir
    Doc -.->|"category? — partial"| Cat
    style Doc fill:#4f8cf7,color:#fff
    style Org fill:#f77f7f,color:#fff
    style Dir fill:#cf7fcf,color:#fff
    style Cat fill:#cf7fcf,color:#fff

counterparty is the unified e_by′ ≡ i_for′; owner the unified e_for ≡ i_by; direction : Document → { INCOMING, OUTGOING } is the discriminator that survived precisely because its square never commuted. Five objects and six morphisms became oneDocument — by two composition steps and a commuting-square check. You did not decide it. You computed it.

In code — and here the method pays a second time. The deduction fixes the category: Dat has one object, Document, with direction a Dat-morphism into a two-element enum, and total the single Trn acting on it. It does not fix the memory layout. You can realise that one object as an array of structures:

// Deduced: one Dat-object. `direction` is a Dat-morphism into a 2-element enum;
// `total` is the single Trn that acts on it.
type Document = {
  lineItems:    LineItem[];
  counterparty: Organization;             // consolidated — Customer + Supplier
  direction:    "OUTGOING" | "INCOMING";  // a Dat-morphism: the discriminator
  category?:    Category;                 // segregated — partial, only when INCOMING
};

const total = (d: Document): Money => /* the one Trn, defined exactly once */;

const ledger: Document[] = [ /* ... */ ];          // array of structures (AoS)

…or, with the exact same categorical content, as a structure of arrays — where the direction morphism is realised not as a stored field but as which array a document lives in:

// Structure of arrays (SoA): same Dat-object, same single `total` Trn — only
// the layout differs. `direction` is now the partition itself: each array is
// one fibre of the morphism (its pre-image over OUTGOING / over INCOMING).
const ledger = {
  invoices: [] as Document[],   // fibre of `direction` over OUTGOING
  expenses: [] as Document[],   // fibre of `direction` over INCOMING
};

Both are faithful to the same category — same single Document object, same single total. AoS versus SoA is a placement and performance decision — SoA is the cache-friendly layout when a computation sweeps one direction at a time — not a modelling one. The category fixed what is true; it deliberately left how to lay it out open. That line — model here, layout there — is exactly what OOP erases by baking a layout into every class, and exactly what the atoms keep sharp.

"Invoice" and "expense" were never two objects. direction is a morphism on one Dat-object, not a class boundary — and the second total() is not "removed," it was never possible: there is one object to define it on. You arrived here by deduction, not taste. (The fully formal version of this move — functors, fibres, why a bijection-on-objects that commutes everywhere is a single category — is in the categorical-machinery deep dive above; it is the same five steps, and exactly how Guliel's real schema was reduced.)

Components and encapsulation are not the enemy — but they belong at the end, as a conclusion the analysis earned, never as the premise you started from. Bundle too early and you abstract away the commutes before you have seen them.

Placement, and the function that wasn't

Four atoms don't make an architecture yet. An architecture is the application of the logical pair onto the physical pair: this transformation, running on that core; this data, resident in that database.

When I first formalised this, I wrote that application as a simple function: every transformation has a location. One transformation, one place.

That was the old mistake again. A transformation does not have one location. A validation runs in the browser and on the server. A "render" runs server-side and again client-side. The same algorithm gets placed in many spots — sometimes as literally different code that does the same job.

So "where does this run" is not a function. It's a relation — one transformation, many placements. The fix is to make a placement its own first-class thing: a record that says "transformation T, at location L, inside component C." Once placement is something you can point at, you can have as many as the truth requires — and you can check them.

That check is the thing prose architecture could never do. The rule: a transformation may only read data that has been transmitted to its location. A placement that reads data no transmission delivers is not a style choice — it's a defect the architecture can show you, on a whiteboard, before production.

Components compose

A component is a cohesive bundle of placements — data and transformations, placed at locations. A component can span more than one location. And components compose: glue two of them along the transmission that connects them and you get a single larger component, that transmission now an internal detail. Compose enough and you have a system.

This is the property the boxes-and-arrows diagram never had. Composition has rules. A composed component's data, behaviour, and locations are derived from its parts — not redrawn by hand. When two components don't compose cleanly, the gap is a concrete, nameable thing: a missing transmission, a transformation reading data that isn't there. The architecture can be wrong in a way you can point at. And note where components landed: at the end, as the boundary the analysis earned — exactly where the object should have been all along, and never was.

Taming the beast: why this is suddenly urgent

Everything so far I believed when humans wrote all the code. It was true then, but it was a slow tax — a non-rigorous spec is survivable when a human reads between its lines, holds the unwritten context, and notices "wait, isn't this the same as that."

Hand that same spec to an AI and the tax becomes a hemorrhage.

An AI generates against the spec you give it. If "correct" is undefined — and in a prose PRD or a boxes-and-arrows diagram it is undefined — then nothing the AI produces can be wrong. I have watched it, repeatedly: it invents objects that were never in the model. It re-orders the architecture between one session and the next. It copies a function definition into five files. And none of that violates a prose spec, because a prose spec has nothing to violate. The non-rigor that humans quietly absorbed, AI industrialises.

And that is measured, not a grievance. GitClear's study of 211 million lines of changed code found that, as AI coding assistants spread, copy-pasted code climbed from 8.3% to 12.3%, duplicated blocks rose roughly eightfold in 2024, and refactoring fell to a record low — 2024 was the first year on record that duplicated code outpaced refactored code (report write-up). GitClear's own diagnosis is this article's thesis in other words: "it is less likely that the AI will propose reusing a similar function elsewhere … partly because of limited context size." It cannot see the whole model, so it re-types pieces of it — and a prose spec gives nothing to catch that with.

Category theory closes the hole, because it turns the spec into a checkable contract. An olog is the definition — readable straight off the diagram by the categorical rules, no annotation needed. "Correct" becomes something precise: the diagram commutes, the coherence laws hold, no morphism is a redundant copy of a composite that already exists. That is verifiable — by you, by a test, increasingly by the AI itself. The model can hallucinate all it likes; the category is the gate it cannot bluff past.

And because you think categorically about every layer — the PRD, the data model, the architecture — the whole spec is verifiable, not one slice of it. When the AI proposes a parallel object, the rule "if X is Y plus extra morphisms, X is not a new object" catches it. When it copies a function, the duplicate shows up as two morphisms that should have been one composition — a diagram that fails to commute. You are not hoping the AI behaved. You are checking.

That is the real meaning of taming the beast. Not better prompting. A verifiable target — an objective function the machine cannot game, because being wrong is finally defined.

Every architecture style is this, with one constraint flipped

Here's the payoff, and the reason I trust the method.

An architectural style — layered, microservices, event-driven, CQRS — is, in the precise sense, a set of constraints on how the parts may be arranged. The method gives you a substrate; a style is a configuration on it. You don't learn a new vocabulary per style. You take the same four atoms and flip one constraint.

  • Request/response: component A names component B; their dependency is one direct transmission.

  • Event-driven: delete that edge. A publishes to a broker; B subscribes. The A→B transmission now factors through a third component. Same atoms — one constraint flipped, and "A doesn't know B exists" falls out for free.

Below is the whole gallery — each style translated into the method. Open the ones you use; they're each self-contained.

Foundational

Layered / n-tier architecture

The oldest move in the book: stack the system in layers — presentation, business logic, data access, database — and let each layer call only the one directly beneath it. Nothing reaches up; nothing skips a level.

The four atoms. Layered constrains the component graph and almost nothing else — so its Dat and Loc are deliberately generic, and that genericity is itself a result (it is what separates "layered" from "n-tier", below).

Trn — each layer owns transformations; the shapes flowing between them:

Transformation

Signature

Kind

render

ViewModel → DOM

pure

decide

Request → Command

pure

load / save

Query ⇄ Record*

effect

exec

SQL → Row*

effect

Trm — transmissions run only between adjacent layers:

Transmission

Signature

Carries

q_PB

Presentation → Business

Request

q_BD

Business → DataAccess

Query

q_D·DB

DataAccess → Database

SQL

Locunconstrained: layered fixes no locations; all four components may share one process. Dat — the per-layer shapes above, with no cross-layer structure imposed.

Placement. Each layer is a component — PresentationCmp, BusinessCmp, DataAccessCmp, DatabaseCmp — and each transformation is a TrnLoc placed inside its layer. The whole content of the style is the shape of the depends-on graph over those components:

graph TD
    P["PresentationCmp"]
    B["BusinessCmp"]
    D["DataAccessCmp"]
    DB["DatabaseCmp"]
    P -->|"q_PB : Request"| B
    B -->|"q_BD : Query"| D
    D -->|"q_D·DB : SQL"| DB
    style P fill:#cf7fcf,color:#fff
    style B fill:#cf7fcf,color:#fff
    style D fill:#cf7fcf,color:#fff
    style DB fill:#cf7fcf,color:#fff

The system is the composite System = PresentationCmp ⋈ BusinessCmp ⋈ DataAccessCmp ⋈ DatabaseCmp, glued along the three adjacent-layer transmissions. Because every dependency points exactly one rank down, the gluing is a single unbranched chain — which is precisely why a layered system stacks.

The defining constraint, as a law. Give the components a rank — a functor rank : Comp → (ℕ, <) numbering the layers — and impose:

depends-on(c, c′) ⟹ rank(c′) = rank(c) + 1

Every dependency edge lands on the immediately next rank: never skips a layer, never points up. That single inference rule is "layered." A skip-level call (rank(c′) > rank(c) + 1) or an upward call (rank(c′) ≤ rank(c)) fails it — visibly, on the diagram, before runtime.

Layered vs n-tier — two independent constraints. The rule above constrains Comp only; it says nothing about Loc. N-tier adds a second, separate constraint — on placement: tl_loc is injective across layers, each layer at a distinct location. A monolith satisfies the first and not the second: fully layered, one process. The method shows the two words name two constraints — you can have either without the other.

What the framework tells you.

  • Cross-cutting concerns are predicted, not a surprise. Logging, auth, telemetry touch every layer — so they cannot be a depends-on edge, which would skip ranks. They are a natural transformation Id ⇒ W applied uniformly across all components — a different construct entirely. That is why every real layered system grows an "aspect" or "middleware" escape hatch: the substrate has no other slot for it.

  • Strictness is one inequality. "Strict" layered forbids skip-calls (rank(c′) = rank(c) + 1 exactly); "relaxed" layered permits them (rank(c′) > rank(c)). The style's only knob is which inequality you write.

  • A linear depends-on is a build order. Compilation, deployment and reasoning all proceed bottom-up because the order is total — there is a unique topological sort, read straight off rank.

Contrast — pipe-and-filter is also a linear chain of components, but the edges are a different kind. Layered edges are call edges: a layer invokes the one beneath and awaits a return — a round trip. Pipe-and-filter edges are compose edges: data flows one way through, no call, no return. The same line shape — q ; q of calls versus f ; g of composition.

Client–server

Two parties at two places. One — the client — wants something and asks for it; the other — the server — holds the real data and answers. The browser asks, the API responds. Every exchange opens on the client side.

The four atoms. Client–server constrains the transmissions — who may open one — and data placement — who owns the authoritative copy. The transformations and their types are whatever the domain needs.

Dat — one domain type Resource, materialised twice; the only structure the style fixes is that the two materialisations are the same Dat object:

Morphism

Signature

Semantics

r_id

Resource → Id

the identity both sides agree on

r_body

Resource → Json

the resource's payload

r_rev

Resource → Revision

which version a copy holds

Trn — ordinary domain transformations, placed on one side or the other:

Transformation

Signature

Kind

read

Query → Resource

effect — server-side, hits the store

mutate

Command → Resource

effect — server-side, the only write

render

Resource → View

pure — client-side

validate

Command → Command

pure — placed both sides

Loc — exactly two: Client (the asking thread) and Server (the answering process, with its store). Trm — every exchange is a request/response pair:

Transmission

Signature

Carries

t_request

Client → Server

Command / Query

t_response

Server → Client

Resource

Placement. read and mutate are TrnLocs at Server; render a TrnLoc at Client; validate has two TrnLocs — one each side. The Resource datum has two DataLocs over the same Dat: the authoritative row at Server, and a copy at Client.

graph LR
    Cl["ClientCmp<br/>TrnLoc: render, validate"]
    DC["Resource DataLoc<br/>(copy, @ Client)"]
    Sv["ServerCmp<br/>TrnLoc: read, mutate, validate"]
    DS["Resource DataLoc<br/>(authoritative, @ Server)"]
    DC -->|"dl_cmp"| Cl
    DS -->|"dl_cmp"| Sv
    Cl -->|"t_request : Command/Query"| Sv
    Sv -.->|"t_response : Resource (reply only)"| Cl
    style Cl fill:#cf7fcf,color:#fff
    style Sv fill:#cf7fcf,color:#fff
    style DC fill:#9a9a9a,color:#fff
    style DS fill:#4f8cf7,color:#fff

Every exchange is the composite — request, then the reply it provokes:

t_response ∘ t_request : Client → Server → Client

t_response is never a free-standing morphism: its domain Server is reached only as the codomain of a prior t_request. The system glues the two components along that pair — System = ClientCmp ⋈ ServerCmp, joined on t_request and t_response — and the depends-on edge runs ClientCmp → ServerCmp only; there is no edge back.

The defining constraint, as a law. Start from peer-to-peer: any component may open a transmission to any other, and no node holds privileged data — the initiation relation on Trm is symmetric. Flip one constraint — make initiation directed:

Designate Client the sole initiator. Every Trm with c_from = Server exists only as a t_response paired after a t_request with c_to = Server: ∀ τ. c_from(τ) = Server ⟹ ∃ ρ. c_from(ρ) = Client ∧ reply(τ) = ρ. No server-side transmission is an opening move.

This strengthens Coherence Law 4: a cross-location dependency is still mediated by a Trm, but now the mediation is one-directionaldepends-on carries no ServerCmp → ClientCmp edge. The data asymmetry follows, it is not a second axiom: the side that only ever answers is the side it makes sense to trust, so the authoritative DataLoc lands at Server and the Client copy is grey.

What the framework tells you.

  • "Never trust the client" is a Law 1 fact. The Client DataLoc is a different placement from the authoritative one — a copy delivered by t_response. Any mutate that must produce authoritative Resource reads t_from = Command against the real store, so by Coherence Law 1 it can only be placed at Server. Trust is a placement, not a slogan.

  • Validation that runs twice is honest. validate has two TrnLocs — one at Client (snappy UI), one at Server (real enforcement). That is the multi-placement the framework was built to express: parallel realisations of one Command → Command contract, not a redundancy to delete.

  • The bottleneck is structural. Because Server is the sole responder and the sole owner of the authoritative DataLoc, every client's depends-on edge points at it. Load concentration shows up on the diagram before it shows up in production.

Contrast — peer-to-peer is the same two-location picture with the initiation constraint lifted: the Trm relation goes symmetric, every node may open a transmission, and no single DataLoc is the authoritative one.

MVC / MVVM

Split the presentation tier into three roles. The Model is the data; the View is what the user sees; and between them sits a Mediator — a Controller (MVC) or a ViewModel (MVVM) — that turns input into Model changes and Model state into something the View can render. The View and the Model never speak directly.

The four atoms. MVC / MVVM constrains the component dependency graph; all three components almost always sit inside one Loc — the presentation tier — so it is not a placement across the wire.

Dat — the central object is Model, the domain state; the View consumes a rendered projection of it:

Morphism

Signature

Semantics

m_state

Model → DomainState

the current authoritative value

vs_of

Model → ViewState

the render-ready shape (the View's input)

Trn:

Transformation

Signature

Kind

render

Model → ViewState

pure — the View itself

intent

Input → Command

pure — interpret a user gesture

apply

Model × Command → Model

effect — the Mediator mutates the Model

Trm — the only transmissions are the three component ports; there is no View → Model transmission:

Transmission

Signature

Carries

t_input

View → Mediator

Input

t_push

Mediator → View

ViewState

t_rw

Mediator → Model

Command / DomainState

Loc — one site, Presentation (a UI thread, a browser tab). View, Mediator and Model are three components co-located there, not three locations.

Placement. Each role is a Cmp; render is a TrnLoc of ViewCmp, intent and apply are TrnLocs of MediatorCmp, the Model is a DataLoc of ModelCmp — all at Presentation:

graph TD
    M["ModelCmp<br/>DataLoc: Model"]
    Md["MediatorCmp<br/>TrnLoc: intent, apply"]
    V["ViewCmp<br/>TrnLoc: render"]
    L["@ Presentation (Loc)"]
    V -->|"t_input : Input"| Md
    Md -->|"t_rw : Command / DomainState"| M
    Md -->|"t_push : ViewState"| V
    V -.->|"co-located at"| L
    Md -.->|"co-located at"| L
    M -.->|"co-located at"| L
    style M fill:#4f8cf7,color:#fff
    style Md fill:#cf7fcf,color:#fff
    style V fill:#cf7fcf,color:#fff
    style L fill:#f7c04f,color:#000

The View↔Model dependency factors — it is never an edge, only a composite through the Mediator:

t_push ∘ apply ∘ t_rw ∘ intent ∘ t_input : View → Mediator → Model → Mediator → View

The system glues the three ports — System = ViewCmp ⋈ MediatorCmp ⋈ ModelCmp, along t_input/t_push and t_rw — and ViewCmp and ModelCmp share no port directly.

The defining constraint, as a law. Start from the unconstrained presentation component: View reads and writes Model freely — a complete triangle of depends-on edges. Flip one constraint — delete the direct edge:

In the depends-on graph there is no edge ViewCmp → ModelCmp and none ModelCmp → ViewCmp. Every View↔Model dependency is mediated by MediatorCmp — the graph is a triangle with the View–Model edge missing.

That single absent edge is MVC / MVVM. It is Coherence Law 4 (a dependency is mediated by a transmission) used within one location: Law 4 normally mediates because two components sit on different sites; here both sit at Presentation, and the mediation is imposed by design — to keep rendering separable from domain logic — not forced by the wire.

MVC vs MVVM is one sub-constraint — the kind of t_push. Both have the same triangle and the same missing edge. They differ only in whether Mediator → View is explicit or deduced:

MVC: t_push is an explicit transmission — the Controller names the View and pushes ViewState by hand. MVVM: t_push is deduced from a Binding ⊆ ViewField × ViewModelField — a declared data-binding; the framework synthesises the transmission both ways. t_push is a real transmission that no component's placements name.

What the framework tells you.

  • The View is a transformation, so it is testable. render : Model → ViewState has a t_from and a t_to like any Trn. Place that TrnLoc off-screen and you can check its output without a real display — the View is not "the screen", it is a pure function into ViewState.

  • MVVM's wiring is invisible by construction. The binding-derived t_push is exactly the event-driven broker's situation: the decoupling win and the "where did the wiring go" cost sit on the same diagram — a transmission that belongs to a Binding table, not to ViewCmp or MediatorCmp.

  • Extensibility is adjoining one component. A second View over the same Model is a new ViewCmp with its own render TrnLoc and one new t_input/t_push pair to the Mediator. ModelCmp is untouched — it structurally cannot import a View, because no edge reaches it.

  • The missing edge is a fact, not a discipline. "The Model knows nothing of the View" is not a coding convention you must uphold — it is the absence of a Cmp-reference, visible on the triangle before runtime.

Contrast — layered imposes a linear order on the same Component dependency graph; MVC / MVVM imposes a triangular one with a single forbidden edge. Same substrate, same depends-on graph — one is a chain, the other a triangle minus an edge.

Distributed

Microservices

Slice the system into many small, independently deployable services, and give each one its own database. No service reads another's tables. If the order service needs a customer's name it asks the customer service — over the network — it never runs a join. "Database per service" is the whole architecture in three words.

The four atoms. Microservices is a constraint on data placement; the transformations and their types are whatever the domain needs.

Dat — each service owns a private cluster of types (Order, OrderLine for one; Customer, Address for another). No object is privileged; what matters is where each is materialised — see Placement.

Trn — ordinary domain transformations:

Transformation

Signature

Kind

placeOrder

OrderDraft → Order

effect

lookupCustomer

CustomerId → Customer

effect

Loc — each service has its own location, and its database is a distinct location again: OrderSvc, OrderDB, CustomerSvc, CustomerDB.

Trm — every cross-service need is a transmission:

Transmission

Signature

Carries

q_cust

OrderSvc → CustomerSvc

CustomerId

r_cust

CustomerSvc → OrderSvc

Customer

Placement. Each service is a component; its database is a DataLoc owned by that component alone:

graph LR
    OS["OrderCmp"]
    OD["Order DataLoc<br/>(private)"]
    CS["CustomerCmp"]
    CD["Customer DataLoc<br/>(private)"]
    OD -->|"dl_cmp"| OS
    CD -->|"dl_cmp"| CS
    OS -->|"q_cust : CustomerId"| CS
    CS -.->|"r_cust : Customer"| OS
    style OS fill:#cf7fcf,color:#fff
    style CS fill:#cf7fcf,color:#fff
    style OD fill:#4f8cf7,color:#fff
    style CD fill:#4f8cf7,color:#fff

The defining constraint, as a law. Start from a layered monolith: every layer-component reads one shared database — a single DataLoc the data-access and business components both touch. Flip one constraint — make data placement exclusive:

dl_data(dl) = dl_data(dl′) ⟹ dl_cmp(dl) = dl_cmp(dl′)

No data type is materialised in two components. The store is private. And because the shared DataLoc is gone, a cross-component data need can no longer be a co-located read — it must become a transmission. Coherence Law 4 (a cross-location dependency is mediated by a Trm) stops being advice and becomes the only way two services interact.

What the framework tells you.

  • Extensibility is adjoining a component. A new service is a new Cmp with its own private DataLoc and a few transmissions to the services it queries. No existing component's data placement changes — nobody's table grows a column for the newcomer.

  • The "distributed transaction" tax is predicted. What a monolith got from one consistent join, microservices must assemble from several transmissions — each able to be slow, fail, or return stale data. That cost is the direct consequence of forbidding the shared DataLoc; the method shows the isolation win and the eventual-consistency cost on the same diagram.

  • The boundary is a fact, not a discipline. There is no arrow from one component into another's DataLoc — only into its transmission ports. A service can be redeployed, rescaled, rewritten without touching anyone's store, because the diagram structurally forbids the alternative.

Contrast — SOA keeps the small, private-data services but routes every inter-service transmission through one shared bus component; microservices keep the transmissions point-to-point — "smart endpoints, dumb pipes."

Service-oriented architecture (SOA)

Carve the enterprise into a handful of coarse-grained services — Billing, Inventory, CRM — and wire none of them to each other. Every service talks to one shared piece of plumbing: an enterprise service bus. The bus routes, translates protocols, transforms messages, orchestrates. Services publish to it and receive from it; they never hold a reference to one another.

The four atoms. SOA is a constraint on transmissions — it leaves Dat, Trn and the service-internal structure to the domain.

Dat — each service owns its own cluster of domain types (Invoice for Billing, StockItem for Inventory); no object is privileged. The one object every service shares is Message — the bus's envelope — with a routing discriminator:

Morphism

Signature

Semantics

msg_to

Message → ServiceId

logical destination — the routing key

msg_payload

Message → Json

the domain datum, in transit

msg_format

Message → Format

wire protocol/schema the bus may translate

Trn — ordinary domain transformations, plus the bus's own:

Transformation

Signature

Kind

chargeCard

Invoice → Receipt

effect — a Billing transformation

route

Message → Endpoint

pure — the bus picks the second hop

translate

Message × Format → Message

pure — the bus's protocol bridge

orchestrate

Message → Message*

effect — the bus's multi-step flows

Loc — each service has its own location: BillingSvc, InventorySvc, CRMSvc — and the bus is a distinct location again, Bus.

Trm — every cross-service need is a transmission, and every one has Bus as an endpoint:

Transmission

Signature

Carries

t_out

BillingSvc → Bus

Message

t_in

Bus → InventorySvc

Message

Placement. Each service is a component; the bus is also a component — its own route / translate / orchestrate TrnLocs and a DataLoc holding the routing table — never a wire:

graph TD
    Bus["BusCmp<br/>TrnLoc: route, translate, orchestrate<br/>DataLoc: routing table"]
    A["BillingCmp"]
    B["InventoryCmp"]
    C["CRMCmp"]
    A -->|"t_out : Message"| Bus
    Bus -->|"t_in : Message"| A
    B -->|"t_out : Message"| Bus
    Bus -->|"t_in : Message"| B
    C -->|"t_out : Message"| Bus
    Bus -->|"t_in : Message"| C
    style Bus fill:#cf7fcf,color:#fff
    style A fill:#cf7fcf,color:#fff
    style B fill:#cf7fcf,color:#fff
    style C fill:#cf7fcf,color:#fff

The defining fact: there is no transmission BillingSvc → InventorySvc. A dependency Billing→Inventory does not exist as one edge — it factors through the bus as a composite in the routing category:

t_in ∘ t_out : BillingSvc → Bus → InventorySvc

The first hop is fixed (the service knows only the bus); the second hop is chosen by the bus's route transformation reading msg_to. The components glue to match — System = BillingCmp ⋈ BusCmp ⋈ InventoryCmp ⋈ CRMCmp, every gluing along a t_out / t_in pair — and no two service components share a port directly. Their only common reference is BusCmp.

The defining constraint, as a law. Start from microservices: many services, private data, transmissions running point-to-pointBillingCmp's placements name a port on InventoryCmp directly. Flip one constraint — make every inter-component transmission factor through one shared bus:

For every transmission τ with c_from(τ) ≠ c_to(τ) between two service components, c_from(τ) = Bus ∨ c_to(τ) = Bus. No service-to-service Trm exists; every cross-service dependency c → c′ is the composite t_in ∘ t_out through Bus.

This is Coherence Law 4 (cross-location dependencies are mediated by a Trm) strengthened the same way the event-based style strengthens it — mediation must be indirect, through a third component — but here the mediator is a named addressable hub, not an anonymous broker: services route by msg_to, not by event type. The depends-on graph collapses to a star centred on Bus.

What the framework tells you.

  • The star graph is the SOA selling point, as a fact about shape. Every service points only at Bus, and Bus at every service. Adding a service adjoins one Cmp and one t_out / t_in pair to the hub — zero edges anywhere else. Integration points grow O(n), not O(n²): the absence of service-to-service Trm is what makes that linear.

  • The bus is a real component, so it has real placements. route, translate, orchestrate are TrnLocs placed inside BusCmp, with a DataLoc for the routing table. The framework forbids pretending the bus is "just infrastructure": behaviour you own is behaviour you test, deploy, version and reason about.

  • The honest cost: the bus is one component on every path. Because every inter-service dependency factors through Bus, that single component knows every service's contract and every routing rule — it concentrates load and knowledge. The diagram shows the integration win (the star) and the coupling risk (one node on every path) at once — exactly the critique that later pushed teams toward microservices.

Contrast — microservices is the same picture with BusCmp deleted and the service-to-service transmissions restored: BillingCmp names a port on InventoryCmp directly, the star collapses back to point-to-point edges — "smart endpoints, dumb pipes."

Service mesh

Take a microservices system and, next to every service, deploy a small proxy — a sidecar — on the same host. A service never speaks to the network directly: every call leaves through its own sidecar and arrives through the peer's sidecar. The proxies handle mTLS, retries, timeouts, load balancing and tracing; the business code stays oblivious. The mesh is the set of all those proxies.

The four atoms. Service mesh is a constraint on transmissions and the components that carry them; Dat and Trn are whatever the domain needs.

Dat — each service owns its domain types (Order, Customer, …); the only datum the mesh itself adds structure to is the wire envelope. No object is privileged — what matters is who carries it across the boundary.

Trn — two flavours. Domain transformations live in the services; the mesh contributes the transmission concerns as its own placed transformations:

Transformation

Signature

Kind

placeOrder

OrderDraft → Order

effect — a service Trn

mtls

Bytes → Bytes

effect — encrypt/verify, in the proxy

retry

Request → Response

effect — resend on failure, in the proxy

trace

Span → Span*

effect — emit telemetry, in the proxy

Loc — each service has its own location, and its sidecar is co-located with it — same Loc: OrderSvc (host of service A and proxy A), PaymentSvc (host of service B and proxy B).

Trm — every inter-service hop is decomposed into proxy-mediated transmissions:

Transmission

Signature

Carries

t_out

ServiceA → ProxyA

Request (loopback, same Loc)

t_wire

ProxyA → ProxyB

Request (mTLS, cross-Loc)

t_in

ProxyB → ServiceB

Request (loopback, same Loc)

Placement. Each service is a component; the mesh adjoins one more component per location — a proxy component, sharing its service's Loc. The proxy's mtls / retry / trace are TrnLocs placed inside the proxy, not the service:

graph LR
    A["ServiceA Cmp"]
    PA["ProxyA Cmp<br/>TrnLoc: mtls, retry, trace"]
    PB["ProxyB Cmp<br/>TrnLoc: mtls, retry, trace"]
    B["ServiceB Cmp"]
    A -->|"t_out : Request"| PA
    PA -->|"t_wire : Request (mTLS)"| PB
    PB -->|"t_in : Request"| B
    style A fill:#cf7fcf,color:#fff
    style PA fill:#cf7fcf,color:#fff
    style PB fill:#cf7fcf,color:#fff
    style B fill:#cf7fcf,color:#fff

The single logical edge ServiceA → ServiceB is never a Trm — it factors into three transmissions in the routing category:

t_in ∘ t_wire ∘ t_out : ServiceA → ProxyA → ProxyB → ServiceB

t_out and t_in are loopback hops (c_from and c_to share a Loc with their service); only t_wire crosses the network. The system glues to match — System = ServiceA ⋈ ProxyA ⋈ ProxyB ⋈ ServiceB, along those three transmissions — and ServiceA and ServiceB share no port directly: every boundary TrmCmp they own names a proxy, never the peer service.

The defining constraint, as a law. Start from plain microservices: a service's transmission reaches the peer point-to-point — t : ServiceA → ServiceB, a direct port. Flip one constraint — interpose a per-location proxy on every inter-service transmission:

No Trm runs ServiceX → ServiceY. For every inter-service dependency there is a proxy component at each endpoint's Loc, and the transmission factors as t_in ∘ t_wire ∘ t_out through them. The proxy is one uniform component, adjoined once per location.

This is Coherence Law 4 (a cross-location dependency is mediated by a Trm) with the mediator pinned: every mediating transmission must terminate on a proxy. And it is a deliberate §5 inversion. Layered's cross-cutting concern is a natural transformation Id ⇒ W — an aspect woven through component code. The mesh refuses the weaving: it realises the same concern as a uniform component stamped per Loc. Aspects become objects.

What the framework tells you.

  • A cross-cutting concern, made a deployable. mtls, retry, trace are one component pattern instantiated once per location, applied uniformly across the depends-on graph — the textbook shape of a cross-cutting concern, but as Cmp objects you run and version, not woven code. The §5 natural transformation and this component are two encodings of one idea; the mesh picks the one with an operational surface.

  • Policy moves without touching business logic. Because retry and mtls are TrnLocs in the proxy, upgrading mesh behaviour redeploys proxy components — no service component's placements change. Decoupling of policy from code is structural, not disciplinary.

  • The cost is on the diagram. Every logical hop is now three Trms, and two extra components sit on every call path. The uniform-policy win and the latency-plus-operational-surface cost are both read off the same picture — the proxies are real components with real placements to run and observe.

Contrast — plain microservices is this diagram with the proxies removed and the direct service-to-service transmission restored: t : ServiceA → ServiceB, a shared port. The mesh is the same substrate with one uniform component adjoined at every location.

Broker / message-queue

A producer drops a message onto a queue. A pool of workers pulls from that queue, and each message is handed to exactly one worker. The queue is durable and ordered; messages wait until a worker takes them. This is how you spread a backlog of work across a fleet — a hundred jobs, ten workers, each job done once.

The four atoms.

Dat — the central datum is Message; the only object a producer and a worker both name is Message itself — there is no shared subscription discriminator:

Morphism

Signature

Semantics

m_payload

Message → Json

the job's data

m_at

Message → Date

when it was enqueued

m_offset

Message → ℕ

its position in the queue order

The queue is the free monoid Message* — finite sequences under concatenation, appended at one end and taken from the other.

Trn:

Transformation

Signature

Kind

enqueue

Job → Message

effect — append to the queue

match

Message → Subscriber

pure — the broker's routing: pick one

handle

Message → JobResult

effect — the worker does the work

LocProducer (the enqueuing thread); Broker (with BrokerQueue, its durable disk); Worker (a competing-consumer thread, usually several).

Trm:

Transmission

Signature

Carries

t_enqueue

Producer → Broker

Message

t_persist

Broker → BrokerQueue

Message

t_deliver

Broker → Worker

Message

t_ack

Worker → Broker

Offset

Placement. enqueue is a TrnLoc at Producer; match a TrnLoc at Broker; handle a TrnLoc at each Worker. The Message datum has three DataLocs over the same Dat — the object in Producer RAM, the persisted entry at BrokerQueue, the in-flight copy at one Worker.

graph LR
    P["ProducerCmp<br/>TrnLoc: enqueue"]
    Bk["BrokerCmp<br/>TrnLoc: match · DataLoc: Message* queue"]
    C1["WorkerCmp A<br/>TrnLoc: handle"]
    C2["WorkerCmp B<br/>TrnLoc: handle"]
    P -->|"t_enqueue : Message"| Bk
    Bk -->|"t_deliver : Message (to one)"| C1
    Bk -->|"t_deliver : Message (to one)"| C2
    C1 -.->|"t_ack : Offset"| Bk
    C2 -.->|"t_ack : Offset"| Bk
    style P fill:#cf7fcf,color:#fff
    style Bk fill:#cf7fcf,color:#fff
    style C1 fill:#cf7fcf,color:#fff
    style C2 fill:#cf7fcf,color:#fff

Like pub/sub, there is no transmission Producer → Worker. The producer–worker dependency factors — it is a composite in the routing category:

t_deliver ∘ t_enqueue : Producer → Broker → Worker

It passes through Broker. The components glue to match — System = ProducerCmp ⋈ BrokerCmp ⋈ WorkerCmp, along t_enqueue and t_deliver — and ProducerCmp and WorkerCmp share no port directly.

The defining constraint, as a law. Start from publish–subscribe: the same broker, the same persisted Message*, the same transmissions — and a routing transformation match : Message → Subscriber* that fans one message out to every subscriber. Flip one constraint — collapse the multiplicity of match:

The broker's routing transformation is match : Message → Subscriber — it selects exactly one consumer. For each Message there is exactly one t_deliver transmission, terminating at one Worker.

That is the only edit. Subscriber* (a list — broadcast) becomes Subscriber (pick one — competing consumers). Where pub/sub turns one event into many t_delivers, the broker turns one message into one. A second consequence follows from the same flip: because each message has a single recipient, the t_deliver transmission is typically initiated by the Worker — it pulls from the queue when it has capacity, rather than being pushed. The arrow's direction in Trm is unchanged; the trigger moves to the consumer end.

What the framework tells you.

  • Scaling is adjoining a component. A new worker is a new WorkerCmp, its handle TrnLoc, and one new t_deliver. Because match picks one recipient from the live pool, throughput scales with the number of WorkerCmps — and no producer placement changes.

  • Temporal decoupling is the BrokerQueue DataLoc. t_enqueue writes the Message*; t_deliver reads it — later. The persisted free monoid is the buffer; a burst of enqueues simply lengthens the sequence, and workers drain it at their own pace. Producer and worker need never be co-live.

  • Exactly-once-ish is localised to two arrows. "One message, one worker" lives entirely in match (pick-one) and the t_ack transmission back from the worker. Redelivery, visibility timeouts, and dead-letter handling are all reasoned about on those two named arrows — nowhere else.

  • The order is a property of the monoid, not a worker. m_offset totally orders Message*; match consumes from the head. FIFO is read off the free monoid's structure — no component has to enforce it.

Contrast — publish–subscribe is the same broker component and the same persisted Message*, with the delivery multiplicity set back to broadcast: match : Message → Subscriber*, every subscriber gets every message. Broker/ queue and pub/sub are siblings — one substrate, the multiplicity of match the only constraint flipped.

Data-flow

Pipe-and-filter / pipelines

Data enters one end, walks a straight line through a series of independent stages, and leaves transformed at the other. Each stage — a filter — does one job and knows nothing of its neighbours; the connectors between them — the pipes — carry a typed datum forward. A Unix shell command, a build pipeline, an ETL job: all the same shape.

The four atoms. Pipe-and-filter constrains the transformation layer: the whole architecture is one composable chain in the algorithm category.

Dat — the stages of the datum as it is rewritten; what matters is not its internal structure but the sequence of types the chain steps through:

Morphism

Signature

Semantics

stage₀ … stageₙ

objects of Dat

the type at each cut of the pipeline — Raw, Parsed, Scored, Report

Trn — each filter is exactly one transformation:

Transformation

Signature

Kind

parse

Raw → Parsed

pure

score

Parsed → Scored

pure

render

Scored → Report

effect

Loc — one site per filter, fixed by nothing but the filter itself: FilterLocₐ, FilterLocᵦ, FilterLoc_c. They need not be distinct — a shell pipeline runs them all in one process; a stream job spreads them across machines. Trm — one pipe per adjacent pair:

Transmission

Signature

Carries

pipe_AB

FilterLocₐ → FilterLocᵦ

Parsed

pipe_BC

FilterLocᵦ → FilterLoc_c

Scored

Placement. Each filter is a component — FilterCmpₐ, FilterCmpᵦ, FilterCmp_c — and each holds exactly one TrnLoc: its single transformation placed at its single location. A pipe is the Trm welding two of them.

graph LR
    A["FilterCmp A<br/>TrnLoc: parse (Raw→Parsed)"]
    B["FilterCmp B<br/>TrnLoc: score (Parsed→Scored)"]
    C["FilterCmp C<br/>TrnLoc: render (Scored→Report)"]
    A -->|"pipe_AB : Parsed"| B
    B -->|"pipe_BC : Scored"| C
    style A fill:#cf7fcf,color:#fff
    style B fill:#cf7fcf,color:#fff
    style C fill:#cf7fcf,color:#fff

The pipeline is a composite morphism in the algorithm category Alg (§2.2) — arrow composition, made physical:

score ; parse is defined because t_to(parse) = Parsed = t_from(score), and render ; (score ; parse) : Raw → Report is the whole system.

The components glue to match — System = FilterCmpₐ ⋈ FilterCmpᵦ ⋈ FilterCmp_c, along pipe_AB and pipe_BC. Each pipe carries the type the upstream filter produces straight into the downstream filter's input — every gluing port is a typed handoff, never a call awaiting a return.

The defining constraint, as a law. Start from layered — also a linear chain of components — and flip one constraint: the edges are not calls but compositions. Make every component exactly one transformation, and require adjacent filters to compose at the type level:

The depends-on graph is a linear chain c₀ → c₁ → … → cₙ; each cᵢ holds exactly one TrnLoc of one Trn fᵢ; and for every edge cᵢ → cᵢ₊₁, t_to(fᵢ) = t_from(fᵢ₊₁) — the pipe carries exactly that type.

That single equation is "pipe-and-filter." It is Coherence Law 1 (Placement honesty) specialised: filter fᵢ₊₁ reads t_from(fᵢ₊₁) at its location, and the only thing that delivers it is pipe_{i,i+1} — so the pipe must carry(τ) = t_from(fᵢ₊₁), which forces the type-match. A filter whose input type is not the upstream output type is a pipe that cannot be welded — the framework rejects it before anything runs.

What the framework tells you.

  • The edges are COMPOSE, not CALL. Layered's chain is call edges — a round trip, a transmission down and a return up. Pipe-and-filter's chain is compose edges: one transmission, one direction, c_from ≠ c_to, and no return Trm at all. The same line shape, a different morphism.

  • Splicing a filter is adjoining one arrow. Inserting g between fᵢ and fᵢ₊₁ is legal exactly when t_to(fᵢ) = t_from(g) and t_to(g) = t_from(fᵢ₊₁) — the new component re-welds two pipes. No other filter's placement changes; the rest of the chain cannot observe it.

  • Filters are trivially relocatable. A filter's only contract is its t_from → t_to signature, so its TrnLoc is free to move — same thread, separate process, separate machine — touching only its own Loc and the two adjacent pipes. A shell pipeline and a distributed stream job are the same architecture at different placements.

  • The honest cost is shared state. A filter has one input type and one output type — nowhere to keep a lookup table or accumulated history. Such state must be threaded through the pipe (fattening carries) or hoisted into a side channel that breaks the linear chain. The constraint that buys composability is the one that makes cross-stage context awkward.

Contrast — event-driven publish–subscribe also forbids a component from naming its successor — but routes through a broker that may fan out to many consumers. Pipe-and-filter keeps the graph a single line: no broker, no branching, the successor fixed by the next link in the chain.

CQRS

CQRS — Command Query Responsibility Segregation — splits a system along the line between changing data and reading it. The write side accepts commands and owns the source of truth in whatever shape best enforces the rules. The read side answers queries from a separate store, shaped for the screens that consume it — and that store is never written by hand: it is materialised from the write side after every command.

The four atoms. CQRS is a constraint on data placement — two DataLocs where a single-model architecture has one. The transformations either side runs are whatever the domain needs.

Dat — two parallel shapes over the same domain. State is the write-side truth (the shape that enforces invariants); ReadModel is the query-side projection (the shape the screen wants). They are connected, not independent:

Morphism

Signature

Semantics

rm_of

ReadModel → State

which write-side truth this view denormalises

rm_at

ReadModel → Date

the write-side version this view reflects

Trn:

Transformation

Signature

Kind

decide

Command × State → State

effect — the only write

project

State → ReadModel

pure — denormalise truth into a view

query

ReadModel → Answer

pure — read-only, never touches State

Loc — the write store and the read store are distinct locations: WriteDB, ReadDB. Trmt_decide writes WriteDB; t_project carries each committed State to ReadDB; t_query reads ReadDB back.

Transmission

Signature

Carries

t_decide

CommandHandler → WriteDB

State

t_project

WriteDB → ReadDB

State

t_query

ReadDB → QueryHandler

ReadModel

Placement. decide is a TrnLoc in the write component; query a TrnLoc in the read component. State is the one authoritative DataLoc — materialised at WriteDB. ReadModel is a second DataLoc at ReadDB, and it is deduced: the output of a placed project, never written by a command.

graph LR
    DC["decide (Trn)"]
    WS["State<br/>(authoritative DataLoc)"]
    PR["project : State→ReadModel (Trn)"]
    RM["ReadModel<br/>(deduced DataLoc)"]
    QY["query (Trn)"]
    DC -->|"t_decide"| WS
    WS -->|"t_project"| PR
    PR -->|"produces"| RM
    RM -->|"t_query"| QY
    style DC fill:#7fc47f,color:#000
    style WS fill:#4f8cf7,color:#fff
    style PR fill:#7fc47f,color:#000
    style RM fill:#9a9a9a,color:#fff
    style QY fill:#7fc47f,color:#000

The read store is a composition, never an independent thing: ReadModel = project(State) — the truth built by decide, then projected. The system glues from two components — System = WriteCmp ⋈ ReadCmp — along the t_project transmission; the write component and the read component share no other port, and the read component has no inbound edge except t_project.

The defining constraint, as a law. Start from the single-model architecture: one DataLoc for an entity, and both decide and query touch that same placement. Flip one constraint — make the read placement separate and derived:

Commands and queries use different DataLocs. The write DataLoc (State) is authoritative; the read DataLoc (ReadModel) is not — it is project of the write DataLoc, and no command writes it directly: ReadModel = project ∘ decide*.

ReadModel is a deduced morphism, so storing it as an independent authority would be a redundant edge — a diagram that fails to commute. This is the post's "design result-data before source-data" rule promoted from one mental step to two physical DataLocs: the read model literally is the result-data shape, the write model the source-data shape, and project is the push between them.

What the framework tells you.

  • Eventual consistency is the latency of one named transformation. ReadDB is correct exactly as far as t_project has run. A query that reads ReadModel before project catches up sees a stale rm_at — the gap between the write DataLoc and the read DataLoc not yet closed by project. The framework names the staleness instead of waving at it: it is one edge's lag, visible on the diagram.

  • Two stores scale independently because they are two DataLocs. Write and read are distinct locations, so ReadDB can be replicated, re-indexed, or re-shaped for the screens without touching WriteDB — the isolation is a fact about the diagram, not a discipline.

  • Read shapes are free projects. A new screen wanting a different denormalisation is a new project' TrnLoc and a new ReadModel' DataLoc — adjoined per the extensibility rule, over an unchanged authoritative State. The write side cannot observe the addition.

  • The cost sits on the project edge. A second DataLoc with no authority of its own, plus a transformation that must run on every write and has its own failure modes. The framework shows the gain (independent reads) and the cost (a derived store to keep current) on the same diagram.

Contrast — event sourcing is the natural pairing: it constrains the write DataLoc to be an append-only Event* log and makes project a fold over it. CQRS alone says nothing about the write store's shape — only that the read store is a separate, derived placement; event sourcing constrains the write store, CQRS the read store.

Event sourcing

Most systems store current state — the account balance, the order status — and overwrite it on every change. Event sourcing refuses to. It stores the sequence of changes themselves — an append-only log of events — and treats that log as the only source of truth. Current state is never written down; it is recomputed from the log whenever it is needed.

The four atoms.

Dat — the central object is Event, and the authoritative store is the free monoid on it:

Morphism

Signature

Semantics

ev_type

Event → EventType

which kind of change

ev_payload

Event → Json

the change's data

ev_at

Event → Date

when it was appended

The log is Event* — the free monoid: finite sequences under concatenation. State is a projected domain value (a balance, a status).

Trn:

Transformation

Signature

Kind

append

Event* × Event → Event*

effect — the only write

handle

State × Event → State

pure — one per EventType

fold

Event* → State

pure — catamorphism, fold = reduce(handle)

Loc — the event store (a disk). Trmt_append writes the store; t_replay reads the log back for a fold.

Placement. The log is the one authoritative DataLoc; State is not a DataLoc at all — it is the output of a placed fold:

graph LR
    AP["append (Trn)"]
    LOG["Event* log<br/>(authoritative DataLoc)"]
    FD["fold : Event*→State (Trn)"]
    ST["State (deduced)"]
    AP -->|"t_append"| LOG
    LOG -->|"t_replay"| FD
    FD -->|"produces"| ST
    style AP fill:#7fc47f,color:#000
    style LOG fill:#4f8cf7,color:#fff
    style FD fill:#7fc47f,color:#000
    style ST fill:#9a9a9a,color:#fff

Current state is a composition, never a stored thing: State = fold(Event*) — the log built by append, then folded.

The defining constraint, as a law. Start from a state-stored architecture: an entity has a DataLoc holding its current value, and each command overwrites it. Flip one constraint:

The only authoritative DataLoc is the append-only Event* log; no transformation overwrites state, and every State value is fold of the log.

This is the post's "deduce, don't store" principle promoted to a whole architecture: State is a deduced morphism (fold ∘ append*), so storing it authoritatively would be a redundant edge — a diagram that fails to commute.

What the framework tells you.

  • Time travel is free because the fold is total. The log is a free monoid, so every prefix is itself a valid Event*. Folding a prefix gives the state as of any past moment; folding with a different handle gives a new projection over the same history — a new Trn, adjoined per the extensibility rule, over an unchanged DataLoc.

  • A cached State is explicitly secondary. Real systems place snapshot DataLocs — memoised fold results — as an optimisation. The framework is blunt that these are deduced, not authoritative: valid exactly as far as fold says so, the same status as a CQRS read model.

  • The honest cost sits on one edge. Replaying the whole log is O(history), and because old events are immutable, fold must handle every event shape the log has ever held. Both costs live on the single t_replay edge from log to fold — the substrate makes them visible, not surprising.

Contrast — CQRS is the natural pairing: the event log is the CQRS write DataLoc, and a CQRS read model is just one fold among several. Event sourcing constrains the write store to be a log; CQRS constrains the read store to be separate and derived.

Decoupling & boundaries

Event-driven / publish–subscribe

A publisher never calls a consumer. It emits an event to a broker, and whoever subscribed to that event type receives it. The publisher does not know who — or whether anyone — is listening. This is implicit invocation.

The four atoms.

Dat — the central datum is Event; the only Dat object a publisher and a subscriber both name is EventType:

Morphism

Signature

Semantics

ev_type

Event → EventType

the discriminator both sides share

ev_payload

Event → Json

the event's data

ev_at

Event → Date

when it was published

The stream is the free monoid Event* — the append-only log. Subscription = Consumer × EventType.

Trn:

Transformation

Signature

Kind

publish

DomainChange → Event

effect — append to the stream

match

Event → Subscription*

pure — the broker's routing

handle

State × Event → State

pure — per-EventType transition

fold

Event* → State

pure — fold = reduce(handle)

LocProducer (the publishing thread); Broker (with BrokerLog, its durable disk); Consumer (a subscribing worker, usually several).

Trm:

Transmission

Signature

Carries

t_publish

Producer → Broker

Event

t_persist

Broker → BrokerLog

Event

t_deliver

Broker → Consumer

Event

t_ack

Consumer → Broker

Offset

Placement. publish is a TrnLoc at Producer; match a TrnLoc at Broker; handle and fold are TrnLocs at Consumer. The Event datum has three DataLocs over the same Dat — the object in Producer RAM, the persisted entry at BrokerLog, the received copy at Consumer.

graph LR
    P["ProducerCmp<br/>TrnLoc: publish"]
    Bk["BrokerCmp<br/>TrnLoc: match · DataLoc: Event* log"]
    C1["ConsumerCmp A<br/>TrnLoc: handle, fold"]
    C2["ConsumerCmp B<br/>TrnLoc: handle, fold"]
    P -->|"t_publish : Event"| Bk
    Bk -->|"t_deliver : Event"| C1
    Bk -->|"t_deliver : Event"| C2
    C1 -.->|"t_ack : Offset"| Bk
    C2 -.->|"t_ack : Offset"| Bk
    style P fill:#cf7fcf,color:#fff
    style Bk fill:#cf7fcf,color:#fff
    style C1 fill:#cf7fcf,color:#fff
    style C2 fill:#cf7fcf,color:#fff

The defining observation: there is no transmission Producer → Consumer. The producer–consumer dependency factors — it is a composite in the routing category:

t_deliver ∘ t_publish : Producer → Broker → Consumer

It passes through Broker. The components glue to match — System = ProducerCmp ⋈ BrokerCmp ⋈ ConsumerCmp, along t_publish and t_deliver — and ProducerCmp and ConsumerCmp share no port directly.

The defining constraint, as a law. Start from request/response: a single transmission t : A → B, and ACmp names BCmp — a direct depends-on edge. Flip one constraint:

Delete the direct edge. Every inter-component dependency is mediated, and mediated indirectly — through the broker. In the depends-on graph there is no edge ConsumerCmp → ProducerCmp; both point at BrokerCmp.

This is Coherence Law 4 at its limit: Law 4 requires every cross-location dependency to be mediated by a transmission; the event-based style strengthens it so no producer/consumer pair is ever directly coupled. "A doesn't know B exists" is not discipline — it is the absence of a Cmp-reference, a fact about the diagram.

What the framework tells you.

  • Extensibility is adjoining an object. A new subscriber is a new ConsumerCmp, its handle / fold TrnLocs, a Subscription, and one new t_deliver. No existing placement changes — the publisher cannot even observe the addition.

  • Temporal decoupling is the BrokerLog DataLoc. t_publish writes the stream; t_deliver reads it — later. The persisted Event* is the buffer; producer and consumer need never be co-live.

  • Fan-out is match. match : Event → Subscription* turns one event into many t_delivers, one per subscriber — the broker's whole job, one functor.

  • The cost is honest, too. The depends-on graph is sparse and stable (everything points at the broker), but control flow — which consumer runs after a publish — lives in Subscription data and match, in no component's placements. The method shows the extensibility win and the lost-control-flow cost in one diagram.

Contrast — request/response is the same picture with the broker removed and the direct edge restored: one transmission t : A → B, and ACmp names BCmp — a direct depends-on edge, a shared port.

Hexagonal (ports & adapters)

A domain core — the rules of your business — that touches the outside world only through abstract slots. The core never names a database driver, an HTTP client, or a queue SDK; it declares a port and an adapter is plugged in from outside to fulfil it.

The four atoms. Hexagonal constrains the component graph and the kind of boundary transmission a component may hold — its Dat, Trn and Loc are whatever the domain needs.

Dat — the core owns a private cluster of domain types (Document, Money, Counterparty); a port also fixes a contract type — the datum on the wire, which is the only Dat object the core and its adapters both name.

Trn — domain transformations in the core, plus the adapter's translation:

Transformation

Signature

Kind

decide

Command → DocumentDraft

pure — core business rule

save

DocumentDraft → SaveResult

effect port contract

pg_save

DocumentDraft → SaveResult

effect — Postgres adapter

mem_save

DocumentDraft → SaveResult

effect — in-memory adapter

save is the contract t_from = DocumentDraft → t_to = SaveResult; pg_save and mem_save are parallel realisations of it — same signature, different code.

Trm — the core's only outward transmission lands on a port, never on a store:

Transmission

Signature

Carries

t_port

Core → Port

DocumentDraft

t_pg

PgAdapter → Postgres

SQL

t_mem

MemAdapter → Heap

DocumentDraft

LocCore (the domain thread); Postgres, Heap, a test double's RAM — each adapter's concrete site. The port has no Loc: it is an interface, not a place.

Placement. The core is a component; each adapter is a separate component; the port is the shared Trm they glue along. decide and the core's t_port are TrnLoc/TrmCmps at Core; pg_save is a TrnLoc at Postgres inside PgAdapterCmp. The core's depends-on graph names the port — never an adapter:

graph LR
    Core["DomainCoreCmp<br/>TrnLoc: decide · TrmCmp: t_port"]
    Port["save port (slot)<br/>contract: Draft→SaveResult"]
    PG["PgAdapterCmp<br/>TrnLoc: pg_save"]
    Mem["MemAdapterCmp<br/>TrnLoc: mem_save"]
    Core -->|"t_port : DocumentDraft"| Port
    PG -.->|"realises (2-cell)"| Port
    Mem -.->|"realises (2-cell)"| Port
    style Core fill:#cf7fcf,color:#fff
    style Port fill:#f7c04f,color:#000
    style PG fill:#cf7fcf,color:#fff
    style Mem fill:#9a9a9a,color:#fff

The system is the composite System = DomainCoreCmp ⋈ PgAdapterCmp, glued along the port: a TrmCmp of the core and a TrmCmp of the adapter name the same Trm, opposite orientation, so in System that transmission becomes internal. Choosing MemAdapterCmp instead gives System′ = DomainCoreCmp ⋈ MemAdapterCmpthe core's placements are byte-identical in both. The core factors out of the choice:

behaviour(DomainCoreCmp) is invariant under which adapter is glued in.

The defining constraint, as a law. Start from layered: there the core (business layer) depends-on the data-access layer directly — a named edge BusinessCmp → DataAccessCmp, the core naming its concrete partner. Flip one constraint:

No TrmCmp of the core component has a c_to (or c_from) inside a concrete external component. Every boundary transmission of the core lands on a port — a Trm typed only by its carries datum. The depends-on graph of the core names ports, never adapters; the concrete component is supplied from outside and glued in along the port.

This relaxes Coherence Law 4 in one specific way. Law 4 says a cross-location dependency must be mediated by a transmission; layered satisfies it with an edge straight to DataAccessCmp. Hexagonal forbids that edge from naming a concrete component at all — the mediating Trm is identified only by its contract type, and which component sits on the other end is decided by gluing, later, from outside. The core stops naming what it talks to. That is the style.

What the framework tells you.

  • It is exactly the strategy 2-category (§5 / deep-dive). A port is one Trn slot — the contract DocumentDraft → SaveResult. Every adapter is a parallel TrnLoc realising that one signature: pg_save and mem_save are parallel 1-cells over the same contract, and picking one is selecting a 2-cell. Hexagonal is not a special architecture — it is the strategy 2-category applied to a component's boundary transmissions.

  • Testability is structural, not a virtue. Because the core glues to a port and not an adapter, DomainCoreCmp ⋈ TestDoubleAdapterCmp is a valid composite with zero change to the core's placements (the invariance equation above). "The core is unit-testable in isolation" is just: the core composes with any component whose port-TrmCmp matches the contract.

  • Extensibility is adjoining a component. A new backing technology is a new AdapterCmp with its t_from → t_to TrnLoc and one concrete Trm — a new parallel arrow into the existing port. No existing placement changes; the core cannot observe the addition.

  • The honest cost — adapters are real components. Every port needs at least one adapter, and each adapter is its own bundle of placements with its own concrete Trm to maintain. The indirection that buys swappability also multiplies components; for a system with one database forever, that ceremony earns nothing. The method shows the price on the same diagram as the gain.

Contrast — layered keeps the direct edge: the core names the data-access layer — a depends-on edge straight to DataAccessCmp. Hexagonal cuts that edge and routes the dependency through a port, so the concrete partner becomes a pluggable, parallel component glued in from outside.

Serverless / FaaS

You write a function and hand it to the platform. There is no server you rent, no process you keep alive. A request arrives, the platform spins up a container, runs your function once, tears the container down. The next request gets a fresh one. Between invocations nothing of yours is running anywhere.

The four atoms. Serverless constrains the lifetime of Loc — and that single change to one atom cascades into everything else. The transformations and their types are whatever the domain needs.

Dat — ordinary domain types; what matters is where each is materialised. The function's input arrives on the wire; everything else lives in a managed store.

Trn — one placed handler plus the reads it is forced into:

Transformation

Signature

Kind

handle

Request → Response

effect — the function body

loadState

Key → Record

effect — read from the managed store

connect

() → Connection

effect — re-opened every invocation

Loc — two kinds, and the split is the whole style. EphLoc is the per-invocation container: created when the request arrives, destroyed when handle returns. StateLoc is the managed database / object store — long-lived, owned by the platform, never your function. EdgeLoc is the trigger source.

Trm — every datum reaches the function over a wire, on every call:

Transmission

Signature

Carries

t_invoke

EdgeLoc → EphLoc

Request

t_load

StateLoc → EphLoc

Record

t_write

EphLoc → StateLoc

Record

t_return

EphLoc → EdgeLoc

Response

Placement. handle is a TrnLoc at EphLoc inside FunctionCmp — but the TrnLoc row itself is transient: it is born with the container and discarded when it dies. StateCmp owns the only persistent DataLoc; FunctionCmp owns no DataLoc at all.

graph LR
    Ed["EdgeCmp<br/>(trigger)"]
    Fn["FunctionCmp<br/>TrnLoc: handle @ EphLoc<br/>(transient — no DataLoc)"]
    St["StateCmp<br/>DataLoc: Record @ StateLoc<br/>(persistent)"]
    Ed -->|"t_invoke : Request"| Fn
    St -->|"t_load : Record"| Fn
    Fn -->|"t_write : Record"| St
    Fn -.->|"t_return : Response"| Ed
    style Ed fill:#cf7fcf,color:#fff
    style Fn fill:#cf7fcf,color:#fff
    style St fill:#cf7fcf,color:#fff

Because EphLoc does not survive the call, the placement of handle cannot be composed forward in time — the next request rebuilds it. Every input handle consumes must arrive by transmission within the one invocation window:

t_load ; handle ; t_write : StateLoc → EphLoc → StateLoc

State leaves and re-enters StateLoc every call; it never rests at EphLoc. The components glue along the four boundary transmissions — System = EdgeCmp ⋈ FunctionCmp ⋈ StateCmp — and FunctionCmp's contribution to data(System) is empty: it owns behaviour, never storage.

The defining constraint, as a law. Start from a long-lived server: one component, one stable Loc holding its transformations and its data — caches, pools, indexes — resident across requests. Flip one constraint — drop the lifetime of Loc to a single invocation:

FunctionCmp owns no persistent Loc. Its EphLoc exists only for one t_invoke ; handle ; t_return cycle, so there is no DataLoc at tl_loc(handle).

That makes Coherence Law 1 (Placement honesty) bite at its hardest. Law 1 says a transformation reads only data materialised at or transmitted to its location. On a warm server you satisfy the "materialised at" clause for free — the location persisted, so the cache is already there. In FaaS that clause is structurally unavailable: there is no warm DataLoc at EphLoc. Only the "transmitted to" clause remains. Cold starts and statelessness are not gotchas — they are this missing DataLoc, read straight off the diagram.

What the framework tells you.

  • All persistent data must live in a separate, long-lived DataLoc. Since FunctionCmp owns no durable Loc, the only place a Record can rest is StateLoc in StateCmp — a managed DB or object store. The split between EphLoc and StateLoc is forced, not a design choice.

  • Every datum is transmitted in, every invocation. With the "materialised at" clause of Law 1 gone, each input to handle must be satisfied by a Trm with c_to = EphLoc. Load the row, open the connection, fetch the config — t_load and connect re-fire on every call because the diagram leaves no shortcut.

  • The strategy axis is the only thing left to scale. A spike in load adjoins more parallel EphLocs, each with its own transient TrnLoc over the same handle Trn (§5). No capacity to plan, nothing to keep warm — the framework's extensibility move is the platform's autoscaler.

  • The honest tradeoff is one row of DataLoc. You give up every optimisation that depended on a warm location — pooling, in-memory caches, sticky state — in exchange for owning no Loc at all: billing drops to zero between calls because there is nothing placed to bill for.

Contrast — client–server pins each component to a durable Loc that holds both its TrnLocs and its DataLocs for the life of the system; peer-to-peer does the same for every peer. Serverless is the same diagram with one constraint flipped — Loc lifetime collapses to a single invocation — so the DataLoc at the function's location vanishes and all state is exiled to a persistent store.

Peer-to-peer

No server. Every node runs the same software and plays the same role: each one both asks other nodes for data and answers their requests. A file-sharing swarm, a blockchain network, a gossip cluster — there is no privileged box in the middle, and no centre to lose.

The four atoms. Peer-to-peer constrains the components — there is only one type of them — and the transmissions — who may open one. The transformations and their types are whatever the domain needs.

Dat — one domain type Resource, materialised at every peer; the style fixes only that all those copies are the same Dat object, with no privileged one:

Morphism

Signature

Semantics

r_id

Resource → Id

the identity all peers agree on

r_body

Resource → Json

the resource's payload

r_rev

Resource → Revision

which version a given copy holds

Trn — ordinary domain transformations, the same set placed at every peer:

Transformation

Signature

Kind

serve

Query → Resource

effect — answer a peer, hits the local store

request

Query → Resource

effect — ask a peer for what's missing

merge

Resource × Resource → Resource

pure — reconcile two copies by r_rev

locate

Id → Peer*

effect — which peers hold this datum

Loc — one location kind, Peer, instantiated n times: Peer₁, Peer₂, …, Peerₙ, each with its own store. Trm — one transmission kind, and it is symmetric:

Transmission

Signature

Carries

t_ask

Peerᵢ → Peerⱼ

Query

t_send

Peerⱼ → Peerᵢ

Resource

The signatures are quantified over all i, jt_ask exists for every ordered pair, in both directions.

Placement. Every component is an instance of one type, PeerCmp. Each instance places the identical TrnLoc set — serve, request, merge, locate — at its own Peer location, and owns one Resource DataLoc there. No DataLoc is grey; none is authoritative.

graph LR
    A["PeerCmp #1<br/>TrnLoc: serve, request, merge, locate"]
    DA["Resource DataLoc<br/>(replica, @ Peer₁)"]
    B["PeerCmp #2<br/>TrnLoc: serve, request, merge, locate"]
    DB["Resource DataLoc<br/>(replica, @ Peer₂)"]
    C["PeerCmp #3<br/>TrnLoc: serve, request, merge, locate"]
    DC["Resource DataLoc<br/>(replica, @ Peer₃)"]
    DA -->|"dl_cmp"| A
    DB -->|"dl_cmp"| B
    DC -->|"dl_cmp"| C
    A <-->|"t_ask / t_send : Query ⇄ Resource"| B
    B <-->|"t_ask / t_send : Query ⇄ Resource"| C
    A <-->|"t_ask / t_send : Query ⇄ Resource"| C
    style A fill:#cf7fcf,color:#fff
    style B fill:#cf7fcf,color:#fff
    style C fill:#cf7fcf,color:#fff
    style DA fill:#4f8cf7,color:#fff
    style DB fill:#4f8cf7,color:#fff
    style DC fill:#4f8cf7,color:#fff

An exchange is still a composite, but it is not directional — either end may open it:

t_send ∘ t_ask : Peerᵢ → Peerⱼ → Peerᵢ — for every ordered pair (i, j)

The components glue along whichever transmissions are live — System = PeerCmp ⋈ PeerCmp ⋈ … ⋈ PeerCmp — and because the factors are all the same interface, the gluing is symmetric: has no preferred operand, and depends-on is an undirected graph.

The defining constraint, as a law. Start from client–server: two distinct component types, an asymmetric initiation relation — the client always initiates, the server always serves — and one authoritative DataLoc at the server. Flip one constraint — erase the role distinction:

There is exactly one component type PeerCmp; every component is an instance of it, so the placement sets are equal up to location. And initiation is symmetric: ∀ i, j. (∃ τ. c_from(τ) = Peerᵢ ∧ c_to(τ) = Peerⱼ) ⟺ (∃ ρ. c_from(ρ) = Peerⱼ ∧ c_to(ρ) = Peerᵢ). No node is solely an initiator and none solely a responder.

This is client–server's directed-initiation law relaxed back to symmetry, and its single authoritative DataLoc deleted. Coherence Law 4 still holds — every cross-location dependency is mediated by a Trm — but the mediation is no longer one-directional: depends-on carries an edge PeerᵢCmp → PeerⱼCmp for every j, and the reverse edge too.

What the framework tells you.

  • Scaling is adjoining an instance of an existing type. A new peer is one more PeerCmp — no new component type, no new transformation, no central node to reconfigure. The system grows by replicating an object the diagram already contains.

  • Resilience falls out of the symmetry. No component is privileged, so deleting any single PeerCmp leaves a smaller graph of the same shape. No node's loss is structurally different from any other's — there is no single point of failure, by construction, not by redundancy bolted on.

  • The honest cost — no authoritative DataLoc. Client–server gets consistency cheaply: one store, one source of truth. P2P has n DataLocs over one Dat and no canonical one, so Law 1 ("data read here was transmitted here") can only be satisfied for the whole network by a consensus or gossip protocol — merge run pairwise until the replicas agree. That protocol is the price of having no centre, and the diagram shows it.

  • Discovery is its own transformation. With no central registry, finding which peer holds a datum is the locate : Id → Peer* transformation — itself transmission-heavy. Client–server never needs it; the method makes the extra cost visible.

Contrast — client–server is the same picture with the role split restored: two component types instead of one, an asymmetric initiation relation, and a single authoritative DataLoc the server alone owns.

This is how Guliel is built

None of this is a hobby. It's the working method behind Guliel — the financial operations platform my team builds — and the consolidations it produced are the proof.

Five tables that were never five objects. Guliel started, like every system does, with nouns: customers, suppliers, invoices, expenses, orders. Five tables, five "objects" — the object-oriented instinct, applied to the schema. Modelling it categorically dissolved that. Customer and Supplier aren't objects at all; they are roles on one directed edge between parties — so they collapsed into a single Counterparty. An Expense is not its own object either: it is a Document you received instead of issued — the same object with direction = INCOMING. A purchase Order is a Document with documentType = PURCHASE_ORDER. Five "objects" reduced toward two — and every collapse was deduced, by one rule ("if X is Y plus morphisms, X is not new"), and written down as a checkable reasoning trail. Not a refactor we stumbled into after the code hurt. A deduction we made on the page.

A dispute is not a table. When we designed per-document disputes, the noun instinct said: a dispute is a thing, give it a Dispute table. The category said otherwise. A document carries an append-only log of events; its current state is a fold over that log. A "dispute" is simply a sub-object of that log — the sub-sequence of dispute-typed events. Storing a Dispute table would have stored a morphism that already exists as a composition: a redundant edge, a diagram that doesn't commute. So there is no Dispute table. The category told us not to build one.

And we build Guliel itself with AI — including, at times, posts like this one. The categorical specs are exactly what make that safe: the olog is the contract the AI generates against, and the contract we verify its output against. That is how AI accelerates the work instead of quietly eroding it.

If you're the kind of engineer who read this far, there's a part of Guliel built for you: a fully typed REST API and an MCP server, so you can wire your own financial operations — issue documents, pull reports, reconcile expenses — straight from your own code or an AI agent. The same method that keeps our architecture honest is what makes that surface clean enough to hand to you.

Explore the Guliel API & MCP →

— Sapir

architecturesoftware-designcategory-theoryaiengineering