Software Architecture Has No Rigor — and AI Feeds on It
Every serious engineering field can verify a design before it's built. Software architecture can't — and an AI generating against an unverifiable spec will hallucinate, duplicate, and reorder with impunity. Here's the method I use at Guliel: model software as what it actually is — data, transformations, and a place to run — into a category you can check.
Every other engineering field can check its work
Design a circuit and you can interrogate it. Kirchhoff's laws either hold or they don't; the design is a formal object and "correct" is computed, before a single component is soldered. Structural engineering has its load calculations. Network design has its capacity and routing proofs. Each of those fields has a notion of a design being verifiably wrong — a question you can ask the design itself, on paper, and get a real answer.
Software architecture has nothing of the kind.
I spent years feeling this before I could name it. Every time I designed with boxes and arrows — UML, an architecture diagram, a PRD full of rectangles — something felt off. Not that the diagrams were ugly. It was that my actual reasoning about the system — this composes with that, this would be redundant, this boundary is going to leak — none of it fit on the diagram. The diagram could record a conclusion but not the deduction that produced it. And a picture that holds conclusions but not reasoning can never tell you a conclusion is wrong.
So I went looking. I read the academic software-architecture literature hoping for something rigorous, and found the opposite: dozens of definitions of "architecture," "component," "connector," each subtly different, most unusable, and across all of them no deduction rules — nothing that lets you start from a design and derive a property. You were left to draw, ship, and discover what should have been refactored only once the code existed and the mistake had teeth.
Then I started modelling systems as categories, and the missing thing appeared. A category is not a picture of a system; it is a system you can compute with. Its diagrams have laws. Some paths through them must be equal — they must commute — and that is a fact you can check. An architecture stops being a sketch you defend in a meeting and becomes an object you can be wrong about, on paper, before the code exists.
And there is a reason this stopped being a craftsman's preference and became urgent. We now hand these specs to AI. An unverifiable spec was a slow, quiet tax when humans wrote all the code. Handed to a machine that generates against it at scale, it is something else entirely. I'll come back to that — it is the sharpest argument for everything here.
How I actually design — from the outside in
First, the workflow — because the parts only make sense in the order you'd actually meet them.
I design from the outside in. Not from the database. From the human.
Requirements. What does the user need to accomplish?
Interaction surface. How will they touch it — a screen, a phone, an API call, an email? This is a physical decision, and it comes early, because it constrains everything after it.
Result data. Given that interaction, what data structure makes it feel effortless? This is the shape the screen wants — designed before, and independently of, how that data is produced.
Source data + transformations. Now: what do I need to store, and what algorithms turn the stored shape into the result shape?
Locations. Where does each of those things physically live and run — which server, which thread, the browser, the database?
Components. Finally, bundle all of that into modules a team can own.
Step 3 — the result data — comes before step 4, the source data, and that is deliberate. The shape the interface wants is pulled from the interaction; the source data and the code are pushed to produce it. Designing the database first and hoping a nice screen falls out is how you get screens that fight their own data.
That workflow kept producing good systems, so eventually I asked the obvious question: what is it actually made of? Strip the process away, and underneath there are only four things.
The four atoms
A running program does exactly two things, and only two. It holds and transforms data, and it moves that data between physical places. Four atomic parts, in two pairs — and each one is a category: a set of objects together with the structure-preserving maps, the morphisms, between them. That is what will make the whole design checkable. The precise version is one section down; the names first.
The logical pair — what the software means:
Data (
Dat) — its objects are the data types; its morphisms are the stored, structural relations between them — a field, a foreign key.Transformations (
Trn) — its objects are the algorithms that rewrite data. Pure meaning; no notion of where.
The physical pair — what the software occupies:
Locations (
Loc) — the physical sites where code runs and data rests: a thread, a core, a region of RAM, a disk, a network card.Transmissions (
Trm) — moving one piece of data across a boundary, from one location to another.
The logical pair is the software in a vacuum. The physical pair is what drags it into the real world — because software in a vacuum manifests nothing. No screen, no input, no interface. An interface exists only because data is transmitted to a physical boundary a human can see or touch. That is not a detail; it is half of what an architecture is — and the half that boxes-and-arrows diagrams quietly omit.
graph LR
subgraph Logical["Logical — what it means"]
Data["Dat — data"]
Transformations["Trn — transformations"]
end
subgraph Physical["Physical — what it occupies"]
Locations["Loc — locations"]
Transmissions["Trm — transmissions"]
end
Transformations -->|"in"| Data
Transformations -->|"out"| Data
Transmissions -->|"from"| Locations
Transmissions -->|"to"| Locations
Transmissions -.->|"carry"| Data
style Data fill:#4f8cf7,color:#fff
style Transformations fill:#7fc47f,color:#000
style Locations fill:#f77f7f,color:#fff
style Transmissions fill:#7fc4c4,color:#000The categorical machinery
Those four names — Dat, Trn, Loc, Trm — plus the two that complete the
method, placement and component, are not metaphors. Each is a real
category; the relations among them are real functors. Pinning that down now,
before the examples, is deliberate: it is what lets the rest of this article
say "morphism," "commute," or "this reduces to that" and mean something you can
check — not architecture jargon dressed in Greek letters.
You do not strictly need the formalism to follow the method. But this is written for engineers, and the whole point is rigor, so it belongs here, in front of the examples — not in an appendix. Expand it now, or read the article through and expand it when a term first bites. It opens with the definitions to keep in hand — the notation the whole style gallery below runs on — and ends with a worked reduction you can copy.
Deep dive: the categorical machinery
The loose words in the main post — relation, compose, deduced, can be wrong — each have an exact counterpart. The exact language is category theory, and it is worth the page because it turns "good architecture" from taste into something you can check. This section is the reference: read it once, and every style in the gallery below reads in the same notation.
Definitions to keep in mind
Six terms. The gallery uses nothing else — keep them in hand and skip back here when one bites.
Category — a set of objects and arrows (morphisms) between them. Arrows compose: if
f : A → Bandg : B → Ctheng ∘ f : A → Cexists; composition is associative; every object has an identity arrow. That is the entire definition. Anything shaped like "things, and structure-preserving ways to get from one to another" is a category.Morphism signature —
f : A → B. Always write the source and the target; the signature is half the content.Partial morphism —
f? : A → B, defined on only some ofA. It is how optionality (a nullable field, a value present only in one case) is written.Commuting diagram — a diagram commutes when any two paths with the same start and end are equal as morphisms:
g ∘ f = h. Commuting is the property you check — a picture cannot fail to commute, a category can.Functor — a structure-preserving map between categories: it sends objects to objects and arrows to arrows, and preserves composition and identities.
Free monoid
A*— the finite sequences ofAunder concatenation: a log, a history, an append-only stream.
One more, used only for cross-cutting concerns: a natural transformation is a uniform family of morphisms relating two parallel functors — one arrow per object, every square commuting. "The same wrapper, applied everywhere."
The four atoms are four categories
A running system holds and transforms data, and moves it between places. Those are four categories.
Dat— objects are data types (entities, plus primitivesℝ,𝕊,𝔹,Date); morphisms are the stored, structural relations between them — a field, a foreign key. Rows are products×, tagged unions are sums⊕, a history is a free monoidA*.Trn— a transformation (an algorithm). Here is the first sharp move: a transformation is an object, not an arrow — because the next step places it, and you place objects, not arrows. Each carries two projection morphisms intoDat:Morphism
Signature
Semantics
t_fromTrn → Datthe input type the transformation consumes
t_toTrn → Datthe output type it produces
The algorithms still compose — that structure is recovered as the free category on these objects — but now each is a thing you can point at. Effectful transformations are marked
⊸.Loc— objects are physical execution sites: a thread, a core, a RAM region, a disk, a NIC, and composites of them (Browser,RouteHandler,Postgres). Morphisms are adjacency — "can hand off directly to."Trm— a transmission (one datum crossing a location boundary). Like a transformation it is an object, with three projections:Morphism
Signature
Semantics
c_fromTrm → Locsource location
c_toTrm → Loctarget location
carriesTrm → Datthe datum on the wire — there is no untyped transmission
Dat and Trn are the logical pair — what the software means. Loc and
Trm are the physical pair — what it occupies. An architecture is the
application of the logical onto the physical.
Placement is a span, not a function
That application is the key correction. When you first formalise it, you write
"every transformation has a location" — a function runsAt : Trn → Loc. It is
false. A validation runs in the Browser and on the RouteHandler. A render
runs server-side and client-side. One transformation, many locations.
So "where does T run" is not a function — it is a relation. The fix is to
reify the pairing: a placement is its own object, with projection morphisms.
Placement | Projections | Meaning |
|---|---|---|
|
| a transformation deployed at a location, inside a component |
|
| a datum materialised at a location, inside a component |
|
| a transmission used by a component |
Each is a span — an apex object with arrows out to the things it relates.
Because a placement is its own object, one transformation can be the target of
many TrnLoc projections — one per place it runs.
graph TB
TrnLoc["TrnLoc<br/>(a placement)"]
DataLoc["DataLoc<br/>(a placement)"]
Trn["Trn"]
Dat["Dat"]
Loc["Loc"]
Cmp["Component"]
TrnLoc -->|"tl_trn"| Trn
TrnLoc -->|"tl_loc"| Loc
TrnLoc -->|"tl_cmp"| Cmp
DataLoc -->|"dl_data"| Dat
DataLoc -->|"dl_loc"| Loc
DataLoc -->|"dl_cmp"| Cmp
style TrnLoc fill:#f7c04f,color:#000
style DataLoc fill:#f7c04f,color:#000
style Trn fill:#7fc47f,color:#000
style Dat fill:#4f8cf7,color:#fff
style Loc fill:#f77f7f,color:#fff
style Cmp fill:#cf7fcf,color:#fff"Where does T run" is then the fibre { tl_loc(tl) : tl_trn(tl) = T } —
the set of placements projecting to T. It may be empty (an unused
transformation), a singleton, or larger. runsAt was never a function.
Components compose
A component (Cmp) is a cohesive bundle of placements sharing one Cmp
value. What it owns is deduced — never stored twice — as the fibres of the
projections: data(c) is the DataLocs with dl_cmp = c, behaviour(c) the
TrnLocs with tl_cmp = c, locs(c) the locations they occupy.
Components compose. Two of them glue along a shared transmission — a port of
one and a port of the other naming the same Trm — and in the composite
C = A ⋈ B that transmission becomes internal. Gluing is associative and has
an identity (the empty pass-through component), so components form a category
Comp: objects are interfaces, morphisms are components, composition is
port-matching. This is the formal content of "a system is made of services" — a
composed service's data, behaviour and locations are computed from its parts,
not redrawn by hand.
Coherence laws — where "wrong" lives
Because every relationship is a structure-preserving map, "the architecture is well-formed" becomes a handful of equations. The load-bearing one:
Law 1 — Placement honesty. For every placement of a transformation
Tat a locationL, each inputt_from(T)is either materialised atL(aDataLocover that type atL) or delivered toL(aTrmwithcarries = t_from(T)andc_to = L). No transformation reads data that is not present at its location.
Read backwards, it is a defect detector: a transformation reading data no
transmission carries to its location is a provable hole, not a style choice.
The others are the same idea — every transmission is typed and crosses a real
boundary (c_from ≠ c_to); a cross-location dependency is mediated by a Trm,
never a direct reach; a composed component's parts must not contradict each
other. An architecture is well-formed exactly when the maps between its
categories commute. A failed law names the missing piece — and that is the
whole reason to bother.
How a diagram deduces — a worked reduction
This is the move the method exists for, in the formal language. The article
runs it on Invoice and Expense; the five steps never change, and apply to
any two objects you suspect are one.
Step 1 — Write the naive model down, rigorously. Start where intuition
starts — Invoice and Expense, two objects of Dat — and write every
morphism out of each, with its target:
| target |
| target |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The instant you write the targets, something the noun never showed surfaces:
both objects' morphisms land in the same objects — Organization, Money,
Date, LineItem. They share a centre.
Step 2 — Map one category onto the other. Propose a functor
F : Expense-cat → Invoice-cat. On objects it is forced — Expense ↦ Invoice,
every shared object to itself. On morphisms, pair by target: supplier ↦ customer, payer ↦ issuer, total ↦ total, lines ↦ lines, incurredAt ↦ issuedAt.
Step 3 — Check the squares commute. A mapping means nothing unless it
respects structure. Take total: the square commutes exactly when
total ∘ F = identity ∘ total — and since F and identity are identities on
the targets, that collapses to one question: do the two totals compute the
same function? They do. Walk Organization: supplier ↦ customer (both "the
other party"), payer ↦ issuer (both "the owning tenant"). Every shared-
structure square commutes.
Step 4 — Read the verdict. Every square commuted, and F is a bijection on
objects. A functor that is bijective on objects and commutes everywhere is the
identity in disguise. There was only ever one category. Invoice and Expense
are the same object of Dat — a proof, not a preference.
Step 5 — What did not commute is the other half of the answer.
categorize : Expense → Category had no partner; the direction of the money is
a genuine difference. The morphisms that fail to commute are the real
distinctions — and the deduction hands them to you precisely. They are not
erased, they are segregated: kept as partial morphisms on the unified
object, selected by a discriminator.
graph LR
Doc["Document"]
Org["Organization"]
Money["Money"]
DateO["Date"]
Dir["{ INCOMING, OUTGOING }"]
Cat["Category"]
Doc -->|"counterparty"| Org
Doc -->|"owner"| Org
Doc -->|"total"| Money
Doc -->|"date"| DateO
Doc -->|"direction"| Dir
Doc -.->|"category? · partial"| Cat
style Doc fill:#4f8cf7,color:#fff
style Org fill:#f77f7f,color:#fff
style Money fill:#f7c04f,color:#000
style DateO fill:#f7c04f,color:#000
style Dir fill:#cf7fcf,color:#fff
style Cat fill:#cf7fcf,color:#fffSo a reduction is two moves at once — consolidate every morphism whose
square commuted onto one object, segregate every morphism that did not as a
partial morphism. The output — one documents table, a direction column,
category nullable, the expenses table dropped — is read straight off the
diagram. The model you can verify is the model you build from.
One dividend: strategies
Two TrnLocs over the same Trn — possibly different code, same t_from → t_to job — are parallel arrows fitting one slot. That is the exact shape of a
strategy: interchangeable implementations behind one interface. Country tax
handlers, swappable PDF templates, a live data provider versus a mock — all the
same structure. Adding a strategy is adjoining one parallel arrow, never
editing the core. The method does not just describe a system; it names the axis
along which it extends without being rewritten.
Software isn't objects — it's data, transformations, and a place to run
We have the four atoms. So here is the question that decides whether you can actually use them: why does almost everyone model software with something else entirely — objects?
Object-oriented programming taught us to model the world as objects: an
Invoice is a class, an Expense is a class. It feels natural because the
words are nouns. But a class is not one of the atoms — it is a premature
bundle of several. A class fuses a Dat object (its fields) with the Trns
that act on it (its methods), and silently pins them to one Loc — one heap,
one process — and it makes you commit to that bundle before you have done any
analysis. The CPU never sees the bundle. It sees Dat, Trn, Loc
separately. The object is a story narrated on top — and the story has a cost.
// The object-oriented instinct: every noun gets a class.
class Invoice {
lineItems: LineItem[];
customer: Customer; // who we billed
issuer: Organization; // ...us
total(): Money { /* sum line items, apply tax */ }
send(): void { /* ... */ }
}
class Expense {
lineItems: LineItem[];
supplier: Supplier; // who billed us
payer: Organization; // ...us, again
total(): Money { /* sum line items, apply tax — a second time */ }
categorize(): void { /* ... */ }
}
// ...and, written by someone else on another day, two more nouns:
class Customer { name: string; taxId: string; address: Address; }
class Supplier { name: string; taxId: string; address: Address; }Four classes — and the rot is already visible: total() written twice, and a
Customer and a Supplier that are line-for-line identical. This is not a
contrived example. It is what any codebase touched by many hands — or by an
AI — drifts into, because nothing makes the redundancy checkable.
Map it onto the atoms and the checking begins. All four — Invoice, Expense,
Customer, Supplier — are objects of Dat. Their fields (customer,
issuer, name, taxId, …) are Dat-morphisms, the structural maps out
of each object. total is not a Dat-morphism — it is an algorithm, an
object of Trn. OOP fused the Dat object and its Trns into each
class, and that glue is exactly what hid the structure.
Now stop describing the model and compute with it. Write it down the way a
category demands — every object of Dat, every morphism, source and target
named. Five objects, six morphisms:
graph LR
Customer["Customer"]
Supplier["Supplier"]
Org["Organization"]
Invoice["Invoice"]
Expense["Expense"]
Invoice -->|"i_by"| Org
Invoice -->|"i_for"| Customer
Expense -->|"e_by"| Supplier
Expense -->|"e_for"| Org
Customer -->|"c_is"| Org
Supplier -->|"s_is"| Org
style Customer fill:#4f8cf7,color:#fff
style Supplier fill:#4f8cf7,color:#fff
style Invoice fill:#4f8cf7,color:#fff
style Expense fill:#4f8cf7,color:#fff
style Org fill:#f77f7f,color:#fffRead it straight off: an Invoice is issued by an Organization (i_by)
for a Customer (i_for); an Expense is billed by a Supplier
(e_by) for an Organization (e_for). Customer and Supplier each
carry a single morphism — into Organization. Two inference rules now take the
diagram the rest of the way. Nothing else.
Rule 1 — composition. A category is closed under composition: given
f : A → B and g : B → C, the morphism g ∘ f : A → C exists — you do not
get to choose. And Customer and Supplier carry the same morphism set as
Organization — name, taxId, address, into the same primitives — so each
is isomorphic to it; c_is and s_is are the witnessing isomorphisms.
Compose straight through them:
graph LR
Invoice["Invoice"]
Customer["Customer ≅ Org"]
Org["Organization"]
Invoice -->|"i_for"| Customer
Customer -->|"c_is"| Org
Invoice -.->|"i_for′ = c_is ∘ i_for"| Org
style Invoice fill:#4f8cf7,color:#fff
style Customer fill:#9a9a9a,color:#fff
style Org fill:#f77f7f,color:#fffi_for′ = c_is ∘ i_for : Invoice → Org, and likewise
e_by′ = s_is ∘ e_by : Expense → Org. The composite is not new data — it is
forced to exist. And once it does, Customer and Supplier are isomorphic
copies of Organization that nothing else reaches: drop them. Three objects
remain, and Invoice and Expense now have the same shape — two morphisms,
both into Organization:
graph LR
Invoice["Invoice"]
Expense["Expense"]
Org["Organization"]
Invoice -->|"i_by"| Org
Invoice -->|"i_for′"| Org
Expense -->|"e_by′"| Org
Expense -->|"e_for"| Org
style Invoice fill:#4f8cf7,color:#fff
style Expense fill:#4f8cf7,color:#fff
style Org fill:#f77f7f,color:#fffThat shared centre did not exist a paragraph ago; Rule 1 created it — and it is exactly what the next rule needs.
Rule 2 — a functor that commutes everywhere is an identity in disguise.
Propose a functor F from the Expense sub-category to the Invoice one. On
objects it is forced: Expense ↦ Invoice, Org ↦ Org. On morphisms, pair them
by role: e_by′ ↦ i_for′, e_for ↦ i_by. F is a functor only if it
respects the structure — only if every square commutes:
graph LR
Expense["Expense"]
Invoice["Invoice"]
OrgA["Organization"]
OrgB["Organization"]
Expense -->|"F"| Invoice
Expense -->|"e_by′"| OrgA
Invoice -->|"i_for′"| OrgB
OrgA -->|"id"| OrgB
style Expense fill:#4f8cf7,color:#fff
style Invoice fill:#4f8cf7,color:#fff
style OrgA fill:#f77f7f,color:#fff
style OrgB fill:#f77f7f,color:#fffThe square commutes exactly when i_for′ ∘ F = id ∘ e_by′ — and since F and
id are identities on the target, that collapses to one plain question: do
e_by′ and i_for′ pick out the same Organization? They do — both are
the counterparty, the other party to the transaction. Walk the second
square, e_for ↦ i_by: both pick out the owner, the tenant the document
belongs to. It commutes too.
F is a bijection on objects, and every square commutes. A functor that is
bijective on objects and commutes everywhere is not a bridge between two
categories — it is the identity in disguise. There was only ever one object.
Invoice and Expense are the same object of Dat, written down twice — a
proof, not a preference. (The commuting squares are the naturality condition:
the exact categorical content of "these two are the same thing.") The
duplicated total() from the four-class sketch falls with them — one object
carries one total, one Trn; the second was never a transformation, only a
class boundary.
Rule 3 — what does not commute is segregated, not erased. The deduction
never said "merge everything." categorize : Expense → Category had no partner
to pair with; the direction of the money — leaving us, or coming in — is a
genuine difference with nothing to commute against. Those are the real
distinctions, and the deduction hands them to you precisely: keep each as a
discriminator or a partial morphism on the one unified object —
consolidate every morphism that commuted, segregate every morphism that did
not:
graph LR
Doc["Document"]
Org["Organization"]
Dir["{ INCOMING, OUTGOING }"]
Cat["Category"]
Doc -->|"counterparty"| Org
Doc -->|"owner"| Org
Doc -->|"direction"| Dir
Doc -.->|"category? — partial"| Cat
style Doc fill:#4f8cf7,color:#fff
style Org fill:#f77f7f,color:#fff
style Dir fill:#cf7fcf,color:#fff
style Cat fill:#cf7fcf,color:#fffcounterparty is the unified e_by′ ≡ i_for′; owner the unified
e_for ≡ i_by; direction : Document → { INCOMING, OUTGOING } is the
discriminator that survived precisely because its square never commuted. Five
objects and six morphisms became one — Document — by two composition
steps and a commuting-square check. You did not decide it. You computed it.
In code — and here the method pays a second time. The deduction fixes the
category: Dat has one object, Document, with direction a Dat-morphism
into a two-element enum, and total the single Trn acting on it. It does
not fix the memory layout. You can realise that one object as an array of
structures:
// Deduced: one Dat-object. `direction` is a Dat-morphism into a 2-element enum;
// `total` is the single Trn that acts on it.
type Document = {
lineItems: LineItem[];
counterparty: Organization; // consolidated — Customer + Supplier
direction: "OUTGOING" | "INCOMING"; // a Dat-morphism: the discriminator
category?: Category; // segregated — partial, only when INCOMING
};
const total = (d: Document): Money => /* the one Trn, defined exactly once */;
const ledger: Document[] = [ /* ... */ ]; // array of structures (AoS)…or, with the exact same categorical content, as a structure of arrays —
where the direction morphism is realised not as a stored field but as which
array a document lives in:
// Structure of arrays (SoA): same Dat-object, same single `total` Trn — only
// the layout differs. `direction` is now the partition itself: each array is
// one fibre of the morphism (its pre-image over OUTGOING / over INCOMING).
const ledger = {
invoices: [] as Document[], // fibre of `direction` over OUTGOING
expenses: [] as Document[], // fibre of `direction` over INCOMING
};Both are faithful to the same category — same single Document object, same
single total. AoS versus SoA is a placement and performance decision —
SoA is the cache-friendly layout when a computation sweeps one direction at a
time — not a modelling one. The category fixed what is true; it deliberately
left how to lay it out open. That line — model here, layout there — is exactly
what OOP erases by baking a layout into every class, and exactly what the atoms
keep sharp.
"Invoice" and "expense" were never two objects. direction is a morphism on
one Dat-object, not a class boundary — and the second total() is not
"removed," it was never possible: there is one object to define it on. You
arrived here by deduction, not taste. (The fully formal version of this move —
functors, fibres, why a bijection-on-objects that commutes everywhere is a
single category — is in the categorical-machinery deep dive above; it is the
same five steps, and exactly how Guliel's real schema was reduced.)
Components and encapsulation are not the enemy — but they belong at the end, as a conclusion the analysis earned, never as the premise you started from. Bundle too early and you abstract away the commutes before you have seen them.
Placement, and the function that wasn't
Four atoms don't make an architecture yet. An architecture is the application of the logical pair onto the physical pair: this transformation, running on that core; this data, resident in that database.
When I first formalised this, I wrote that application as a simple function: every transformation has a location. One transformation, one place.
That was the old mistake again. A transformation does not have one location. A validation runs in the browser and on the server. A "render" runs server-side and again client-side. The same algorithm gets placed in many spots — sometimes as literally different code that does the same job.
So "where does this run" is not a function. It's a relation — one transformation, many placements. The fix is to make a placement its own first-class thing: a record that says "transformation T, at location L, inside component C." Once placement is something you can point at, you can have as many as the truth requires — and you can check them.
That check is the thing prose architecture could never do. The rule: a transformation may only read data that has been transmitted to its location. A placement that reads data no transmission delivers is not a style choice — it's a defect the architecture can show you, on a whiteboard, before production.
Components compose
A component is a cohesive bundle of placements — data and transformations, placed at locations. A component can span more than one location. And components compose: glue two of them along the transmission that connects them and you get a single larger component, that transmission now an internal detail. Compose enough and you have a system.
This is the property the boxes-and-arrows diagram never had. Composition has rules. A composed component's data, behaviour, and locations are derived from its parts — not redrawn by hand. When two components don't compose cleanly, the gap is a concrete, nameable thing: a missing transmission, a transformation reading data that isn't there. The architecture can be wrong in a way you can point at. And note where components landed: at the end, as the boundary the analysis earned — exactly where the object should have been all along, and never was.
Taming the beast: why this is suddenly urgent
Everything so far I believed when humans wrote all the code. It was true then, but it was a slow tax — a non-rigorous spec is survivable when a human reads between its lines, holds the unwritten context, and notices "wait, isn't this the same as that."
Hand that same spec to an AI and the tax becomes a hemorrhage.
An AI generates against the spec you give it. If "correct" is undefined — and in a prose PRD or a boxes-and-arrows diagram it is undefined — then nothing the AI produces can be wrong. I have watched it, repeatedly: it invents objects that were never in the model. It re-orders the architecture between one session and the next. It copies a function definition into five files. And none of that violates a prose spec, because a prose spec has nothing to violate. The non-rigor that humans quietly absorbed, AI industrialises.
And that is measured, not a grievance. GitClear's study of 211 million lines of changed code found that, as AI coding assistants spread, copy-pasted code climbed from 8.3% to 12.3%, duplicated blocks rose roughly eightfold in 2024, and refactoring fell to a record low — 2024 was the first year on record that duplicated code outpaced refactored code (report write-up). GitClear's own diagnosis is this article's thesis in other words: "it is less likely that the AI will propose reusing a similar function elsewhere … partly because of limited context size." It cannot see the whole model, so it re-types pieces of it — and a prose spec gives nothing to catch that with.
Category theory closes the hole, because it turns the spec into a checkable contract. An olog is the definition — readable straight off the diagram by the categorical rules, no annotation needed. "Correct" becomes something precise: the diagram commutes, the coherence laws hold, no morphism is a redundant copy of a composite that already exists. That is verifiable — by you, by a test, increasingly by the AI itself. The model can hallucinate all it likes; the category is the gate it cannot bluff past.
And because you think categorically about every layer — the PRD, the data model, the architecture — the whole spec is verifiable, not one slice of it. When the AI proposes a parallel object, the rule "if X is Y plus extra morphisms, X is not a new object" catches it. When it copies a function, the duplicate shows up as two morphisms that should have been one composition — a diagram that fails to commute. You are not hoping the AI behaved. You are checking.
That is the real meaning of taming the beast. Not better prompting. A verifiable target — an objective function the machine cannot game, because being wrong is finally defined.
Every architecture style is this, with one constraint flipped
Here's the payoff, and the reason I trust the method.
An architectural style — layered, microservices, event-driven, CQRS — is, in the precise sense, a set of constraints on how the parts may be arranged. The method gives you a substrate; a style is a configuration on it. You don't learn a new vocabulary per style. You take the same four atoms and flip one constraint.
Request/response: component A names component B; their dependency is one direct transmission.
Event-driven: delete that edge. A publishes to a broker; B subscribes. The A→B transmission now factors through a third component. Same atoms — one constraint flipped, and "A doesn't know B exists" falls out for free.
Below is the whole gallery — each style translated into the method. Open the ones you use; they're each self-contained.
Foundational
Layered / n-tier architecture
The oldest move in the book: stack the system in layers — presentation, business logic, data access, database — and let each layer call only the one directly beneath it. Nothing reaches up; nothing skips a level.
The four atoms. Layered constrains the component graph and almost nothing
else — so its Dat and Loc are deliberately generic, and that genericity is
itself a result (it is what separates "layered" from "n-tier", below).
Trn — each layer owns transformations; the shapes flowing between them:
Transformation | Signature | Kind |
|---|---|---|
|
| pure |
|
| pure |
|
| effect |
|
| effect |
Trm — transmissions run only between adjacent layers:
Transmission | Signature | Carries |
|---|---|---|
|
|
|
|
|
|
|
|
|
Loc — unconstrained: layered fixes no locations; all four components may
share one process. Dat — the per-layer shapes above, with no cross-layer
structure imposed.
Placement. Each layer is a component — PresentationCmp, BusinessCmp,
DataAccessCmp, DatabaseCmp — and each transformation is a TrnLoc placed
inside its layer. The whole content of the style is the shape of the
depends-on graph over those components:
graph TD
P["PresentationCmp"]
B["BusinessCmp"]
D["DataAccessCmp"]
DB["DatabaseCmp"]
P -->|"q_PB : Request"| B
B -->|"q_BD : Query"| D
D -->|"q_D·DB : SQL"| DB
style P fill:#cf7fcf,color:#fff
style B fill:#cf7fcf,color:#fff
style D fill:#cf7fcf,color:#fff
style DB fill:#cf7fcf,color:#fffThe system is the composite System = PresentationCmp ⋈ BusinessCmp ⋈ DataAccessCmp ⋈ DatabaseCmp, glued along the three adjacent-layer
transmissions. Because every dependency points exactly one rank down, the
gluing is a single unbranched chain — which is precisely why a layered
system stacks.
The defining constraint, as a law. Give the components a rank — a functor
rank : Comp → (ℕ, <) numbering the layers — and impose:
depends-on(c, c′) ⟹ rank(c′) = rank(c) + 1
Every dependency edge lands on the immediately next rank: never skips a
layer, never points up. That single inference rule is "layered." A
skip-level call (rank(c′) > rank(c) + 1) or an upward call
(rank(c′) ≤ rank(c)) fails it — visibly, on the diagram, before runtime.
Layered vs n-tier — two independent constraints. The rule above constrains
Comp only; it says nothing about Loc. N-tier adds a second, separate
constraint — on placement: tl_loc is injective across layers, each layer at a
distinct location. A monolith satisfies the first and not the second: fully
layered, one process. The method shows the two words name two constraints — you
can have either without the other.
What the framework tells you.
Cross-cutting concerns are predicted, not a surprise. Logging, auth, telemetry touch every layer — so they cannot be a
depends-onedge, which would skip ranks. They are a natural transformationId ⇒ Wapplied uniformly across all components — a different construct entirely. That is why every real layered system grows an "aspect" or "middleware" escape hatch: the substrate has no other slot for it.Strictness is one inequality. "Strict" layered forbids skip-calls (
rank(c′) = rank(c) + 1exactly); "relaxed" layered permits them (rank(c′) > rank(c)). The style's only knob is which inequality you write.A linear
depends-onis a build order. Compilation, deployment and reasoning all proceed bottom-up because the order is total — there is a unique topological sort, read straight offrank.
Contrast — pipe-and-filter is also a linear chain of components, but the
edges are a different kind. Layered edges are call edges: a layer invokes the
one beneath and awaits a return — a round trip. Pipe-and-filter edges are
compose edges: data flows one way through, no call, no return. The same line
shape — q ; q of calls versus f ; g of composition.
Client–server
Two parties at two places. One — the client — wants something and asks for it; the other — the server — holds the real data and answers. The browser asks, the API responds. Every exchange opens on the client side.
The four atoms. Client–server constrains the transmissions — who may open one — and data placement — who owns the authoritative copy. The transformations and their types are whatever the domain needs.
Dat — one domain type Resource, materialised twice; the only structure the
style fixes is that the two materialisations are the same Dat object:
Morphism | Signature | Semantics |
|---|---|---|
|
| the identity both sides agree on |
|
| the resource's payload |
|
| which version a copy holds |
Trn — ordinary domain transformations, placed on one side or the other:
Transformation | Signature | Kind |
|---|---|---|
|
| effect |
|
| effect |
|
| pure — client-side |
|
| pure — placed both sides |
Loc — exactly two: Client (the asking thread) and Server (the answering
process, with its store). Trm — every exchange is a request/response pair:
Transmission | Signature | Carries |
|---|---|---|
|
|
|
|
|
|
Placement. read and mutate are TrnLocs at Server; render a
TrnLoc at Client; validate has two TrnLocs — one each side. The
Resource datum has two DataLocs over the same Dat: the authoritative row
at Server, and a copy at Client.
graph LR
Cl["ClientCmp<br/>TrnLoc: render, validate"]
DC["Resource DataLoc<br/>(copy, @ Client)"]
Sv["ServerCmp<br/>TrnLoc: read, mutate, validate"]
DS["Resource DataLoc<br/>(authoritative, @ Server)"]
DC -->|"dl_cmp"| Cl
DS -->|"dl_cmp"| Sv
Cl -->|"t_request : Command/Query"| Sv
Sv -.->|"t_response : Resource (reply only)"| Cl
style Cl fill:#cf7fcf,color:#fff
style Sv fill:#cf7fcf,color:#fff
style DC fill:#9a9a9a,color:#fff
style DS fill:#4f8cf7,color:#fffEvery exchange is the composite — request, then the reply it provokes:
t_response ∘ t_request : Client → Server → Client
t_response is never a free-standing morphism: its domain Server is reached
only as the codomain of a prior t_request. The system glues the two
components along that pair — System = ClientCmp ⋈ ServerCmp, joined on
t_request and t_response — and the depends-on edge runs ClientCmp → ServerCmp only; there is no edge back.
The defining constraint, as a law. Start from peer-to-peer: any component
may open a transmission to any other, and no node holds privileged data — the
initiation relation on Trm is symmetric. Flip one constraint — make
initiation directed:
Designate
Clientthe sole initiator. EveryTrmwithc_from = Serverexists only as at_responsepaired after at_requestwithc_to = Server:∀ τ. c_from(τ) = Server ⟹ ∃ ρ. c_from(ρ) = Client ∧ reply(τ) = ρ. No server-side transmission is an opening move.
This strengthens Coherence Law 4: a cross-location dependency is still mediated
by a Trm, but now the mediation is one-directional — depends-on carries
no ServerCmp → ClientCmp edge. The data asymmetry follows, it is not a second
axiom: the side that only ever answers is the side it makes sense to trust, so
the authoritative DataLoc lands at Server and the Client copy is grey.
What the framework tells you.
"Never trust the client" is a Law 1 fact. The
ClientDataLocis a different placement from the authoritative one — a copy delivered byt_response. Anymutatethat must produce authoritativeResourcereadst_from = Commandagainst the real store, so by Coherence Law 1 it can only be placed atServer. Trust is a placement, not a slogan.Validation that runs twice is honest.
validatehas twoTrnLocs — one atClient(snappy UI), one atServer(real enforcement). That is the multi-placement the framework was built to express: parallel realisations of oneCommand → Commandcontract, not a redundancy to delete.The bottleneck is structural. Because
Serveris the sole responder and the sole owner of the authoritativeDataLoc, every client'sdepends-onedge points at it. Load concentration shows up on the diagram before it shows up in production.
Contrast — peer-to-peer is the same two-location picture with the
initiation constraint lifted: the Trm relation goes symmetric, every node may
open a transmission, and no single DataLoc is the authoritative one.
MVC / MVVM
Split the presentation tier into three roles. The Model is the data; the View is what the user sees; and between them sits a Mediator — a Controller (MVC) or a ViewModel (MVVM) — that turns input into Model changes and Model state into something the View can render. The View and the Model never speak directly.
The four atoms. MVC / MVVM constrains the component dependency graph;
all three components almost always sit inside one Loc — the presentation
tier — so it is not a placement across the wire.
Dat — the central object is Model, the domain state; the View consumes a
rendered projection of it:
Morphism | Signature | Semantics |
|---|---|---|
|
| the current authoritative value |
|
| the render-ready shape (the View's input) |
Trn:
Transformation | Signature | Kind |
|---|---|---|
|
| pure — the View itself |
|
| pure — interpret a user gesture |
|
| effect |
Trm — the only transmissions are the three component ports; there is no
View → Model transmission:
Transmission | Signature | Carries |
|---|---|---|
|
|
|
|
|
|
|
|
|
Loc — one site, Presentation (a UI thread, a browser tab). View, Mediator
and Model are three components co-located there, not three locations.
Placement. Each role is a Cmp; render is a TrnLoc of ViewCmp,
intent and apply are TrnLocs of MediatorCmp, the Model is a DataLoc
of ModelCmp — all at Presentation:
graph TD
M["ModelCmp<br/>DataLoc: Model"]
Md["MediatorCmp<br/>TrnLoc: intent, apply"]
V["ViewCmp<br/>TrnLoc: render"]
L["@ Presentation (Loc)"]
V -->|"t_input : Input"| Md
Md -->|"t_rw : Command / DomainState"| M
Md -->|"t_push : ViewState"| V
V -.->|"co-located at"| L
Md -.->|"co-located at"| L
M -.->|"co-located at"| L
style M fill:#4f8cf7,color:#fff
style Md fill:#cf7fcf,color:#fff
style V fill:#cf7fcf,color:#fff
style L fill:#f7c04f,color:#000The View↔Model dependency factors — it is never an edge, only a composite through the Mediator:
t_push ∘ apply ∘ t_rw ∘ intent ∘ t_input : View → Mediator → Model → Mediator → View
The system glues the three ports — System = ViewCmp ⋈ MediatorCmp ⋈ ModelCmp,
along t_input/t_push and t_rw — and ViewCmp and ModelCmp share no
port directly.
The defining constraint, as a law. Start from the unconstrained
presentation component: View reads and writes Model freely — a complete
triangle of depends-on edges. Flip one constraint — delete the direct edge:
In the
depends-ongraph there is no edgeViewCmp → ModelCmpand noneModelCmp → ViewCmp. Every View↔Model dependency is mediated byMediatorCmp— the graph is a triangle with the View–Model edge missing.
That single absent edge is MVC / MVVM. It is Coherence Law 4 (a dependency
is mediated by a transmission) used within one location: Law 4 normally
mediates because two components sit on different sites; here both sit at
Presentation, and the mediation is imposed by design — to keep rendering
separable from domain logic — not forced by the wire.
MVC vs MVVM is one sub-constraint — the kind of t_push. Both have the
same triangle and the same missing edge. They differ only in whether
Mediator → View is explicit or deduced:
MVC:
t_pushis an explicit transmission — the Controller names the View and pushesViewStateby hand. MVVM:t_pushis deduced from aBinding ⊆ ViewField × ViewModelField— a declared data-binding; the framework synthesises the transmission both ways.t_pushis a real transmission that no component's placements name.
What the framework tells you.
The View is a transformation, so it is testable.
render : Model → ViewStatehas at_fromand at_tolike anyTrn. Place thatTrnLocoff-screen and you can check its output without a real display — the View is not "the screen", it is a pure function intoViewState.MVVM's wiring is invisible by construction. The binding-derived
t_pushis exactly the event-driven broker's situation: the decoupling win and the "where did the wiring go" cost sit on the same diagram — a transmission that belongs to aBindingtable, not toViewCmporMediatorCmp.Extensibility is adjoining one component. A second View over the same Model is a new
ViewCmpwith its ownrenderTrnLocand one newt_input/t_pushpair to the Mediator.ModelCmpis untouched — it structurally cannot import a View, because no edge reaches it.The missing edge is a fact, not a discipline. "The Model knows nothing of the View" is not a coding convention you must uphold — it is the absence of a
Cmp-reference, visible on the triangle before runtime.
Contrast — layered imposes a linear order on the same Component
dependency graph; MVC / MVVM imposes a triangular one with a single forbidden
edge. Same substrate, same depends-on graph — one is a chain, the other a
triangle minus an edge.
Distributed
Microservices
Slice the system into many small, independently deployable services, and give each one its own database. No service reads another's tables. If the order service needs a customer's name it asks the customer service — over the network — it never runs a join. "Database per service" is the whole architecture in three words.
The four atoms. Microservices is a constraint on data placement; the transformations and their types are whatever the domain needs.
Dat — each service owns a private cluster of types (Order, OrderLine for
one; Customer, Address for another). No object is privileged; what matters
is where each is materialised — see Placement.
Trn — ordinary domain transformations:
Transformation | Signature | Kind |
|---|---|---|
|
| effect |
|
| effect |
Loc — each service has its own location, and its database is a distinct
location again: OrderSvc, OrderDB, CustomerSvc, CustomerDB.
Trm — every cross-service need is a transmission:
Transmission | Signature | Carries |
|---|---|---|
|
|
|
|
|
|
Placement. Each service is a component; its database is a DataLoc owned
by that component alone:
graph LR
OS["OrderCmp"]
OD["Order DataLoc<br/>(private)"]
CS["CustomerCmp"]
CD["Customer DataLoc<br/>(private)"]
OD -->|"dl_cmp"| OS
CD -->|"dl_cmp"| CS
OS -->|"q_cust : CustomerId"| CS
CS -.->|"r_cust : Customer"| OS
style OS fill:#cf7fcf,color:#fff
style CS fill:#cf7fcf,color:#fff
style OD fill:#4f8cf7,color:#fff
style CD fill:#4f8cf7,color:#fffThe defining constraint, as a law. Start from a layered monolith: every
layer-component reads one shared database — a single DataLoc the data-access
and business components both touch. Flip one constraint — make data placement
exclusive:
dl_data(dl) = dl_data(dl′) ⟹ dl_cmp(dl) = dl_cmp(dl′)
No data type is materialised in two components. The store is private. And
because the shared DataLoc is gone, a cross-component data need can no longer
be a co-located read — it must become a transmission. Coherence Law 4 (a
cross-location dependency is mediated by a Trm) stops being advice and becomes
the only way two services interact.
What the framework tells you.
Extensibility is adjoining a component. A new service is a new
Cmpwith its own privateDataLocand a few transmissions to the services it queries. No existing component's data placement changes — nobody's table grows a column for the newcomer.The "distributed transaction" tax is predicted. What a monolith got from one consistent join, microservices must assemble from several transmissions — each able to be slow, fail, or return stale data. That cost is the direct consequence of forbidding the shared
DataLoc; the method shows the isolation win and the eventual-consistency cost on the same diagram.The boundary is a fact, not a discipline. There is no arrow from one component into another's
DataLoc— only into its transmission ports. A service can be redeployed, rescaled, rewritten without touching anyone's store, because the diagram structurally forbids the alternative.
Contrast — SOA keeps the small, private-data services but routes every inter-service transmission through one shared bus component; microservices keep the transmissions point-to-point — "smart endpoints, dumb pipes."
Service-oriented architecture (SOA)
Carve the enterprise into a handful of coarse-grained services — Billing, Inventory, CRM — and wire none of them to each other. Every service talks to one shared piece of plumbing: an enterprise service bus. The bus routes, translates protocols, transforms messages, orchestrates. Services publish to it and receive from it; they never hold a reference to one another.
The four atoms. SOA is a constraint on transmissions — it leaves Dat,
Trn and the service-internal structure to the domain.
Dat — each service owns its own cluster of domain types (Invoice for
Billing, StockItem for Inventory); no object is privileged. The one object
every service shares is Message — the bus's envelope — with a routing
discriminator:
Morphism | Signature | Semantics |
|---|---|---|
|
| logical destination — the routing key |
|
| the domain datum, in transit |
|
| wire protocol/schema the bus may translate |
Trn — ordinary domain transformations, plus the bus's own:
Transformation | Signature | Kind |
|---|---|---|
|
| effect |
|
| pure — the bus picks the second hop |
|
| pure — the bus's protocol bridge |
|
| effect |
Loc — each service has its own location: BillingSvc, InventorySvc,
CRMSvc — and the bus is a distinct location again, Bus.
Trm — every cross-service need is a transmission, and every one has Bus as
an endpoint:
Transmission | Signature | Carries |
|---|---|---|
|
|
|
|
|
|
Placement. Each service is a component; the bus is also a component —
its own route / translate / orchestrate TrnLocs and a DataLoc holding
the routing table — never a wire:
graph TD
Bus["BusCmp<br/>TrnLoc: route, translate, orchestrate<br/>DataLoc: routing table"]
A["BillingCmp"]
B["InventoryCmp"]
C["CRMCmp"]
A -->|"t_out : Message"| Bus
Bus -->|"t_in : Message"| A
B -->|"t_out : Message"| Bus
Bus -->|"t_in : Message"| B
C -->|"t_out : Message"| Bus
Bus -->|"t_in : Message"| C
style Bus fill:#cf7fcf,color:#fff
style A fill:#cf7fcf,color:#fff
style B fill:#cf7fcf,color:#fff
style C fill:#cf7fcf,color:#fffThe defining fact: there is no transmission BillingSvc → InventorySvc. A
dependency Billing→Inventory does not exist as one edge — it factors through
the bus as a composite in the routing category:
t_in ∘ t_out : BillingSvc → Bus → InventorySvc
The first hop is fixed (the service knows only the bus); the second hop is
chosen by the bus's route transformation reading msg_to. The components glue
to match — System = BillingCmp ⋈ BusCmp ⋈ InventoryCmp ⋈ CRMCmp, every gluing
along a t_out / t_in pair — and no two service components share a port
directly. Their only common reference is BusCmp.
The defining constraint, as a law. Start from microservices: many services,
private data, transmissions running point-to-point — BillingCmp's placements
name a port on InventoryCmp directly. Flip one constraint — make every
inter-component transmission factor through one shared bus:
For every transmission
τwithc_from(τ) ≠ c_to(τ)between two service components,c_from(τ) = Bus ∨ c_to(τ) = Bus. No service-to-serviceTrmexists; every cross-service dependencyc → c′is the compositet_in ∘ t_outthroughBus.
This is Coherence Law 4 (cross-location dependencies are mediated by a Trm)
strengthened the same way the event-based style strengthens it — mediation must
be indirect, through a third component — but here the mediator is a named
addressable hub, not an anonymous broker: services route by msg_to, not by
event type. The depends-on graph collapses to a star centred on Bus.
What the framework tells you.
The star graph is the SOA selling point, as a fact about shape. Every service points only at
Bus, andBusat every service. Adding a service adjoins oneCmpand onet_out/t_inpair to the hub — zero edges anywhere else. Integration points growO(n), notO(n²): the absence of service-to-serviceTrmis what makes that linear.The bus is a real component, so it has real placements.
route,translate,orchestrateareTrnLocs placed insideBusCmp, with aDataLocfor the routing table. The framework forbids pretending the bus is "just infrastructure": behaviour you own is behaviour you test, deploy, version and reason about.The honest cost: the bus is one component on every path. Because every inter-service dependency factors through
Bus, that single component knows every service's contract and every routing rule — it concentrates load and knowledge. The diagram shows the integration win (the star) and the coupling risk (one node on every path) at once — exactly the critique that later pushed teams toward microservices.
Contrast — microservices is the same picture with BusCmp deleted and the
service-to-service transmissions restored: BillingCmp names a port on
InventoryCmp directly, the star collapses back to point-to-point edges —
"smart endpoints, dumb pipes."
Service mesh
Take a microservices system and, next to every service, deploy a small proxy — a sidecar — on the same host. A service never speaks to the network directly: every call leaves through its own sidecar and arrives through the peer's sidecar. The proxies handle mTLS, retries, timeouts, load balancing and tracing; the business code stays oblivious. The mesh is the set of all those proxies.
The four atoms. Service mesh is a constraint on transmissions and the
components that carry them; Dat and Trn are whatever the domain needs.
Dat — each service owns its domain types (Order, Customer, …); the only
datum the mesh itself adds structure to is the wire envelope. No object is
privileged — what matters is who carries it across the boundary.
Trn — two flavours. Domain transformations live in the services; the mesh
contributes the transmission concerns as its own placed transformations:
Transformation | Signature | Kind |
|---|---|---|
|
| effect |
|
| effect |
|
| effect |
|
| effect |
Loc — each service has its own location, and its sidecar is co-located
with it — same Loc: OrderSvc (host of service A and proxy A),
PaymentSvc (host of service B and proxy B).
Trm — every inter-service hop is decomposed into proxy-mediated transmissions:
Transmission | Signature | Carries |
|---|---|---|
|
|
|
|
|
|
|
|
|
Placement. Each service is a component; the mesh adjoins one more
component per location — a proxy component, sharing its service's Loc. The
proxy's mtls / retry / trace are TrnLocs placed inside the proxy, not
the service:
graph LR
A["ServiceA Cmp"]
PA["ProxyA Cmp<br/>TrnLoc: mtls, retry, trace"]
PB["ProxyB Cmp<br/>TrnLoc: mtls, retry, trace"]
B["ServiceB Cmp"]
A -->|"t_out : Request"| PA
PA -->|"t_wire : Request (mTLS)"| PB
PB -->|"t_in : Request"| B
style A fill:#cf7fcf,color:#fff
style PA fill:#cf7fcf,color:#fff
style PB fill:#cf7fcf,color:#fff
style B fill:#cf7fcf,color:#fffThe single logical edge ServiceA → ServiceB is never a Trm — it factors
into three transmissions in the routing category:
t_in ∘ t_wire ∘ t_out : ServiceA → ProxyA → ProxyB → ServiceB
t_out and t_in are loopback hops (c_from and c_to share a Loc with
their service); only t_wire crosses the network. The system glues to match —
System = ServiceA ⋈ ProxyA ⋈ ProxyB ⋈ ServiceB, along those three
transmissions — and ServiceA and ServiceB share no port directly: every
boundary TrmCmp they own names a proxy, never the peer service.
The defining constraint, as a law. Start from plain microservices: a
service's transmission reaches the peer point-to-point — t : ServiceA → ServiceB, a direct port. Flip one constraint — interpose a per-location
proxy on every inter-service transmission:
No
TrmrunsServiceX → ServiceY. For every inter-service dependency there is a proxy component at each endpoint'sLoc, and the transmission factors ast_in ∘ t_wire ∘ t_outthrough them. The proxy is one uniform component, adjoined once per location.
This is Coherence Law 4 (a cross-location dependency is mediated by a Trm)
with the mediator pinned: every mediating transmission must terminate on a
proxy. And it is a deliberate §5 inversion. Layered's cross-cutting concern
is a natural transformation Id ⇒ W — an aspect woven through component
code. The mesh refuses the weaving: it realises the same concern as a
uniform component stamped per Loc. Aspects become objects.
What the framework tells you.
A cross-cutting concern, made a deployable.
mtls,retry,traceare one component pattern instantiated once per location, applied uniformly across thedepends-ongraph — the textbook shape of a cross-cutting concern, but asCmpobjects you run and version, not woven code. The §5 natural transformation and this component are two encodings of one idea; the mesh picks the one with an operational surface.Policy moves without touching business logic. Because
retryandmtlsareTrnLocs in the proxy, upgrading mesh behaviour redeploys proxy components — no service component's placements change. Decoupling of policy from code is structural, not disciplinary.The cost is on the diagram. Every logical hop is now three
Trms, and two extra components sit on every call path. The uniform-policy win and the latency-plus-operational-surface cost are both read off the same picture — the proxies are real components with real placements to run and observe.
Contrast — plain microservices is this diagram with the proxies removed and
the direct service-to-service transmission restored: t : ServiceA → ServiceB,
a shared port. The mesh is the same substrate with one uniform component
adjoined at every location.
Broker / message-queue
A producer drops a message onto a queue. A pool of workers pulls from that queue, and each message is handed to exactly one worker. The queue is durable and ordered; messages wait until a worker takes them. This is how you spread a backlog of work across a fleet — a hundred jobs, ten workers, each job done once.
The four atoms.
Dat — the central datum is Message; the only object a producer and a worker
both name is Message itself — there is no shared subscription discriminator:
Morphism | Signature | Semantics |
|---|---|---|
|
| the job's data |
|
| when it was enqueued |
|
| its position in the queue order |
The queue is the free monoid Message* — finite sequences under
concatenation, appended at one end and taken from the other.
Trn:
Transformation | Signature | Kind |
|---|---|---|
|
| effect |
|
| pure — the broker's routing: pick one |
|
| effect |
Loc — Producer (the enqueuing thread); Broker (with BrokerQueue, its
durable disk); Worker (a competing-consumer thread, usually several).
Trm:
Transmission | Signature | Carries |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Placement. enqueue is a TrnLoc at Producer; match a TrnLoc at
Broker; handle a TrnLoc at each Worker. The Message datum has three
DataLocs over the same Dat — the object in Producer RAM, the persisted
entry at BrokerQueue, the in-flight copy at one Worker.
graph LR
P["ProducerCmp<br/>TrnLoc: enqueue"]
Bk["BrokerCmp<br/>TrnLoc: match · DataLoc: Message* queue"]
C1["WorkerCmp A<br/>TrnLoc: handle"]
C2["WorkerCmp B<br/>TrnLoc: handle"]
P -->|"t_enqueue : Message"| Bk
Bk -->|"t_deliver : Message (to one)"| C1
Bk -->|"t_deliver : Message (to one)"| C2
C1 -.->|"t_ack : Offset"| Bk
C2 -.->|"t_ack : Offset"| Bk
style P fill:#cf7fcf,color:#fff
style Bk fill:#cf7fcf,color:#fff
style C1 fill:#cf7fcf,color:#fff
style C2 fill:#cf7fcf,color:#fffLike pub/sub, there is no transmission Producer → Worker. The
producer–worker dependency factors — it is a composite in the routing
category:
t_deliver ∘ t_enqueue : Producer → Broker → Worker
It passes through Broker. The components glue to match —
System = ProducerCmp ⋈ BrokerCmp ⋈ WorkerCmp, along t_enqueue and
t_deliver — and ProducerCmp and WorkerCmp share no port directly.
The defining constraint, as a law. Start from publish–subscribe: the same
broker, the same persisted Message*, the same transmissions — and a routing
transformation match : Message → Subscriber* that fans one message out to
every subscriber. Flip one constraint — collapse the multiplicity of match:
The broker's routing transformation is
match : Message → Subscriber— it selects exactly one consumer. For eachMessagethere is exactly onet_delivertransmission, terminating at oneWorker.
That is the only edit. Subscriber* (a list — broadcast) becomes Subscriber
(pick one — competing consumers). Where pub/sub turns one event into many
t_delivers, the broker turns one message into one. A second consequence
follows from the same flip: because each message has a single recipient, the
t_deliver transmission is typically initiated by the Worker — it pulls
from the queue when it has capacity, rather than being pushed. The arrow's
direction in Trm is unchanged; the trigger moves to the consumer end.
What the framework tells you.
Scaling is adjoining a component. A new worker is a new
WorkerCmp, itshandleTrnLoc, and one newt_deliver. Becausematchpicks one recipient from the live pool, throughput scales with the number ofWorkerCmps — and no producer placement changes.Temporal decoupling is the
BrokerQueueDataLoc.t_enqueuewrites theMessage*;t_deliverreads it — later. The persisted free monoid is the buffer; a burst of enqueues simply lengthens the sequence, and workers drain it at their own pace. Producer and worker need never be co-live.Exactly-once-ish is localised to two arrows. "One message, one worker" lives entirely in
match(pick-one) and thet_acktransmission back from the worker. Redelivery, visibility timeouts, and dead-letter handling are all reasoned about on those two named arrows — nowhere else.The order is a property of the monoid, not a worker.
m_offsettotally ordersMessage*;matchconsumes from the head. FIFO is read off the free monoid's structure — no component has to enforce it.
Contrast — publish–subscribe is the same broker component and the same
persisted Message*, with the delivery multiplicity set back to broadcast:
match : Message → Subscriber*, every subscriber gets every message. Broker/
queue and pub/sub are siblings — one substrate, the multiplicity of match the
only constraint flipped.
Data-flow
Pipe-and-filter / pipelines
Data enters one end, walks a straight line through a series of independent stages, and leaves transformed at the other. Each stage — a filter — does one job and knows nothing of its neighbours; the connectors between them — the pipes — carry a typed datum forward. A Unix shell command, a build pipeline, an ETL job: all the same shape.
The four atoms. Pipe-and-filter constrains the transformation layer: the whole architecture is one composable chain in the algorithm category.
Dat — the stages of the datum as it is rewritten; what matters is not its
internal structure but the sequence of types the chain steps through:
Morphism | Signature | Semantics |
|---|---|---|
| objects of | the type at each cut of the pipeline — |
Trn — each filter is exactly one transformation:
Transformation | Signature | Kind |
|---|---|---|
|
| pure |
|
| pure |
|
| effect |
Loc — one site per filter, fixed by nothing but the filter itself:
FilterLocₐ, FilterLocᵦ, FilterLoc_c. They need not be distinct — a shell
pipeline runs them all in one process; a stream job spreads them across
machines. Trm — one pipe per adjacent pair:
Transmission | Signature | Carries |
|---|---|---|
|
|
|
|
|
|
Placement. Each filter is a component — FilterCmpₐ, FilterCmpᵦ,
FilterCmp_c — and each holds exactly one TrnLoc: its single
transformation placed at its single location. A pipe is the Trm welding two
of them.
graph LR
A["FilterCmp A<br/>TrnLoc: parse (Raw→Parsed)"]
B["FilterCmp B<br/>TrnLoc: score (Parsed→Scored)"]
C["FilterCmp C<br/>TrnLoc: render (Scored→Report)"]
A -->|"pipe_AB : Parsed"| B
B -->|"pipe_BC : Scored"| C
style A fill:#cf7fcf,color:#fff
style B fill:#cf7fcf,color:#fff
style C fill:#cf7fcf,color:#fffThe pipeline is a composite morphism in the algorithm category Alg (§2.2) —
arrow composition, made physical:
score ; parseis defined becauset_to(parse) = Parsed = t_from(score), andrender ; (score ; parse) : Raw → Reportis the whole system.
The components glue to match — System = FilterCmpₐ ⋈ FilterCmpᵦ ⋈ FilterCmp_c,
along pipe_AB and pipe_BC. Each pipe carries the type the upstream filter
produces straight into the downstream filter's input — every gluing port is a
typed handoff, never a call awaiting a return.
The defining constraint, as a law. Start from layered — also a linear chain of components — and flip one constraint: the edges are not calls but compositions. Make every component exactly one transformation, and require adjacent filters to compose at the type level:
The
depends-ongraph is a linear chainc₀ → c₁ → … → cₙ; eachcᵢholds exactly oneTrnLocof oneTrn fᵢ; and for every edgecᵢ → cᵢ₊₁,t_to(fᵢ) = t_from(fᵢ₊₁)— the pipe carries exactly that type.
That single equation is "pipe-and-filter." It is Coherence Law 1
(Placement honesty) specialised: filter fᵢ₊₁ reads t_from(fᵢ₊₁) at its
location, and the only thing that delivers it is pipe_{i,i+1} — so the pipe
must carry(τ) = t_from(fᵢ₊₁), which forces the type-match. A filter whose
input type is not the upstream output type is a pipe that cannot be welded —
the framework rejects it before anything runs.
What the framework tells you.
The edges are
COMPOSE, notCALL. Layered's chain is call edges — a round trip, a transmission down and a return up. Pipe-and-filter's chain is compose edges: one transmission, one direction,c_from ≠ c_to, and no returnTrmat all. The same line shape, a different morphism.Splicing a filter is adjoining one arrow. Inserting
gbetweenfᵢandfᵢ₊₁is legal exactly whent_to(fᵢ) = t_from(g)andt_to(g) = t_from(fᵢ₊₁)— the new component re-welds two pipes. No other filter's placement changes; the rest of the chain cannot observe it.Filters are trivially relocatable. A filter's only contract is its
t_from → t_tosignature, so itsTrnLocis free to move — same thread, separate process, separate machine — touching only its ownLocand the two adjacent pipes. A shell pipeline and a distributed stream job are the same architecture at different placements.The honest cost is shared state. A filter has one input type and one output type — nowhere to keep a lookup table or accumulated history. Such state must be threaded through the pipe (fattening
carries) or hoisted into a side channel that breaks the linear chain. The constraint that buys composability is the one that makes cross-stage context awkward.
Contrast — event-driven publish–subscribe also forbids a component from naming its successor — but routes through a broker that may fan out to many consumers. Pipe-and-filter keeps the graph a single line: no broker, no branching, the successor fixed by the next link in the chain.
CQRS
CQRS — Command Query Responsibility Segregation — splits a system along the line between changing data and reading it. The write side accepts commands and owns the source of truth in whatever shape best enforces the rules. The read side answers queries from a separate store, shaped for the screens that consume it — and that store is never written by hand: it is materialised from the write side after every command.
The four atoms. CQRS is a constraint on data placement — two DataLocs
where a single-model architecture has one. The transformations either side runs
are whatever the domain needs.
Dat — two parallel shapes over the same domain. State is the write-side
truth (the shape that enforces invariants); ReadModel is the query-side
projection (the shape the screen wants). They are connected, not independent:
Morphism | Signature | Semantics |
|---|---|---|
|
| which write-side truth this view denormalises |
|
| the write-side version this view reflects |
Trn:
Transformation | Signature | Kind |
|---|---|---|
|
| effect |
|
| pure — denormalise truth into a view |
|
| pure — read-only, never touches |
Loc — the write store and the read store are distinct locations:
WriteDB, ReadDB. Trm — t_decide writes WriteDB; t_project carries
each committed State to ReadDB; t_query reads ReadDB back.
Transmission | Signature | Carries |
|---|---|---|
|
|
|
|
|
|
|
|
|
Placement. decide is a TrnLoc in the write component; query a
TrnLoc in the read component. State is the one authoritative DataLoc —
materialised at WriteDB. ReadModel is a second DataLoc at ReadDB, and
it is deduced: the output of a placed project, never written by a command.
graph LR
DC["decide (Trn)"]
WS["State<br/>(authoritative DataLoc)"]
PR["project : State→ReadModel (Trn)"]
RM["ReadModel<br/>(deduced DataLoc)"]
QY["query (Trn)"]
DC -->|"t_decide"| WS
WS -->|"t_project"| PR
PR -->|"produces"| RM
RM -->|"t_query"| QY
style DC fill:#7fc47f,color:#000
style WS fill:#4f8cf7,color:#fff
style PR fill:#7fc47f,color:#000
style RM fill:#9a9a9a,color:#fff
style QY fill:#7fc47f,color:#000The read store is a composition, never an independent thing:
ReadModel = project(State) — the truth built by decide, then projected. The
system glues from two components — System = WriteCmp ⋈ ReadCmp — along the
t_project transmission; the write component and the read component share
no other port, and the read component has no inbound edge except
t_project.
The defining constraint, as a law. Start from the single-model
architecture: one DataLoc for an entity, and both decide and query touch
that same placement. Flip one constraint — make the read placement separate and
derived:
Commands and queries use different
DataLocs. The writeDataLoc(State) is authoritative; the readDataLoc(ReadModel) is not — it isprojectof the writeDataLoc, and no command writes it directly:ReadModel = project ∘ decide*.
ReadModel is a deduced morphism, so storing it as an independent authority
would be a redundant edge — a diagram that fails to commute. This is the post's
"design result-data before source-data" rule promoted from one mental step to
two physical DataLocs: the read model literally is the result-data shape,
the write model the source-data shape, and project is the push between them.
What the framework tells you.
Eventual consistency is the latency of one named transformation.
ReadDBis correct exactly as far ast_projecthas run. A query that readsReadModelbeforeprojectcatches up sees a stalerm_at— the gap between the writeDataLocand the readDataLocnot yet closed byproject. The framework names the staleness instead of waving at it: it is one edge's lag, visible on the diagram.Two stores scale independently because they are two
DataLocs. Write and read are distinct locations, soReadDBcan be replicated, re-indexed, or re-shaped for the screens without touchingWriteDB— the isolation is a fact about the diagram, not a discipline.Read shapes are free
projects. A new screen wanting a different denormalisation is a newproject'TrnLocand a newReadModel'DataLoc— adjoined per the extensibility rule, over an unchanged authoritativeState. The write side cannot observe the addition.The cost sits on the
projectedge. A secondDataLocwith no authority of its own, plus a transformation that must run on every write and has its own failure modes. The framework shows the gain (independent reads) and the cost (a derived store to keep current) on the same diagram.
Contrast — event sourcing is the natural pairing: it constrains the write
DataLoc to be an append-only Event* log and makes project a fold over
it. CQRS alone says nothing about the write store's shape — only that the
read store is a separate, derived placement; event sourcing constrains the
write store, CQRS the read store.
Event sourcing
Most systems store current state — the account balance, the order status — and overwrite it on every change. Event sourcing refuses to. It stores the sequence of changes themselves — an append-only log of events — and treats that log as the only source of truth. Current state is never written down; it is recomputed from the log whenever it is needed.
The four atoms.
Dat — the central object is Event, and the authoritative store is the free
monoid on it:
Morphism | Signature | Semantics |
|---|---|---|
|
| which kind of change |
|
| the change's data |
|
| when it was appended |
The log is Event* — the free monoid: finite sequences under
concatenation. State is a projected domain value (a balance, a status).
Trn:
Transformation | Signature | Kind |
|---|---|---|
|
| effect |
|
| pure — one per |
|
| pure — catamorphism, |
Loc — the event store (a disk). Trm — t_append writes the store;
t_replay reads the log back for a fold.
Placement. The log is the one authoritative DataLoc; State is not a
DataLoc at all — it is the output of a placed fold:
graph LR
AP["append (Trn)"]
LOG["Event* log<br/>(authoritative DataLoc)"]
FD["fold : Event*→State (Trn)"]
ST["State (deduced)"]
AP -->|"t_append"| LOG
LOG -->|"t_replay"| FD
FD -->|"produces"| ST
style AP fill:#7fc47f,color:#000
style LOG fill:#4f8cf7,color:#fff
style FD fill:#7fc47f,color:#000
style ST fill:#9a9a9a,color:#fffCurrent state is a composition, never a stored thing:
State = fold(Event*) — the log built by append, then folded.
The defining constraint, as a law. Start from a state-stored architecture:
an entity has a DataLoc holding its current value, and each command
overwrites it. Flip one constraint:
The only authoritative
DataLocis the append-onlyEvent*log; no transformation overwrites state, and everyStatevalue isfoldof the log.
This is the post's "deduce, don't store" principle promoted to a whole
architecture: State is a deduced morphism (fold ∘ append*), so storing it
authoritatively would be a redundant edge — a diagram that fails to commute.
What the framework tells you.
Time travel is free because the fold is total. The log is a free monoid, so every prefix is itself a valid
Event*. Folding a prefix gives the state as of any past moment; folding with a differenthandlegives a new projection over the same history — a newTrn, adjoined per the extensibility rule, over an unchangedDataLoc.A cached
Stateis explicitly secondary. Real systems place snapshotDataLocs — memoised fold results — as an optimisation. The framework is blunt that these are deduced, not authoritative: valid exactly as far asfoldsays so, the same status as a CQRS read model.The honest cost sits on one edge. Replaying the whole log is
O(history), and because old events are immutable,foldmust handle every event shape the log has ever held. Both costs live on the singlet_replayedge from log to fold — the substrate makes them visible, not surprising.
Contrast — CQRS is the natural pairing: the event log is the CQRS write
DataLoc, and a CQRS read model is just one fold among several. Event
sourcing constrains the write store to be a log; CQRS constrains the read
store to be separate and derived.
Decoupling & boundaries
Event-driven / publish–subscribe
A publisher never calls a consumer. It emits an event to a broker, and whoever subscribed to that event type receives it. The publisher does not know who — or whether anyone — is listening. This is implicit invocation.
The four atoms.
Dat — the central datum is Event; the only Dat object a publisher and a
subscriber both name is EventType:
Morphism | Signature | Semantics |
|---|---|---|
|
| the discriminator both sides share |
|
| the event's data |
|
| when it was published |
The stream is the free monoid Event* — the append-only log.
Subscription = Consumer × EventType.
Trn:
Transformation | Signature | Kind |
|---|---|---|
|
| effect |
|
| pure — the broker's routing |
|
| pure — per- |
|
| pure — |
Loc — Producer (the publishing thread); Broker (with BrokerLog, its
durable disk); Consumer (a subscribing worker, usually several).
Trm:
Transmission | Signature | Carries |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Placement. publish is a TrnLoc at Producer; match a TrnLoc at
Broker; handle and fold are TrnLocs at Consumer. The Event datum has
three DataLocs over the same Dat — the object in Producer RAM, the
persisted entry at BrokerLog, the received copy at Consumer.
graph LR
P["ProducerCmp<br/>TrnLoc: publish"]
Bk["BrokerCmp<br/>TrnLoc: match · DataLoc: Event* log"]
C1["ConsumerCmp A<br/>TrnLoc: handle, fold"]
C2["ConsumerCmp B<br/>TrnLoc: handle, fold"]
P -->|"t_publish : Event"| Bk
Bk -->|"t_deliver : Event"| C1
Bk -->|"t_deliver : Event"| C2
C1 -.->|"t_ack : Offset"| Bk
C2 -.->|"t_ack : Offset"| Bk
style P fill:#cf7fcf,color:#fff
style Bk fill:#cf7fcf,color:#fff
style C1 fill:#cf7fcf,color:#fff
style C2 fill:#cf7fcf,color:#fffThe defining observation: there is no transmission Producer → Consumer.
The producer–consumer dependency factors — it is a composite in the routing
category:
t_deliver ∘ t_publish : Producer → Broker → Consumer
It passes through Broker. The components glue to match —
System = ProducerCmp ⋈ BrokerCmp ⋈ ConsumerCmp, along t_publish and
t_deliver — and ProducerCmp and ConsumerCmp share no port directly.
The defining constraint, as a law. Start from request/response: a single
transmission t : A → B, and ACmp names BCmp — a direct depends-on edge.
Flip one constraint:
Delete the direct edge. Every inter-component dependency is mediated, and mediated indirectly — through the broker. In the
depends-ongraph there is no edgeConsumerCmp → ProducerCmp; both point atBrokerCmp.
This is Coherence Law 4 at its limit: Law 4 requires every cross-location
dependency to be mediated by a transmission; the event-based style strengthens
it so no producer/consumer pair is ever directly coupled. "A doesn't know B
exists" is not discipline — it is the absence of a Cmp-reference, a fact about
the diagram.
What the framework tells you.
Extensibility is adjoining an object. A new subscriber is a new
ConsumerCmp, itshandle/foldTrnLocs, aSubscription, and one newt_deliver. No existing placement changes — the publisher cannot even observe the addition.Temporal decoupling is the
BrokerLogDataLoc.t_publishwrites the stream;t_deliverreads it — later. The persistedEvent*is the buffer; producer and consumer need never be co-live.Fan-out is
match.match : Event → Subscription*turns one event into manyt_delivers, one per subscriber — the broker's whole job, one functor.The cost is honest, too. The
depends-ongraph is sparse and stable (everything points at the broker), but control flow — which consumer runs after a publish — lives inSubscriptiondata andmatch, in no component's placements. The method shows the extensibility win and the lost-control-flow cost in one diagram.
Contrast — request/response is the same picture with the broker removed and
the direct edge restored: one transmission t : A → B, and ACmp names
BCmp — a direct depends-on edge, a shared port.
Hexagonal (ports & adapters)
A domain core — the rules of your business — that touches the outside world only through abstract slots. The core never names a database driver, an HTTP client, or a queue SDK; it declares a port and an adapter is plugged in from outside to fulfil it.
The four atoms. Hexagonal constrains the component graph and the kind of
boundary transmission a component may hold — its Dat, Trn and Loc are
whatever the domain needs.
Dat — the core owns a private cluster of domain types (Document, Money,
Counterparty); a port also fixes a contract type — the datum on the wire,
which is the only Dat object the core and its adapters both name.
Trn — domain transformations in the core, plus the adapter's translation:
Transformation | Signature | Kind |
|---|---|---|
|
| pure — core business rule |
|
| effect |
|
| effect |
|
| effect |
save is the contract t_from = DocumentDraft → t_to = SaveResult; pg_save
and mem_save are parallel realisations of it — same signature, different
code.
Trm — the core's only outward transmission lands on a port, never on a store:
Transmission | Signature | Carries |
|---|---|---|
|
|
|
|
|
|
|
|
|
Loc — Core (the domain thread); Postgres, Heap, a test double's RAM —
each adapter's concrete site. The port has no Loc: it is an interface, not
a place.
Placement. The core is a component; each adapter is a separate component;
the port is the shared Trm they glue along. decide and the core's t_port
are TrnLoc/TrmCmps at Core; pg_save is a TrnLoc at Postgres inside
PgAdapterCmp. The core's depends-on graph names the port — never an adapter:
graph LR
Core["DomainCoreCmp<br/>TrnLoc: decide · TrmCmp: t_port"]
Port["save port (slot)<br/>contract: Draft→SaveResult"]
PG["PgAdapterCmp<br/>TrnLoc: pg_save"]
Mem["MemAdapterCmp<br/>TrnLoc: mem_save"]
Core -->|"t_port : DocumentDraft"| Port
PG -.->|"realises (2-cell)"| Port
Mem -.->|"realises (2-cell)"| Port
style Core fill:#cf7fcf,color:#fff
style Port fill:#f7c04f,color:#000
style PG fill:#cf7fcf,color:#fff
style Mem fill:#9a9a9a,color:#fffThe system is the composite System = DomainCoreCmp ⋈ PgAdapterCmp, glued
along the port: a TrmCmp of the core and a TrmCmp of the adapter name the
same Trm, opposite orientation, so in System that transmission becomes
internal. Choosing MemAdapterCmp instead gives System′ = DomainCoreCmp ⋈ MemAdapterCmp — the core's placements are byte-identical in both. The core
factors out of the choice:
behaviour(DomainCoreCmp)is invariant under which adapter is glued in.
The defining constraint, as a law. Start from layered: there the core
(business layer) depends-on the data-access layer directly — a named edge
BusinessCmp → DataAccessCmp, the core naming its concrete partner. Flip one
constraint:
No
TrmCmpof the core component has ac_to(orc_from) inside a concrete external component. Every boundary transmission of the core lands on a port — aTrmtyped only by itscarriesdatum. Thedepends-ongraph of the core names ports, never adapters; the concrete component is supplied from outside and glued in along the port.
This relaxes Coherence Law 4 in one specific way. Law 4 says a cross-location
dependency must be mediated by a transmission; layered satisfies it with an
edge straight to DataAccessCmp. Hexagonal forbids that edge from naming a
concrete component at all — the mediating Trm is identified only by its
contract type, and which component sits on the other end is decided by gluing,
later, from outside. The core stops naming what it talks to. That is the style.
What the framework tells you.
It is exactly the strategy 2-category (§5 / deep-dive). A port is one
Trnslot — the contractDocumentDraft → SaveResult. Every adapter is a parallelTrnLocrealising that one signature:pg_saveandmem_saveare parallel 1-cells over the same contract, and picking one is selecting a 2-cell. Hexagonal is not a special architecture — it is the strategy 2-category applied to a component's boundary transmissions.Testability is structural, not a virtue. Because the core glues to a port and not an adapter,
DomainCoreCmp ⋈ TestDoubleAdapterCmpis a valid composite with zero change to the core's placements (the invariance equation above). "The core is unit-testable in isolation" is just: the core composes with any component whose port-TrmCmpmatches the contract.Extensibility is adjoining a component. A new backing technology is a new
AdapterCmpwith itst_from → t_toTrnLocand one concreteTrm— a new parallel arrow into the existing port. No existing placement changes; the core cannot observe the addition.The honest cost — adapters are real components. Every port needs at least one adapter, and each adapter is its own bundle of placements with its own concrete
Trmto maintain. The indirection that buys swappability also multiplies components; for a system with one database forever, that ceremony earns nothing. The method shows the price on the same diagram as the gain.
Contrast — layered keeps the direct edge: the core names the data-access
layer — a depends-on edge straight to DataAccessCmp. Hexagonal cuts that
edge and routes the dependency through a port, so the concrete partner becomes a
pluggable, parallel component glued in from outside.
Serverless / FaaS
You write a function and hand it to the platform. There is no server you rent, no process you keep alive. A request arrives, the platform spins up a container, runs your function once, tears the container down. The next request gets a fresh one. Between invocations nothing of yours is running anywhere.
The four atoms. Serverless constrains the lifetime of Loc — and that
single change to one atom cascades into everything else. The transformations and
their types are whatever the domain needs.
Dat — ordinary domain types; what matters is where each is materialised. The
function's input arrives on the wire; everything else lives in a managed store.
Trn — one placed handler plus the reads it is forced into:
Transformation | Signature | Kind |
|---|---|---|
|
| effect |
|
| effect |
|
| effect |
Loc — two kinds, and the split is the whole style. EphLoc is the
per-invocation container: created when the request arrives, destroyed when
handle returns. StateLoc is the managed database / object store — long-lived,
owned by the platform, never your function. EdgeLoc is the trigger source.
Trm — every datum reaches the function over a wire, on every call:
Transmission | Signature | Carries |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Placement. handle is a TrnLoc at EphLoc inside FunctionCmp — but the
TrnLoc row itself is transient: it is born with the container and discarded
when it dies. StateCmp owns the only persistent DataLoc; FunctionCmp owns
no DataLoc at all.
graph LR
Ed["EdgeCmp<br/>(trigger)"]
Fn["FunctionCmp<br/>TrnLoc: handle @ EphLoc<br/>(transient — no DataLoc)"]
St["StateCmp<br/>DataLoc: Record @ StateLoc<br/>(persistent)"]
Ed -->|"t_invoke : Request"| Fn
St -->|"t_load : Record"| Fn
Fn -->|"t_write : Record"| St
Fn -.->|"t_return : Response"| Ed
style Ed fill:#cf7fcf,color:#fff
style Fn fill:#cf7fcf,color:#fff
style St fill:#cf7fcf,color:#fffBecause EphLoc does not survive the call, the placement of handle cannot be
composed forward in time — the next request rebuilds it. Every input handle
consumes must arrive by transmission within the one invocation window:
t_load ; handle ; t_write : StateLoc → EphLoc → StateLoc
State leaves and re-enters StateLoc every call; it never rests at EphLoc. The
components glue along the four boundary transmissions —
System = EdgeCmp ⋈ FunctionCmp ⋈ StateCmp — and FunctionCmp's contribution
to data(System) is empty: it owns behaviour, never storage.
The defining constraint, as a law. Start from a long-lived server: one
component, one stable Loc holding its transformations and its data — caches,
pools, indexes — resident across requests. Flip one constraint — drop the
lifetime of Loc to a single invocation:
FunctionCmpowns no persistentLoc. ItsEphLocexists only for onet_invoke ; handle ; t_returncycle, so there is noDataLocattl_loc(handle).
That makes Coherence Law 1 (Placement honesty) bite at its hardest. Law 1
says a transformation reads only data materialised at or transmitted to its
location. On a warm server you satisfy the "materialised at" clause for free —
the location persisted, so the cache is already there. In FaaS that clause is
structurally unavailable: there is no warm DataLoc at EphLoc. Only the
"transmitted to" clause remains. Cold starts and statelessness are not gotchas —
they are this missing DataLoc, read straight off the diagram.
What the framework tells you.
All persistent data must live in a separate, long-lived
DataLoc. SinceFunctionCmpowns no durableLoc, the only place aRecordcan rest isStateLocinStateCmp— a managed DB or object store. The split betweenEphLocandStateLocis forced, not a design choice.Every datum is transmitted in, every invocation. With the "materialised at" clause of Law 1 gone, each input to
handlemust be satisfied by aTrmwithc_to = EphLoc. Load the row, open the connection, fetch the config —t_loadandconnectre-fire on every call because the diagram leaves no shortcut.The strategy axis is the only thing left to scale. A spike in load adjoins more parallel
EphLocs, each with its own transientTrnLocover the samehandleTrn(§5). No capacity to plan, nothing to keep warm — the framework's extensibility move is the platform's autoscaler.The honest tradeoff is one row of
DataLoc. You give up every optimisation that depended on a warm location — pooling, in-memory caches, sticky state — in exchange for owning noLocat all: billing drops to zero between calls because there is nothing placed to bill for.
Contrast — client–server pins each component to a durable Loc that holds
both its TrnLocs and its DataLocs for the life of the system; peer-to-peer
does the same for every peer. Serverless is the same diagram with one constraint
flipped — Loc lifetime collapses to a single invocation — so the DataLoc at
the function's location vanishes and all state is exiled to a persistent store.
Peer-to-peer
No server. Every node runs the same software and plays the same role: each one both asks other nodes for data and answers their requests. A file-sharing swarm, a blockchain network, a gossip cluster — there is no privileged box in the middle, and no centre to lose.
The four atoms. Peer-to-peer constrains the components — there is only one type of them — and the transmissions — who may open one. The transformations and their types are whatever the domain needs.
Dat — one domain type Resource, materialised at every peer; the style
fixes only that all those copies are the same Dat object, with no privileged
one:
Morphism | Signature | Semantics |
|---|---|---|
|
| the identity all peers agree on |
|
| the resource's payload |
|
| which version a given copy holds |
Trn — ordinary domain transformations, the same set placed at every peer:
Transformation | Signature | Kind |
|---|---|---|
|
| effect |
|
| effect |
|
| pure — reconcile two copies by |
|
| effect |
Loc — one location kind, Peer, instantiated n times: Peer₁, Peer₂,
…, Peerₙ, each with its own store. Trm — one transmission kind, and it is
symmetric:
Transmission | Signature | Carries |
|---|---|---|
|
|
|
|
|
|
The signatures are quantified over all i, j — t_ask exists for every
ordered pair, in both directions.
Placement. Every component is an instance of one type, PeerCmp. Each
instance places the identical TrnLoc set — serve, request, merge,
locate — at its own Peer location, and owns one Resource DataLoc there.
No DataLoc is grey; none is authoritative.
graph LR
A["PeerCmp #1<br/>TrnLoc: serve, request, merge, locate"]
DA["Resource DataLoc<br/>(replica, @ Peer₁)"]
B["PeerCmp #2<br/>TrnLoc: serve, request, merge, locate"]
DB["Resource DataLoc<br/>(replica, @ Peer₂)"]
C["PeerCmp #3<br/>TrnLoc: serve, request, merge, locate"]
DC["Resource DataLoc<br/>(replica, @ Peer₃)"]
DA -->|"dl_cmp"| A
DB -->|"dl_cmp"| B
DC -->|"dl_cmp"| C
A <-->|"t_ask / t_send : Query ⇄ Resource"| B
B <-->|"t_ask / t_send : Query ⇄ Resource"| C
A <-->|"t_ask / t_send : Query ⇄ Resource"| C
style A fill:#cf7fcf,color:#fff
style B fill:#cf7fcf,color:#fff
style C fill:#cf7fcf,color:#fff
style DA fill:#4f8cf7,color:#fff
style DB fill:#4f8cf7,color:#fff
style DC fill:#4f8cf7,color:#fffAn exchange is still a composite, but it is not directional — either end may open it:
t_send ∘ t_ask : Peerᵢ → Peerⱼ → Peerᵢ— for every ordered pair(i, j)
The components glue along whichever transmissions are live —
System = PeerCmp ⋈ PeerCmp ⋈ … ⋈ PeerCmp — and because the factors are all
the same interface, the gluing is symmetric: ⋈ has no preferred operand,
and depends-on is an undirected graph.
The defining constraint, as a law. Start from client–server: two distinct
component types, an asymmetric initiation relation — the client always
initiates, the server always serves — and one authoritative DataLoc at the
server. Flip one constraint — erase the role distinction:
There is exactly one component type
PeerCmp; every component is an instance of it, so the placement sets are equal up to location. And initiation is symmetric:∀ i, j. (∃ τ. c_from(τ) = Peerᵢ ∧ c_to(τ) = Peerⱼ) ⟺ (∃ ρ. c_from(ρ) = Peerⱼ ∧ c_to(ρ) = Peerᵢ). No node is solely an initiator and none solely a responder.
This is client–server's directed-initiation law relaxed back to symmetry, and
its single authoritative DataLoc deleted. Coherence Law 4 still holds —
every cross-location dependency is mediated by a Trm — but the mediation is no
longer one-directional: depends-on carries an edge PeerᵢCmp → PeerⱼCmp for
every j, and the reverse edge too.
What the framework tells you.
Scaling is adjoining an instance of an existing type. A new peer is one more
PeerCmp— no new component type, no new transformation, no central node to reconfigure. The system grows by replicating an object the diagram already contains.Resilience falls out of the symmetry. No component is privileged, so deleting any single
PeerCmpleaves a smaller graph of the same shape. No node's loss is structurally different from any other's — there is no single point of failure, by construction, not by redundancy bolted on.The honest cost — no authoritative
DataLoc. Client–server gets consistency cheaply: one store, one source of truth. P2P has nDataLocs over oneDatand no canonical one, so Law 1 ("data read here was transmitted here") can only be satisfied for the whole network by a consensus or gossip protocol —mergerun pairwise until the replicas agree. That protocol is the price of having no centre, and the diagram shows it.Discovery is its own transformation. With no central registry, finding which peer holds a datum is the
locate : Id → Peer*transformation — itself transmission-heavy. Client–server never needs it; the method makes the extra cost visible.
Contrast — client–server is the same picture with the role split restored:
two component types instead of one, an asymmetric initiation relation, and a
single authoritative DataLoc the server alone owns.
This is how Guliel is built
None of this is a hobby. It's the working method behind Guliel — the financial operations platform my team builds — and the consolidations it produced are the proof.
Five tables that were never five objects. Guliel started, like every
system does, with nouns: customers, suppliers, invoices, expenses,
orders. Five tables, five "objects" — the object-oriented instinct, applied
to the schema. Modelling it categorically dissolved that. Customer and
Supplier aren't objects at all; they are roles on one directed edge between
parties — so they collapsed into a single Counterparty. An Expense is not
its own object either: it is a Document you received instead of issued —
the same object with direction = INCOMING. A purchase Order is a Document
with documentType = PURCHASE_ORDER. Five "objects" reduced toward two — and
every collapse was deduced, by one rule ("if X is Y plus morphisms, X is not
new"), and written down as a checkable reasoning trail. Not a refactor we
stumbled into after the code hurt. A deduction we made on the page.
A dispute is not a table. When we designed per-document disputes, the noun
instinct said: a dispute is a thing, give it a Dispute table. The category
said otherwise. A document carries an append-only log of events; its current
state is a fold over that log. A "dispute" is simply a sub-object of that
log — the sub-sequence of dispute-typed events. Storing a Dispute table would
have stored a morphism that already exists as a composition: a redundant
edge, a diagram that doesn't commute. So there is no Dispute table. The
category told us not to build one.
And we build Guliel itself with AI — including, at times, posts like this one. The categorical specs are exactly what make that safe: the olog is the contract the AI generates against, and the contract we verify its output against. That is how AI accelerates the work instead of quietly eroding it.
If you're the kind of engineer who read this far, there's a part of Guliel built for you: a fully typed REST API and an MCP server, so you can wire your own financial operations — issue documents, pull reports, reconcile expenses — straight from your own code or an AI agent. The same method that keeps our architecture honest is what makes that surface clean enough to hand to you.
Explore the Guliel API & MCP →
— Sapir