Boundary-Driven Testing
Testing difficulty is architectural evidence. When a component cannot be exercised in isolation, the problem is not the tests — it is the structure.
The Problem
The conventional framing of testing as a discipline separate from design produces a particular kind of pain. Teams adopt frameworks, mandate coverage minimums, and write guidelines about what to test. The tests improve. The pain persists. A change to a business rule breaks seventeen tests, most of which are not about business rules. An integration test requires spinning up four services to assert one value. An end-to-end test passes in isolation and fails in CI for reasons nobody can reproduce. More coverage, more pain.
The problem is structural, not methodological. The tests are not wrong. The structure the tests are attempting to exercise is wrong. Components have absorbed responsibilities they should not have. Dependencies are hidden rather than declared. Boundaries have been drawn in the wrong place, or not drawn at all. No amount of better tooling, higher coverage requirements, or more disciplined test writing fixes a structural problem. Fix the structure; the tests follow.
The Core Insight
The test spiral — unit, integration, end-to-end, system, user acceptance — is not a testing methodology. It is an architectural map. Each ring of the spiral corresponds to a level of architectural scope, and that scope is determined entirely by where boundaries have been placed.
Get the boundaries right and the spiral populates itself: each role has clear targets, predictable scope, and low maintenance overhead. Get them wrong and the spiral collapses — unit tests become integration tests in disguise, E2E tests become the only reliable safety net, and the entire suite grows expensive while providing diminishing confidence.
In a system with no meaningful component boundaries, unit scope and system scope are the same thing. There is nothing below the full system that can be isolated. The spiral collapses into E2E by default, because E2E is the only level at which anything coherent can be exercised.
Testing Is Not Separate from Design
A component designed around a coherent responsibility, with explicit inputs, explicit outputs, and dependencies passed rather than acquired, is inherently testable. No additional effort is required to make it so. The same structural choices that allow the component to change without cascading effects allow it to be tested without elaborate setup.
The inverse is equally true. A component that cannot be unit-tested without mocking half the system is not badly tested — it is badly structured. The test difficulty is diagnostic. It reveals that the component has absorbed responsibilities it should not have, or that its dependencies are implicit rather than declared, or that the boundary between it and its collaborators has been drawn in the wrong place.
This is why the structural models defined in VBD and EBD are also testing models. The same role taxonomy that makes components replaceable makes them mockable. The same communication rules that prevent coupling prevent test contamination. The same line that isolates change isolates test scope.
Test Profiles by Role
Each component role in VBD and EBD has a characteristic test profile — not assigned arbitrarily, but derived from its structural position, responsibilities, and communication rules. Every role carries unit tests. The weight and character of those tests differs by role. Integration tests arise at the seams between roles.
Engines — The Unit Test Core
Engines are the most logic-dense role and the natural home of the unit test suite. An Engine encapsulates business rules: given inputs, apply policy, produce a result. It has no workflow awareness, no sibling Engine dependencies, and no reason to reach outward beyond a Resource Accessor — which it receives through an explicit, mockable interface.
This structural position is what makes Engines straightforwardly testable. Mock the Accessor, supply controlled inputs, assert on the output. The Engine’s communication constraints ensure there is nothing else to mock. The test scope is exactly the Engine and nothing more. Every business rule, every policy variant, every edge case and failure mode belongs here — and each one is fast, isolated, and cheap to run.
- Mock the Resource Accessor only — nothing else should be reachable
- Test every business rule, policy variant, and edge case
- Test every failure mode the Engine can produce
- Engine unit tests are the densest and most numerous in the suite
Flows — The EBD Equivalent
Flows in EBD occupy the same structural position as Engines in VBD. A Flow receives shared state from the Experience, steps through Interactions, and emits a completion event carrying accumulated state upward. Its rule against calling sibling Flows and its prohibition from making direct backend calls keep the unit scope tight.
Mock the backend call at the Experience boundary. Simulate Interaction events through a test harness. Assert on what the Flow emits at completion, what it emits on skip, and how it handles each conditional path through its Interaction sequence. The Flow’s behavior is fully exercisable without a running backend or a real browser environment.
Managers — Orchestration Under Test
Managers carry unit tests, but the character of those tests differs from Engine tests. An Engine is asserting business rules — the bulk of the domain logic lives here and demands exhaustive coverage. A Manager is asserting orchestration: given this response from an Engine, did the Manager route correctly? Given a domain failure, did it handle and respond appropriately? Given multiple Engines in sequence, does it compose them in the right order with the right inputs?
Manager unit tests are comparatively fewer than Engine tests because the Manager has less behavior to assert directly. It does not compute — it coordinates. What it does assert is consequential: every state that can arrive from its collaborators, and every routing decision that follows. Mock all collaborators. Feed controlled responses representing every state each contract can produce. Verify the Manager’s decisions, not its collaborators’ behavior.
- Mock all Engines and Resource Accessors
- Assert the orchestration sequence — which components are called, in what order, with what inputs
- Assert every routing decision: success paths, domain failures, unexpected errors
- Fewer tests than Engine — not because rigor is lower, but because the behavior surface is narrower
Experiences — The EBD Equivalent
Experiences in EBD mirror Managers in VBD at the orchestration layer. An Experience composes Flows, holds accumulated journey state, dispatches to the backend, and advances the journey in response to Flow completion and skip signals. Unit tests assert journey composition logic: which Flows execute, in what order, under what conditions, and how the Experience responds to each possible signal from each Flow. The backend is mocked. Flows are mocked.
Resource Accessors — Translation Under Test
Accessors sit at the system’s external boundary. Their job is translation: convert a domain request into an external call, convert the response back. Whether the external system is reachable, whether it is correctly provisioned, whether it performs within acceptable bounds — none of these are Accessor concerns. They belong to system testing and deployment verification.
Unit tests for an Accessor verify the translation. Mock the data source driver, control what it returns, assert that the Accessor’s output matches the expected domain representation. The Accessor has no business connecting to a real database in a unit or integration test. Its correctness is about the translation. The infrastructure’s correctness is about the infrastructure.
Interactions and Utilities — Narrow and Fast
Interactions are atomic. They render, receive user input, and emit events. They carry no flow logic, make no backend calls, and have no awareness of adjacent components. Render in a harness, simulate the input event, assert what was emitted. Props and callbacks are the entire interface — no mocks needed.
Utilities are simpler still: inputs in, outputs out, no side effects. Given input X, assert output Y. The only exception is a Utility wrapping an external sink — a log transport, a telemetry exporter — where the sink gets mocked. Everything else is pure function territory.
Test Profiles at a Glance
Select a component role to see its test level, what it validates, and what gets mocked.
The Integration Seams
Integration tests verify the seams between roles — not individual components in isolation. Everything is still mocked at the external boundary. Real external systems do not enter until E2E. The distinction from unit tests is scope, not realism: a unit test exercises one component against mocked dependencies; an integration test exercises the collaboration between two components against mocked dependencies at the outer edge.
Manager → Engine
Does the Manager invoke the Engine with the correct inputs? Does it handle every state the Engine’s contract can emit — success, domain failure, unexpected error — and route accordingly? The Engine is mocked. Feed it controlled responses representing each state it might return. What you are testing is the Manager’s response to each, not the Engine’s behavior.
Engine → Resource Accessor
Does the Engine correctly use the Accessor’s contract? Does it handle all return states — including partial results, empty sets, and infrastructure errors? The Accessor is mocked. No database is involved. You are testing whether the Engine correctly interprets the Accessor’s interface and handles everything the contract allows.
Manager → Resource Accessor
Managers sometimes interact with Accessors directly — for reads that inform orchestration decisions, or for state persistence the Manager owns. Test these paths the same way: mock the Accessor, exercise the Manager’s handling of every response state the Accessor’s contract defines.
Experience → Flow (EBD)
Does the Experience pass correct shared state to each Flow? Does it handle Flow completion events and skip signals correctly? Does it advance the journey as designed when Flows complete in sequence, complete early, or signal that conditions for execution were not met? The backend is mocked. Flows are mocked. You are verifying journey composition — not backend behavior or Flow-level logic.
Mock Placement Is Architectural Evidence
Where you place mocks tells you where your boundaries are. Where you are forced to place mocks tells you where your boundaries should be.
The rule: mock at the role boundary, not inside the role. Each component role has one natural mock point — the interface at which it hands off to the next tier. Mock that interface and nothing else.
When a unit test requires mocking more than the single boundary below the component under test, something is wrong. Either the component has absorbed responsibilities that belong at a different tier, or its dependencies are implicit rather than injected, or a Resource Accessor is missing and the component is reaching directly into infrastructure it should not see. Mock proliferation is always a structural signal — not a testing problem, and not a problem that better mocking frameworks solve.
The inverse is also worth examining. An Engine unit test that requires no mocks at all is either genuinely pure-computation — which is fine — or is only exercising the easy path through logic that silently delegates to collaborators the test never reaches. Coverage numbers tell you how many lines ran. They do not tell you whether the logic that matters was actually exercised.
Scenarios Validate Architecture and Tests Simultaneously
VBD and EBD both use core scenarios as architectural validation mechanisms. A core use case in VBD should be traceable through the component hierarchy without bypassing communication rules. A core user journey in EBD should trace through Experience → Flow → Interaction without boundary leakage. If a scenario requires an Engine to call another Engine, or a Flow to call the backend directly, the boundaries need adjustment — not the test.
These same scenarios are the test scenarios that matter most. Scenarios that validate structural boundaries naturally exercise the most load-bearing code paths, the most significant collaborations, and the most complete representations of what the system is actually for. The same scenario, traced at unit scope, integration scope, and E2E scope, asks three different questions and produces three different kinds of confidence. Together they cover the full surface of what the system must do.