Boundary-Driven Testing

Boundary-Driven Testing

Testing difficulty is architectural evidence. When a component cannot be exercised in isolation, the problem is not the tests — it is the structure.


The Problem

The conventional framing of testing as a discipline separate from design produces a particular kind of pain. Teams adopt frameworks, mandate coverage minimums, and write guidelines about what to test. The tests improve. The pain persists. A change to a business rule breaks seventeen tests, most of which are not about business rules. An integration test requires spinning up four services to assert one value. An end-to-end test passes in isolation and fails in CI for reasons nobody can reproduce. More coverage, more pain.

The problem is structural, not methodological. The tests are not wrong. The structure the tests are attempting to exercise is wrong. Components have absorbed responsibilities they should not have. Dependencies are hidden rather than declared. Boundaries have been drawn in the wrong place, or not drawn at all. No amount of better tooling, higher coverage requirements, or more disciplined test writing fixes a structural problem. Fix the structure; the tests follow.

The Core Insight

The test spiral — unit, integration, end-to-end, system, user acceptance — is not a testing methodology. It is an architectural map. Each ring of the spiral corresponds to a level of architectural scope, and that scope is determined entirely by where boundaries have been placed.

Get the boundaries right and the spiral populates itself: each role has clear targets, predictable scope, and low maintenance overhead. Get them wrong and the spiral collapses — unit tests become integration tests in disguise, E2E tests become the only reliable safety net, and the entire suite grows expensive while providing diminishing confidence.

In a system with no meaningful component boundaries, unit scope and system scope are the same thing. There is nothing below the full system that can be isolated. The spiral collapses into E2E by default, because E2E is the only level at which anything coherent can be exercised.

Testing Is Not Separate from Design

A component designed around a coherent responsibility, with explicit inputs, explicit outputs, and dependencies passed rather than acquired, is inherently testable. No additional effort is required to make it so. The same structural choices that allow the component to change without cascading effects allow it to be tested without elaborate setup.

The inverse is equally true. A component that cannot be unit-tested without mocking half the system is not badly tested — it is badly structured. The test difficulty is diagnostic. It reveals that the component has absorbed responsibilities it should not have, or that its dependencies are implicit rather than declared, or that the boundary between it and its collaborators has been drawn in the wrong place.

This is why the structural models defined in VBD and EBD are also testing models. The same role taxonomy that makes components replaceable makes them mockable. The same communication rules that prevent coupling prevent test contamination. The same line that isolates change isolates test scope.


Test Profiles by Role

Each component role in VBD and EBD has a characteristic test profile — not assigned arbitrarily, but derived from its structural position, responsibilities, and communication rules. Every role carries unit tests. The weight and character of those tests differs by role. Integration tests arise at the seams between roles.

Engines — The Unit Test Core

Engines are the most logic-dense role and the natural home of the unit test suite. An Engine encapsulates business rules: given inputs, apply policy, produce a result. It has no workflow awareness, no sibling Engine dependencies, and no reason to reach outward beyond a Resource Accessor — which it receives through an explicit, mockable interface.

This structural position is what makes Engines straightforwardly testable. Mock the Accessor, supply controlled inputs, assert on the output. The Engine’s communication constraints ensure there is nothing else to mock. The test scope is exactly the Engine and nothing more. Every business rule, every policy variant, every edge case and failure mode belongs here — and each one is fast, isolated, and cheap to run.

  • Mock the Resource Accessor only — nothing else should be reachable
  • Test every business rule, policy variant, and edge case
  • Test every failure mode the Engine can produce
  • Engine unit tests are the densest and most numerous in the suite

Flows — The EBD Equivalent

Flows in EBD occupy the same structural position as Engines in VBD. A Flow receives shared state from the Experience, steps through Interactions, and emits a completion event carrying accumulated state upward. Its rule against calling sibling Flows and its prohibition from making direct backend calls keep the unit scope tight.

Mock the backend call at the Experience boundary. Simulate Interaction events through a test harness. Assert on what the Flow emits at completion, what it emits on skip, and how it handles each conditional path through its Interaction sequence. The Flow’s behavior is fully exercisable without a running backend or a real browser environment.

Managers — Orchestration Under Test

Managers carry unit tests, but the character of those tests differs from Engine tests. An Engine is asserting business rules — the bulk of the domain logic lives here and demands exhaustive coverage. A Manager is asserting orchestration: given this response from an Engine, did the Manager route correctly? Given a domain failure, did it handle and respond appropriately? Given multiple Engines in sequence, does it compose them in the right order with the right inputs?

Manager unit tests are comparatively fewer than Engine tests because the Manager has less behavior to assert directly. It does not compute — it coordinates. What it does assert is consequential: every state that can arrive from its collaborators, and every routing decision that follows. Mock all collaborators. Feed controlled responses representing every state each contract can produce. Verify the Manager’s decisions, not its collaborators’ behavior.

  • Mock all Engines and Resource Accessors
  • Assert the orchestration sequence — which components are called, in what order, with what inputs
  • Assert every routing decision: success paths, domain failures, unexpected errors
  • Fewer tests than Engine — not because rigor is lower, but because the behavior surface is narrower

Experiences — The EBD Equivalent

Experiences in EBD mirror Managers in VBD at the orchestration layer. An Experience composes Flows, holds accumulated journey state, dispatches to the backend, and advances the journey in response to Flow completion and skip signals. Unit tests assert journey composition logic: which Flows execute, in what order, under what conditions, and how the Experience responds to each possible signal from each Flow. The backend is mocked. Flows are mocked.

Resource Accessors — Translation Under Test

Accessors sit at the system’s external boundary. Their job is translation: convert a domain request into an external call, convert the response back. Whether the external system is reachable, whether it is correctly provisioned, whether it performs within acceptable bounds — none of these are Accessor concerns. They belong to system testing and deployment verification.

Unit tests for an Accessor verify the translation. Mock the data source driver, control what it returns, assert that the Accessor’s output matches the expected domain representation. The Accessor has no business connecting to a real database in a unit or integration test. Its correctness is about the translation. The infrastructure’s correctness is about the infrastructure.

Interactions and Utilities — Narrow and Fast

Interactions are atomic. They render, receive user input, and emit events. They carry no flow logic, make no backend calls, and have no awareness of adjacent components. Render in a harness, simulate the input event, assert what was emitted. Props and callbacks are the entire interface — no mocks needed.

Utilities are simpler still: inputs in, outputs out, no side effects. Given input X, assert output Y. The only exception is a Utility wrapping an external sink — a log transport, a telemetry exporter — where the sink gets mocked. Everything else is pure function territory.


Test Profiles at a Glance

Select a component role to see its test level, what it validates, and what gets mocked.

VBD
EBD
Test LevelUnit + Integration
ValidatesBusiness rules, policies, every variant and edge case; correct use of the Accessor contract
MockResource Accessor — at both levels
The densest test surface in the system. Unit tests cover every business rule, every policy variant, every failure mode — the Accessor is mocked, scope is exactly the Engine. Integration tests verify the Engine → Resource Accessor seam: does the Engine call the Accessor with correct inputs? Does it handle every state the Accessor’s contract can return? The Accessor is still mocked — you are feeding it controlled responses. No database, no infrastructure. The integration test verifies the contract at the seam, not the components on either side of it.
Test LevelUnit + Integration
ValidatesOrchestration sequence, routing decisions, response handling at each seam
MockAll Engines and Accessors — at both levels
Manager unit tests are fewer than Engine tests — not because rigor is lower, but because the behavior surface is narrower. The Manager does not compute; it coordinates. Unit tests assert orchestration decisions: which Engine was called, with what inputs, and how every response state was routed. Integration tests focus on the seam contracts: does the Manager call each Engine with the correct inputs? Does it correctly handle every state those mocked Engines can return? Engines are still mocked — you are feeding them controlled responses, not running them. The integration test verifies the wiring, not the components on either end of it.
Test LevelUnit
ValidatesTranslation: domain request → external call shape, response → domain object
MockData source driver only
The Accessor’s correctness is about the translation, not the infrastructure. Whether the external system is reachable is a system-level fact, not a test target. Mock the driver, control what it returns, assert the mapping. No live database needed.
Test LevelUnit
ValidatesProgression through Interactions, completion events, conditional paths
MockBackend call at Experience boundary
Same principle as Engine. Simulate Interaction events through a test harness, assert on what the Flow emits at completion and on skip. The rule against sibling Flow calls keeps the scope tight — the Flow is fully exercisable without a running backend or real browser environment.
Test LevelUnit + Integration
ValidatesJourney composition, Flow sequencing, completion and skip handling
MockBackend + Flows — at both levels
The EBD equivalent of the Manager. Unit tests assert journey composition: which Flows execute, in what order, and how the Experience handles every signal each Flow can emit. Integration tests verify the seam contracts: does the Experience pass correct state to each Flow, and handle every signal those mocked Flows can return? Flows are still mocked. The backend is always mocked. You are verifying the wiring of the journey, not what happens inside each Flow.
Test LevelUnit
ValidatesRenders correctly, accepts input, emits the right events
MockNothing — props and callbacks are the entire interface
Atomic and indivisible. Render in a harness, simulate the input event, assert what was emitted. No flow logic, no backend calls, no awareness of adjacent components. The test surface is as narrow as the component.
Test LevelUnit
ValidatesPure inputs and outputs, no side effects
MockExternal sink only, if applicable
Given input X, assert output Y. No domain knowledge, no workflow context — pure function territory. The only exception is a Utility wrapping an external sink (a log transport, a telemetry exporter), where the sink gets mocked. Everything else requires no setup at all.

The Integration Seams

Integration tests verify the seams between roles — not individual components in isolation. Everything is still mocked at the external boundary. Real external systems do not enter until E2E. The distinction from unit tests is scope, not realism: a unit test exercises one component against mocked dependencies; an integration test exercises the collaboration between two components against mocked dependencies at the outer edge.

Manager → Engine

Does the Manager invoke the Engine with the correct inputs? Does it handle every state the Engine’s contract can emit — success, domain failure, unexpected error — and route accordingly? The Engine is mocked. Feed it controlled responses representing each state it might return. What you are testing is the Manager’s response to each, not the Engine’s behavior.

Engine → Resource Accessor

Does the Engine correctly use the Accessor’s contract? Does it handle all return states — including partial results, empty sets, and infrastructure errors? The Accessor is mocked. No database is involved. You are testing whether the Engine correctly interprets the Accessor’s interface and handles everything the contract allows.

Manager → Resource Accessor

Managers sometimes interact with Accessors directly — for reads that inform orchestration decisions, or for state persistence the Manager owns. Test these paths the same way: mock the Accessor, exercise the Manager’s handling of every response state the Accessor’s contract defines.

Experience → Flow (EBD)

Does the Experience pass correct shared state to each Flow? Does it handle Flow completion events and skip signals correctly? Does it advance the journey as designed when Flows complete in sequence, complete early, or signal that conditions for execution were not met? The backend is mocked. Flows are mocked. You are verifying journey composition — not backend behavior or Flow-level logic.


Mock Placement Is Architectural Evidence

Where you place mocks tells you where your boundaries are. Where you are forced to place mocks tells you where your boundaries should be.

The rule: mock at the role boundary, not inside the role. Each component role has one natural mock point — the interface at which it hands off to the next tier. Mock that interface and nothing else.

When a unit test requires mocking more than the single boundary below the component under test, something is wrong. Either the component has absorbed responsibilities that belong at a different tier, or its dependencies are implicit rather than injected, or a Resource Accessor is missing and the component is reaching directly into infrastructure it should not see. Mock proliferation is always a structural signal — not a testing problem, and not a problem that better mocking frameworks solve.

The inverse is also worth examining. An Engine unit test that requires no mocks at all is either genuinely pure-computation — which is fine — or is only exercising the easy path through logic that silently delegates to collaborators the test never reaches. Coverage numbers tell you how many lines ran. They do not tell you whether the logic that matters was actually exercised.

Scenarios Validate Architecture and Tests Simultaneously

VBD and EBD both use core scenarios as architectural validation mechanisms. A core use case in VBD should be traceable through the component hierarchy without bypassing communication rules. A core user journey in EBD should trace through Experience → Flow → Interaction without boundary leakage. If a scenario requires an Engine to call another Engine, or a Flow to call the backend directly, the boundaries need adjustment — not the test.

These same scenarios are the test scenarios that matter most. Scenarios that validate structural boundaries naturally exercise the most load-bearing code paths, the most significant collaborations, and the most complete representations of what the system is actually for. The same scenario, traced at unit scope, integration scope, and E2E scope, asks three different questions and produces three different kinds of confidence. Together they cover the full surface of what the system must do.

Stay in the loop.