Error Handling and Self-Healing Flows
Errors in a Tiny Systems flow are just messages. They travel through ports you control. You pick which components expose error ports and where you wire them, and that decides where the flow recovers and where it gives up.
That's the whole resilience model. No separate retry layer, no execution-state CRD, no "suspend on error" primitive. Ports and edges, like the rest of the flow.
How errors travel through a chain
Three nodes wired together. The middle one does no error handling. The last one fails.
+----------+ +--------------+ +----------+
| Node A | --> | Node B | --> | Node C |
| | | (no error | | |
| | | port wired) | | (errors |
| | | | | out) |
+----------+ +--------------+ +----------+What happens:
- Node C's handler fails and returns a failure value.
- Node B receives that failure as the result of its outgoing call. B has no error port wired, so it passes the failure up to its caller. No special code needed; that's how every component behaves by default.
- Node A receives the failure as the result of its call to B.
No error port has fired. The error travelled invisibly up the chain via return values.
Enabling an error port catches the failure
Now wire Node A's error port to a recovery flow. Could be a log, an alert, a retry node, whatever fits.
+--------+
+----------+ +----------+ +----+ | Failure|
| Node A | --> | Node B | --> | C | | Handler|
| | | | +----+ | |
| error o-+----------------------------------> | |
+----------+ +----------+ +--------+With Enable Error Port turned on for Node A, A's component logic checks the result of every call it makes. If a downstream call returns a failure, A emits a message on its error port carrying that failure. The error now flows down a second path on the canvas, the recovery path, instead of bubbling further up.
Error ports are try/catch boundaries on the canvas
The same pattern scales up. Wherever you enable an error port in a chain, you're drawing a "catch" line on the graph.
- Everything between two enabled error ports is one transactional unit. A failure anywhere inside bubbles up and out the nearest one.
- A chain with no error ports enabled propagates failure all the way to the top-level trigger. HTTP request fails, scheduled signal logs and gives up, that kind of thing.
- A chain with error ports on every node treats each component as its own recovery boundary. More wiring on the canvas, but more granular.
You pick the granularity. You can tighten it later by enabling more error ports as the flow matures.
When to enable an error port
Turn Enable Error Port on when:
- The work has external side effects (sending an email, writing to a DB, calling a paid API) and you want to handle failure visibly: write to a dead-letter, alert a human, fall back somewhere.
- You want the upstream caller to see "success" while a failure case routes to a different path. Without an error port, the upstream caller sees the raw failure.
- There's a sensible recovery action you can express as another flow segment: re-queue, send to manual review, fall back to a different API.
Leave it off when:
- Failure should just propagate up. A pure transform that gets bad input has no business "recovering"; let the failure bubble.
- The upstream is itself an error-handling node that wants to see the raw failure.
Recovery patterns
A few shapes worth knowing.
Dead-letter routing. Wire the error port to a Slack send, an email, or a kv.store under a failed/ prefix. Useful when humans investigate failures later.
Retry with backoff. Wire the error port back to the same node's input through a delay node. Add a counter to the payload so you stop retrying after N attempts.
Fallback path. Wire the error port to an alternate implementation. A slower-but-more-reliable provider, a cached result, or a degraded response.
Compensating action. When the failure happened mid-multi-step work, wire the error port to a node that undoes whatever the earlier steps did. Delete the half-created resource. Refund the partial charge.
What this means for "is the flow durable"
A flow is as resilient as its error wiring. There's no hidden "make it durable" switch. Durability comes from the topology you draw. Error ports define the recovery edges; without them, failure terminates the run at the nearest unbuffered point.
That's intentional. The framework doesn't try to retry transparently behind your back, because doing so would hide the failure mode (and the cost) from you. The canvas shows exactly what happens on success and exactly what happens on failure.
See also
- Nodes and Edges, how ports and edges work in general
- Flow Basics, the model of node execution and message flow