Scenario: “Create and send invoice to new client in Iceland for 1,000,000 ISK.” That's seven steps: create customer, add bank details, create project, log hours, create invoice, convert currency, send. Step six is where things usually break.
Why stateless AI breaks on multi-step workflows
LLMs are stateless. Each call is independent. Context window limitations mean long chains get truncated. The model “forgets” the original goal and stops halfway — or worse, repeats steps it already completed.
Our approach: explicit goal state
A thread-level goal tracker. A state machine that records: goal, completed steps, pending retries, blocker resolution. The AI doesn't have to remember; the system does.
The confirmation complication
Human-in-the-loop pauses the flow. When a step needs approval, we can't just continue. We added an afterSuccess field on blocked operations: “When update_customer succeeds → automatically retry create_invoice with original parameters.”
When update_customer succeeds → automatically retry create_invoice with original parameters
What we inject into the AI's context
Explicit instructions in the system prompt: “Current goal: create and send invoice. Completed: create_customer, add_bank_details, create_project, log_hours. Pending: create_invoice (blocked — awaiting currency conversion). Do not repeat completed steps.” Reduced “lost goal” failures from 40% to under 3%.
Current goal: create and send invoice. Completed: create_customer, add_bank_details, create_project, log_hours. Pending: create_invoice (blocked — awaiting currency conversion). Do not repeat completed steps.
Lessons learned
- Don't trust the AI to remember goals — persist them in your own state machine
- Use deterministic orchestration for multi-step flows; the AI handles the “what,” the system handles the “when”
- Separation of concerns: intent extraction, tool selection, and execution state are separate layers
Want to try it? Join the beta →