Scenario: “Create and send invoice to new client in Iceland for 1,000,000 ISK.” That's seven steps: create customer, add bank details, create project, log hours, create invoice, convert currency, send. Step six is where things usually break.

Why stateless AI breaks on multi-step workflows

LLMs are stateless. Each call is independent. Context window limitations mean long chains get truncated. The model “forgets” the original goal and stops halfway — or worse, repeats steps it already completed.

Our approach: explicit goal state

A thread-level goal tracker. A state machine that records: goal, completed steps, pending retries, blocker resolution. The AI doesn't have to remember; the system does.

The confirmation complication

Human-in-the-loop pauses the flow. When a step needs approval, we can't just continue. We added an afterSuccess field on blocked operations: “When update_customer succeeds → automatically retry create_invoice with original parameters.”

When update_customer succeeds → automatically retry create_invoice with original parameters

What we inject into the AI's context

Explicit instructions in the system prompt: “Current goal: create and send invoice. Completed: create_customer, add_bank_details, create_project, log_hours. Pending: create_invoice (blocked — awaiting currency conversion). Do not repeat completed steps.” Reduced “lost goal” failures from 40% to under 3%.

Current goal: create and send invoice. Completed: create_customer, add_bank_details, create_project, log_hours. Pending: create_invoice (blocked — awaiting currency conversion). Do not repeat completed steps.

Lessons learned

  • Don't trust the AI to remember goals — persist them in your own state machine
  • Use deterministic orchestration for multi-step flows; the AI handles the “what,” the system handles the “when”
  • Separation of concerns: intent extraction, tool selection, and execution state are separate layers

Want to try it? Join the beta →