Standard operating sequence
- Readiness check:
/health/ready - Start run with explicit goal and mode
- Monitor
/api/v1/eventsand/api/v1/runs/{id}/graph - Apply pause/resume/cancel controls as needed
- Archive evidence (events, graph, attempts)
Incident handling
- Use pause first when uncertainty is high.
- Cancel when policy breach or uncontrolled retry emerges.
- Attach evidence paths to postmortem timeline.
Gate-minded operations
- Quality target >= 0.90
- Time improvement >= 25%
- Cost improvement >= 20%
- Regression must remain false