Skip to content

SYSTEM Cited by 1 source

Atlassian Long Horizon

Atlassian's Long Horizon is a reasoning-native agent orchestrator that powers Rovo Chat. It replaced the Hybrid Orchestrator (a hierarchical multi-agent system with per-product sub-agents) with a fundamentally different design: one LLM, one context, one iterative loop running up to 150 iterations per user turn.

Architecture

  • Single reasoning loop. One LLM holds the full conversation state and all available tools. No intermediate agents paraphrase results. The model that decides to make a tool call is the same model that reads its response and decides what to do next.
  • Flattened tool surface. Every product operation (Jira, Confluence, Bitbucket, JSM, Compass, plus third-party connectors) is a typed, namespaced action (jira__search_issues, google_calendar__list_events) callable directly.
  • Progressive disclosure. Each product namespace exposes two meta-tools (get_tool_schema + invoke_tool) with a one-line summary of all tools in the namespace description. Frequently-used tools stay permanently flat. Schema cost: fetched once per tool per task.
  • SKILL.md per namespace. Hand-authored domain guides encoding which tool to reach for, multi-step recipes, and gotchas — replacing the implicit expertise that previously lived in sub-agent prompts.
  • Context compaction service. Runs before each model call; trims or summarises older tool outputs while preserving recent results at full resolution. Pruned content is offloaded for on-demand retrieval.
  • Child instances. For wide tasks, spawns parallel Long Horizon instances — each with its own clean context — to own independent research strands. Parent synthesises finished results.
  • Prompt layer ordering. Assembled from most-stable to most-volatile for maximal prefix-cache reuse. Explicit cache_control markers for Anthropic; implicit caching for OpenAI/Gemini.
  • Adaptive reasoning. Calibrates reasoning depth per query (minimal for lookups, deep for multi-step research).

Production numbers

Metric Value
Iteration budget Up to 150
Timeout 20 minutes
LLM calls per tool use 1 (vs 2 in Hybrid Orchestrator)
Offline accuracy 77% (+8.5% vs previous)
Task completion (Confluence) +23% relative
Chat success rate (online A/B) +0.83%
Perceived latency −37% (streaming progress)

Predecessor: Hybrid Orchestrator

The Hybrid Orchestrator was a hierarchical multi-agent system (coordinator + per-product specialist sub-agents). Limitations that motivated Long Horizon: - Information loss. Orchestrator only saw summaries from sub-agents; never raw tool outputs, intermediate reasoning, or errors. - Limited iteration depth. Optimised for 2–5 step tasks; multi-step research hit the ceiling. - Models outgrew the box. Frontier models handle longer contexts and larger tool surfaces reliably; fragmenting context across sub-agents prevented them from using these capabilities. - N-way model migration. Every model upgrade required re-tuning N sub-agents. Long Horizon: one model migration, not N.

Seen in

Last updated · 542 distilled / 1,571 read