CONCEPT Cited by 1 source

JSONL output streaming¶

JSONL (JSON Lines) is a text format in which every line is a valid, self-contained JSON object. As a streaming protocol for long-running processes it resolves a specific pain: standard JSON requires an enclosing array to be balanced ([ ... ]), so a process that crashes mid-stream leaves an unparseable payload on disk. With JSONL, every line is independently parseable; truncation costs at most one trailing record.

Specification: jsonlines.org.

Why CI and agent harnesses converge on JSONL¶

Every CI system parsing structured output from a long-running child eventually lands on something like JSONL because the alternatives are worse:

Alternative	Failure mode
Closed JSON array `[...]`	Missing trailing `]` on crash → unparseable; must buffer entire payload in memory.
Framed binary (protobuf, msgpack)	Not human-inspectable; needs a framing layer; toolchain heavier.
Plaintext logs	Not structured; must be re-parsed by regex; lossy.
Separate file per event	Filesystem overhead; ordering needs external sequencing.

JSONL sidesteps all four. Read a line, parse it, move on. No upfront buffering, no balanced-bracket requirement, grep-able with standard tools.

The streaming-producer discipline¶

A process emitting JSONL commits to:

One event per line. No multi-line JSON objects, no pretty-printing.
Newline-terminated (\n), including the final event.
Flush frequently. Line buffering is the default on stdout but not on fully-buffered pipes.
Each line is self-contained. No references to prior lines' shapes; parsers can be stateless.

The streaming-consumer discipline¶

A process consuming JSONL from a child:

Buffer-then-flush, not line-at-a-time. Cloudflare's AI Code Review consumer buffers 100 lines or 50 ms, whichever comes first, "to save our disks from a slow but painful appendFileSync death."
Real-time pattern matching. Consumers watch the stream for specific events (step_finish, error, session.idle) and react — not by processing the whole file at the end.
Tolerate partial tail. If the producer crashes, the tail is at worst one unparseable line; parse-what-you-can and continue.

Event shapes that ride JSONL in agent harnesses¶

From Cloudflare's OpenCode embedding:

step_finish — carries reason (length, stop, tool_use, …) and cumulative token usage. Triggers retry when reason: "length" (hit max-tokens mid-sentence).
error — structured union type (APIError, ProviderAuthError, ContextOverflowError, MessageAbortedError, …). Classifier reads this to decide shouldFailback before invoking the circuit breaker.
session.idle — primary completion signal; backed by 3-second polling fallback.
Token-usage events — extracted from step_finish to drive per-reviewer cost accounting in real time.

The `ARG_MAX` / `E2BIG` gotcha JSONL solves at the boundary¶

A related reason child processes running LLM agents land on JSONL: the input direction has a symmetric problem. Large merge-request descriptions passed as a command-line argument hit Linux's ARG_MAX limit (E2BIG). Solution: pipe the prompt via stdin, and receive results via stdout as JSONL. Both ends of the IPC channel become streams — no argv, no payload files. Cloudflare's Bun.spawn invocation encodes exactly this choice:

Bun.spawn(
  ["bun", opencodeScript, "--format", "json", "--agent", "review_coordinator", "run"],
  { stdin: Buffer.from(prompt), stdout: "pipe", stderr: "pipe" }
);

Seen in¶

sources/2026-04-20-cloudflare-orchestrating-ai-code-review-at-scale — canonical use: OpenCode child process emits JSONL; orchestrator buffers 100 lines / 50 ms; events drive retry, heartbeat, token accounting.

systems/opencode — producer; --format json emits JSONL.
systems/cloudflare-ai-code-review — consumer.
patterns/jsonl-streaming-child-process — the full embedding pattern this concept is the transport of.
concepts/ai-thinking-heartbeat — a consumer-side artefact built on top of the "no new JSONL event in N seconds" signal.