CONCEPT Cited by 1 source
JSONL output streaming¶
JSONL (JSON Lines) is a text format in which every line is a valid, self-contained JSON object. As a streaming protocol for long-running processes it resolves a specific pain: standard JSON requires an enclosing array to be balanced ([ ... ]), so a process that crashes mid-stream leaves an unparseable payload on disk. With JSONL, every line is independently parseable; truncation costs at most one trailing record.
Specification: jsonlines.org.
Why CI and agent harnesses converge on JSONL¶
Every CI system parsing structured output from a long-running child eventually lands on something like JSONL because the alternatives are worse:
| Alternative | Failure mode |
|---|---|
Closed JSON array [...] |
Missing trailing ] on crash → unparseable; must buffer entire payload in memory. |
| Framed binary (protobuf, msgpack) | Not human-inspectable; needs a framing layer; toolchain heavier. |
| Plaintext logs | Not structured; must be re-parsed by regex; lossy. |
| Separate file per event | Filesystem overhead; ordering needs external sequencing. |
JSONL sidesteps all four. Read a line, parse it, move on. No upfront buffering, no balanced-bracket requirement, grep-able with standard tools.
The streaming-producer discipline¶
A process emitting JSONL commits to:
- One event per line. No multi-line JSON objects, no pretty-printing.
- Newline-terminated (
\n), including the final event. - Flush frequently. Line buffering is the default on stdout but not on fully-buffered pipes.
- Each line is self-contained. No references to prior lines' shapes; parsers can be stateless.
The streaming-consumer discipline¶
A process consuming JSONL from a child:
- Buffer-then-flush, not line-at-a-time. Cloudflare's AI Code Review consumer buffers 100 lines or 50 ms, whichever comes first, "to save our disks from a slow but painful
appendFileSyncdeath." - Real-time pattern matching. Consumers watch the stream for specific events (
step_finish,error,session.idle) and react — not by processing the whole file at the end. - Tolerate partial tail. If the producer crashes, the tail is at worst one unparseable line; parse-what-you-can and continue.
Event shapes that ride JSONL in agent harnesses¶
From Cloudflare's OpenCode embedding:
step_finish— carriesreason(length,stop,tool_use, …) and cumulative token usage. Triggers retry whenreason: "length"(hit max-tokens mid-sentence).error— structured union type (APIError,ProviderAuthError,ContextOverflowError,MessageAbortedError, …). Classifier reads this to decideshouldFailbackbefore invoking the circuit breaker.session.idle— primary completion signal; backed by 3-second polling fallback.- Token-usage events — extracted from
step_finishto drive per-reviewer cost accounting in real time.
The ARG_MAX / E2BIG gotcha JSONL solves at the boundary¶
A related reason child processes running LLM agents land on JSONL: the input direction has a symmetric problem. Large merge-request descriptions passed as a command-line argument hit Linux's ARG_MAX limit (E2BIG). Solution: pipe the prompt via stdin, and receive results via stdout as JSONL. Both ends of the IPC channel become streams — no argv, no payload files. Cloudflare's Bun.spawn invocation encodes exactly this choice:
Bun.spawn(
["bun", opencodeScript, "--format", "json", "--agent", "review_coordinator", "run"],
{ stdin: Buffer.from(prompt), stdout: "pipe", stderr: "pipe" }
);
Seen in¶
- sources/2026-04-20-cloudflare-orchestrating-ai-code-review-at-scale — canonical use: OpenCode child process emits JSONL; orchestrator buffers 100 lines / 50 ms; events drive retry, heartbeat, token accounting.
Related¶
- systems/opencode — producer;
--format jsonemits JSONL. - systems/cloudflare-ai-code-review — consumer.
- patterns/jsonl-streaming-child-process — the full embedding pattern this concept is the transport of.
- concepts/ai-thinking-heartbeat — a consumer-side artefact built on top of the "no new JSONL event in N seconds" signal.