Skip to content

CONCEPT Cited by 1 source

OpenTelemetry collector — receiver / processor / exporter pipeline

The OpenTelemetry Collector is OTel's vendor-neutral intermediary between data sources (applications, SDKs, infra agents, metric streams) and observability backends (Grafana Cloud, Honeycomb, Lightstep, AWS X-Ray, Jaeger, Prometheus, etc.). What makes the collector the central-hub primitive of OTel is its load-bearing internal abstraction: a three-stage pipelineReceivers → Processors → Exporters — each stage pluggable independently, and the data between stages always in OTel's internal format.

The three stages

The 2026-05-13 source canonicalises the triad verbatim:

"The OpenTelemetry collector operates through three primary components that work together in a processing flow: Receivers accept data in specified formats (like Prometheus or OpenTelemetry Protocol (OTLP)) and translate it into OpenTelemetry's internal format; Processors manipulate and enrich the data as it flows through (filtering unnecessary data, batching for performance, transforming to mask sensitive information, or adding metadata like Kubernetes attributes); and Exporters send the processed data to destination backends such as Grafana Cloud, AWS X-Ray, Lightstep or Honeycomb." (Source)

Stage Role Examples
Receivers Accept incoming telemetry in N input formats; translate to OTel internal format OTLP/gRPC, OTLP/HTTP, Prometheus scrape, statsd, fluentd, AWS Metric Streams, syslog
Processors Manipulate / enrich / filter / batch in-flight Batch, attributes (add/edit/remove), filter, memory-limiter, k8s-attributes, tail-sampling
Exporters Send out in N output formats to N backends OTLP, Prometheus remote-write, AWS X-Ray, CloudWatch, Grafana Cloud, Honeycomb, Splunk

Why the three-stage shape matters: vendor-neutral fan-out

The collector's value proposition is not any single stage — it's the decoupling the three stages enforce. Two architectural properties fall out of the design:

  • N receivers × M exporters with a single internal format in the middle. Without the OTel internal format, every source-format × destination-format pair would need a custom translator (an N×M matrix). With the internal format, you need N receivers + M exporters (an N+M sum). This is the same argument that made LLVM's IR load-bearing for compiler back-ends.
  • Multiple exporters per pipeline = vendor-neutral fan-out. A single collector can simultaneously export the same metric stream to AWS CloudWatch (for native AWS integration), Grafana Cloud (for dashboards), and Honeycomb (for high-cardinality query). The customer can switch exporters without changing application instrumentation, removing a load-bearing source of vendor lock-in.

Why the processor stage matters

Receivers and exporters are largely format adapters; the processor stage is where production observability problems are solved:

  • Filtering — drop high-volume / low-signal metrics before egress; the cost lever for SaaS observability bills.
  • Batching — coalesce per-record exports into per-batch HTTP requests; the throughput lever.
  • Attribute manipulation — add cluster_id, region, team from collector config; the standardisation lever.
  • PII masking — strip / redact sensitive fields before egress; the compliance lever.
  • Tail sampling — sample traces based on whole-trace outcome (errors, latency); the trace-cost lever.

The processor stage is where the collector earns its keep relative to a thin proxy. "Filtering unnecessary data, batching for performance, transforming to mask sensitive information, or adding metadata like Kubernetes attributes" — every one of these is a cross-cutting concern that doesn't belong in application code or in backend ingestion.

Architectural role: the collector as central hub

In the 2026-05-13 source's customer architecture, the OpenTelemetry collector sits inside the customer's VPC on EC2 (with internal-NLB ingress) and is explicitly framed as a central hub:

"The collector is a central hub that receives, processes, and forwards telemetry data (metrics, traces, and logs) from various sources to multiple destinations in a vendor-neutral way."

The hub property is the receiver-processor-exporter pipeline made operational. From the customer's standpoint, the upstream producers (CloudWatch Metric Streams, application SDKs, infra agents) and downstream backends (Grafana Cloud, Honeycomb, X-Ray) all change independently of each other. The collector's three stages absorb the change.

Seen in

  • sources/2026-05-13-aws-streaming-cloudwatch-metrics-to-vpc-based-opentelemetry-collectors-using-lambdafirst canonical wiki home. The post canonicalises the receiver / processor / exporter triad verbatim and names the collector explicitly as the "central hub" abstraction. The collector's fan-out role is illustrated with named exporter destinations: Grafana Cloud, AWS X-Ray, Lightstep, Honeycomb. Receiver examples named: Prometheus, OTLP. Processor examples named: filter / batch / mask / k8s-attribute. The post does not disclose the customer's specific processor configuration; the value is in canonicalising the architectural shape.
Last updated · 542 distilled / 1,571 read