Skip to content

SYSTEM Cited by 1 source

Redpanda Connect dynamic plugins

Redpanda Connect dynamic plugins is a Beta plugin framework introduced in Redpanda Connect v4.56.0 (2025-06-17, Apache 2.0) that allows external programs — written in any language with gRPC support — to act as Redpanda Connect inputs, processors, or outputs without being compiled into the main Go binary. It is the structural successor to the previous "compiled plugins" model (Go-only, built into the binary), reframing Go plugins as the performance-critical path and dynamic plugins as the flexibility / language-choice path.

Architecture

The feature ships with three new compiled plugins in the Redpanda Connect binary — one per component type: BatchInput, BatchProcessor, BatchOutput. These act as dispatch shims: when a pipeline references a dynamic plugin by name, the shim boots the plugin's executable as a subprocess and forwards messages to it over a gRPC service whose interface "closely mirrors the existing interfaces defined for plugins within Redpanda Connect's core engine, Benthos" (Source: sources/2025-06-17-redpanda-introducing-multi-language-dynamic-plugins-for-redpanda-connect).

                        Redpanda Connect host process (Go)
                        ┌──────────────────────────────────┐
                        │  pipeline                        │
                        │  ┌──────────────────────────┐    │
                        │  │ BatchProcessor shim      │    │
                        │  │ (compiled plugin)        │    │
                        │  └─────────────┬────────────┘    │
                        └────────────────│─────────────────┘
                                         │ gRPC over Unix socket
                                         │ (protobuf-serialized
                                         │  message batches)
                        ┌──────────────────────────────────┐
                        │  Plugin subprocess               │
                        │  (Python / Go / any gRPC lang)   │
                        │  implements Benthos-mirrored     │
                        │  BatchProcessor service          │
                        └──────────────────────────────────┘

Four load-bearing properties of this shape (Source: sources/2025-06-17-redpanda-introducing-multi-language-dynamic-plugins-for-redpanda-connect):

  • Language agnosticism"Write plugins in virtually any language that supports gRPC." Language SDKs ship for Go and Python at launch; anything with a gRPC code generator could write its own.
  • Process isolation"Plugins run in separate processes, so crashes won't take down the main Redpanda Connect engine." Canonicalized as concepts/subprocess-plugin-isolation.
  • Protobuf wire format"efficient data transfer with Protocol Buffers". The serialization path is the same class as any gRPC service; the wire cost is the protobuf encode + Unix socket traversal, not full JSON / text serialization.
  • One-subprocess-per-plugin mapping"Each plugin maps to a single subprocess, keeping things modular and isolated." No implicit multiplexing of distinct plugins into one subprocess.

Batch-only component types

Only the batch variants of each component shape (BatchInput, BatchProcessor, BatchOutput) are available as dynamic plugins. The rationale is explicit: "We use batch components exclusively to amortize the cost of cross-process communication." (Source: sources/2025-06-17-redpanda-introducing-multi-language-dynamic-plugins-for-redpanda-connect). Canonicalized as concepts/batch-only-component-for-ipc-amortization. The single-message component types (Input, Processor, Output) are deliberately excluded — their per-message cost model would be dominated by protobuf + socket overhead, making them a worse shape for cross-process implementation.

Plugin surface — the developer contract

A dynamic plugin is packaged as three artefacts:

  1. An executable (any language with a gRPC server). The executable implements the batch Input / Processor / Output gRPC service against the Benthos-mirrored protocol.
  2. A plugin descriptor (plugin.yaml) declaring the plugin's name, type (input / processor / output), the command argv array to launch the executable, and any configurable fields. Example from the post:
name: yell
summary: Just the simplest example
command: ["uv", "run", "main.py"]
type: processor
fields: []
  1. A pipeline YAML (connect.yaml) referencing the plugin by name as if it were a native component:
pipeline:
  processors:
    - yell: {}

Execution: rpk connect run --rpc-plugins=plugin.yaml connect.yaml. The rpk CLI is responsible for spawning the plugin subprocess, establishing the Unix socket, and wiring the shim to the subprocess.

Language SDKs at launch

  • Go SDK"provides a familiar environment for existing Redpanda Connect developers, with type-safe interfaces that mirror the core Redpanda Connect components". Framed as a stepping stone for Go developers who previously had to compile plugins in; lets them iterate without rebuilding the binary.
  • Python SDK — explicitly the headline target: "The Python SDK opens up Redpanda Connect to one of the most popular languages for data processing and AI/ML workloads." Stated ecosystem access: PyTorch / TensorFlow / Hugging Face Transformers (deep learning frameworks), LangChain / LlamaIndex (LLM orchestration), Pandas / NumPy / SciPy (data processing).

The @redpanda_connect.processor decorator is the Python-side entry point; the SDK handles the gRPC server wiring. Minimal Python processor:

import asyncio, logging, redpanda_connect

@redpanda_connect.processor
def yell(msg: redpanda_connect.Message) -> redpanda_connect.Message:
    msg.payload = msg.payload.upper()
    return msg

if __name__ == "__main__":
    logging.basicConfig(level=logging.INFO)
    asyncio.run(redpanda_connect.processor_main(yell))

When to use dynamic vs compiled plugins

Explicit guidance from the post (Source: sources/2025-06-17-redpanda-introducing-multi-language-dynamic-plugins-for-redpanda-connect):

"Use compiled plugins for most standard use cases and best performance. Use dynamic plugins when you need language flexibility or to wrap existing libraries not available in Go."

And the reverse framing:

"For performance-critical workloads where every microsecond counts, the best approach remains using native Go plugins compiled directly into the Redpanda Connect binary. Dynamic plugins shine for flexibility and language choice, while compiled plugins offer maximum performance."

Canonicalized as patterns/compiled-vs-dynamic-plugin-tradeoff. The dynamic plugin model is additive to the compiled-plugin model, not a replacement.

Stated use cases

The post lists five opening use cases for dynamic plugins:

  1. Real-time ML inference — Python processor plugin running a pre-trained BERT model from Hugging Face for sentiment analysis on streaming customer feedback; messages enriched with sentiment scores and emotion classifications.
  2. Wrapping non-Go libraries — gating library ecosystem (Python / Java / C++) that doesn't have Go equivalents into a Redpanda Connect pipeline without rewriting the library.
  3. Complex data transformations — NumPy / SciPy statistical anomaly detection, time-series forecasting, NLP — running inside the streaming pipeline.
  4. Lower barrier to entry"Allow data scientists and ML engineers to contribute plugins in Python without learning Go." Organizational / contributor-ecosystem argument, not a technical one.
  5. Independent deployment — plugins can be upgraded without rebuilding the core Redpanda Connect binary; a plugin and the host process have different release cadences.

Licensing

Apache 2.0. Explicitly declared. Stands in contrast to the Enterprise-licensed CDC input connectors covered in the 2025-03-18 Redpanda Connect post — the plugin framework is open; connectors written on top may carry any license.

Known gaps at launch (2025-06-17 Beta)

  • No performance numbers. The post asserts batch-only components amortize the IPC cost but doesn't quantify: no throughput delta vs compiled plugins, no p99 latency on the cross-process hop, no benchmark against a reference workload.
  • No gRPC .proto published inline. The protocol "closely mirrors" Benthos's existing plugin interfaces — implementors must consult the Go / Python SDK source rather than a spec.
  • No process lifecycle details. Crash recovery, socket cleanup, supervisor model for crashed plugins, and shared state between host and plugin on restart are all unspecified.
  • No horizontal scaling model for CPU-bound plugins. One subprocess per plugin is the described shape. A Python plugin doing heavy CPU work cannot, per this post, spawn N worker subprocesses under one plugin descriptor.
  • Beta stability only. Protocol stability across minor Redpanda Connect versions is not guaranteed at v4.56.0.

Seen in

Last updated · 470 distilled / 1,213 read