Ockam and Redpanda partner to launch the world's first zero-trust streaming data platform.
Matthew Gregory CEO
Published 2024-05-30
Ockam teamed up with Redpanda to launch Redpanda Connect with Ockam: the first zero-trust streaming data platform. This is a natural partnership because both companies have the same ethos; to enable every developer to build distributed systems, at scale, with simple tools. Both companies' products are based on popular open source projects and are built by veteran, high-performing teams.
Redpanda Connect with Ockam is the only enterprise-scale, zero-trust, streaming data platform that's simultaneously easy to deploy, secure-by-design, and highly performant.
“Enterprise executives tell me that their high value streaming applications take a long time to build and to ship,” said Matthew Gregory, Ockam's CEO. “Their biggest obstacle is the construction of secure streaming data connections across their siloed enterprise, and between partner companies. Secure data streaming is now simple to set up, trivial to maintain, and almost impossible to mess up. We've unlocked data so application development can move fast.”
Enterprise Kafka product vendors defer zero-trust implementation details to their customers. It's difficult, even for sophisticated engineering organizations, to pull together a team that can choose and assemble the dozens of parts required to build a secure-by-design streaming data platform. Redpanda Connect with Ockam empowers a single developer to easily create end-to-end encrypted streaming pipelines that can connect 100's of applications across clouds and hybrid infrastructure. It builds trust in data streams, eliminates sources of risk to critical data, and removes the operational complexities inherent in managing keys at scale.
Redpanda provides a declarative data streaming service that enables developers and organizations to build simple, chained, and stateless processing steps for their data pipelines. Our partnership with Redpanda brings a truly zero-trust capability to these pipelines. Ockam's cryptographic protocols and data integrity guarantees remove the risk of MITM and supply chain attacks that could otherwise tamper with data inflight and lead to data poisoning; highlighted as one of the OWASP Top 10 risks for companies that leverage AI and Large Language Model (LLM) systems. The mutual authentication and encrypted data protocols that Ockam provides also mean organizations can transmit highly sensitive data with cryptographic guarantees so only the intended recipients will be able to decrypt and process it. This will unlock new high value applications that businesses may have been unwilling to explore due to privacy and governance risk, or complexity.
“We help the world's largest companies build and scale streaming applications," said Alex Gallego, Redpanda's CEO. "Redpanda Connect with Ockam was conceived to empower our customers to build streaming data applications with ease. Redpanda Connect comes with over 200 pre-built integrations that will quickly connect to and unlock your data silos. Ockam provides the mutual authentication and end-to-end encryption that zero-trust streaming data applications require. Together with Ockam, we've abstracted away all of the complex pieces that cause engineering teams to stumble.”
Confluent's “2023 Data Streaming Report: Moving Up the Maturity Curve” recently surveyed 2,250 IT and engineering leaders from organizations that use streaming data.
- 74% cited that their streaming data applications are blocked because of fragmented projects, uncoordinated teams, and constrained engineering budgets.
- 32% claim that they successfully “reduced security risks” after implementing data streaming services across their applications. However, 94% have high-security expectations for these same applications.
These are alarming numbers! Despite the obvious benefits that a real-time streaming data architecture provides, companies struggle to ship data pipelines to their applications. Often, they're blocked for several quarters. Once they get to production, they realize that their data doesn't have the security guarantees that they need or expect from their streaming data infrastructure.
Redpanda Connect with Ockam aims to move these numbers to 0% and 100%, respectively. Let's go!
Connect, secure, and stream—all in one simple platform
Connect with Redpanda Connect (formerly known as Benthos)
Redpanda Connect empowers engineers to build connectors and integrations to over 200 applications, including SQL, Mongo, Snowflake, Cassandra, Redis, Splunk, AWS Lambda, Azure CosmosDB, and GCP Cloud Storage. Redpanda Connect can process streaming data with transforms, mapping, filtering, hydration, and enrichment capabilities.
Secure with Ockam
Zero-trust in your data streaming infrastructure allows you to have absolute-trust in your data and applications. Ockam adds the guarantees of data confidentiality, integrity, and authenticity, so you can enforce data governance and privacy policies at scale.
Each producer creates its own unique cryptographically provable identity and encryption keys. The producers use their keys to establish trusted secure channels through Redpanda Cloud and to the consumers across the streaming platform. All the data that moves through your streaming data platform is encrypted in motion - even when it's passing through a broker. Keys, enrollments, and credentials are safely created, stored, rotated, and revoked automagically so there's almost nothing to manage.
Ockam makes it simple to Build Trust across your entire data layer - in a way that's almost impossible to mess up.
Stream with Redpanda Cloud
Redpanda Cloud is a complete streaming data platform delivered as a fully managed service with automated upgrades and patching, data and partition balancing, and 24×7 support. It continuously monitors and maintains your clusters along with the underlying infrastructure to meet strict performance, availability, reliability, and security requirements.
Give it a try
The example below will show you how to use Ockam to connect any input sources (e.g., MongoDB, Postgresql, a Kafka stream, etc.) to any output (e.g., Snowflake, S3, Splunk, etc.), automatically encrypting all your data in motion. You can run this complete example locally in less than 2 minutes by copying the relevant sections below. To understand in more detail each configuration and command, and how to extend this for other uses, read on below.
First, copy the code block below and save it to a file named consumer.yaml
:
_22# consumer.yaml_22_22input:_22 ockam_kafka:_22 seed_brokers: [rp-node-0:9092]_22 topics: [topic_A]_22 consumer_group: example_group_22 ockam_allow_producer: producer_22 ockam_relay: consumer_relay_22 ockam_enrollment_ticket: ${OCKAM_ENROLLMENT_TICKET}_22_22pipeline:_22 processors:_22 - bloblang: |_22 root = this_22 root.data.message = this.data.message.uppercase()_22_22output:_22 stdout: {}_22 # snowflake_put:_22 # account: acme_22 # ...
Next, save the code block below to producer.yaml
_23# producer.yaml_23_23input:_23 generate:_23 count: 1000_23 interval: "@every 1s"_23 mapping: |_23 root = {_23 "_producer": hostname(),_23 "data": { "email": fake("email"), "message": fake("sentence") }_23 }_23 # mongodb:_23 # url: mongodb://localhost_23 # database: orders_23 # ..._23_23output:_23 ockam_kafka:_23 seed_brokers: [rp-node-0:9092]_23 topic: topic_A_23 ockam_route_to_consumer: /project/default/service/forward_to_consumer_relay/secure/api_23 ockam_allow_consumer: consumer_23 ockam_enrollment_ticket: ${OCKAM_ENROLLMENT_TICKET}
All that's left is to run the commands below to start the services, and data will start flowing. These commands use Homebrew to install tools and Docker to run containers, so before you begin, please set up Homebrew and Docker on your machine.
_42# Setup Redpanda_42brew install redpanda-data/tap/redpanda_42rpk container start_42_42# Setup Ockam_42brew install build-trust/ockam/ockam_42ockam enroll_42_42# Setup Consumer_42ockam project ticket --usage-count 1 --expires-in 10m \_42 --attribute consumer --relay consumer_relay > ticket_42_42docker run --rm --network redpanda --name consumer \_42 -v "$(pwd)/consumer.yaml:/connect.yaml" \_42 -e OCKAM_ENROLLMENT_TICKET="$(cat ticket)" ghcr.io/build-trust/redpanda-connect_42_42# The previous command will block and print the output so you can see _42# the messages being consumed by the consumer._42#_42# Run the following commands in a new terminal window._42_42# Setup Producers_42for i in $(seq 1 3); do_42 ockam project ticket --usage-count 1 --expires-in 10m \_42 --attribute producer > ticket_42_42 docker run --rm -d --network redpanda --name "producer$i" \_42 -v "$(pwd)/producer.yaml:/connect.yaml" \_42 -e OCKAM_ENROLLMENT_TICKET="$(cat ./ticket)" ghcr.io/build-trust/redpanda-connect_42done_42_42# It can take a minute or so for the messages to start flowing._42# Observe that the above consumer can decrypt and transform messages._42_42# However, if you look at the messages inside redpanda console at _42# localhost:8080 or using the below rpk command, you'll notice that _42# the messages are encrypted._42rpk topic consume topic_A_42_42# Cleanup_42docker stop $(docker ps -q --filter "name=producer" --filter "name=consumer")_42rpk container purge
Detailed explanation of each step
Walking through the commands one-by-one will explain what each step achieves. Starting with getting Redpanda running locally:
_10brew install redpanda-data/tap/redpanda_10rpk container start
These two commands install Redpanda and start a container. Now there is a broker running in a container that we will use to stream and process data with Redpanda Connect.
_10brew install build-trust/ockam/ockam_10ockam enroll
The brew
command is how you install Ockam Command into your local environment.
ockam enroll
will open a browser window and guide you through either signing up
or signing in to Ockam Orchestrator. Ockam
Orchestrator will allow you to securely manage the enrollment of thousands or more
processing nodes without the need to manually push identities, keys, or
credentials to each node. This initial enroll
will establish your local machine
as an Administrator within the project that has been setup for you within Ockam
Orchestrator.
_10ockam project ticket --usage-count 1 --expires-in 10m \_10 --attribute consumer --relay consumer_relay > ticket
The ockam project ticket
command is how you generate a ticket that will admit
another node into your project. The arguments here specify that:
- The ticket is valid for use
1
time. This ensures that the ticket is for the desired purposes and if it's inadvertently leaked there is no opportunity for an attacker to re-use it to gain access to your project. - A time-to-live (TTL)/expiry period of
10
minutes. This also reduces the risk posed by a ticket leaking, you can adjust it to any time window that's appropriate for your use case. - An attribute of
consumer
. This will add an attribute to any node that enrolls with this ticket, and policy definitions can then use those attributes to apply Attribute Based Access Controls and restrict which nodes can communicate with each other. - Allow any node, that enrolls with this ticket, to setup a relay named
consumer_relay
. This provides a named route for other nodes to find this node, even if both nodes are on different network.
The output of that command, the enrollment ticket, is then redirected out to
a file named ticket
.
_10docker run --rm --network redpanda --name consumer \_10 -v "$(pwd)/consumer.yaml:/connect.yaml" \_10 -e OCKAM_ENROLLMENT_TICKET="$(cat ticket)" ghcr.io/build-trust/redpanda-connect
This is where you start your consumer with Redpanda Connect. It uses Docker to
start a new container named consumer
, on an isolated network named redpanda
where the broker is available, passing in the consumer.yaml
and ticket
files.
The consumer.yaml
file defines how to receive the data:
_10input:_10 ockam_kafka:_10 seed_brokers: [rp-node-0:9092]_10 topics: [topic_A]_10 consumer_group: example_group_10 ockam_allow_producer: producer_10 ockam_relay: consumer_relay_10 ockam_enrollment_ticket: ${OCKAM_ENROLLMENT_TICKET}
This input block shows the standard kafka configuration options that specify the broker, topics, and consumer group to use. The following three lines are the Ockam-specific additions that allow this input source to receive and process end-to-end encrypted data streams:
ockam_allow_producer
: the attribute that needs to exist on the other node before allowing each node to establish a trusted channel between each other.ockam_relay
: the name of the relay to connect this node to.ockam_enrollment_ticket
: the enrollment ticket to use to admit this node into the Ockam Orchestrator project.
_10pipeline:_10 processors:_10 - bloblang: |_10 root = this_10 root.data.message = this.data.message.uppercase()
The pipeline and processors block does a basic transformation of received
data, and converts the message
attribute to be all in upper-case. While not
an entirely useful transformation in production, this allows you to independently
verify that the consumer was able to decrypt and process the data.
_10output:_10 stdout: {}_10 # snowflake_put:_10 # account: acme_10 # ...
The last part of the consumer.yaml
configuration is the output block. This
example implements a simple stdout
output. As the commented-out Snowflake block
highlights, other destinations can receive your final processed data stream by
providing their configuration settings.
The code snippet for starting data producers shows how Ockam makes enrolling any number of nodes painless:
_10for i in $(seq 1 3); do_10 ockam project ticket --usage-count 1 --expires-in 10m \_10 --attribute producer > ticket_10_10 docker run --rm -d --network redpanda --name "producer$i" \_10 -v "$(pwd)/producer.yaml:/connect.yaml" \_10 -e OCKAM_ENROLLMENT_TICKET="$(cat ./ticket)" ghcr.io/build-trust/redpanda-connect_10done
The for i in $(seq 1 3); do
line creates a for
loop that will create 3
separate producers. Each prodcucer will generate a unique identity, enroll into
the Ockam Orchestrator project, and establish a secure channel with the consumer
node. If you feel ambitious you can increase that 3
to a higher number, just
be aware the script will run a new docker container for each producer so you could
quickly exhaust your local system resources.
The ockam project ticket
command generates an enrollment ticket, just as it
did in the earlier section that explained the setup of the consumer. The primary
difference with this ticket is that the attribute is set to producer
and there
is no need to setup a relay for a producer.
The docker run
command runs the same image as before, on the same network,
giving each image a unique name with a prefix of producer
, and passing in
the producer.yaml
and ticket
for each node.
The producer.yaml
configuration file has two blocks of importance:
_13input:_13 generate:_13 count: 1000_13 interval: "@every 1s"_13 mapping: |_13 root = {_13 "_producer": hostname(),_13 "data": { "email": fake("email"), "message": fake("sentence") }_13 }_13 # mongodb:_13 # url: mongodb://localhost_13 # database: orders_13 # ...
The input block here is generating random data, useful for the sake of a self-
contained example but something you would replace in a real-world use case. On a
1 second interval new data that includes the hostname
(allowing you to verify that there is 3 separate nodes producing data), and
fake data for the email
and message
attributes in the payload, is sent to
the output. If you recall the explanation earlier, the message
field is later
processed by the consumer to be entirely in upper-case before being output
to stdout
.
For a real-world use case, the commented out block gives an example of how to use a Mongo database as the input source for a producer.
_10output:_10 ockam_kafka:_10 seed_brokers: [rp-node-0:9092]_10 topic: topic_A_10 ockam_route_to_consumer: /project/default/service/forward_to_consumer_relay/secure/api_10 ockam_allow_consumer: consumer_10 ockam_enrollment_ticket: ${OCKAM_ENROLLMENT_TICKET}
The output block in producer.yaml
contains the standard kafka configuration,
along with three Ockam-specific options:
ockam_route_to_consumer
: the full route to the relay that was setup for the consumer.ockam_allow_consumer
: the attribute that needs to exist on the other node before allowing each node to establish a trusted channel between each other.ockam_enrollment_ticket
: the enrollment ticket to use to admit this node into the Ockam Orchestrator project.
Now everything is setup and running. You'll see in the log output that the
consumer is receiving data, from different hosts, and that the
message
in each payload is upper-case. If you open the
Redpanda Console or run rpk topic consume topic_A
you'll see that the encrypted messages in topic_A
are not readable by the
broker or anybody else that might have access to the topic.
By adding or changing a few lines in the consumer.yaml
and producer.yaml
files it's possible to:
- Use an existing self-hosted Redpanda broker or Redpanda Cloud instance.
- Add more inputs into
producer.yaml
and/or outputsconsumer.yaml
to connect, secure, and stream data between any of the hundreds of different integrations supported by Redpanda Connect.
Previous Article
How to grow an OSS community on GitHub
Next Article
Ockam Orchestrator GA: A new way for Enterprises to build apps that can Trust Data-in-Motion