paint-brush
Ever Wanted to Run a Data Engine on Kubernetes? SeaTunnel Just Made It Weirdly Simpleby@Apache
New Story

Ever Wanted to Run a Data Engine on Kubernetes? SeaTunnel Just Made It Weirdly Simple

by SeaTunnel5mApril 10th, 2025
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

A how-to guide for deploying SeaTunnel in a clustered mode on Kubernetes, enabling distributed data processing with the Zeta engine as the compute backend.

Coin Mentioned

Mention Thumbnail
featured image - Ever Wanted to Run a Data Engine on Kubernetes? SeaTunnel Just Made It Weirdly Simple
SeaTunnel HackerNoon profile picture
0-item

SeaTunnel provides a method to run the Zeta engine in cluster mode, allowing Kubernetes to locally deploy the Zeta engine for more efficient application deployment and management. In this guide, we will explore more about running the Zeta engine in cluster mode with SeaTunnel on Kubernetes and understand how to leverage the advantages of the Zeta engine better.


  1. Upload SeaTunnel to the server.Previously,

    I have already decompressed and executed the install-plugin.sh script. Here, for convenience, I directly demonstrate using the SeaTunnel after executing the install-plugin.sh script.


After executing the install-plugin, the lib directory contains the following:


tar -zxvf apache-seatunnel-2.3.3-bin.tar.gz
sh apache-seatunnel-2.3.3/bin/install-plugin.sh
tar -czvf apache-seatunnel-2.3.3-bin.tar.gz apache-seatunnel-2.3.3



2.Build the SeaTunnel image.

Create a Dockerfile in the same directory where SeaTunnel is installed. Configure as follows, you can choose the version yourself:


FROM openjdk:8
ENV SEATUNNEL_HOME="/opt/seatunnel"
ENV SEATUNNEL_VERSION="2.3.3"
COPY /apache-seatunnel-${SEATUNNEL_VERSION}-bin.tar.gz /opt/apache-seatunnel-${SEATUNNEL_VERSION}-bin.tar.gz
WORKDIR /opt
RUN tar -xzvf apache-seatunnel-${SEATUNNEL_VERSION}-bin.tar.gz
RUN mv apache-seatunnel-${SEATUNNEL_VERSION} seatunnel
RUN rm -f /opt/apache-seatunnel-${SEATUNNEL_VERSION}-bin.tar.gz
WORKDIR /opt/seatunnel


Execute the command:


docker build -t seatunnel:2.3.3 -f Dockerfile.


3. View the image:


docker images


The image appears as follows:


  1. Load the image into Kubernetes. Here I’m using Minikube for demonstration:
minikube image load seatunnel:2.3.3

Refer to the above link for details: Mastering SeaTunnel Running Zeta Engine in local mode on Kubernetes: A Step-by-Step Guide


5. Create config maps as follows:


kubectl create configmap hazelcast-client --from-file=config/hazelcast-client.yaml
kubectl create configmap hazelcast --from-file=config/hazelcast.yaml
kubectl create configmap seatunnelmap --from-file=config/seatunnel.yaml


6. Use Reloader to restart podsafter updating configmaps automatically:


wget https://raw.githubusercontent.com/stakater/Reloader/master/deployments/kubernetes/reloader.yaml

7. Create seatunnel-cluster.ymlas follows:


apiVersion: v1
kind: Service
metadata:
  name: seatunnel
spec:
  selector:
    app: seatunnel
  ports:
  - port: 5801
    name: seatunnel
  clusterIP: None
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: seatunnel
  annotations:
    configmap.reloader.stakater.com/reload: "hazelcast,hazelcast-client,seatunnelmap"
spec:
  serviceName: "seatunnel"
  replicas: 3
  selector:
    matchLabels:
      app: seatunnel
  template:
    metadata:
      labels:
        app: seatunnel
    spec:
      containers:
        - name: seatunnel
          image: seatunnel:2.3.3
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 5801
              name: client
          command: ["/bin/sh","-c","/opt/seatunnel/bin/seatunnel-cluster.sh -DJvmOption=-Xms2G -Xmx2G"]
          resources:
            limits:
              cpu: "1"
              memory: 4G
            requests:
              cpu: "1"
              memory: 2G
          volumeMounts:
            - mountPath: "/opt/seatunnel/config/hazelcast.yaml"
              name: hazelcast
              subPath: hazelcast.yaml
            - mountPath: "/opt/seatunnel/config/hazelcast-client.yaml"
              name: hazelcast-client
              subPath: hazelcast-client.yaml
            - mountPath: "/opt/seatunnel/config/seatunnel.yaml"
              name: seatunnelmap
              subPath: seatunnel.yaml
      volumes:
        - name: hazelcast
          configMap:
            name: hazelcast
        - name: hazelcast-client
          configMap:
            name: hazelcast-client
        - name: seatunnelmap
          configMap:
            name: seatunnelmap


8. Execute:

kubectl apply -f seatunnel-cluster.yml


9. Modify the configuration in config maps:

kubectl edit cm hazelcast
Modify the cluster address.
Here, headless service access mode is used.
The general format for pod-to-pod communication is <pod-name>.<service-name>.<namespace>.svc.cluster.local


Example:

  • seatunnel-0.seatunnel.default.svc.cluster.local
  • seatunnel-1.seatunnel.default.svc.cluster.local
  • seatunnel-2.seatunnel.default.svc.cluster.local

Reminder: Use spaces instead of tabs. Otherwise, it will cause errors.


kubectl edit cm hazelcast-client


kubectl edit cm seatunnelmap


Modify it to your own HDFS address.

10. You can see the following result:


11. After all nodes have finished updating and are running, you canenter the container to check if the paths have been modified:

kubectl exec -it seatunnel-0 /bin/bash
cat config/hazelcast.yaml


12. Check the logs inside the container:


tail -200f logs/seatunnel-engine-server.log


We can see that the cluster is running normally.


13. Run tasks:

You can open a new connection, log in to another pod node, and execute tasks to test the cluster:


kubectl exec -it seatunnel-1 /bin/bash
bin/seatunnel.sh --config config/v2.streaming.conf.template

We can see that tasks have also started running in other pods.


That’s all for running the Zeta engine in cluster mode with SeaTunnel on Kubernetes!