Kubernetes Data Platform 구축 / Raspberry Pi 5, Jetson Nano Cluster 환경
1. 설치 환경
![[Figure 1] Cluster Spec](/blog-software/docs/record/kubernetes-data-platform-raspberrypi5-jetnano-cluster/images/cluster-spec.png)
[Figure 1] Cluster Spec
![[Figure 2] Cluster 구성 요소](/blog-software/docs/record/kubernetes-data-platform-raspberrypi5-jetnano-cluster/images/cluster-component.png)
[Figure 2] Cluster 구성 요소
2. OS 설치
2.1. Raspberry Pi 5
![[Figure 3] Raspberry Pi Imager](/blog-software/docs/record/kubernetes-data-platform-raspberrypi5-jetnano-cluster/images/raspberry-pi-imager-ubuntu-server.png)
[Figure 3] Raspberry Pi Imager
Raspberry Pi Imager를 활용하여 Ubuntu Server 24.04를 설치한다.
![[Figure 4] Raspberry Pi Imager General](/blog-software/docs/record/kubernetes-data-platform-raspberrypi5-jetnano-cluster/images/raspberry-pi-imager-general.png)
[Figure 4] Raspberry Pi Imager General
![[Figure 5] Raspberry Pi Imager Services](/blog-software/docs/record/kubernetes-data-platform-raspberrypi5-jetnano-cluster/images/raspberry-pi-imager-services.png)
[Figure 5] Raspberry Pi Imager Services
Host 이름, Username, Password, Timezone, SSH Server를 설정하고 OS Image를 uSD Card에 복사한다.
- Host 이름 : [Figure 1]의 Host 이름 참조
- Username/Password :
temp
/temp
2.2. Jetson Nano
Install Guide에 따라서 uSD Card에 OS를 설치한다.
3. Network 설정
3.1. Raspberry Pi 5
root
User로 진입한다.
sudo -s
[Figure 1]의 Network를 참조하여 /etc/netplan/50-cloud-init.yaml
파일에 다음과 같이 고정 IP를 설정한다.
network:
ethernets:
eth0:
addresses:
- [IP Address]/24
nameservers:
addresses:
- 8.8.8.8
routes:
- to: default
via: 192.168.1.1
version: 2
3.2. Jetson Nano
root
User로 진입한다.
sudo -s
[Figure 1]의 Network를 참조하여 고정 IP를 설정한다.
nmcli con mod "Wired connection 1" \
ipv4.addresses "[IP Address]/24" \
ipv4.gateway "192.168.1.1" \
ipv4.dns "8.8.8.8" \
ipv4.method "manual"
4. Docker, kubelet 설치
4.1. Raspberry Pi 5
Kernel Module을 로드한다.
cat <<EOF | tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
modprobe overlay
modprobe br_netfilter
sysctl Parameter를 설정한다.
cat <<EOF | tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
sysctl --system
containerd를 설치한다.
apt update
apt install -y containerd
mkdir -p /etc/containerd
containerd config default | tee /etc/containerd/config.toml
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml
systemctl restart containerd.service
kubelet, kubeadm을 설치한다.
apt-get update
apt-get install -y apt-transport-https ca-certificates curl gnupg
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
chmod 644 /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo chmod 644 /etc/apt/sources.list.d/kubernetes.list
apt-get update
apt-get install -y kubelet=1.30.8-1.1 kubeadm=1.30.8-1.1
4.2. Jetson Nano
Swap Memory를 제거한다.
swapoff -a
mv /etc/systemd/nvzramconfig.sh /etc/systemd/nvzramconfig.sh.back
Kernel Module을 로드한다.
cat <<EOF | tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
modprobe overlay
modprobe br_netfilter
sysctl Parameter를 설정한다.
cat <<EOF | tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
sysctl --system
containerd를 설치한다.
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
apt-get update
apt-get install -y nvidia-container-toolkit containerd
mkdir -p /etc/containerd
containerd config default | tee /etc/containerd/config.toml
nvidia-ctk runtime configure --runtime=containerd --set-as-default
systemctl restart containerd
kubelet, kubeadm을 설치한다.
apt-get update
mkdir -p /etc/apt/keyrings
apt-get install -y apt-transport-https ca-certificates curl gnupg
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
chmod 644 /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo chmod 644 /etc/apt/sources.list.d/kubernetes.list
apt-get update
apt-get install -y kubelet=1.30.8-1.1 kubeadm=1.30.8-1.1
5. Kubernetes Cluster 구성
5.1. Master Node (Raspberry Pi 5)
kubectl을 설치한다.
apt-get install -y kubectl=1.30.8-1.1
Kubernetes Cluster를 구성한다.
cat <<EOF | tee kubeadm-config.yaml
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
certificateValidityPeriod: 876000h
caCertificateValidityPeriod: 876000h
kubernetesVersion: "v1.30.8"
networking:
podSubnet: "10.244.0.0/16"
EOF
kubeadm init --config kubeadm-config.yaml
kubeadm join 192.168.1.71:6443 --token e5t05s.1z4zbpm3oxdhskya --discovery-token-ca-cert-hash sha256:01c2bf6ead65ea0e9c39186d92a51baa9aa6dc6963b900cd825d7e14dcb08fba
kubectl config 파일을 복사한다.
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
flannel CNI Plugin을 설치한다.
kubectl apply -f https://github.com/flannel-io/flannel/releases/download/v0.26.2/kube-flannel.yml
5.2. Compute, Storage, GPU Nodes
각각의 Node에 SSH로 접근하여 Kubernetes Cluster에 Join 한다.
kubeadm join 192.168.1.71:6443 --token e5t05s.1z4zbpm3oxdhskya --discovery-token-ca-cert-hash sha256:01c2bf6ead65ea0e9c39186d92a51baa9aa6dc6963b900cd825d7e14dcb08fba
Master Node의 Master Label과 Taint를 제거한다.
kubectl taint node dp-master node-role.kubernetes.io/control-plane:NoSchedule-
kubectl label node dp-master node-role.kubernetes.io/control-plane-
각각의 Node에 Role을 부여한다.
kubectl label node dp-master node-role.kubernetes.io/master=""
kubectl label node dp-compute-01 node-role.kubernetes.io/compute=""
kubectl label node dp-compute-02 node-role.kubernetes.io/compute=""
kubectl label node dp-storage-01 node-role.kubernetes.io/storage=""
kubectl label node dp-storage-02 node-role.kubernetes.io/storage=""
kubectl label node dp-gpu-01 node-role.kubernetes.io/gpu=""
kubectl label node dp-gpu-02 node-role.kubernetes.io/gpu=""
kubectl label node dp-master node-group.dp.ssup2="master"
kubectl label node dp-compute-01 node-group.dp.ssup2="compute"
kubectl label node dp-compute-02 node-group.dp.ssup2="compute"
kubectl label node dp-storage-01 node-group.dp.ssup2="storage"
kubectl label node dp-storage-02 node-group.dp.ssup2="storage"
kubectl label node dp-gpu-01 node-group.dp.ssup2="gpu"
kubectl label node dp-gpu-02 node-group.dp.ssup2="gpu"
Node의 Role을 확인한다.
kubectl get nodes
NAME STATUS ROLES AGE VERSION
dp-compute-01 Ready compute 16d v1.30.8
dp-compute-02 Ready compute 16d v1.30.8
dp-gpu-01 Ready gpu 16d v1.30.8
dp-gpu-02 Ready gpu 16d v1.30.8
dp-master Ready control-plane,master 16d v1.30.8
dp-storage-01 Ready storage 16d v1.30.8
dp-storage-02 Ready storage 16d v1.30.8
core-dns가 Master Node에만 동작하도록 설정한다.
kubectl patch deployment coredns -n kube-system -p '{"spec":{"template":{"spec":{"nodeSelector":{"node-group.dp.ssup2":"master"}}}}}'
6. Data Component 설치
# MetelLB
helm upgrade --install --create-namespace --namespace metallb metallb metallb -f metallb/values.yaml
kubectl apply -f metallb/ip-address-pool.yaml
kubectl apply -f metallb/l2-advertisement.yaml
# Cert Manager
helm upgrade --install --create-namespace --namespace cert-manager cert-manager cert-manager -f cert-manager/values.yaml
# PostgreSQL (ID/PW: postgres/root123!)
helm upgrade --install --create-namespace --namespace postgresql postgresql postgresql -f postgresql/values.yaml
kubectl -n postgresql exec -it postgresql-0 -- bash -c 'PGPASSWORD=root123! psql -U postgres -c "create database dagster;"'
kubectl -n postgresql exec -it postgresql-0 -- bash -c 'PGPASSWORD=root123! psql -U postgres -c "create database metastore;"'
kubectl -n postgresql exec -it postgresql-0 -- bash -c 'PGPASSWORD=root123! psql -U postgres -c "create database ranger;"'
kubectl -n postgresql exec -it postgresql-0 -- bash -c 'PGPASSWORD=root123! psql -U postgres -c "create database mlflow;"'
kubectl -n postgresql exec -it postgresql-0 -- bash -c 'PGPASSWORD=root123! psql -U postgres -c "create database mlflow_auth;"'
# Redis (ID/PW: default/default)
helm upgrade --install --create-namespace --namespace redis redis redis -f redis/values.yaml
# ArgoCD (ID/PW: default/default)
helm upgrade --install --create-namespace --namespace argo-cd argo-cd argo-cd -f argo-cd/values.yaml
# Yunikorn
helm upgrade --install --create-namespace --namespace yunikorn yunikorn yunikorn -f yunikorn/values.yaml
# KEDA
helm upgrade --install --create-namespace --namespace keda keda keda -f keda/values.yaml
# Longhorn
helm upgrade --install --create-namespace --namespace longhorn longhorn longhorn -f longhorn/values.yaml
# MinIO (ID/PW: root/root123!)
helm upgrade --install --create-namespace --namespace minio minio minio -f minio/values.yaml
brew install minio/stable/mc
mc alias set dp http://$(kubectl -n minio get service minio -o jsonpath="{.status.loadBalancer.ingress[0].ip}"):9000 root root123!
mc mb dp/spark/logs
mc mb dp/dagster/pipelines
# ZincSearch (ID/PW: admin/Rootroot123!)
helm upgrade --install --create-namespace --namespace zincsearch zincsearch zincsearch -f zincsearch/values.yaml
INDEX_MAPPING='{
"settings": {
"index": {
"number_of_shards": 1,
"number_of_replicas": 1
}
},
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"name": {
"type": "text"
},
"description": {
"type": "text"
},
"created_at": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
}'
curl -s -X PUT "http://$(kubectl -n zincsearch get service zincsearch -o jsonpath='{.status.loadBalancer.ingress[0].ip}'):4080/ranger" -u "admin:Rootroot123\!" -H "Content-Type: application/json" -d "$INDEX_MAPPING"
# Nvidia Device Plugin
helm upgrade --install --create-namespace --namespace nvidia-device-plugin nvidia-device-plugin nvidia-device-plugin -f nvidia-device-plugin/values.yaml
# Nvidia Jetson Exporter
# Prometheus
helm upgrade --install --create-namespace --namespace prometheus prometheus prometheus -f prometheus/values.yaml
# Prometheus Node Exporter
helm upgrade --install --create-namespace --namespace prometheus-node-exporter prometheus-node-exporter prometheus-node-exporter -f prometheus-node-exporter/values.yaml
# kube-state-metrics
helm upgrade --install --create-namespace --namespace kube-state-metrics kube-state-metrics kube-state-metrics -f kube-state-metrics/values.yaml
# Loki
helm upgrade --install --create-namespace --namespace loki loki loki -f loki/values.yaml
# Promtail
helm upgrade --install --create-namespace --namespace promtail promtail promtail -f promtail/values.yaml
# Grafana (ID/PW: admin/root123!)
helm upgrade --install --create-namespace --namespace grafana grafana grafana -f grafana/values.yaml
# Ranger
helm upgrade --install --create-namespace --namespace ranger ranger ranger -f ranger/values.yaml
# Hive Metastore
helm upgrade --install --create-namespace --namespace hive-metastore hive-metastore hive-metastore -f hive-metastore/values.yaml
# Kafka (ID/PW: user/user)
helm upgrade --install --create-namespace --namespace kafka kafka kafka -f kafka/values.yaml
helm upgrade --install --create-namespace --namespace kafka kafka-ui kafka-ui -f kafka-ui/values.yaml
# Airflow (ID/PW: admin/admin)
helm upgrade --install --create-namespace --namespace airflow airflow airflow -f airflow/values.yaml
# Dagster
helm upgrade --install --create-namespace --namespace dagster dagster dagster -f dagster/values.yaml
# Spark Operator
helm upgrade --install --create-namespace --namespace spark-operator spark-operator spark-operator -f spark-operator/values.yaml
# Spark History Server
helm upgrade --install --create-namespace --namespace spark-history-server spark-history-server spark-history-server -f spark-history-server/values.yaml
# Flink Kubernetes Operator
helm upgrade --install --create-namespace --namespace flink-kubernetes-operator flink-kubernetes-operator flink-kubernetes-operator -f flink-kubernetes-operator/values.yaml
# Trino (ID: root)
helm upgrade --install --create-namespace --namespace trino trino trino -f trino/values.yaml
# JupyterHub (ID/PW: root/root123!)
helm upgrade --install --create-namespace --namespace jupyterhub jupyterhub jupyterhub -f jupyterhub/values.yaml
# Apache Ranger
# MLflow (ID/PW: root/root123!)
helm upgrade --install --create-namespace --namespace mlflow mlflow mlflow -f mlflow/values.yaml
참조
- Nvidia Containerd : https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html
- Airflow on Kubernetes : https://zerohertz.github.io/k8s-airflow/