Building small data lake with k8s
Airflow + K8s + S3[MinIO] + Pandas/Pyspark/Polars + Delta Lake + RDBMS + velero
WIKI: https://github.com/veinkr/K8sDataLake/wiki
Airflow Cluster- Install CMD
- CI/CD script
- MinIO with UI
- DeltaLake
- Spark on K8s
- Jupyter
- API for downstream
- Dashboard
- K8s Backup/Migration
- Flink
- Node Selector Strategy(affinities/tolerations)