This page lists Helm charts and/or Docker images that are potential candidates for inclusion in a cloud-native PNDA.
This set of Helm charts is authored by the developers of kubernetes contributions to Apache Spark.
- Note from the repo → Note that the HDFS charts are currently in pre-alpha quality. They are also being heavily revised and are subject to change.
- HDFS Datanode does not use Kubernetes Persistent Volumes
Gradiant developed an Alpine-based containers for HDFS:
- Not tested in Kubernetes.
- Helm charts on the roadmap.
- Deployment of spark version 1.5.1.
Gradiant developed Alpine-based containers for Spark 2.x Standalone Deployment.
- Spark-UI not well integrated on Kubernetes.
- Gradiant Private Helm charts available (internal discussion to make them public).
Zero to Jupyterhub is an official set of Jupyter images and Helm charts. The main drawback is that the pyspark-notebook image is > 5GB.
Jupyter notebooks docker image by jupyter is 2.1GB and 61 layers (uncompressed 6GB):
Gradiant developed alpine-based containers for Jupyter with datascience libraries included is 955MB and 21 layers (uncompressed 2GB):
Dockerfile for further customizations is public at:
- No user management (no jupyterhub).