Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

Package Usage: docker: apache/spark-py


11 versions
Latest release: about 1 year ago
322,094 downloads total

View more package details: https://packages.ecosyste.ms/registries/hub.docker.com/packages/apache/spark-py

Dependent Repos 15

syedhassaanahmed/portable-etl
E2E data pipeline showcasing Spark's capabilities on Cloud and Edge
  • v3.1.3 src/docker-compose.yml

Size: 1.21 MB - Last synced: 9 months ago - Pushed: 9 months ago

duyet/spark-docker
Spark image for running on Kubernetes
  • v3.3.0 spark-py/v3.3.0-hadoop3/Dockerfile

Size: 64.5 KB - Last synced: 4 days ago - Pushed: 4 days ago

maziyarpanahi/k8-zeppelin-spark
  • v3.2.2 images/spark/native/Dockerfile

Size: 111 KB - Last synced: 9 months ago - Pushed: 9 months ago

forrest-bajbek/pyspark-regression
A tool for regression testing Spark Dataframes in Python
  • latest Dockerfile

Size: 605 KB - Last synced: 26 days ago - Pushed: over 1 year ago

karmanovalexey/spark-food-clusterization
  • latest Dockerfile

Last synced: over 1 year ago

thalox-portal/feature-engineering
  • v3.3.0 Dockerfile

Size: 466 KB - Last synced: over 1 year ago

karmanovalexey/spark-recommendation-system
  • latest Dockerfile

Last synced: over 1 year ago

damian-barsotti/spark-docker-thrift
Docker compose cluster with running thrift server
  • ${SPARK_VERSION} spark/Dockerfile

Size: 2.34 MB - Last synced: about 1 year ago - Pushed: about 1 year ago

soldni/docker-images
A collection of scripts that build docker images for various use-cases.
  • latest apache-py-beaker/Dockerfile
  • latest spark-py-beaker-with-python/Dockerfile

Size: 73.2 KB - Last synced: 10 months ago - Pushed: about 1 year ago

caraml-dev/caraml-store
Feature Store for CaraML Platform
  • v3.1.3 caraml-store-spark/docker/Dockerfile

Size: 749 KB - Last synced: 8 days ago - Pushed: 8 days ago

khorshuheng/caraml-store Fork of caraml-dev/caraml-store
Feature Store for CaraML Platform
  • v3.1.3 caraml-store-spark/docker/Dockerfile

Size: 479 KB - Last synced: about 1 year ago - Pushed: over 1 year ago

tdthatcher/HPC Fork of NREL/HPC
A collection of various resources, examples, and executables for the general NREL HPC user community's benefit.
  • latest applications/spark/docker/python/Dockerfile

Size: 131 MB - Last synced: about 1 month ago - Pushed: about 1 month ago

alexpirogovski/quickstart-amazon-eks Fork of firstval/quickstart-amazon-eks
Amazon EKS Quick Start
  • v3.2.1 initContainers/spark-app/Dockerfile

Size: 421 MB - Last synced: 9 months ago - Pushed: 10 months ago

yandthj/HPC Fork of NREL/HPC
A collection of various resources, examples, and executables for the general NREL HPC user community's benefit.
  • latest applications/spark/docker/python/Dockerfile

Size: 159 MB - Last synced: 2 days ago - Pushed: 2 days ago

A6u7H/clickhouse_spark_ml
  • latest Dockerfile

Size: 106 KB - Last synced: over 1 year ago - Pushed: over 1 year ago

A6u7H/clickhouse
  • latest Dockerfile

Size: 121 KB - Last synced: over 1 year ago - Pushed: over 1 year ago

nkongenelly/assignments
assignments
  • v3.1.3 Dockerfile

Size: 4.7 MB - Last synced: 12 days ago - Pushed: over 1 year ago

2q3ridcz/handson-apache-spark-for-application-engineer
  • v3.3.1 .devcontainer/docker-compose.yml

Last synced: about 1 year ago

vhuni/Pyspark-structured-streaming-exercise-master
  • latest docker-compose.yml

Size: 1.95 KB - Last synced: about 1 year ago - Pushed: about 1 year ago

antonsold/spark-movielens
  • latest Dockerfile

Size: 6.28 MB - Last synced: over 1 year ago - Pushed: over 1 year ago

srivyshnavi93/vysh_temp
  • v3.3.1 archive/pyspark_container/Dockerfile
  • latest archive/spark_from_debian/spark_master_worker/Dockerfile
  • v3.3.1 spark_cluster/Dockerfile

Size: 80.4 MB - Last synced: 4 months ago - Pushed: about 1 year ago

shinie19/Recruitment_ETL_Pipeline
  • latest Dockerfile

Size: 22.5 MB - Last synced: about 1 year ago - Pushed: about 1 year ago

Zakhar-S1/SGH-BigDataFinaleProject
  • v3.3.0 spark_streaming/Dockerfile

Size: 474 KB - Last synced: 12 months ago - Pushed: over 1 year ago

Future-Outlier/flytekit Fork of flyteorg/flytekit
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.
  • 3.3.1 plugins/flytekit-spark/Dockerfile

Size: 15.1 MB - Last synced: 16 days ago - Pushed: 17 days ago

cdreetz/flytesnacks Fork of flyteorg/flytesnacks
Flyte Documentation πŸ“–
  • 3.3.1 examples/k8s_spark_plugin/Dockerfile

Size: 36.5 MB - Last synced: about 2 months ago - Pushed: 11 months ago

cdreetz/flytekit Fork of flyteorg/flytekit
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.
  • 3.3.1 plugins/flytekit-spark/Dockerfile

Size: 6.64 MB - Last synced: about 2 months ago - Pushed: 11 months ago

Arnan-Dee/DataSci-Eng
  • v3.3.2 spark/Dockerfile

Size: 315 MB - Last synced: about 1 year ago - Pushed: about 1 year ago

calilisantos/candidates_finder
Procurando detalhes dos seguidores de um perfil do github via Rest API.
  • v3.4.0 docker-compose.yml

Size: 8.81 MB - Last synced: 5 days ago - Pushed: 6 days ago

DavidMertz/flytekit Fork of flyteorg/flytekit
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.
  • 3.3.1 plugins/flytekit-spark/Dockerfile

Size: 6.2 MB - Last synced: 11 months ago - Pushed: 11 months ago

ugurpy/csv2mongo
Streaming data application with PySpark from CSV to MongoDB using Docker.
  • latest Dockerfile

Size: 1.7 MB - Last synced: 11 months ago - Pushed: 11 months ago

smruti61/test_11_jun
  • latest Dockerfile

Size: 116 KB - Last synced: 12 months ago - Pushed: 12 months ago

cuddihyd-cornell/sysen5260-2023-mstdn-nlp
Data pipeline project
  • v3.4.0 docker-compose.yml
  • v3.4.0 jupyter/Dockerfile
  • v3.4.0 rest/Dockerfile

Size: 67.4 KB - Last synced: about 1 year ago - Pushed: about 1 year ago

starfoxbra/pocspark
  • latest Dockerfile

Size: 1.76 MB - Last synced: 11 months ago - Pushed: 11 months ago

pathwaycom/pathway-benchmarks
Benchmarks for data processing systems: Pathway, Spark, Flink, Kafka Streams
  • v3.3.1 pagerank-iterative-graph-processing/pagerank_spark/docker-compose.yml

Size: 4.74 MB - Last synced: about 2 months ago - Pushed: 4 months ago

aybidi/flytekit Fork of flyteorg/flytekit
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.
  • 3.3.1 plugins/flytekit-spark/Dockerfile

Size: 6.22 MB - Last synced: 10 months ago - Pushed: 10 months ago

Hilton-AH/NLP-with-PySpark
Mastodon Toot Extraction with Natural Language Processing, Apache Spark, Apache Hadoop, PySpark, and MapReduce
  • v3.4.0 Dockerfile
  • v3.4.0 jupyter/Dockerfile
  • v3.4.0 rest/Dockerfile
  • v3.4.0 sparkcode/Dockerfile

Size: 7.81 KB - Last synced: 10 months ago - Pushed: 10 months ago

troychiu/flytesnacks Fork of flyteorg/flytesnacks
Flyte Documentation πŸ“–
  • 3.3.1 examples/k8s_spark_plugin/Dockerfile

Size: 7.06 MB - Last synced: 6 months ago - Pushed: 6 months ago

tjkuson/Pyspark-structured-streaming-exercise Fork of ChrisPWilliams/Pyspark-structured-streaming-exercise
  • latest docker-compose.yml

Size: 1000 Bytes - Last synced: 10 months ago - Pushed: about 1 year ago

Salazander/portable-etl Fork of syedhassaanahmed/portable-etl
E2E data pipeline built on Spark Structured Streaming, showcased in both Cloud and Edge
  • v3.3.2 src/docker-compose.yml

Size: 1.05 MB - Last synced: 10 months ago - Pushed: about 1 year ago

terry-cao/HPC Fork of NREL/HPC
A collection of various resources, examples, and executables for the general NREL HPC user community's benefit. Use the following website for accessing documentation.
  • latest applications/spark/docker/python/Dockerfile

Size: 75.5 MB - Last synced: 9 months ago - Pushed: 9 months ago

costwise/optscale Fork of hystax/optscale
Hystax OptScale - Cloud Cost Optimization, FinOps and MLOps platform. Includes UI, all the source code and deployment instructions.
  • v3.3.0 deploy/docker_images/ohsu/Dockerfile
  • v3.3.0 optscale-deploy/docker_images/ohsu/Dockerfile

Size: 11.7 MB - Last synced: 9 months ago - Pushed: 12 months ago

gnanaprakash-ravi/zingg Fork of zinggAI/zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
  • v3.1.3 Dockerfile
  • v3.1.3 docker/mac/Dockerfile

Size: 438 MB - Last synced: 3 months ago - Pushed: 3 months ago

MemVerge/flytekit-float Fork of flyteorg/flytekit
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.
  • 3.3.1 plugins/flytekit-spark/Dockerfile

Size: 6.63 MB - Last synced: about 2 months ago - Pushed: about 2 months ago

MemVerge/flytesnacks Fork of flyteorg/flytesnacks
Flyte Documentation πŸ“–
  • 3.3.1 examples/k8s_spark_plugin/Dockerfile

Size: 7.37 MB - Last synced: about 1 month ago - Pushed: about 1 month ago

Tom-Newton/flytekit Fork of flyteorg/flytekit
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.
  • 3.3.1 plugins/flytekit-spark/Dockerfile

Size: 10.3 MB - Last synced: about 2 months ago - Pushed: about 2 months ago

euanjudd/sparkify_user_churn_prediction
  • latest Dockerfile

Size: 102 KB - Last synced: 2 days ago - Pushed: 2 days ago

Mormur22/PrivateCloudArchitectureOnKubernetes
  • latest jupyter/spark/Dockerfile

Size: 110 MB - Last synced: about 2 months ago - Pushed: 9 months ago

shauryaparanjape/spark-docker
Repository for running spark master worked nodes in docker containers
  • latest spark-scripts/Dockerfile

Size: 1.95 KB - Last synced: 9 months ago - Pushed: over 1 year ago

ssen85/flytesnacks Fork of flyteorg/flytesnacks
Flyte Documentation πŸ“–
  • 3.3.1 examples/k8s_spark_plugin/Dockerfile

Size: 36.4 MB - Last synced: 8 months ago - Pushed: 8 months ago

flyteorg/flytekit
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.
  • 3.3.1 plugins/flytekit-spark/Dockerfile

Size: 21.1 MB - Last synced: about 2 months ago - Pushed: about 2 months ago

squiishyy/flytekit Fork of flyteorg/flytekit
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.
  • 3.3.1 plugins/flytekit-spark/Dockerfile

Size: 17.4 MB - Last synced: 9 months ago - Pushed: 9 months ago

Future-Outlier/flytesnacks Fork of flyteorg/flytesnacks
Flyte Documentation πŸ“–
  • 3.3.1 examples/k8s_spark_plugin/Dockerfile

Size: 7.18 MB - Last synced: 5 months ago - Pushed: 5 months ago

k1nshuk/HPC Fork of NREL/HPC
A collection of various resources, examples, and executables for the general NREL HPC user community's benefit.
  • latest applications/spark/docker/python/Dockerfile

Size: 83 MB - Last synced: 3 months ago - Pushed: 5 months ago

deivy311/DockerComposePySpark
  • v3.4.0 Dockerfile

Size: 6.84 KB - Last synced: 9 months ago - Pushed: 9 months ago

zinggAI/zingg-vikas Fork of zinggAI/zingg
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
  • v3.1.3 Dockerfile
  • v3.1.3 docker/mac/Dockerfile

Size: 438 MB - Last synced: 11 days ago - Pushed: 11 days ago

hhcs9527/flytekit Fork of flyteorg/flytekit
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.
  • 3.3.1 plugins/flytekit-spark/Dockerfile

Size: 7.01 MB - Last synced: 8 months ago - Pushed: 8 months ago

coreweave/kubernetes-cloud
Getting Started with the CoreWeave Kubernetes GPU Cloud
  • $SPARK_VERSION spark/docker/Dockerfile

Size: 195 MB - Last synced: 31 minutes ago - Pushed: about 14 hours ago

ericwudayi/flytekit Fork of flyteorg/flytekit
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.
  • 3.3.1 plugins/flytekit-spark/Dockerfile

Size: 8.22 MB - Last synced: 6 months ago - Pushed: 6 months ago

flyteorg/flytesnacks
Flyte Documentation πŸ“–
  • 3.3.1 examples/k8s_spark_plugin/Dockerfile

Size: 37.3 MB - Last synced: 17 days ago - Pushed: 17 days ago

Michkail/LessonStore
  • v3.4.0 docker-compose-spark.yml

Size: 4.55 MB - Last synced: 6 months ago - Pushed: 6 months ago

Yicheng-Lu-llll/flytekit Fork of flyteorg/flytekit
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.
  • 3.3.1 plugins/flytekit-spark/Dockerfile

Size: 22.5 MB - Last synced: 4 months ago - Pushed: 4 months ago

chfstudio/optscale Fork of hystax/optscale
Hystax OptScale - Cloud Cost Optimization, FinOps and MLOps platform. Includes UI, all the source code and deployment instructions.
  • v3.3.0 deploy/docker_images/ohsu/Dockerfile
  • v3.3.0 optscale-deploy/docker_images/ohsu/Dockerfile

Size: 12.1 MB - Last synced: 17 days ago - Pushed: 18 days ago

ehab-eb/spond-data-engineer Fork of spondcorp/spond-data-engineer
  • 3.3.1 Dockerfile

Size: 286 KB - Last synced: 8 months ago - Pushed: 8 months ago

asdfjkalsdfla/gavotesbackend
data pipleines to summarize Georgia voting and early voting results
  • latest Dockerfile

Size: 235 KB - Last synced: 4 days ago - Pushed: 4 days ago

mdjong1/flytekit Fork of flyteorg/flytekit
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.
  • 3.3.1 plugins/flytekit-spark/Dockerfile

Size: 12.6 MB - Last synced: 6 months ago - Pushed: 6 months ago

ringohoffman/flytekit Fork of flyteorg/flytekit
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.
  • 3.3.1 plugins/flytekit-spark/Dockerfile

Size: 7.92 MB - Last synced: 7 months ago - Pushed: 7 months ago

troychiu/flytekit Fork of flyteorg/flytekit
Extensible Python SDK for developing Flyte tasks and workflows. Simple to get started and learn and highly extensible.
  • 3.3.1 plugins/flytekit-spark/Dockerfile

Size: 9.14 MB - Last synced: 2 months ago - Pushed: 2 months ago

iammuho/spark-test
Test spark repository for the task
  • latest Custom_folder/Dockerfile

Size: 11.7 KB - Last synced: 7 months ago - Pushed: 7 months ago

romibuzi/iceberg-pyspark-demo
Usage of Apache Iceberg format with PySpark
  • v3.2.4 Dockerfile

Size: 20.5 KB - Last synced: about 2 months ago - Pushed: 7 months ago

wahoo14/apache_project
  • v3.3.0 Consumer/Dockerfile

Size: 50.3 MB - Last synced: 6 months ago - Pushed: 6 months ago

dyllamt/apache-sample
Helm charts and source code for an apache data application.
  • v3.4.0 src/spark/Dockerfile

Size: 810 KB - Last synced: 6 months ago - Pushed: 6 months ago

datafabrichub/nebulasnacks
  • 3.3.1 examples/k8s_spark_plugin/Dockerfile

Size: 5.11 MB - Last synced: 4 months ago - Pushed: 6 months ago

lucimidori92/challenge_shape
RepositΓ³rio contendo o desafio proposto pela empresa Shape
  • latest Dockerfile

Size: 3.91 KB - Last synced: 6 months ago - Pushed: 6 months ago

datafabrichub/nebulakit
  • 3.3.1 plugins/nebulakit-spark/Dockerfile

Size: 3.47 MB - Last synced: 4 months ago - Pushed: 5 months ago

sbow/Datascience
No models / data, datascience repo
  • latest docker-compose.yml

Size: 317 KB - Last synced: about 1 month ago - Pushed: about 1 month ago

GMaffio99/kube-mondrian
Thesis project: Distributed Anonymization on Kubernetes
  • v3.4.0 Dockerfile

Size: 6.16 MB - Last synced: 3 months ago - Pushed: 3 months ago

Gustavo-H-Martins/desafio_panvel-data_engineer
  • ${SPARK_VERSION} Dockerfile

Size: 2.4 GB - Last synced: 4 months ago - Pushed: 4 months ago

Gustavo-H-Martins/metodologia_medalhao_pipeline_engenharia_de_dados
  • ${SPARK_VERSION} Dockerfile

Size: 313 MB - Last synced: 4 months ago - Pushed: 4 months ago

ismaHenzel/potgres-to-delta-cdc-using-debezium-kafka-spark
This project establishes a versatile layer capable of handling Change Data Capture (CDC) for any PostgreSQL table dynamically. It achieves this by leveraging a schema registry to load the appropriate types seamlessly into Spark using Avro. This ensures a flexible and extensible solution, eliminating the need for additional code for each new table.
  • v3.4.0 Dockerfile

Size: 11.7 KB - Last synced: about 2 months ago - Pushed: 4 months ago

garfado/optscale
  • v3.3.0 docker_images/ohsu/Dockerfile

Size: 565 MB - Last synced: 4 months ago - Pushed: 9 months ago

klima7/Spark-Playground
Environment for learning and experimenting with Appache Spark
  • latest .devcontainer/Dockerfile

Size: 1.14 MB - Last synced: 4 months ago - Pushed: 4 months ago

WesleyJw/Learning
This repository hosts content created from courses, reading books, and personal studies to record and document all knowledge acquired throughout my years of study. This repository is continuously evolving.
  • v3.2.4 Kubernets/spark_on_kubernetes_series/Dockerfile

Size: 352 MB - Last synced: 3 months ago - Pushed: 3 months ago

MarcusLe02/realtime-pipeline-hiring-platform
Real-time data engineering pipeline for an American hiring platform
  • v3.1.3 Dockerfile

Size: 2.42 MB - Last synced: 3 months ago - Pushed: 3 months ago

M4nihere/cafebot-operators
  • latest spark-operator/Dockerfile
  • latest spark-operator/history-server/Dockerfile
  • latest spark-operator/spark-operator/Dockerfile

Size: 2.34 MB - Last synced: about 2 months ago - Pushed: 3 months ago

MohammedTalhi/scale-project
  • v3.3.0 deploy/docker_images/ohsu/Dockerfile
  • v3.3.0 optscale-deploy/docker_images/ohsu/Dockerfile

Size: 5.89 MB - Last synced: about 2 months ago - Pushed: about 1 year ago