GitHub topics: spark-datasource
huggingface/pyspark_huggingface
PySpark custom data source for Hugging Face Datasets
Language: Python - Size: 201 KB - Last synced at: 4 days ago - Pushed at: 18 days ago - Stars: 6 - Forks: 4

StabRise/spark-pdf
PDF DataSource for Apache Spark, allow to read PDF files directly to the DataFrame and ocr it
Language: Scala - Size: 5.72 MB - Last synced at: 29 days ago - Pushed at: about 2 months ago - Stars: 66 - Forks: 3

rejeb/netcdf-spark-parser
Scala/Spark Netcdf for reading Netcdf files
Language: Scala - Size: 88.9 KB - Last synced at: 16 days ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

spark-root/laurelin
Allows reading ROOT TTrees into Apache Spark as DataFrames
Language: Java - Size: 934 KB - Last synced at: 7 months ago - Pushed at: about 2 years ago - Stars: 10 - Forks: 4

miraisolutions/spark-bigquery
Google BigQuery data source for Apache Spark
Language: Scala - Size: 188 KB - Last synced at: about 1 year ago - Pushed at: about 5 years ago - Stars: 18 - Forks: 6
