An open API service providing repository metadata for many open source software ecosystems.

GitHub / ev2900 / Glue_Aggregate_Small_Files

PySpark script to aggregate small parquet files in a prefix into larger files. Designed to be run on AWS Glue

JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ev2900%2FGlue_Aggregate_Small_Files
PURL: pkg:github/ev2900/Glue_Aggregate_Small_Files

Stars: 1
Forks: 0
Open issues: 0

License: None
Language: Python
Size: 149 KB
Dependencies parsed at: Pending

Created at: over 2 years ago
Updated at: 11 days ago
Pushed at: 11 days ago
Last synced at: 11 days ago

Topics: aws, glue, pyspark, s3, small-files

    Loading...