An open API service providing repository metadata for many open source software ecosystems.

gitlab.com / siddie / stackexchange-dump-spark-research-tools

Stack Exchange releases "data dumps" of all its publicly available content roughly every three months via archive.org. This project is an example and a framework for building ETL for this data with Apache Spark and Java.

JSON API: http://repos.ecosyste.ms/api/v1/hosts/gitlab.com/repositories/siddie%2Fstackexchange-dump-spark-research-tools
PURL: pkg:gitlab/siddie/stackexchange-dump-spark-research-tools

Stars: 0
Forks: 0
Open issues: 0

License: None
Language:
Dependencies parsed at: Pending

Created at: about 6 years ago
Updated at: almost 6 years ago
Last synced at: over 2 years ago

Topics: Apache Spark, ETL, big data, big data ETL, spark, stack exchange, stackexchange

    Loading...