Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub / Morawetz / Speech-to-text-data_collection

Speech-to-text data collection with Kafka, Airflow, and Spark, building a pipeline that can be deployed to process posting and receiving text and audio files from and into a data lake, apply transformation in a distributed manner, and load it into a warehouse in a suitable format to train a speech-to-text model.

JSON API: https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Morawetz%2FSpeech-to-text-data_collection

Stars: 2
Forks: 8
Open Issues: 8

License: None
Language: Python
Repo Size: 38.5 MB
Dependencies: 15

Created: over 2 years ago
Updated: almost 2 years ago
Last pushed: over 2 years ago
Last synced: about 1 year ago

Files
    Loading...
    Readme
    Loading...
    Dependencies