GitHub / huggingface / olm-datasets
Pipeline for pulling and processing online language model pretraining data from the web
JSON API: http://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/huggingface%2Folm-datasets
PURL: pkg:github/huggingface/olm-datasets
Stars: 178
Forks: 23
Open issues: 1
License: apache-2.0
Language: Python
Size: 324 KB
Dependencies parsed at: Pending
Created at: almost 3 years ago
Updated at: 2 months ago
Pushed at: almost 2 years ago
Last synced at: 8 days ago
Loading...