An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: datarecipes

NVIDIA/NeMo-Curator

Scalable data pre processing and curation toolkit for LLMs

Language: Jupyter Notebook - Size: 7.66 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 879 - Forks: 124

data-prep-kit/data-prep-kit

Open source project for data preparation of LLM application builders

Language: HTML - Size: 219 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 622 - Forks: 193