GitHub topics: datarecipes
NVIDIA/NeMo-Curator
Scalable data pre processing and curation toolkit for LLMs
Language: Jupyter Notebook - Size: 7.66 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 879 - Forks: 124

data-prep-kit/data-prep-kit
Open source project for data preparation of LLM application builders
Language: HTML - Size: 219 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 622 - Forks: 193
