GitHub topics: datacuration
WDscholia/scholia
Wikidata-based scholarly profiles
Language: JavaScript - Size: 4.81 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 234 - Forks: 82

chapmanjacobd/library
99+ CLI tools to build, browse, and blend your media library
Language: Python - Size: 200 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 425 - Forks: 13

NVIDIA/NeMo-Curator
Scalable data pre processing and curation toolkit for LLMs
Language: Jupyter Notebook - Size: 7.66 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 879 - Forks: 124

data-prep-kit/data-prep-kit
Open source project for data preparation of LLM application builders
Language: HTML - Size: 219 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 622 - Forks: 193

GaloRomero/pepadbPosgreScript
PostgreSQL code for archaeological data management
Language: SQL - Size: 14.4 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

kosson/sva21
Acest repo conține materiale, seturi de date și soluții care au fost folosite în cadrul Școlii de vară Astra, prima ediție, 2021
Size: 3.57 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

purvasingh96/Data-Collection-for-CarZam
An image + data web scraper build to crawl the CarMax website and store relevant information for vehicle identification projects.
Language: Python - Size: 70.4 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 1

benjaminocampo/DataCuration
Exploration and data curation of a dataset given by a Kaggle competition (https://www.kaggle.com/dansbecker/melbourne-housing-snapshot) related to properties that were sold in Melbourne in 2016 and 2017. The meaning of this project is to prepare a well-structured matrix, so it can be used to run a model in order to estimate their prices.
Language: Jupyter Notebook - Size: 14.2 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 2
