Ecosyste.ms: Repos
An open API service providing repository metadata for many open source software ecosystems.
GitHub topics: data-versioning
dolthub/dolt
Dolt – Git for Data
Language: Go - Size: 137 MB - Last synced: about 7 hours ago - Pushed: about 8 hours ago - Stars: 17,101 - Forks: 482
BemiHQ/bemi
Automatic data change tracking for PostgreSQL
Language: TypeScript - Size: 2.65 MB - Last synced: about 14 hours ago - Pushed: about 17 hours ago - Stars: 160 - Forks: 2
aws/amazon-finspace-examples
This repo contains sample code and sample notebooks to illustrate how to work with Amazon FinSpace
Language: Jupyter Notebook - Size: 127 MB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 21 - Forks: 23
daefresh/awesome-data-temporality
A curated list to help you manage temporal data across many modalities 🚀.
Size: 1.87 MB - Last synced: 1 day ago - Pushed: over 1 year ago - Stars: 99 - Forks: 2
koordinates/kart
Distributed version-control for geospatial and tabular data
Language: Python - Size: 107 MB - Last synced: 5 days ago - Pushed: 5 days ago - Stars: 503 - Forks: 39
Renumics/awesome-open-data-centric-ai
Curated list of open source tooling for data-centric AI on unstructured data.
Size: 572 KB - Last synced: about 21 hours ago - Pushed: 6 months ago - Stars: 679 - Forks: 36
treeverse/lakeFS
lakeFS - Data version control for your data lake | Git for data
Language: Go - Size: 136 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 4,054 - Forks: 328
datopian/ckanext-versions
A CKAN extension for data versioning.
Language: Python - Size: 313 KB - Last synced: 20 days ago - Pushed: 11 months ago - Stars: 8 - Forks: 6
wandb/wandb
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
Language: Python - Size: 89.8 MB - Last synced: 29 days ago - Pushed: 29 days ago - Stars: 8,194 - Forks: 604
BemiHQ/bemi-prisma
Automatic data change tracking for Prisma
Language: TypeScript - Size: 295 KB - Last synced: 28 days ago - Pushed: about 1 month ago - Stars: 64 - Forks: 0
GitDataAI/jiaozifs
An Git-like version control file system for data lineage & data collaboration.
Language: Go - Size: 1.66 MB - Last synced: about 1 month ago - Pushed: about 2 months ago - Stars: 41 - Forks: 2
quiltdata/quilt
Quilt is a data mesh for connecting people with actionable data
Language: Jupyter Notebook - Size: 228 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 1,311 - Forks: 92
iusztinpaul/energy-forecasting
🌀 𝗧𝗵𝗲 𝗙𝘂𝗹𝗹 𝗦𝘁𝗮𝗰𝗸 𝟳-𝗦𝘁𝗲𝗽𝘀 𝗠𝗟𝗢𝗽𝘀 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 | 𝗟𝗲𝗮𝗿𝗻 𝗠𝗟𝗘 & 𝗠𝗟𝗢𝗽𝘀 for free by designing, building and deploying an end-to-end ML batch system ~ 𝘴𝘰𝘶𝘳𝘤𝘦 𝘤𝘰𝘥𝘦 + 2.5 𝘩𝘰𝘶𝘳𝘴 𝘰𝘧 𝘳𝘦𝘢𝘥𝘪𝘯𝘨 & 𝘷𝘪𝘥𝘦𝘰 𝘮𝘢𝘵𝘦𝘳𝘪𝘢𝘭𝘴
Language: Python - Size: 4.1 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 775 - Forks: 174
jomariya23156/full-stack-on-prem-cv-mlops
"1 config, 1 command from Jupyter Notebook to serve Millions of users", Full-stack On-Premises MLOps system for Computer Vision from Data versioning to Model monitoring and drift detection.
Language: Jupyter Notebook - Size: 15.2 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 34 - Forks: 2
pier4all/mongoose-versioned
Document versioning library for MongoDB using the mongoose package.
Language: JavaScript - Size: 527 KB - Last synced: about 1 month ago - Pushed: 5 months ago - Stars: 14 - Forks: 7
BemiHQ/bemi-typeorm
Automatic data change tracking for TypeORM
Language: TypeScript - Size: 73.2 KB - Last synced: 28 days ago - Pushed: 2 months ago - Stars: 19 - Forks: 0
ropensci/gittargets
Data version control for reproducible analysis pipelines in R with {targets}.
Language: R - Size: 1.5 MB - Last synced: about 1 month ago - Pushed: 6 months ago - Stars: 80 - Forks: 1
albagc/auto-data-version
Obtain data versioning tag using ML models
Language: Jupyter Notebook - Size: 8.18 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0
ksm26/LLMOps
In this course navigates through the LLMOps pipeline, enabling you to preprocess training data for supervised fine-tuning and deploy custom Large Language Models (LLMs).
Language: Jupyter Notebook - Size: 1.98 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0
leeper/data-versioning
Collecting thoughts about data versioning
Size: 16.6 KB - Last synced: 28 days ago - Pushed: almost 5 years ago - Stars: 107 - Forks: 8
RecallGraph/RecallGraph
A versioning data store for time-variant graph data.
Language: JavaScript - Size: 4.31 MB - Last synced: about 1 month ago - Pushed: over 1 year ago - Stars: 331 - Forks: 25
martysai/artificial-text-detection
Python framework for artificial text detection: NLP approaches to compare natural text against generated by neural networks.
Language: Python - Size: 262 KB - Last synced: 1 day ago - Pushed: 9 months ago - Stars: 14 - Forks: 1
cs-uche/Car-Prices-Prediction
Advanced Machine Learning Regression: Predicting Car Prices
Language: Jupyter Notebook - Size: 10.1 MB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0
KalyanM45/Data-Version-Control-Demo
The provided demo project demonstrates the practical implementation and advantages of using DVC. It showcases how DVC simplifies data versioning and model versioning while working in tandem with Git to create a cohesive version control system tailored for data science projects.
Language: Python - Size: 67.5 MB - Last synced: 19 days ago - Pushed: 10 months ago - Stars: 3 - Forks: 0
layerai-archive/sdk 📦
Metadata store for Production ML
Language: Python - Size: 2.22 MB - Last synced: about 17 hours ago - Pushed: over 1 year ago - Stars: 89 - Forks: 7
mucozcan/awesome-ml-infra
Articles, tutorials, and tools about creating scalable and sustainable ML/DL systems.
Size: 5.86 KB - Last synced: 5 months ago - Pushed: about 1 year ago - Stars: 0 - Forks: 1
d-lowl/bunny-party
A demonstration of how DVC and MLFlow can be used in the task of data relabeling
Language: Python - Size: 25.1 MB - Last synced: 5 months ago - Pushed: 8 months ago - Stars: 4 - Forks: 0
data-as-code/dac
Python Data as Code core implementation
Language: Python - Size: 814 KB - Last synced: 2 months ago - Pushed: 3 months ago - Stars: 6 - Forks: 0
wrgl/wrgl
Git-like data versioning.
Language: Go - Size: 3.47 MB - Last synced: 9 months ago - Pushed: 9 months ago - Stars: 38 - Forks: 0
VineetKT/ML_fastapi_on_Heroku_CI-CD
Deploying a Machine Learning Model on Heroku with FastAPI using CI/CD tools as GitHub Actions and Heroku Automatic Deployment.
Language: Jupyter Notebook - Size: 4.91 MB - Last synced: 9 months ago - Pushed: almost 3 years ago - Stars: 2 - Forks: 2
dolthub/kedro-dolt
Kedro-Dolt Hook Plugin
Language: Python - Size: 73.2 KB - Last synced: 20 days ago - Pushed: over 1 year ago - Stars: 4 - Forks: 2
lucapug/github_actions_CI_CD
following best practices to productionize an ML project
Language: Jupyter Notebook - Size: 2.33 MB - Last synced: about 1 month ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0
OElesin/modeldb-aws
Verta ai ModelDB on AWS Cloud with integration into Amazon SageMaker for ML training data versioning and experiment tracking
Language: TypeScript - Size: 392 KB - Last synced: 19 days ago - Pushed: about 4 years ago - Stars: 1 - Forks: 0
pytholic/ClearML
Testing and implementations with ClearML
Language: Python - Size: 5.78 MB - Last synced: 23 days ago - Pushed: 8 months ago - Stars: 0 - Forks: 0
pier4all/data-versioning
Repository for evaluating the different approaches to data versioning
Language: JavaScript - Size: 23.2 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 1
lsjsj92/data_version_control
practice about data_version_control(DVC)
Size: 1000 Bytes - Last synced: about 1 year ago - Pushed: over 4 years ago - Stars: 1 - Forks: 0
fair-data-austria/dbrepo 📦
A Data Preservation Repository Supporting FAIR Principles, Data Versioning and Reproducible Queries
Language: Java - Size: 92.3 MB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 6 - Forks: 0
NewronAI/newron-sdk
Newron is a data-centric ML platform to easily build, manage, deploy and continuously improve models through data driven development.
Language: Python - Size: 1.11 MB - Last synced: 14 days ago - Pushed: over 1 year ago - Stars: 3 - Forks: 4
prathameshThakur/dvc-mlflow-test
DVC + MLflow for data monitoring and ML lifecycle management
Language: Jupyter Notebook - Size: 12.7 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0
neptune-ai/project-tabular-data-version
Project with tabular data versioned with Artifacts.
Language: Python - Size: 10.7 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0
zensors/droplet
A JSON-based format for working with machine learning data, with a focus on data interoperability.
Size: 1.69 MB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 7 - Forks: 0
datopian/ckanext-versioning
Deprecated. See https://github.com/datopian/ckanext-versions. ⏰ CKAN extension providing data versioning (metadata and files) based on git and github.
Language: Python - Size: 385 KB - Last synced: 20 days ago - Pushed: about 3 years ago - Stars: 7 - Forks: 4