GitHub topics: data-version-control
treeverse/lakeFS
lakeFS - Data version control for your data lake | Git for data
Language: Go - Size: 149 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 4,660 - Forks: 373

iterative/dvc
🦉 Data Versioning and ML Experiments
Language: Python - Size: 19.5 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 14,444 - Forks: 1,210

dolthub/dolt
Dolt – Git for Data
Language: Go - Size: 148 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 18,661 - Forks: 556

quiltdata/quilt
Quilt is a data mesh for connecting people with actionable data
Language: TypeScript - Size: 164 MB - Last synced at: 2 days ago - Pushed at: 7 days ago - Stars: 1,339 - Forks: 91

git-lfs-fuse/git-lfs-fuse
Mount remote repositories, models and datasets managed by Git LFS locally.
Language: Go - Size: 293 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 42 - Forks: 3

zincware/ZnTrack
Create, visualize, run & benchmark DVC pipelines in Python & Jupyter notebooks.
Language: Python - Size: 9.33 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 52 - Forks: 5

GitDataAI/jzfs
Git based Version Control File System for joint management of code, data, model and their relationship.
Language: Rust - Size: 3.92 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 106 - Forks: 11

ropensci/gittargets
Data version control for reproducible analysis pipelines in R with {targets}.
Language: R - Size: 1.5 MB - Last synced at: 5 days ago - Pushed at: 10 months ago - Stars: 88 - Forks: 1

splitgraph/sgr
sgr (command line client for Splitgraph) and the splitgraph Python library
Language: Python - Size: 9.38 MB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 321 - Forks: 17

data-as-code/dac
Python Data as Code core implementation
Language: Python - Size: 814 KB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 9 - Forks: 0

datopian/ckanext-versions
A CKAN extension for data versioning.
Language: Python - Size: 362 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 9 - Forks: 6

joseph-nagel/dvc-playground
Playground for learning DVC
Language: Python - Size: 2.93 KB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 0 - Forks: 0

daefresh/awesome-data-temporality
A curated list to help you manage temporal data across many modalities 🚀.
Size: 1.87 MB - Last synced at: 8 days ago - Pushed at: over 2 years ago - Stars: 111 - Forks: 2

data-drift/data-drift
Metrics Observability & Troubleshooting
Language: HTML - Size: 11.7 MB - Last synced at: 12 days ago - Pushed at: about 1 year ago - Stars: 323 - Forks: 12

Shuyib/data-version-ctrl
Data version control with Makefile and DVC for a regression task to estimate insurance costs for certain individuals.
Language: Python - Size: 678 KB - Last synced at: 27 days ago - Pushed at: 3 months ago - Stars: 6 - Forks: 1

HarshStats/Chicken-Disease-Classification-Using-MLOPS-DVC-Pipeline-
The Chicken Disease Classification Using MLOps DVC Pipeline project utilizes the VGG16 architecture to analyze images of chicken fecal matter, enabling early disease detection and reducing economic losses in poultry farming.
Language: Jupyter Notebook - Size: 31.9 MB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

ajithvcoder/dvc-aws-s3-bucket-workflow-setup
tutorial to connect dvc and aws-s3 and run github actions
Size: 1.23 MB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 1

cr21/DVC-pytorch-lightning-MLOps
Data Version Control (DVC) , Hydra, Pytorch Lightning Integration MLOPS
Language: Python - Size: 67.5 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

ajithvcoder/dvc-gdrive-workflow-setup
tutorial to connect dvc and gdrive and run github actions
Size: 1.71 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

MArpogaus/dvc-stage
Stop programming common dvc stages. Configure them.
Language: Python - Size: 177 KB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

Md-Emon-Hasan/DVC-Turotial
📂 Comprehensive guide on using DVC for efficient and reproducible machine learning projects, covering essential commands and workflows.
Size: 27.3 KB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

martysai/artificial-text-detection
Python framework for artificial text detection: NLP approaches to compare natural text against generated by neural networks.
Language: Python - Size: 262 KB - Last synced at: 16 days ago - Pushed at: over 1 year ago - Stars: 15 - Forks: 1

ejhusom/d2m
A machine learning pipeline taking you from raw data to fully trained machine learning model - from data to model (d2m).
Language: Python - Size: 29.8 MB - Last synced at: 11 months ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 1

VineetKT/ML_fastapi_on_Heroku_CI-CD
Deploying a Machine Learning Model on Heroku with FastAPI using CI/CD tools as GitHub Actions and Heroku Automatic Deployment.
Language: Jupyter Notebook - Size: 4.91 MB - Last synced at: 10 months ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 2

abdmuffid/DVC-Basics
In this repository, an ML-Ops task is undertaken to practice configuring and storing data using DVC on GitHub. The goal is to explore how DVC seamlessly integrates for efficient data management, enhancing reproducibility and scalability in machine learning workflows.
Size: 20.5 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

wrgl/wrgl
Git-like data versioning.
Language: Go - Size: 3.49 MB - Last synced at: 11 months ago - Pushed at: over 1 year ago - Stars: 40 - Forks: 0

RuiFilipeCampos/git-datasets
Declaratively create, transform, manage and version ML datasets.
Language: Python - Size: 120 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

blaz-cerpnjak/dvc-git-example
DVC - Data Version Control Basics
Size: 4.88 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Ezzaldin97/Batch-Serving-ML-Pipeline
create a robust, simple, effecient, and modern end to end ML Batch Serving Pipeline Using set of modern open-source/free Platforms/Tools
Language: Python - Size: 1.67 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 1

data-mill-cloud/mastro 📦
Metadata management in Go
Language: Go - Size: 187 MB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 7 - Forks: 0

KalyanM45/Data-Version-Control-Demo
The provided demo project demonstrates the practical implementation and advantages of using DVC. It showcases how DVC simplifies data versioning and model versioning while working in tandem with Git to create a cohesive version control system tailored for data science projects.
Language: Python - Size: 67.5 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

NtreevSoft/Crema
Meta data server & client tools for game development
Language: C# - Size: 4.44 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 67 - Forks: 15

aws-samples/amazon-sagemaker-experiments-dvc-demo
SageMaker Experiments and DVC
Language: Jupyter Notebook - Size: 1.15 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 13 - Forks: 2

ClimateImpactLab/DataFS
An abstraction layer for data storage systems
Language: Python - Size: 920 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 2

lsjsj92/data_version_control
practice about data_version_control(DVC)
Size: 1000 Bytes - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

ericdasse28/dvc-test
Just to try out DVC
Size: 9.77 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Michael95-m/simple_demo_dvc
Demonstration about how to use DVC(Data Version Control)
Language: Jupyter Notebook - Size: 412 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

mlrepa/dvc-2-data-versioning
Lesson 2 tutorial: Versioning Data and Model for the ML REPA School course: Machine Learning experiments reproducibility and engineering with DVC
Language: Jupyter Notebook - Size: 6.49 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 5 - Forks: 4

DiegoBiagini/NatuReddit
Personal project aimed at developing a ML service which resembles a production environment system
Language: Python - Size: 55.6 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

wlandau/user-conf-2022
useR! 2022 talk
Language: HTML - Size: 2.41 MB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

zensors/droplet
A JSON-based format for working with machine learning data, with a focus on data interoperability.
Size: 1.69 MB - Last synced at: 5 months ago - Pushed at: almost 3 years ago - Stars: 7 - Forks: 0

datopian/ckanext-versioning
Deprecated. See https://github.com/datopian/ckanext-versions. ⏰ CKAN extension providing data versioning (metadata and files) based on git and github.
Language: Python - Size: 385 KB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 7 - Forks: 4
