An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-version-control

treeverse/lakeFS

lakeFS - Data version control for your data lake | Git for data

Language: Go - Size: 149 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 4,660 - Forks: 373

iterative/dvc

🦉 Data Versioning and ML Experiments

Language: Python - Size: 19.5 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 14,444 - Forks: 1,210

dolthub/dolt

Dolt – Git for Data

Language: Go - Size: 148 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 18,661 - Forks: 556

quiltdata/quilt

Quilt is a data mesh for connecting people with actionable data

Language: TypeScript - Size: 164 MB - Last synced at: 2 days ago - Pushed at: 7 days ago - Stars: 1,339 - Forks: 91

git-lfs-fuse/git-lfs-fuse

Mount remote repositories, models and datasets managed by Git LFS locally.

Language: Go - Size: 293 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 42 - Forks: 3

zincware/ZnTrack

Create, visualize, run & benchmark DVC pipelines in Python & Jupyter notebooks.

Language: Python - Size: 9.33 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 52 - Forks: 5

GitDataAI/jzfs

Git based Version Control File System for joint management of code, data, model and their relationship.

Language: Rust - Size: 3.92 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 106 - Forks: 11

ropensci/gittargets

Data version control for reproducible analysis pipelines in R with {targets}.

Language: R - Size: 1.5 MB - Last synced at: 5 days ago - Pushed at: 10 months ago - Stars: 88 - Forks: 1

splitgraph/sgr

sgr (command line client for Splitgraph) and the splitgraph Python library

Language: Python - Size: 9.38 MB - Last synced at: 4 days ago - Pushed at: about 1 year ago - Stars: 321 - Forks: 17

data-as-code/dac

Python Data as Code core implementation

Language: Python - Size: 814 KB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 9 - Forks: 0

datopian/ckanext-versions

A CKAN extension for data versioning.

Language: Python - Size: 362 KB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 9 - Forks: 6

joseph-nagel/dvc-playground

Playground for learning DVC

Language: Python - Size: 2.93 KB - Last synced at: 30 days ago - Pushed at: 30 days ago - Stars: 0 - Forks: 0

daefresh/awesome-data-temporality

A curated list to help you manage temporal data across many modalities 🚀.

Size: 1.87 MB - Last synced at: 8 days ago - Pushed at: over 2 years ago - Stars: 111 - Forks: 2

data-drift/data-drift

Metrics Observability & Troubleshooting

Language: HTML - Size: 11.7 MB - Last synced at: 12 days ago - Pushed at: about 1 year ago - Stars: 323 - Forks: 12

Shuyib/data-version-ctrl

Data version control with Makefile and DVC for a regression task to estimate insurance costs for certain individuals.

Language: Python - Size: 678 KB - Last synced at: 27 days ago - Pushed at: 3 months ago - Stars: 6 - Forks: 1

HarshStats/Chicken-Disease-Classification-Using-MLOPS-DVC-Pipeline-

The Chicken Disease Classification Using MLOps DVC Pipeline project utilizes the VGG16 architecture to analyze images of chicken fecal matter, enabling early disease detection and reducing economic losses in poultry farming.

Language: Jupyter Notebook - Size: 31.9 MB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

ajithvcoder/dvc-aws-s3-bucket-workflow-setup

tutorial to connect dvc and aws-s3 and run github actions

Size: 1.23 MB - Last synced at: 2 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 1

cr21/DVC-pytorch-lightning-MLOps

Data Version Control (DVC) , Hydra, Pytorch Lightning Integration MLOPS

Language: Python - Size: 67.5 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

ajithvcoder/dvc-gdrive-workflow-setup

tutorial to connect dvc and gdrive and run github actions

Size: 1.71 MB - Last synced at: about 1 month ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

MArpogaus/dvc-stage

Stop programming common dvc stages. Configure them.

Language: Python - Size: 177 KB - Last synced at: 5 days ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

Md-Emon-Hasan/DVC-Turotial

📂 Comprehensive guide on using DVC for efficient and reproducible machine learning projects, covering essential commands and workflows.

Size: 27.3 KB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 1 - Forks: 0

martysai/artificial-text-detection

Python framework for artificial text detection: NLP approaches to compare natural text against generated by neural networks.

Language: Python - Size: 262 KB - Last synced at: 16 days ago - Pushed at: over 1 year ago - Stars: 15 - Forks: 1

ejhusom/d2m

A machine learning pipeline taking you from raw data to fully trained machine learning model - from data to model (d2m).

Language: Python - Size: 29.8 MB - Last synced at: 11 months ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 1

VineetKT/ML_fastapi_on_Heroku_CI-CD

Deploying a Machine Learning Model on Heroku with FastAPI using CI/CD tools as GitHub Actions and Heroku Automatic Deployment.

Language: Jupyter Notebook - Size: 4.91 MB - Last synced at: 10 months ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 2

abdmuffid/DVC-Basics

In this repository, an ML-Ops task is undertaken to practice configuring and storing data using DVC on GitHub. The goal is to explore how DVC seamlessly integrates for efficient data management, enhancing reproducibility and scalability in machine learning workflows.

Size: 20.5 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

wrgl/wrgl

Git-like data versioning.

Language: Go - Size: 3.49 MB - Last synced at: 11 months ago - Pushed at: over 1 year ago - Stars: 40 - Forks: 0

RuiFilipeCampos/git-datasets

Declaratively create, transform, manage and version ML datasets.

Language: Python - Size: 120 KB - Last synced at: about 1 month ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

blaz-cerpnjak/dvc-git-example

DVC - Data Version Control Basics

Size: 4.88 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Ezzaldin97/Batch-Serving-ML-Pipeline

create a robust, simple, effecient, and modern end to end ML Batch Serving Pipeline Using set of modern open-source/free Platforms/Tools

Language: Python - Size: 1.67 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 1

data-mill-cloud/mastro 📦

Metadata management in Go

Language: Go - Size: 187 MB - Last synced at: 10 months ago - Pushed at: about 2 years ago - Stars: 7 - Forks: 0

KalyanM45/Data-Version-Control-Demo

The provided demo project demonstrates the practical implementation and advantages of using DVC. It showcases how DVC simplifies data versioning and model versioning while working in tandem with Git to create a cohesive version control system tailored for data science projects.

Language: Python - Size: 67.5 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

NtreevSoft/Crema

Meta data server & client tools for game development

Language: C# - Size: 4.44 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 67 - Forks: 15

aws-samples/amazon-sagemaker-experiments-dvc-demo

SageMaker Experiments and DVC

Language: Jupyter Notebook - Size: 1.15 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 13 - Forks: 2

ClimateImpactLab/DataFS

An abstraction layer for data storage systems

Language: Python - Size: 920 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 2

lsjsj92/data_version_control

practice about data_version_control(DVC)

Size: 1000 Bytes - Last synced at: about 1 month ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

ericdasse28/dvc-test

Just to try out DVC

Size: 9.77 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

Michael95-m/simple_demo_dvc

Demonstration about how to use DVC(Data Version Control)

Language: Jupyter Notebook - Size: 412 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

mlrepa/dvc-2-data-versioning

Lesson 2 tutorial: Versioning Data and Model for the ML REPA School course: Machine Learning experiments reproducibility and engineering with DVC

Language: Jupyter Notebook - Size: 6.49 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 5 - Forks: 4

DiegoBiagini/NatuReddit

Personal project aimed at developing a ML service which resembles a production environment system

Language: Python - Size: 55.6 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

wlandau/user-conf-2022

useR! 2022 talk

Language: HTML - Size: 2.41 MB - Last synced at: about 1 month ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

zensors/droplet

A JSON-based format for working with machine learning data, with a focus on data interoperability.

Size: 1.69 MB - Last synced at: 5 months ago - Pushed at: almost 3 years ago - Stars: 7 - Forks: 0

datopian/ckanext-versioning

Deprecated. See https://github.com/datopian/ckanext-versions. ⏰ CKAN extension providing data versioning (metadata and files) based on git and github.

Language: Python - Size: 385 KB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 7 - Forks: 4

Related Keywords