An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-version-control

iterative/dvc

🦉 Data Versioning and ML Experiments

Language: Python - Size: 19.6 MB - Last synced at: about 3 hours ago - Pushed at: about 5 hours ago - Stars: 14,706 - Forks: 1,240

treeverse/lakeFS

lakeFS - Data version control for your data lake | Git for data

Language: Go - Size: 152 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 4,787 - Forks: 382

dolthub/dolt

Dolt – Git for Data

Language: Go - Size: 150 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 18,914 - Forks: 568

GitDataAI/jzfs

Git based Version Control File System for joint management of code, data, model and their relationship.

Language: Rust - Size: 6.38 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 108 - Forks: 11

quiltdata/quilt

Quilt is a data mesh for connecting people with actionable data

Language: TypeScript - Size: 165 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1,342 - Forks: 91

zincware/ZnTrack

Create, visualize, run & benchmark DVC pipelines in Python & Jupyter notebooks.

Language: Python - Size: 9.7 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 53 - Forks: 5

data-drift/data-drift

Metrics Observability & Troubleshooting

Language: HTML - Size: 11.7 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 322 - Forks: 12

daefresh/awesome-data-temporality

A curated list to help you manage temporal data across many modalities 🚀.

Size: 1.87 MB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 115 - Forks: 2

splitgraph/sgr

sgr (command line client for Splitgraph) and the splitgraph Python library

Language: Python - Size: 9.38 MB - Last synced at: 11 days ago - Pushed at: about 1 year ago - Stars: 323 - Forks: 18

MArpogaus/dvc-stage

Stop programming common dvc stages. Configure them.

Language: Python - Size: 157 KB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

git-lfs-fuse/git-lfs-fuse

Mount remote repositories, models and datasets managed by Git LFS locally.

Language: Go - Size: 253 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 50 - Forks: 3

aliyzd95/project-dnn-ser-pipeline

This repository contains a complete machine learning pipeline for Speech Emotion Recognition (SER) using Deep Neural Networks (DNNs).

Language: Python - Size: 6.84 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

ropensci/gittargets

Data version control for reproducible analysis pipelines in R with {targets}.

Language: R - Size: 1.5 MB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 88 - Forks: 1

data-as-code/dac

Python Data as Code core implementation

Language: Python - Size: 814 KB - Last synced at: 15 days ago - Pushed at: 4 months ago - Stars: 9 - Forks: 0

datopian/ckanext-versions

A CKAN extension for data versioning.

Language: Python - Size: 358 KB - Last synced at: 27 days ago - Pushed at: about 1 month ago - Stars: 9 - Forks: 7

joseph-nagel/dvc-playground

Playground for learning DVC

Language: Python - Size: 2.93 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

Shuyib/data-version-ctrl

Data version control with Makefile and DVC for a regression task to estimate insurance costs for certain individuals.

Language: Python - Size: 678 KB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 6 - Forks: 1

HarshStats/Chicken-Disease-Classification-Using-MLOPS-DVC-Pipeline-

The Chicken Disease Classification Using MLOps DVC Pipeline project utilizes the VGG16 architecture to analyze images of chicken fecal matter, enabling early disease detection and reducing economic losses in poultry farming.

Language: Jupyter Notebook - Size: 31.9 MB - Last synced at: 5 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0

ajithvcoder/dvc-aws-s3-bucket-workflow-setup

tutorial to connect dvc and aws-s3 and run github actions

Size: 1.23 MB - Last synced at: 5 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 1

cr21/DVC-pytorch-lightning-MLOps

Data Version Control (DVC) , Hydra, Pytorch Lightning Integration MLOPS

Language: Python - Size: 67.5 MB - Last synced at: 4 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

ajithvcoder/dvc-gdrive-workflow-setup

tutorial to connect dvc and gdrive and run github actions

Size: 1.71 MB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

Md-Emon-Hasan/DVC-Turotial

📂 Comprehensive guide on using DVC for efficient and reproducible machine learning projects, covering essential commands and workflows.

Size: 27.3 KB - Last synced at: 5 months ago - Pushed at: 11 months ago - Stars: 1 - Forks: 0

aws-samples/amazon-sagemaker-experiments-dvc-demo

SageMaker Experiments and DVC

Language: Jupyter Notebook - Size: 1.15 MB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 15 - Forks: 2

Michael95-m/simple_demo_dvc

Demonstration about how to use DVC(Data Version Control)

Language: Jupyter Notebook - Size: 412 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

martysai/artificial-text-detection

Python framework for artificial text detection: NLP approaches to compare natural text against generated by neural networks.

Language: Python - Size: 262 KB - Last synced at: 29 days ago - Pushed at: almost 2 years ago - Stars: 15 - Forks: 1

ejhusom/d2m

A machine learning pipeline taking you from raw data to fully trained machine learning model - from data to model (d2m).

Language: Python - Size: 29.8 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 5 - Forks: 1

VineetKT/ML_fastapi_on_Heroku_CI-CD

Deploying a Machine Learning Model on Heroku with FastAPI using CI/CD tools as GitHub Actions and Heroku Automatic Deployment.

Language: Jupyter Notebook - Size: 4.91 MB - Last synced at: 12 months ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 2

abdmuffid/DVC-Basics

In this repository, an ML-Ops task is undertaken to practice configuring and storing data using DVC on GitHub. The goal is to explore how DVC seamlessly integrates for efficient data management, enhancing reproducibility and scalability in machine learning workflows.

Size: 20.5 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

wrgl/wrgl

Git-like data versioning.

Language: Go - Size: 3.49 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 40 - Forks: 0

RuiFilipeCampos/git-datasets

Declaratively create, transform, manage and version ML datasets.

Language: Python - Size: 120 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

blaz-cerpnjak/dvc-git-example

DVC - Data Version Control Basics

Size: 4.88 KB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Ezzaldin97/Batch-Serving-ML-Pipeline

create a robust, simple, effecient, and modern end to end ML Batch Serving Pipeline Using set of modern open-source/free Platforms/Tools

Language: Python - Size: 1.67 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 1

data-mill-cloud/mastro 📦

Metadata management in Go

Language: Go - Size: 187 MB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 0

KalyanM45/Data-Version-Control-Demo

The provided demo project demonstrates the practical implementation and advantages of using DVC. It showcases how DVC simplifies data versioning and model versioning while working in tandem with Git to create a cohesive version control system tailored for data science projects.

Language: Python - Size: 67.5 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 3 - Forks: 0

NtreevSoft/Crema

Meta data server & client tools for game development

Language: C# - Size: 4.44 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 67 - Forks: 15

ClimateImpactLab/DataFS

An abstraction layer for data storage systems

Language: Python - Size: 920 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 2

lsjsj92/data_version_control

practice about data_version_control(DVC)

Size: 1000 Bytes - Last synced at: 4 months ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

ericdasse28/dvc-test

Just to try out DVC

Size: 9.77 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

mlrepa/dvc-2-data-versioning

Lesson 2 tutorial: Versioning Data and Model for the ML REPA School course: Machine Learning experiments reproducibility and engineering with DVC

Language: Jupyter Notebook - Size: 6.49 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 4

DiegoBiagini/NatuReddit

Personal project aimed at developing a ML service which resembles a production environment system

Language: Python - Size: 55.6 MB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

wlandau/user-conf-2022

useR! 2022 talk

Language: HTML - Size: 2.41 MB - Last synced at: 4 months ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

zensors/droplet

A JSON-based format for working with machine learning data, with a focus on data interoperability.

Size: 1.69 MB - Last synced at: 7 months ago - Pushed at: about 3 years ago - Stars: 7 - Forks: 0

datopian/ckanext-versioning

Deprecated. See https://github.com/datopian/ckanext-versions. ⏰ CKAN extension providing data versioning (metadata and files) based on git and github.

Language: Python - Size: 385 KB - Last synced at: 27 days ago - Pushed at: over 4 years ago - Stars: 7 - Forks: 4