Topic: "synthetic-data"
stefan-jansen/machine-learning-for-trading
Code for Machine Learning for Algorithmic Trading, 2nd edition.
Language: Jupyter Notebook - Size: 652 MB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 14,872 - Forks: 4,601

modelscope/data-juicer
Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷
Language: Python - Size: 453 MB - Last synced at: about 9 hours ago - Pushed at: 3 days ago - Stars: 5,142 - Forks: 267

lk-geimfari/mimesis
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
Language: Python - Size: 33.8 MB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 4,612 - Forks: 342

Kiln-AI/Kiln
The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.
Language: Python - Size: 23 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 4,096 - Forks: 296

nucleuscloud/neosync
Open Source Data Security Platform for Developers to Monitor and Detect PII, Anonymize Production Data and Sync it across environments.
Language: Go - Size: 184 MB - Last synced at: 23 days ago - Pushed at: 23 days ago - Stars: 3,910 - Forks: 167

DLR-RM/BlenderProc
A procedural Blender pipeline for photorealistic training image generation
Language: Python - Size: 91.6 MB - Last synced at: 4 days ago - Pushed at: 26 days ago - Stars: 3,200 - Forks: 482

sdv-dev/SDV
Synthetic data generation for tabular data
Language: Python - Size: 31.8 MB - Last synced at: 2 days ago - Pushed at: 3 days ago - Stars: 3,150 - Forks: 380

pgmpy/pgmpy
Python library for causal inference. Supports causal discovery, identification, effect estimation, prediction, and simulation with a scikit-learn style API.
Language: Python - Size: 13.4 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 3,032 - Forks: 867

argilla-io/distilabel
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Language: Python - Size: 554 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 2,862 - Forks: 216

synthetichealth/synthea
Synthetic Patient Population Simulator
Language: Java - Size: 742 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 2,702 - Forks: 766

hitsz-ids/synthetic-data-generator
SDG is a specialized framework designed to generate high-quality structured tabular data.
Language: Python - Size: 4.19 MB - Last synced at: 27 days ago - Pushed at: 28 days ago - Stars: 2,375 - Forks: 384

unrealcv/unrealcv
UnrealCV: Connecting Computer Vision to Unreal Engine
Language: C++ - Size: 18.1 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2,048 - Forks: 451

ydataai/ydata-synthetic
Synthetic data generators for tabular and time-series data
Language: Jupyter Notebook - Size: 16.3 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1,570 - Forks: 252

GreenmaskIO/greenmask
PostgreSQL database anonymization and synthetic data generation tool
Language: Go - Size: 32.7 MB - Last synced at: 10 days ago - Pushed at: 10 days ago - Stars: 1,521 - Forks: 42

bespokelabsai/curator
Synthetic data curation for post-training and structured data extraction
Language: Python - Size: 62.6 MB - Last synced at: 9 days ago - Pushed at: about 1 month ago - Stars: 1,487 - Forks: 120

shuttle-hq/synth
The Declarative Data Generator
Language: Rust - Size: 32.3 MB - Last synced at: 11 days ago - Pushed at: 12 months ago - Stars: 1,443 - Forks: 109

sdv-dev/CTGAN
Conditional GAN for generating synthetic tabular data.
Language: Python - Size: 1.84 MB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 1,442 - Forks: 320

plurai-ai/intellagent
A framework for comprehensive diagnosis and optimization of agents using simulated, realistic synthetic interactions
Language: Python - Size: 14.2 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 1,071 - Forks: 133

datadreamer-dev/DataDreamer
DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤
Language: Python - Size: 895 KB - Last synced at: 12 days ago - Pushed at: 7 months ago - Stars: 1,052 - Forks: 53

huggingface/aisheets
Build, enrich, and transform datasets using AI models with no code
Language: TypeScript - Size: 1.7 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 968 - Forks: 93

BatsResearch/bonito
A lightweight library for generating synthetic instruction tuning datasets for your data without GPT.
Language: Python - Size: 796 KB - Last synced at: 22 days ago - Pushed at: about 2 months ago - Stars: 787 - Forks: 49

magpie-align/magpie
[ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data generation pipeline!
Language: Python - Size: 1.08 MB - Last synced at: about 9 hours ago - Pushed at: 6 months ago - Stars: 765 - Forks: 69

Renumics/awesome-open-data-centric-ai
Curated list of open source tooling for data-centric AI on unstructured data.
Size: 572 KB - Last synced at: 1 day ago - Pushed at: almost 2 years ago - Stars: 724 - Forks: 36

nicolas-hbt/pygraft
Configurable Generation of Synthetic Schemas and Knowledge Graphs at Your Fingertips
Language: Python - Size: 699 KB - Last synced at: 20 days ago - Pushed at: about 1 year ago - Stars: 688 - Forks: 45

jofpin/synthBTC
A tool that uses advanced Monte Carlo simulations and Turbit parallel processing to create possible Bitcoin prediction scenarios.
Language: JavaScript - Size: 6.46 MB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 669 - Forks: 403

gretelai/gretel-synthetics
Synthetic data generators for structured and unstructured text, featuring differentially private learning.
Language: Python - Size: 2.35 MB - Last synced at: 3 months ago - Pushed at: 6 months ago - Stars: 642 - Forks: 91

mostly-ai/mostlyai
Synthetic Data SDK ✨
Language: Python - Size: 14.4 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 633 - Forks: 55

SciPhi-AI/synthesizer 📦
A multi-purpose LLM framework for RAG and data creation.
Language: Python - Size: 31.5 MB - Last synced at: 7 days ago - Pushed at: over 1 year ago - Stars: 628 - Forks: 53

sdv-dev/Copulas
A library to model multivariate data using copulas.
Language: Python - Size: 30.5 MB - Last synced at: 5 days ago - Pushed at: 12 days ago - Stars: 608 - Forks: 117

paulbricman/thisrepositorydoesnotexist
A curated list of awesome projects which use Machine Learning to generate synthetic content.
Size: 34.2 KB - Last synced at: 6 days ago - Pushed at: over 2 years ago - Stars: 584 - Forks: 39

vanderschaarlab/synthcity
A library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.
Language: Python - Size: 6.8 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 571 - Forks: 76

sparkfish/augraphy
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes
Language: Python - Size: 254 MB - Last synced at: 3 days ago - Pushed at: about 2 months ago - Stars: 455 - Forks: 56

lukehinds/promptwright
Generate large synthetic data
Language: Python - Size: 13.9 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 440 - Forks: 32

plaitpy/plaitpy
plait.py - a fake data modeler
Language: Python - Size: 1 MB - Last synced at: 19 days ago - Pushed at: over 6 years ago - Stars: 436 - Forks: 22

yandex-research/tab-ddpm
[ICML 2023] The official implementation of the paper "TabDDPM: Modelling Tabular Data with Diffusion Models"
Language: Python - Size: 183 KB - Last synced at: 6 months ago - Pushed at: about 1 year ago - Stars: 426 - Forks: 97

databrickslabs/dbldatagen
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
Language: Python - Size: 11.1 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 425 - Forks: 79

GeorgeCazenavette/mtt-distillation
Official code for our CVPR '22 paper "Dataset Distillation by Matching Training Trajectories"
Language: Python - Size: 38.6 MB - Last synced at: 4 months ago - Pushed at: about 1 year ago - Stars: 420 - Forks: 58

wenbowen123/iros20-6d-pose-tracking
[IROS 2020] se(3)-TrackNet: Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic Domains
Language: Python - Size: 84.8 MB - Last synced at: 4 months ago - Pushed at: about 2 years ago - Stars: 407 - Forks: 67

Unity-Technologies/SynthDet 📦
SynthDet - An end-to-end object detection pipeline using synthetic data
Language: C# - Size: 2.19 MB - Last synced at: 25 days ago - Pushed at: 9 months ago - Stars: 385 - Forks: 56

gszfwsb/NCFM
Official PyTorch implementation of the paper "Dataset Distillation with Neural Characteristic Function: A Minmax Perspective" (NCFM) in CVPR 2025 (Highlight).
Language: Python - Size: 7.68 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 379 - Forks: 29

Data-Centric-AI-Community/awesome-data-centric-ai
Open-Source Software, Tutorials, and Research on Data-Centric AI 🤖
Language: Jupyter Notebook - Size: 6.73 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 338 - Forks: 47

microsoft/genalog
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.
Language: Jupyter Notebook - Size: 14.6 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 332 - Forks: 32

BMW-InnovationLab/BMW-Labeltool-Lite
This repository provides you with an easy-to-use labeling tool for State-of-the-art Deep Learning training purposes. It supports Auto-Labeling.
Language: C# - Size: 478 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 323 - Forks: 47

Nicholasli1995/EvoSkeleton
Official project website for the CVPR 2020 paper (Oral Presentation) "Cascaded deep monocular 3D human pose estimation wth evolutionary training data"
Language: Python - Size: 17.1 MB - Last synced at: over 1 year ago - Pushed at: about 4 years ago - Stars: 323 - Forks: 43

tabularis-ai/be_great
A novel approach for synthesizing tabular data using pretrained large language models
Language: Python - Size: 4.29 MB - Last synced at: 1 day ago - Pushed at: 2 months ago - Stars: 321 - Forks: 52

Unity-Technologies/Robotics-Object-Pose-Estimation
A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.
Language: Python - Size: 38.6 MB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 321 - Forks: 78

Unity-Technologies/PeopleSansPeople
Unity's privacy-preserving human-centric synthetic data generator
Language: C# - Size: 446 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 314 - Forks: 35

ZumoLabs/zpy
Synthetic data for computer vision. An open source toolkit using Blender and Python.
Language: Python - Size: 29.3 MB - Last synced at: 10 days ago - Pushed at: almost 4 years ago - Stars: 313 - Forks: 34

milaan9/Clustering-Datasets
This repository contains the collection of UCI (real-life) datasets and Synthetic (artificial) datasets (with cluster labels and MATLAB files) ready to use with clustering algorithms.
Size: 99.2 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 312 - Forks: 236

tirthajyoti/pydbgen
Random dataframe and database table generator
Language: Python - Size: 687 KB - Last synced at: 3 months ago - Pushed at: about 4 years ago - Stars: 309 - Forks: 58

nickkunz/smogn
Synthetic Minority Over-Sampling Technique for Regression
Language: Python - Size: 730 KB - Last synced at: 12 months ago - Pushed at: over 1 year ago - Stars: 308 - Forks: 76

fjxmlzn/DoppelGANger
[IMC 2020 (Best Paper Finalist)] Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions
Language: Python - Size: 67.4 KB - Last synced at: 4 months ago - Pushed at: almost 2 years ago - Stars: 307 - Forks: 74

LinkedAi/flip
Synthetic Image generation with Flip. Generate thousands of new 2D images from a small batch of objects and backgrounds.
Language: Python - Size: 80.1 MB - Last synced at: 2 months ago - Pushed at: over 2 years ago - Stars: 307 - Forks: 35

davanstrien/awesome-synthetic-datasets
awesome synthetic (text) datasets
Language: Jupyter Notebook - Size: 188 KB - Last synced at: 12 days ago - Pushed at: 2 months ago - Stars: 295 - Forks: 12

sdv-dev/TGAN
Generative adversarial training for generating synthetic tabular data.
Language: Python - Size: 7.84 MB - Last synced at: 2 months ago - Pushed at: almost 3 years ago - Stars: 289 - Forks: 91

openxrlab/xrfeitoria
OpenXRLab Synthetic Data Rendering Toolbox
Language: Python - Size: 1.29 MB - Last synced at: 3 days ago - Pushed at: 7 days ago - Stars: 287 - Forks: 21

debidatta/syndata-generation
Code used to generate synthetic scenes and bounding box annotations for object detection. This was used to generate data used in the Cut, Paste and Learn paper
Language: Python - Size: 6.44 MB - Last synced at: almost 2 years ago - Pushed at: almost 5 years ago - Stars: 280 - Forks: 72

sdv-dev/SDGym
Benchmarking synthetic data generation methods.
Language: Python - Size: 3.15 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 277 - Forks: 63

expectedparrot/edsl
Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.
Language: Python - Size: 128 MB - Last synced at: about 9 hours ago - Pushed at: about 11 hours ago - Stars: 271 - Forks: 26

kevinlin311tw/CDCL-human-part-segmentation
Repository for Paper: Cross-Domain Complementary Learning Using Pose for Multi-Person Part Segmentation (TCSVT20)
Language: Python - Size: 5.67 MB - Last synced at: almost 2 years ago - Pushed at: about 5 years ago - Stars: 255 - Forks: 43

sdv-dev/SDMetrics
Metrics to evaluate quality and efficacy of synthetic datasets.
Language: Python - Size: 3.06 MB - Last synced at: 25 days ago - Pushed at: 25 days ago - Stars: 243 - Forks: 50

KodCode-AI/kodcode
✨ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framwork
Language: Python - Size: 40.6 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 243 - Forks: 13

Project-AgML/AgML
AgML is a centralized framework for agricultural machine learning. AgML provides access to public agricultural datasets for common agricultural deep learning tasks, with standard benchmarks and pretrained models, as well the ability to generate synthetic data and annotations.
Language: Python - Size: 212 MB - Last synced at: 6 days ago - Pushed at: 5 months ago - Stars: 235 - Forks: 34

worldbank/REaLTabFormer
A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.
Language: Jupyter Notebook - Size: 13.4 MB - Last synced at: 22 days ago - Pushed at: about 2 months ago - Stars: 234 - Forks: 29

jrieke/shape-detection
🟣 Object detection of abstract shapes with neural networks
Language: Jupyter Notebook - Size: 1.12 MB - Last synced at: 7 days ago - Pushed at: almost 5 years ago - Stars: 221 - Forks: 126

ndrplz/surround_vehicles_awareness
Learn to map surrounding vehicles onto a bird's eye view of the scene.
Language: Python - Size: 6.12 MB - Last synced at: about 2 months ago - Pushed at: over 5 years ago - Stars: 210 - Forks: 71

firmai/datagene
DataGene - Identify How Similar TS Datasets Are to One Another (by @firmai)
Language: Jupyter Notebook - Size: 1.12 MB - Last synced at: about 22 hours ago - Pushed at: over 3 years ago - Stars: 205 - Forks: 24

statice/awesome-synthetic-data
A curated list of awesome synthetic data tools (open source and commercial).
Size: 8.79 KB - Last synced at: 13 days ago - Pushed at: over 1 year ago - Stars: 199 - Forks: 28

TonicAI/masquerade
A Postgres Proxy to Mask Data in Realtime
Language: C# - Size: 84 KB - Last synced at: 15 days ago - Pushed at: over 1 year ago - Stars: 197 - Forks: 15

RichardObi/medigan
medigan - A Python Library of Pretrained Generative Models for Medical Image Synthesis
Language: Python - Size: 106 MB - Last synced at: 3 days ago - Pushed at: about 1 year ago - Stars: 180 - Forks: 21

rungalileo/agent-leaderboard
Ranking LLMs on agentic tasks
Language: Jupyter Notebook - Size: 16.6 MB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 178 - Forks: 18

AlexanderVNikitin/tsgm
Generation and evaluation of synthetic time series datasets (also, augmentations, visualizations, a collection of popular datasets) NeurIPS'24
Language: Python - Size: 8.63 MB - Last synced at: 14 days ago - Pushed at: about 2 months ago - Stars: 177 - Forks: 19

ku21fan/STR-Fewer-Labels
Scene Text Recognition (STR) methods trained with fewer real labels (CVPR 2021)
Language: Jupyter Notebook - Size: 1.61 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 166 - Forks: 26

zjrwtx/SFT-data-builder
利用免费的大模型api来结合你的私域数据来生成sft训练数据(妥妥白嫖)支持llamafactory等工具的训练数据格式synthetic data
Language: JavaScript - Size: 502 KB - Last synced at: 4 months ago - Pushed at: 10 months ago - Stars: 161 - Forks: 17

Shuyu-XJTU/APTM
The official code of "Towards Unified Text-based Person Retrieval: A Large-scale Multi-Attribute and Language Search Benchmark"
Language: Python - Size: 2.32 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 160 - Forks: 14

rapiddweller/rapiddweller-benerator-ce
BENERATOR is a leading software solution to generate, obfuscate, pseudonymize and migrate data for development, testing, and training purposes with a model-driven approach.
Language: Java - Size: 35.3 MB - Last synced at: about 1 month ago - Pushed at: 2 months ago - Stars: 150 - Forks: 26

MhLiao/SynthText3D
Project page of SynthText3D
Language: C++ - Size: 1.44 MB - Last synced at: 4 months ago - Pushed at: over 5 years ago - Stars: 145 - Forks: 23

DataformerAI/dataformer
Solving data for LLMs - Create quality synthetic datasets!
Language: Python - Size: 278 KB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 143 - Forks: 12

anton-jeran/FAST-RIR
This is the official implementation of our neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.
Language: Python - Size: 4.47 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 143 - Forks: 26

gist-ailab/uoais
Codes of paper "Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling", ICRA 2022
Language: Python - Size: 14.9 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 142 - Forks: 28

atapour/monocularDepth-Inference
Inference pipeline for the CVPR paper entitled "Real-Time Monocular Depth Estimation using Synthetic Data with Domain Adaptation via Image Style Transfer" (http://atapour.co.uk/papers/atapour18monocular.pdf).
Language: Python - Size: 6.9 MB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 141 - Forks: 37

gretelai/awesome-synthetic-data
📖 A curated list of resources dedicated to synthetic data
Size: 40 KB - Last synced at: 11 days ago - Pushed at: about 3 years ago - Stars: 133 - Forks: 10

aimclub/BAMT
Repository of a data modeling and analysis tool based on Bayesian networks
Language: Python - Size: 106 MB - Last synced at: 10 days ago - Pushed at: 4 months ago - Stars: 132 - Forks: 21

allenai/pixmo-docs
ACL 2025: Synthetic data generation pipelines for text-rich images.
Language: Python - Size: 6.43 MB - Last synced at: 26 days ago - Pushed at: 6 months ago - Stars: 132 - Forks: 20

khawar-islam/diffuseMix
Official PyTorch implementation of DiffuseMix : Label-Preserving Data Augmentation with Diffusion Models (CVPR'2024)
Language: Python - Size: 1.74 MB - Last synced at: 24 days ago - Pushed at: 6 months ago - Stars: 121 - Forks: 8

sdv-dev/DeepEcho
Synthetic Data Generation for mixed-type, multivariate time series.
Language: Python - Size: 767 KB - Last synced at: 20 days ago - Pushed at: 28 days ago - Stars: 116 - Forks: 16

stefan-jansen/synthetic-data-for-finance
Material for QuantUniversity talk on Sythetic Data Generation for Finance.
Language: Jupyter Notebook - Size: 757 KB - Last synced at: 5 months ago - Pushed at: almost 5 years ago - Stars: 110 - Forks: 45

LiheYoung/FreeMask
[NeurIPS 2023] FreeMask: Synthetic Images with Dense Annotations Make Stronger Segmentation Models
Language: Python - Size: 13 MB - Last synced at: over 1 year ago - Pushed at: almost 2 years ago - Stars: 107 - Forks: 1

kirill-vish/Beyond-INet
Code for experiments for "ConvNet vs Transformer, Supervised vs CLIP: Beyond ImageNet Accuracy"
Language: Python - Size: 130 MB - Last synced at: 5 months ago - Pushed at: 12 months ago - Stars: 101 - Forks: 6

neurallambda/awesome-reasoning
a curated list of data for reasoning ai
Size: 89.8 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 101 - Forks: 5

microsoft/DPSDA
Private Evolution: Generating DP Synthetic Data without Training [ICLR 2024, ICML 2024 Spotlight]
Language: Python - Size: 9.54 MB - Last synced at: 6 days ago - Pushed at: about 2 months ago - Stars: 100 - Forks: 14

barseghyanartur/faker-file
Create files with fake data. In many formats. With no efforts.
Language: Python - Size: 2.57 MB - Last synced at: 2 months ago - Pushed at: 3 months ago - Stars: 94 - Forks: 6

firmai/mtss-gan 📦
MTSS-GAN: Multivariate Time Series Simulation with Generative Adversarial Networks (by @firmai)
Size: 3.62 MB - Last synced at: about 22 hours ago - Pushed at: almost 5 years ago - Stars: 94 - Forks: 30

Baukebrenninkmeijer/table-evaluator
Evaluate real and synthetic datasets against each other
Language: Python - Size: 9.51 MB - Last synced at: 3 days ago - Pushed at: about 1 month ago - Stars: 92 - Forks: 28

sunchang0124/dp_cgans
A library to generate synthetic tabular or RDF data using Conditional Generative Adversary Networks (GANs) combined with Differential Privacy techniques.
Language: Python - Size: 266 KB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 92 - Forks: 27

ruirangerfan/Three-Filters-to-Normal
Three-Filters-to-Normal: An Accurate and Ultrafast Surface Normal Estimator (RAL+ICRA'21)
Language: C++ - Size: 85.3 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 91 - Forks: 14

Data-Centric-AI-Community/awesome-python-for-data-science
A curated list of awesome resources such as books, tutorials, courses, open-source libraries, exercises, and other materials that support Pythonistas in the making, and Pythonistas migrating into Data Science! 📊
Language: Jupyter Notebook - Size: 51.8 MB - Last synced at: 9 days ago - Pushed at: over 1 year ago - Stars: 89 - Forks: 19

justchenhao/IAug_CDNet
Official Pytorch Implementation of Adversarial Instance Augmentation for Building Change Detection in Remote Sensing Images.
Language: Python - Size: 16.9 MB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 88 - Forks: 19

statice/anonymeter
A Unified Framework for Quantifying Privacy Risk in Synthetic Data according to the GDPR
Language: Jupyter Notebook - Size: 1.77 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 87 - Forks: 22

privateai/deid-examples
Examples scripts that showcase how to use Private AI Text to de-identify, redact, hash, tokenize, mask and synthesize PII in text.
Language: Jupyter Notebook - Size: 37.8 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 81 - Forks: 1
