GitHub topics: synthetic-data
Francis-Calingo/Canadian-Rental-Prices-and-Immigration-ML-Predictive-Model
Language: Jupyter Notebook - Size: 8.26 MB - Last synced at: about 13 hours ago - Pushed at: about 14 hours ago - Stars: 0 - Forks: 1

mostly-ai/mostlyai
Synthetic Data SDK โจ
Language: Python - Size: 14.1 MB - Last synced at: about 23 hours ago - Pushed at: about 23 hours ago - Stars: 569 - Forks: 45

mostly-ai/mostlyai-mock
Synthetic Data as You See Fit ๐ฎ
Language: Python - Size: 716 KB - Last synced at: about 24 hours ago - Pushed at: 1 day ago - Stars: 6 - Forks: 2

expectedparrot/edsl
Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.
Language: Python - Size: 124 MB - Last synced at: about 24 hours ago - Pushed at: 1 day ago - Stars: 252 - Forks: 25

lk-geimfari/mimesis
Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.
Language: Python - Size: 33.8 MB - Last synced at: about 13 hours ago - Pushed at: about 1 month ago - Stars: 4,589 - Forks: 341

sparkfish/augraphy
Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes
Language: Python - Size: 245 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 427 - Forks: 51

igor-olikh/syntetic-data-generator
A comprehensive toolkit for generating high-quality synthetic datasets using Meta's Llama Synthetic Data Kit. Supports PDFs, videos, documents & more for AI fine-tuning and testing.
Size: 393 KB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 0 - Forks: 0

DerwenAI/kleptosyn
Synthetic data generation for investigative graphs based on patterns of bad-actor tradecraft.
Language: Jupyter Notebook - Size: 1.88 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 6 - Forks: 0

microsoft/genalog
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.
Language: Jupyter Notebook - Size: 14.6 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 329 - Forks: 34

pgmpy/pgmpy
Python Library for Causal and Probabilistic Modeling using Bayesian Networks
Language: Python - Size: 13.1 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 2,981 - Forks: 859

sdv-dev/Copulas
A library to model multivariate data using copulas.
Language: Python - Size: 31.7 MB - Last synced at: about 19 hours ago - Pushed at: about 19 hours ago - Stars: 595 - Forks: 116

Renumics/awesome-open-data-centric-ai
Curated list of open source tooling for data-centric AI on unstructured data.
Size: 572 KB - Last synced at: about 2 hours ago - Pushed at: over 1 year ago - Stars: 718 - Forks: 36

synthetichealth/synthea
Synthetic Patient Population Simulator
Language: Java - Size: 742 MB - Last synced at: 1 day ago - Pushed at: 2 days ago - Stars: 2,581 - Forks: 741

allenmonkey970/ben10-synthetic-battles
This project builds on the Ben 10 Alien Universe Realistic Battle Dataset and adds a synthetic, expanded dataset for testing and analysis.
Language: Jupyter Notebook - Size: 3 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

chaudharijeel9673/linux-syslog-insights
Explore "linux-syslog-insights" to gain valuable insights into Linux server activity through a custom Splunk dashboard. ๐ Analyze trends in authentication, detect brute-force attempts, and monitor CPU anomalies to enhance your system's security. ๐
Language: Python - Size: 1.01 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 0 - Forks: 0

synthesizer-project/synthesizer
Synthesizer - a code for creating synthetic astrophysical observables
Language: Python - Size: 17.4 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 30 - Forks: 13

nucleuscloud/neosync
Open Source Data Security Platform for Developers to Monitor and Detect PII, Anonymize Production Data and Sync it across environments.
Language: Go - Size: 175 MB - Last synced at: 1 day ago - Pushed at: 1 day ago - Stars: 3,875 - Forks: 156

modelscope/data-juicer
Data processing for and with foundation models! ๐ ๐ ๐ฝ โก๏ธ โก๏ธ๐ธ ๐น ๐ท
Language: Python - Size: 223 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 4,607 - Forks: 243

Kiln-AI/Kiln
The easiest tool for fine-tuning LLM models, synthetic data generation, and collaborating on datasets.
Language: Python - Size: 19.3 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 3,789 - Forks: 266

synthesized-io/tdk-demo
This is a collection of TDK demo projects that use different databases and options
Language: YAML - Size: 69.4 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 17 - Forks: 4

eggai-tech/qa-extraction-with-human-review
Question & answer extraction with human review
Language: Jupyter Notebook - Size: 7.98 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

sean-zw/SynthECG
This repository hosts advanced models for generating ECG signals using deep learning techniques. Contributions are welcome, so feel free to fork and submit your improvements! ๐๐ป
Language: Python - Size: 11.7 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

dbt-labs/jaffle-shop-generator
๐ฅช๐ญ A simple CLI for generating synthetic Jaffle Shop data.
Language: Python - Size: 6.5 MB - Last synced at: 2 days ago - Pushed at: 3 months ago - Stars: 40 - Forks: 9

Vini09-cpu/agentin
AI Agents for Technology Services
Size: 1000 Bytes - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

sdv-dev/SDV
Synthetic data generation for tabular data
Language: Python - Size: 31.6 MB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 3,026 - Forks: 365

Data-Centric-AI-Community/awesome-python-for-data-science
A curated list of awesome resources such as books, tutorials, courses, open-source libraries, exercises, and other materials that support Pythonistas in the making, and Pythonistas migrating into Data Science! ๐
Language: Jupyter Notebook - Size: 51.8 MB - Last synced at: 1 day ago - Pushed at: about 1 year ago - Stars: 86 - Forks: 19

ahmad-alismail/LLM_based_Synthetic_Data_Generation
A curated and continuously updated collection of papers, tools, and datasets on synthetic data generation using LLMs and agentic workflows.
Size: 28.3 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 0 - Forks: 0

sdv-dev/CTGAN
Conditional GAN for generating synthetic tabular data.
Language: Python - Size: 1.83 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1,410 - Forks: 314

Deezpa/PyTorch-CreditScoring-ThinFile
A PyTorch-based deep learning extension to my PhD thesis on credit scoring of thin-file consumers.
Size: 7.81 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

vanderschaarlab/DECAF Fork of trentkyono/DECAF
DECAF: Generating Fair Synthetic Data Using Causally-Aware Generative Networks
Language: Python - Size: 35.2 KB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 20 - Forks: 10

ImJaeSung/Synthesizers
Implementations of various synthesizers with pytorch.
Language: Python - Size: 14.7 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 1 - Forks: 0

naomibaes/Synthetic-LSC_pipeline
Synthetic datasets to evaluate key dimensions of LSC (Sentiment, Intensity, Breadth), generated using LLMs and WordNet from the LSC-Eval framework.
Language: Jupyter Notebook - Size: 31.5 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 0 - Forks: 0

Project-AgML/AgML
AgML is a centralized framework for agricultural machine learning. AgML provides access to public agricultural datasets for common agricultural deep learning tasks, with standard benchmarks and pretrained models, as well the ability to generate synthetic data and annotations.
Language: Python - Size: 212 MB - Last synced at: 6 days ago - Pushed at: 2 months ago - Stars: 228 - Forks: 33

gada17/synthetic-to-viewbinding-migrator
Convert Kotlin Android Fragments from synthetic imports to ViewBinding. Automate the migration of old fragment code to modern, type-safe view binding in Android projects.
Language: Python - Size: 10.7 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

privateai/pai-thin-client
A python client used to interact with the Private AI's API
Language: Python - Size: 736 KB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 22 - Forks: 3

argilla-io/distilabel
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Language: Python - Size: 554 MB - Last synced at: 4 days ago - Pushed at: 12 days ago - Stars: 2,755 - Forks: 205

Data-Centric-AI-Community/awesome-data-centric-ai
Open-Source Software, Tutorials, and Research on Data-Centric AI ๐ค
Language: Jupyter Notebook - Size: 6.73 MB - Last synced at: about 20 hours ago - Pushed at: over 1 year ago - Stars: 337 - Forks: 46

DLR-RM/BlenderProc
A procedural Blender pipeline for photorealistic training image generation
Language: Python - Size: 96 MB - Last synced at: 5 days ago - Pushed at: 2 months ago - Stars: 3,110 - Forks: 466

tanaos/synthex-python
Generate high-quality, large-scale synthetic datasets ๐๐งช
Language: Python - Size: 270 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 3 - Forks: 1

vanderschaarlab/synthcity
A library for generating and evaluating synthetic tabular data for privacy, fairness and data augmentation.
Language: Python - Size: 6.77 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 555 - Forks: 76

sdv-dev/SDMetrics
Metrics to evaluate quality and efficacy of synthetic datasets.
Language: Python - Size: 2.75 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 236 - Forks: 49

tdspora/syngen
Open-source version of the TDspora synthetic data generation algorithm.
Language: Jupyter Notebook - Size: 18.2 MB - Last synced at: 2 days ago - Pushed at: 2 days ago - Stars: 17 - Forks: 9

bespokelabsai/curator
Synthetic data curation for post-training and structured data extraction
Language: Python - Size: 62.6 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 1,391 - Forks: 109

mostly-ai/mostlyai-engine
Synthetic Data Engine ๐
Language: Python - Size: 2.5 MB - Last synced at: 7 days ago - Pushed at: 8 days ago - Stars: 62 - Forks: 5

mostly-ai/mostlyai-qa
Synthetic Data Quality Assurance ๐
Language: HTML - Size: 131 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 57 - Forks: 5

gretelai/gretel-python-client
The Gretel Python Client allows you to interact with the Gretel REST API.
Language: Python - Size: 31.1 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 56 - Forks: 19

GreenmaskIO/greenmask
PostgreSQL database anonymization and synthetic data generation tool
Language: Go - Size: 32.3 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1,437 - Forks: 35

sdv-dev/SDGym
Benchmarking synthetic data generation methods.
Language: Python - Size: 3.06 MB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 274 - Forks: 63

SchweizerischeBundesbahnen/SynPopToolbox
SynPopToolbox is a Python framework designed for analysis, visualization and manipulation of a synthetic population produced by the land-use simulation software FaLC (https://github.com/falc-sim-org/FaLC) and related subproducts. Contact: [email protected]
Language: Python - Size: 70.4 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 6 - Forks: 0

magpie-align/magpie
[ICLR 2025] Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing. Your efficient and high-quality synthetic data generation pipeline!
Language: Python - Size: 1.08 MB - Last synced at: 9 days ago - Pushed at: 3 months ago - Stars: 712 - Forks: 62

shuttle-hq/synth
The Declarative Data Generator
Language: Rust - Size: 32.3 MB - Last synced at: 8 days ago - Pushed at: 9 months ago - Stars: 1,418 - Forks: 108

openxrlab/xrfeitoria
OpenXRLab Synthetic Data Rendering Toolbox
Language: Python - Size: 1.28 MB - Last synced at: 6 days ago - Pushed at: 3 months ago - Stars: 281 - Forks: 20

vincentkoc/tiny_qa_benchmark_pp
Tiny QA Benchmark++ a micro-benchmark suite (52-item gold + on-demand multilingual synthetic packs), generator CLI, and CI-ready eval harness for ultra-fast LLM smoke-testing & regression-catching.
Language: Python - Size: 306 KB - Last synced at: about 17 hours ago - Pushed at: about 1 month ago - Stars: 23 - Forks: 0

KI-AIM/Cinnamon
Cinnamon is a modular application designed to offer robust functionalities for data anonymization, synthetization, and evaluation.
Language: Java - Size: 40.5 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 21 - Forks: 1

gretelai/awesome-synthetic-data
๐ A curated list of resources dedicated to synthetic data
Size: 40 KB - Last synced at: 10 days ago - Pushed at: almost 3 years ago - Stars: 131 - Forks: 10

SigVarGen/SigVarGen
SigVarGen is a Python framework for time-series signal generation, data augmentation, and anomaly simulation. It creates diverse 1D signal variants under controlled conditions, including idle-state, perturbed, and noisy signals.
Language: Jupyter Notebook - Size: 84.3 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 5 - Forks: 0

RichardObi/frd-score
Official implementation of the Frรฉchet Radiomics Distance | pip install frd-score
Language: Python - Size: 2.07 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 3 - Forks: 1

OllieBoyne/BlenderSynth
Synthetic Blender Dataset Production
Language: Python - Size: 34.9 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 76 - Forks: 7

harveybc/feature-extractor
Application for training an autoencoder for generating an encoder that can be used as feature extractor for dimensionality and noise reduction, while the decoder can be used for synthetic data generation. Supports dynamic plugin integration, allowing users to extend its capabilities by adding custom encoder and decoder models.
Language: Python - Size: 184 MB - Last synced at: about 8 hours ago - Pushed at: about 9 hours ago - Stars: 5 - Forks: 0

yashmaurya01/Awesome-ML-Privacy-Mitigations
A curated collection of privacy-preserving machine learning techniques, tools, and practical evaluations. Focuses on differential privacy, federated learning, secure computation, and synthetic data generation for implementing privacy in ML workflows.
Size: 146 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 2 - Forks: 0

tanaos/tanaos-docs
Documentation for our synthetic data generation SDKs and APIs ๐
Language: TypeScript - Size: 2.46 MB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

pedrodevog/SynthECG
This repository provides the first systematic evaluation framework for synthetic 10-second 12-lead ECGs from diagnostic class-conditioned generative models.
Language: Python - Size: 12.7 KB - Last synced at: 7 days ago - Pushed at: 13 days ago - Stars: 1 - Forks: 0

davanstrien/awesome-synthetic-datasets
awesome synthetic (text) datasets
Language: Jupyter Notebook - Size: 184 KB - Last synced at: about 17 hours ago - Pushed at: 8 months ago - Stars: 282 - Forks: 11

data-catering/data-caterer Fork of pflooky/data-caterer
Test data management tool for any data source, batch or real-time. Generate, validate and clean up data all in one tool.
Language: Scala - Size: 2.8 MB - Last synced at: 6 days ago - Pushed at: 6 days ago - Stars: 57 - Forks: 8

KodCode-AI/kodcode
โจ A synthetic dataset generation framework that produces diverse coding questions and verifiable solutions - all in one framwork
Language: Python - Size: 40.6 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 229 - Forks: 10

gszfwsb/NCFM
Official PyTorch implementation of the paper "Dataset Distillation with Neural Characteristic Function: A Minmax Perspective" (NCFM) in CVPR 2025 (Highlight).
Language: Python - Size: 1.11 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 364 - Forks: 27

aimclub/BAMT
Repository of a data modeling and analysis tool based on Bayesian networks
Language: Python - Size: 106 MB - Last synced at: 5 days ago - Pushed at: about 1 month ago - Stars: 129 - Forks: 20

ndiwawan/qa-generator-with-human-review
# QA Generator with Human ReviewThis repository allows you to generate QA pairs from documents, incorporating a human review process through Label Studio. ๐ ๏ธ Track sources, filter quality, and export in multiple formats for effective dataset creation. ๐
Language: Python - Size: 46.9 KB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

ajaykr2712/ML_DS
Dialy Curated Open Source Learnings of ML ๐ค
Language: Jupyter Notebook - Size: 73.9 MB - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 0 - Forks: 0

javi22020/CharacterGen
Tool to generate identity-consistent LoRA training data.
Language: Python - Size: 381 KB - Last synced at: 4 days ago - Pushed at: 10 days ago - Stars: 1 - Forks: 0

Shuyu-XJTU/SVTA
The official repo of "Towards Scalable Video Anomaly Retrieval: A Synthetic Video-Text Benchmark"
Size: 1000 Bytes - Last synced at: 16 days ago - Pushed at: 16 days ago - Stars: 1 - Forks: 0

SCAI-BIO/syndat
Synthetic data quality evaluation & visualization
Language: Python - Size: 188 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 2 - Forks: 0

mirpo/datamatic
Generate synthetic datasets using local LLMs via Ollama and LMstudio with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other major language models.
Language: Go - Size: 79.1 KB - Last synced at: about 17 hours ago - Pushed at: about 18 hours ago - Stars: 1 - Forks: 0

plaitpy/plaitpy
plait.py - a fake data modeler
Language: Python - Size: 1 MB - Last synced at: 16 days ago - Pushed at: over 6 years ago - Stars: 435 - Forks: 22

zjunlp/Knowledge2Data
Spatial Knowledge Graph-Guided Synthesis for Multimodal LLMs
Language: Python - Size: 1.51 MB - Last synced at: 8 days ago - Pushed at: 18 days ago - Stars: 2 - Forks: 0

sergio-sanz-rodriguez/Synthetic-To-Real-Object-Detection-Edition-2
Training object-detection deep learning models using 100% synthetic data.
Language: Python - Size: 22.4 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 0 - Forks: 0

sassoftware/dpmm
dpmm: a library for synthetic tabular data generation with rich functionality and end-to-end Differential Privacy guarantees
Language: Python - Size: 661 KB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 3 - Forks: 0

tabularis-ai/be_great
A novel approach for synthesizing tabular data using pretrained large language models
Language: Python - Size: 4.29 MB - Last synced at: 9 days ago - Pushed at: about 1 month ago - Stars: 312 - Forks: 52

SciPhi-AI/synthesizer ๐ฆ
A multi-purpose LLM framework for RAG and data creation.
Language: Python - Size: 31.5 MB - Last synced at: 6 days ago - Pushed at: over 1 year ago - Stars: 626 - Forks: 53

unrealcv/unrealcv
UnrealCV: Connecting Computer Vision to Unreal Engine
Language: C++ - Size: 18.1 MB - Last synced at: 17 days ago - Pushed at: 2 months ago - Stars: 2,008 - Forks: 444

microsoft/DPSDA
Private Evolution: Generating DP Synthetic Data without Training [ICLR 2024, ICML 2024 Spotlight]
Language: Python - Size: 8.64 MB - Last synced at: 1 day ago - Pushed at: 23 days ago - Stars: 97 - Forks: 13

Goodbyefrog/synthetic-ping-data-generator
Modular Java application to generate synthetic user, device, and event data for data engineering pipelines and software testing.
Language: Java - Size: 29.3 KB - Last synced at: 18 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

databrickslabs/dbldatagen
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
Language: Python - Size: 11.1 MB - Last synced at: 12 days ago - Pushed at: about 1 month ago - Stars: 407 - Forks: 74

IDanK0/Deepseek-Dataset-Generator
Deepseek-Dataset-Generator crea dataset conversazionali per il fine-tuning di LLM tramite API DeepSeek. Supporta vari formati (ChatML, ShareGPT, Alpaca, JSON, CSV), configurazione semplice via YAML e log dettagliati. Ideale per generare dati realistici e personalizzati in modo rapido.
Language: Python - Size: 165 KB - Last synced at: 19 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

srivathsan96/Splunk-Admin-Monitoring-Dashboard
Splunk project analyzing simulated Apache web logs to detect failing endpoints, access trends, slow APIs, suspicious patterns, and usage by device/browser. Includes complex SPL queries and visual storytelling.
Language: Python - Size: 997 KB - Last synced at: 19 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

jknafou/TransCorpus
TransCorpus is a scalable toolkit for large-scale, parallel translation and preprocessing of text corpora, built for language model pretraining and research.
Language: Python - Size: 5.91 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

worldbank/REaLTabFormer
A suite of auto-regressive and Seq2Seq (sequence-to-sequence) transformer models for tabular and relational synthetic data generation.
Language: Jupyter Notebook - Size: 12.3 MB - Last synced at: 16 days ago - Pushed at: 21 days ago - Stars: 228 - Forks: 28

ThomasRochefortB/open-agentinstruct
An open-source recreation of the AgentInstruct agentic workflow for synthetic data generation
Language: Python - Size: 372 KB - Last synced at: 3 days ago - Pushed at: 2 months ago - Stars: 16 - Forks: 0

sdv-dev/TGAN
Generative adversarial training for generating synthetic tabular data.
Language: Python - Size: 7.84 MB - Last synced at: 16 days ago - Pushed at: over 2 years ago - Stars: 290 - Forks: 91

starfishdata/starfish
Synthetic data generation to fuel AI models
Language: Python - Size: 14 MB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 30 - Forks: 1

RiccardoSenica/synthetic-consumer-data
Generate synthetic consumers and their weekly purchase history using AI. Create synthetic data with detailed profiles, shopping habits, and consistent spending patterns.
Language: TypeScript - Size: 350 KB - Last synced at: 21 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 0

tempo-sim/Tempo
The Tempo Unreal Engine plugins
Language: C++ - Size: 6.76 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 15 - Forks: 5

agr78/PRLx-GAN
Generative modeling and latent projection label denoising approach to create synthetic rim lesions on QSM
Language: Shell - Size: 7.12 MB - Last synced at: 3 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 0

martinkuhn94/PALSYN
PALSYN is a tool that generates privacy-preserving, process-oriented synthetic data using Autoregressive Sequence Models and differential privacy techniques.
Language: Python - Size: 11.8 MB - Last synced at: 15 days ago - Pushed at: 22 days ago - Stars: 1 - Forks: 2

intervene-EU-H2020/synthetic_data
Software program for generating synthetic datasets for genotypes and phenotypes
Language: Jupyter Notebook - Size: 82.2 MB - Last synced at: 14 days ago - Pushed at: about 2 years ago - Stars: 15 - Forks: 3

Baukebrenninkmeijer/table-evaluator
Evaluate real and synthetic datasets against each other
Language: Jupyter Notebook - Size: 7.21 MB - Last synced at: 17 days ago - Pushed at: 29 days ago - Stars: 89 - Forks: 28

SherAndrei/blender-gen-dataset
Generate synthetic datasets with Blender
Language: Python - Size: 2.6 MB - Last synced at: 11 days ago - Pushed at: 22 days ago - Stars: 0 - Forks: 0

rapiddweller/datamimic
๐ง Model-Driven test data generation platform enabling developers to create realistic, scalable, and privacy-compliant test data. Features model-driven data generation, GDPR compliance, and seamless Python integration.
Language: Python - Size: 14.3 MB - Last synced at: 22 days ago - Pushed at: 22 days ago - Stars: 25 - Forks: 2

roboflow/magic-scissors
Synthetic data for object detection and segmentation
Language: Python - Size: 877 KB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 3

firmai/mtss-gan ๐ฆ
MTSS-GAN: Multivariate Time Series Simulation with Generative Adversarial Networks (by @firmai)
Size: 3.62 MB - Last synced at: 2 days ago - Pushed at: over 4 years ago - Stars: 94 - Forks: 30
