An open API service providing repository metadata for many open source software ecosystems.

Topic: "synthetic-data-generation"

nucleuscloud/neosync

Open Source Data Security Platform for Developers to Monitor and Detect PII, Anonymize Production Data and Sync it across environments.

Language: Go - Size: 165 MB - Last synced at: about 18 hours ago - Pushed at: about 19 hours ago - Stars: 3,847 - Forks: 154

sdv-dev/SDV

Synthetic data generation for tabular data

Language: Python - Size: 31 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 2,597 - Forks: 331

sdv-dev/CTGAN

Conditional GAN for generating synthetic tabular data.

Language: Python - Size: 1.82 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 1,372 - Forks: 307

sdv-dev/Copulas

A library to model multivariate data using copulas.

Language: Python - Size: 27.5 MB - Last synced at: 9 days ago - Pushed at: 18 days ago - Stars: 585 - Forks: 116

mostly-ai/mostlyai

Synthetic Data SDK ✨

Language: Python - Size: 12.6 MB - Last synced at: about 17 hours ago - Pushed at: 3 days ago - Stars: 417 - Forks: 32

microsoft/genalog

Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.

Language: Jupyter Notebook - Size: 14.6 MB - Last synced at: 1 day ago - Pushed at: over 1 year ago - Stars: 322 - Forks: 32

Unity-Technologies/PeopleSansPeople

Unity's privacy-preserving human-centric synthetic data generator

Language: C# - Size: 446 MB - Last synced at: 9 days ago - Pushed at: about 1 year ago - Stars: 310 - Forks: 35

fjxmlzn/DoppelGANger

[IMC 2020 (Best Paper Finalist)] Using GANs for Sharing Networked Time Series Data: Challenges, Initial Promise, and Open Questions

Language: Python - Size: 67.4 KB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 300 - Forks: 75

sdv-dev/DeepEcho

Synthetic Data Generation for mixed-type, multivariate time series.

Language: Python - Size: 755 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 111 - Forks: 15

ing-bank/INGenious

INGenious Playwright Studio

Language: Java - Size: 11.7 MB - Last synced at: 16 days ago - Pushed at: 21 days ago - Stars: 90 - Forks: 27

netsharecmu/NetShare

(SIGCOMM '22) Practical GAN-based Synthetic IP Header Trace Generation using NetShare

Language: Python - Size: 4.29 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 58 - Forks: 17

Graph-COM/GraphMaker

[TMLR] GraphMaker: Can Diffusion Models Generate Large Attributed Graphs?

Language: Python - Size: 1.86 MB - Last synced at: 5 days ago - Pushed at: 5 days ago - Stars: 56 - Forks: 7

microsoft/CodeMixed-Text-Generator

This tool helps automatic generation of grammatically valid synthetic Code-mixed data by utilizing linguistic theories such as Equivalence Constant Theory and Matrix Language Theory.

Language: Jupyter Notebook - Size: 3.79 MB - Last synced at: 1 day ago - Pushed at: 9 months ago - Stars: 54 - Forks: 12

mostly-ai/mostlyai-engine

Synthetic Data Engine 💎

Language: Python - Size: 2.01 MB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 53 - Forks: 2

openraven/mockingbird

A toolset to test data classification engines that generates mock data in various file formats, sizes and data profiles.

Language: Python - Size: 564 KB - Last synced at: 19 days ago - Pushed at: over 1 year ago - Stars: 43 - Forks: 6

Unity-Technologies/AnthroNet

Unity's Privacy-Preserving Novel Human Body Model Trained Solely on Synthetic Data and Corresponding Dense Anthropometric Measurements

Language: Rich Text Format - Size: 158 MB - Last synced at: 10 days ago - Pushed at: over 1 year ago - Stars: 35 - Forks: 1

ritaranx/ClinGen

[ACL 2024 Findings] This is the code for our paper "Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models".

Language: Python - Size: 663 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 28 - Forks: 1

keishihara/flow-matching

Flow Matching implemented in PyTorch

Language: Python - Size: 307 MB - Last synced at: 7 days ago - Pushed at: 3 months ago - Stars: 27 - Forks: 3

dannylee1020/openpo

Language: Python - Size: 10.7 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 27 - Forks: 0

kkyuhun94/dalda

[ECCV'24 Workshops Oral] DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling

Language: Python - Size: 948 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 26 - Forks: 4

gongouveia/Whisper-Synthetic-ASR-Dataset-Generator

This UI serves as a Synthetic ASR Dataset Generator powered by/for OpenAI Whisper, enabling users to capture audio, transcribing it, on the fly and manage the generated dataset 🤗. Fine tune Whisper or enhanced and custom datasets

Language: Python - Size: 1.57 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 23 - Forks: 0

codezakh/DataEnvGym

A testbed for agents and environments that can automatically improve models through data generation.

Language: Python - Size: 9.16 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 19 - Forks: 5

VCL3D/BlenderScripts

Scripts for data generation using Blender and 3D datasets like Matterport3D.

Language: Python - Size: 37.1 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 16 - Forks: 3

Graph-COM/LayerDAG

[ICLR 2025 Spotlight] LayerDAG: A Layerwise Autoregressive Diffusion Model of Directed Acyclic Graphs

Language: Python - Size: 1.24 MB - Last synced at: 17 days ago - Pushed at: 3 months ago - Stars: 15 - Forks: 0

OmarSamirz/ImageFromTextGenerator

IFTG (ImageFromTextGenerator) is a Python package that simplifies creating robust datasets for OCR models. Generate images from text, apply over 10 built-in noise effects, and customize fonts and layouts. IFTG supports all languages and offers endless noise combinations, including custom noise creation.

Language: Python - Size: 15.2 MB - Last synced at: 18 days ago - Pushed at: 18 days ago - Stars: 14 - Forks: 1

apple/ml-interactive-data-augmentation

Interactive Data Augmentation (CHI 2025)

Language: Svelte - Size: 73 MB - Last synced at: 3 days ago - Pushed at: about 1 month ago - Stars: 12 - Forks: 1

stefanrmmr/differentially_private_synthetic_data

Differentially Private Synthetic Data Generation [DP-SDG] - Experimental Setups & Knowledge Base - WORK IN PROGRESS

Language: Jupyter Notebook - Size: 5.23 MB - Last synced at: 19 days ago - Pushed at: over 2 years ago - Stars: 12 - Forks: 2

SidharthMacherla/conjurer

R Package to generate synthetic data.

Language: R - Size: 211 KB - Last synced at: 5 months ago - Pushed at: 10 months ago - Stars: 9 - Forks: 4

jpdefrutos/DDMR

3D image registration training framework using adaptive loss weighting and synthetic data generation

Language: Python - Size: 413 KB - Last synced at: 8 days ago - Pushed at: 11 months ago - Stars: 8 - Forks: 2

aliseyfi75/COSCI-GAN

Codebase for "Generating multivariate time series with COmmon Source CoordInated GAN (COSCI-GAN)"

Language: Jupyter Notebook - Size: 93.8 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 8 - Forks: 2

arya-upm/mVARbox

mVARbox is a Matlab toolbox for uni/multivariate data series analysis in both time/space and frequency domains, with focus on mutivariate autoregressive (VAR) models

Language: MATLAB - Size: 32.6 MB - Last synced at: 8 days ago - Pushed at: 9 months ago - Stars: 7 - Forks: 0

jaabmar/private-pgd

Implementation for the paper "Privacy-preserving data release leveraging optimal transport and particle gradient descent"

Language: Python - Size: 26.2 MB - Last synced at: 9 days ago - Pushed at: 2 months ago - Stars: 6 - Forks: 1

PMBio/Health-Privacy-Challenge

The starter kit for the CAMDA 2025 Health Privacy Challenge.

Language: Python - Size: 43 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 5 - Forks: 5

dannylee1020/pyper

Synthetic data generation for LLM instruction tuning

Language: Python - Size: 142 KB - Last synced at: 3 days ago - Pushed at: 3 months ago - Stars: 4 - Forks: 0

AvaAvarai/Dynamic_Coordinates_Vis_System

Build visual machine learning models with multidimensional general line coordinate visualizations by interactive classification and synthetic data generation tools.

Language: Python - Size: 30.4 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 2

marzekan/WCGAN-GP

TensorFlow 2 implementation of Wasserstein Conditional GAN with Gradient Penalty (WCGAN-GP) for synthetic data generation

Language: Jupyter Notebook - Size: 103 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 4 - Forks: 2

starfishdata/starfish

Synthetic data generation to fuel AI models

Language: Python - Size: 683 KB - Last synced at: 3 days ago - Pushed at: 4 days ago - Stars: 3 - Forks: 0

TNO-SDG/tabular.eval.utility_metrics

TNO PET Lab - Synthetic Data Generation (SDG) - Tabular - Evaluation - Utility Metrics

Language: Python - Size: 280 KB - Last synced at: 9 days ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0

shaadclt/Ragas-Synthetic-Test-Data-Generation

This project demonstrates how to generate synthetic test data for Retrieval Augmented Generation (RAG) using Ragas.

Language: Jupyter Notebook - Size: 102 KB - Last synced at: 12 days ago - Pushed at: 8 months ago - Stars: 3 - Forks: 0

ShendoxParadox/Few-shot-satellite-image-classification-OPS-SAT

Few-shot satellite image classification for bringing deep learning on board OPS-SAT

Language: Python - Size: 3.39 GB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 3 - Forks: 0

lparolari/harlequin

Code and DataLoader for the Harlequin dataset 🎨 described in the paper "Harlequin: Color-driven Generation of Synthetic Data for Referring Expression Comprehension", presented at ICPR'24

Language: Python - Size: 3.42 MB - Last synced at: 2 days ago - Pushed at: 5 months ago - Stars: 2 - Forks: 0

mtnmunuklu/logen

Generates synthetic logs for Sigma rules

Language: Go - Size: 66.4 KB - Last synced at: 10 months ago - Pushed at: about 1 year ago - Stars: 2 - Forks: 0

SeyedMuhammadHosseinMousavi/Synthetic-Data-Generation-by-Markov-Chain-Monte-Carlo

Synthetic Data Generation by Markov Chain Monte Carlo (MCMC)

Language: MATLAB - Size: 5.86 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

SeyedMuhammadHosseinMousavi/Synthetic-Data-Generation-by-Sequential-Monte-Carlo

Synthetic Data Generation by Sequential Monte Carlo (SMC)

Language: MATLAB - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

SeyedMuhammadHosseinMousavi/Synthetic-Data-Generation-of-Body-Motion-Data-by-Neural-Gas-Network-for-Emotion-Recognition

Synthetic Data Generation of Body Motion Data by Neural Gas Network for Emotion Recognition

Language: Python - Size: 14.5 MB - Last synced at: 15 days ago - Pushed at: 15 days ago - Stars: 1 - Forks: 1

starfishdata/starfish-cli

A powerful synthetic Q&A data generation CLI that enables AI-driven dataset creation for ML training, research, and automation.

Language: Python - Size: 20.5 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

SeyedMuhammadHosseinMousavi/Synthetic-Data-Generation-by-Supervised-Neural-Gas-Network

Synthetic Data Generation by Supervised Neural Gas Network for Physiological Emotion Recognition Data

Language: Python - Size: 2.23 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

SundareshSankaran/SDG---SMOTE-Synthetic-Data-Generation

Upstream repository for a custom step to generate synthetic data based on an input table, using the Synthetic Minority Oversampling TEchnique (SMOTE). SMOTE is an oversampling technique which identifies new data observations in the neighborhood of closely associated original observations.

Language: SAS - Size: 465 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 0

patrickamadeus/vqa-nle-llava

Novel approach that leverages LVLMs to efficiently generate high-quality synthetic VQA-NLE datasets.

Language: Python - Size: 29.1 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

R-N/ml-utility-loss

Integrating Machine Learning Utility in Tabular Data Synthesizer Training using Loss Function Learning

Language: Python - Size: 1.68 GB - Last synced at: 21 days ago - Pushed at: 9 months ago - Stars: 1 - Forks: 0

HoomanRamezani/drone-defect-detection

temporal + cnn vision model for classification of windmill defects, with unreal-engine data generation and a custom data augmentation suite

Language: Python - Size: 1.18 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

an-seunghwan/synthesizers

Implementations of various synthesizers with pytorch.

Language: Python - Size: 3.4 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

JJavierRosales/scapy

Machine Learning Python library for Spacecraft Conjunction Assessment optimisation.

Language: Jupyter Notebook - Size: 122 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

TNO-SDG/graph.gen.graphbin

TNO PET Lab - Synthetic Data Generation (SDG) - Graph - Generation - GraphBin

Language: Python - Size: 2.66 MB - Last synced at: 8 days ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

SeyedMuhammadHosseinMousavi/Synthetic-Data-Generation-by-ShallowNN

Synthetic Data Generation by ShallowNN

Language: MATLAB - Size: 9.77 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

SeyedMuhammadHosseinMousavi/Synthetic-Data-Generation-by-Differential-Evolution

Synthetic Data Generation by Differential Evolution (DE)

Size: 3.91 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

SeyedMuhammadHosseinMousavi/Synthetic-Data-Generation-by-Kernel-Density-Estimation

Synthetic Data Generation by Kernel Density Estimation (KDE)

Language: MATLAB - Size: 3.91 KB - Last synced at: 7 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

SeyedMuhammadHosseinMousavi/SDG_by_GMDH

Synthetic Data Generation (SDG) by Group Method of Data Handling (GMDH)

Language: MATLAB - Size: 4.88 KB - Last synced at: 7 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

SeyedMuhammadHosseinMousavi/SDG_by_ARX

Synthetic Data Generation (SDG) by nonlinear AutoRegressive with eXogenous input (ARX) model

Language: MATLAB - Size: 19.5 KB - Last synced at: 7 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

SeyedMuhammadHosseinMousavi/SDGbySMOTE

Synthetic Minority Over-sampling Technique (SMOTE) for Synthetic Data Generation (SDG)

Language: MATLAB - Size: 7.81 KB - Last synced at: 7 months ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

SeyedMuhammadHosseinMousavi/SDGVanillaGAN

Synthetic Data Generation (SDG) Using Vanilla GAN

Language: MATLAB - Size: 9.72 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

SeyedMuhammadHosseinMousavi/Synthetic-Data-Generation-SDG-by-Gaussian-Mixture-Model-GMM-Distribution

Synthetic Data Generation (SDG) by Gaussian Mixture Model (GMM) Distribution

Language: MATLAB - Size: 3.91 KB - Last synced at: 7 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

markbader/midi-spline-interpolation

Generates midi notes to join two different midi snippets by interpolation with a cubic spline curve

Language: Python - Size: 94.7 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

AlgoMathITMO/SynEvaRecSimulator

The public repo of experiments from paper "Performance Ranking of Recommender Systems on Simulated Data" by Stavinova et al.

Language: Jupyter Notebook - Size: 35.2 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

A-Bak/fingerprint-wgan

Wasserstein Generative Adversarial Network (WGAN) with Gradient Penalty (GP) for generation of synthetic diseased fingerprints.

Language: Jupyter Notebook - Size: 10.1 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

LLNL/SYNDATA

SYNDATA software includes a suite of statistical/machine learning models to generate discrete/categorical synthetic data.

Language: Python - Size: 34.2 KB - Last synced at: 10 days ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 1

Replicon-genetics/rg_exploder_shared

Python code for generating synthetic sequence data: DNASEQ and RNASEQ reads for use as standards in genomics data analysis pipelines

Language: Python - Size: 209 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

ioanacretu97/MC-GAN_Multichannel_Signal_Generation

Synthesis of multimodal cardiological signals using a conditional wasserstein generative adversarial network.

Language: Jupyter Notebook - Size: 154 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

josericodata/SyntheticDataGeneratorApp

Generate and download free synthetic datasets instantly! A Streamlit app with built-in statistical validation tools like Chi-Square and Mutual Information.

Language: Python - Size: 7.41 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

HamzaEzzRa/Synthetic-Dice

Unity environment for generating synthetic images of dice to train detection/classification algorithms.

Language: C# - Size: 6.13 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

villacampaporta/synthetic-dielectric-data-gen

🚀 Synthetic Data Generation for Dielectric Characterization using Machine Learning | TVAE & CTGAN for Data Augmentation in Sensor Applications

Language: Jupyter Notebook - Size: 44.8 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

an-seunghwan/MaCoDE

Official implementation of 'Masked Language Modeling Becomes Conditional Density Estimation for Tabular Data Synthesis' (MaCoDE) with pytorch.

Language: Jupyter Notebook - Size: 1.28 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Arif264Shaik/Ola-Performance-Analysis

This repository analyzes ride performance data for Ola using a synthetic dataset generated with ChatGPT. Leveraging MySQL, Power BI, and Excel, the project reveals insights into bookings, cancellations, customer ratings, and driver performance, supporting data-driven decision-making in the ride-hailing industry.

Language: Jupyter Notebook - Size: 3.8 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

yongchoooon/dalda Fork of kkyuhun94/dalda

[ECCV'24 Workshops Oral] DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling

Language: Python - Size: 948 KB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

Luckilyeee/Solar-Flare-Prediction-through-Time-Series-Data-Augmentation

Solar Flare Prediction through Time Series Data Augmentation

Language: Python - Size: 22.4 MB - Last synced at: 6 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

CatSatOK/Prophets-of-Profit-Evaluating-Synthetic-Data-Techniques-in-Financial-Forecasting-Models

An comparative investigation into WGAN-GP, CTGAN, TimeGAN and DoppelGANger usage for generating synthetic time series finance data for use in forecasting model

Language: Jupyter Notebook - Size: 1.71 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

baew-seattleu/SDGnE

Synthetic Data Generation and Evaluation

Language: Jupyter Notebook - Size: 3.92 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 2

sodascience/workshop_syntheticdata_osf2022

Files for the synthetic data presentation at the Open Science Festival 2022

Size: 2.28 MB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

an-seunghwan/DistVAE

Official pytorch implementation codes for NeurIPS-2023 accepted paper "Distributional Learning of Variational AutoEncoder: Application to Synthetic Data Generation"

Language: Python - Size: 41.4 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

nderus/Generative-models-Lesson

Parent project - Training School

Language: Jupyter Notebook - Size: 2.03 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

AlgoMathITMO/SynEvaRec Fork of vldpro/SynEvaRec

A Framework for Evaluating Recommender Systems on Synthetic Data Classes

Language: Jupyter Notebook - Size: 68.7 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

AliValiyev/Utility-of-Synthetic-Data-in-Machine-learning-tasks.

In this repository, I tried to investigate the utility of synthetic data generated by DataSynthesizer and Synthetic Data Vault in machine learning tasks. I applied the Random Forest, Logistic Regression, Support Vector Machine, K-Nearest Neighbor, and Naive Bayes algorithms to the synthetic data and made a comparison.

Language: Python - Size: 2.12 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

Related Topics
synthetic-data 27 machine-learning 13 deep-learning 11 synthetic-dataset-generation 10 python 9 data-augmentation 8 gan 7 data-generation 7 generative-adversarial-network 7 time-series 6 privacy 5 sdg 5 generative-model 5 pytorch 5 differential-privacy 5 synthetic-data-generator 4 large-language-model 4 generative-ai 4 data-science 3 dataset-generation 3 tabular-data 3 graph-generation 3 smote 3 gans 3 fine-tuning 3 sdv 3 llm 3 large-language-models 3 tensorflow 2 graph-neural-networks 2 test-data-generator 2 generative-models 2 smote-sampling 2 human-activity-recognition 2 human-centric-ml 2 human-pose-estimation 2 unity3d 2 pet-lab 2 tno 2 distributional-learning 2 data-visualization 2 tabular 2 computer-vision 2 data-analysis 2 diffusion-model 2 unity 2 synthetic-datasets 2 object-detection 2 faker 2 docker 2 golang 2 applied-ml-research 2 vae 2 synthetic 2 autoregressive-models 2 data-classification 2 emotion-recognition 2 ddpm 2 synthetic-tabular-data 2 privacy-enhancing-technologies 2 simulation 2 ai 2 image-generation 2 recommender-system 2 evaluation 2 autoencoders 2 large-language-model-agent 1 billing-7054 1 dummy-data-generator 1 r 1 applied-machine-learning 1 rpackage 1 statistics 1 clinical-research 1 privacy-protection 1 cli 1 llms 1 body-motion-data 1 data-scarcity 1 mathematics 1 multimodal-deep-learning 1 deep-neural-networks 1 reinforcement-learning-environments 1 ai-feedback 1 computervision 1 dpo 1 finetuning 1 huggingface 1 llm-evaluation 1 rlaif 1 rlhf 1 ner 1 ocr-recognition 1 synthetic-images 1 computer-graphics 1 text-alignment 1 referring-expression-comprehension 1 instruction-tuning 1 ubuntu 1 streamlit 1