Topic: "dataset-generator"
Dev-Tarek/sketched-webpages-generator
Customizable open-source software to generate randomized sketched web-pages.
Language: JavaScript - Size: 72.2 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 37 - Forks: 6

hetpandya/youtube_tts_data_generator
A python library to generate speech dataset from Youtube videos
Language: Python - Size: 81.1 KB - Last synced at: 28 days ago - Pushed at: 11 months ago - Stars: 36 - Forks: 8

meyerls/PEGASUS
[IROS24] Offical repository for "PEGASUS: Physically Enhanced Gaussian Splatting Simulation System for 6DoF Object Pose Dataset Generation"
Language: Python - Size: 6.33 MB - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 35 - Forks: 0

ATISLabs/SyntheticDatasets.jl
Collection of artificial data generators in julia
Language: Julia - Size: 259 KB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 24 - Forks: 4

msorkhpar/wiki-entity-summarization
This repository hosts a comprehensive suite for graph-based entity summarization dataset generating from user-selected Wikipedia pages. Utilizing a series of interconnected modules, it leverages Wikidata and Wikipedia dumps to construct a dataset, alongside auto-generated ground truths.
Language: Python - Size: 35.3 MB - Last synced at: 8 days ago - Pushed at: 11 months ago - Stars: 21 - Forks: 0

tombax7/FLITC-application
A data-driven deep learning based fault diagnosis application for radial, active distribution grids
Language: Python - Size: 5.34 MB - Last synced at: 27 days ago - Pushed at: about 2 years ago - Stars: 21 - Forks: 4

Jaesung-Jun/Cut-And-Save-Faces
collect pictures
Language: Python - Size: 73.9 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 18 - Forks: 6

nileshprasad137/keystroke-dynamics-datagen
Generate dataset for keystroke timings for exploratory and research purposes.
Language: Python - Size: 1.49 MB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 15 - Forks: 4

alexppppp/synthetic-dataset-object-detection
How to Create Synthetic Dataset for Computer Vision (Object Detection) (Article on Medium)
Language: Jupyter Notebook - Size: 159 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 15 - Forks: 5

OmarSamirz/ImageFromTextGenerator
IFTG (ImageFromTextGenerator) is a Python package that simplifies creating robust datasets for OCR models. Generate images from text, apply over 10 built-in noise effects, and customize fonts and layouts. IFTG supports all languages and offers endless noise combinations, including custom noise creation.
Language: Python - Size: 15.2 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 14 - Forks: 1

fedecalendino/reddit-graph
Graph representation of Reddit
Language: Python - Size: 151 KB - Last synced at: 4 days ago - Pushed at: about 2 years ago - Stars: 10 - Forks: 0

M-Farag/rawbuilder
an elegant datasets factory
Language: Python - Size: 126 KB - Last synced at: 4 days ago - Pushed at: about 3 years ago - Stars: 6 - Forks: 0

iwangjian/pyloader
🐳 PyLoader: An asynchronous Python dataloader for loading big datasets, supporting PyTorch and TensorFlow 2.x.
Language: Python - Size: 182 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 6 - Forks: 0

DolbyUUU/Sudoku4LLM
Sudoku4LLM is a Sudoku dataset generator for training and evaluating reasoning in Large Language Models (LLMs). It offers customizable puzzles, difficulty levels, and 11 serialization formats to support structured data reasoning and Chain of Thought (CoT) experiments.
Language: Python - Size: 29.3 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 4 - Forks: 0

christiangarcia0311/data-exploration-analysis
Data Exploration is the initial step in data analysis, where users explore a large data set in an unstructured way to uncover initial patterns, characteristics and points of interest.
Language: Python - Size: 118 KB - Last synced at: about 1 month ago - Pushed at: 5 months ago - Stars: 4 - Forks: 1

PatricioGuinle/CoffeMIDI
A MIDI Content Based Recomandation System
Language: Jupyter Notebook - Size: 2.98 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 4 - Forks: 1

joshuaboud/gen-dataset
Command line tool to quickly generate a lot of files in a lot of directories
Language: C++ - Size: 267 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 4 - Forks: 0

leomaurodesenv/travel-dataset-generator
A tool to generate synthetic dataset of corporate travels
Language: Jupyter Notebook - Size: 37.1 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 4 - Forks: 0

StarlangSoftware/DataGenerator-Py
Classification dataset generator library for high level Nlp tasks
Language: Python - Size: 105 KB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0

yas-sim/simple-annotation-toolkit
The most simple ROI annotation toolkit for object detection task
Language: Python - Size: 813 KB - Last synced at: about 2 months ago - Pushed at: almost 5 years ago - Stars: 3 - Forks: 1

ZEKE320/llm-dataset-generator
The LLM Dataset Generator is an open source tool for generating text data compatible with various language models supported by LangChain. You can customize it to meet your specific needs, making it a valuable resource for researchers, developers, and organizations working on NLP applications.
Language: Jupyter Notebook - Size: 558 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 2 - Forks: 0

ElsevierSoftwareX/SOFTX-D-20-00055 Fork of agsoto/webgenerator
An open-source software for synthetic web-based user interface and content dataset generation. To cite this Original Software Publication: https://www.sciencedirect.com/science/article/pii/S2352711022000073
Size: 33.8 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

JC-ProgJava/Handwritten-Digit-Dataset
A collection of 107,730 28x28 PNG files of digits from 0-9, with a dataset generator.
Size: 125 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 0

filiptronicek/dataset-creator Fork of ultralytics/flickr_scraper
Simple Flickr Image Scraper and compression script
Language: Python - Size: 38.1 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

realm-tech/docgen
A document generator used to fully create training and evaluation datasets for OCR applications
Language: Python - Size: 32.5 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

RohitMidha23/youtube-video-scraper
Unleash the power of YouTube with this efficient scraper - download videos with just a search query!
Language: Python - Size: 5.86 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

Spr-Aachen/Easy-DataSet-Creator-Tool-For-Image-Classification
一个简易的图像分类数据集制作工具,目前尚在施工中~ | A simple dataset creating tool for image classification, still working on it~
Language: Jupyter Notebook - Size: 85 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

rbsathish/file_renamer_with_gui
using this tool u can rename your files from the selected directory. It will be useful for your ML dataset preparations and anyother uses Eg. Frame_01, Image01
Language: Python - Size: 25.5 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

StarlangSoftware/DataGenerator-Cy
Classification dataset generator library for high level Nlp tasks
Language: Cython - Size: 72.3 KB - Last synced at: 8 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

serpo-dev/python-kandinsky-api
script for bulk Kandinsky/FusionBrain AI image generation with API key rotation, auto-saving, rate-limiting, progress tracking, and failover handling, perfect for datasets, content creation & API testing
Language: Python - Size: 3.91 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

VendenIX/YoutubeDatasetGenerator
This repository provides a tool to create a dataset of images from a YouTube video by capturing one image every 10 seconds in 480p resolution.
Language: Shell - Size: 1000 Bytes - Last synced at: about 1 month ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

red-shock/Mass-Image-Downloader
A browser extension which allows you to download all images on a page as well as aggregate them.
Language: JavaScript - Size: 150 KB - Last synced at: 10 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

adeiskandarzulkarnaen/AksaraGenerator-Flutter
Dataset Generator for CNN Handwriting Recognition
Language: Dart - Size: 247 KB - Last synced at: about 1 month ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

dennis-barrett/dimdates-dot-com
Source code for the Kimball-style date dimension generator dimdates.com.
Language: JavaScript - Size: 839 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

JoshWarn/Multi-Label-Shapes-Toy-Dataset-Generator
An easy-to-use multi-label image dataset generator.
Language: Python - Size: 103 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

rodrigo-barraza/inscriptor
Blip 2 Captioning, Mass Captioning, Question Answering, and other tools.
Language: Jupyter Notebook - Size: 491 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

lnschroeder/carla-dataset-generator
A multi-modal video dataset generator using CARLA simulator.
Language: Python - Size: 195 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

hygull/kaggle_dataset_creator
A Python package to that allows Data scientist, Data engineer, Data analyst to create a dataset in form of csv, json so that they could be either submitted to Kaggle's dataset collection or used to work with Pandas etc.
Language: Python - Size: 46.9 KB - Last synced at: 23 days ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

ale-ch/nyc-re-data
Create a dataset with sale price estimates for over 700k buildings in New York City.
Language: R - Size: 11.7 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

quickgrid/vision-tools
A bunch of cli tools for deep learning and computer vision.
Language: Python - Size: 5.11 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

async-research/youtube-scraper
Scrape videos and video meta data from YouTube
Language: Python - Size: 13.7 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

jkee58/EMNIST-Detection Fork of tensorflow/models
EMNIST Detection built with TensorFlow
Language: Python - Size: 1.3 GB - Last synced at: 10 months ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

rbsathish/Bulk-Image-Resizer
Bulk image resizer with GUI
Language: Python - Size: 3.91 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

varunveeraa/face_extractor
Face extraction from webcam for training purposes
Language: Python - Size: 3.91 KB - Last synced at: almost 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

Pratap2018/Dataset_generator
Language: Python - Size: 6.84 KB - Last synced at: 2 months ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

anandhupvr/shapes Fork of cjpurackal/shapes
A dataset for validating computer vision models for classification, detection and segmentation before testing it out with real world datasets
Language: Python - Size: 18.6 KB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0
