An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: dataset-creation

fxrhan/smart-media-sampler

Professional Python tool for intelligently selecting and copying media files with advanced filtering, performance optimization, and resume capabilities. Perfect for dataset creation, content curation, and large-scale media management.

Language: Python - Size: 19.5 KB - Last synced at: about 3 hours ago - Pushed at: about 5 hours ago - Stars: 1 - Forks: 0

tanaos/synthex

Generate high-quality, large-scale synthetic datasets 📊🧪

Language: Python - Size: 149 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 0 - Forks: 0

muhammad-fiaz/cp-dataset-gui

A GUI for managing, visualizing, and analyzing competitive programming datasets with a PyQt6 GUI.

Language: Python - Size: 154 KB - Last synced at: 4 days ago - Pushed at: 4 days ago - Stars: 1 - Forks: 0

sanyabeast/useful-scripts

A versatile collection of utility scripts for everyday tasks including image processing, file management, audio manipulation, and ML tools. Features cross-platform solutions with a focus on automation and productivity.

Language: JavaScript - Size: 7.76 MB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 0 - Forks: 0

Mindful-AI-Assistants/SP2024-Election-Analysis

📊 An analysis of voting patterns in São Paulo's 2024 elections, focusing on voter behavior, absenteeism, and geographic trends.

Language: HTML - Size: 86.6 MB - Last synced at: 1 day ago - Pushed at: 5 days ago - Stars: 7 - Forks: 3

Particle1904/DatasetHelpers

Dataset Helper program to automatically select, re scale and tag Datasets (composed of image and text) for Machine Learning training.

Language: C# - Size: 3.07 MB - Last synced at: 8 days ago - Pushed at: 9 days ago - Stars: 205 - Forks: 11

odoma-ch/quagga

Quagga – question answering over graphs

Language: HTML - Size: 1.28 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 1 - Forks: 0

chadsr/marktplaats-scraper

Marktplaats.nl (Dutch Classifieds) Listing Scraper

Language: Python - Size: 502 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 5 - Forks: 1

ylogx/aesthetics

Image Aesthetics Toolkit - includes Fisher Vector implementation, AVA (Image Aesthetic Visual Analysis) dataset and fast multi-threaded downloader

Language: Python - Size: 4.17 MB - Last synced at: 3 days ago - Pushed at: about 2 years ago - Stars: 227 - Forks: 54

gops77/cp-dataset-gui

🖥️ Create and manage datasets visually with cp-dataset-gui, streamlining your data handling process for efficient project development.

Language: Python - Size: 142 KB - Last synced at: 17 days ago - Pushed at: 17 days ago - Stars: 0 - Forks: 0

ynop/audiomate

Python library for handling audio datasets.

Language: Python - Size: 9.07 MB - Last synced at: 3 days ago - Pushed at: about 2 years ago - Stars: 138 - Forks: 28

D-Ogi/WatermarkRemover-AI

AI-Powered Watermark Remover using Florence-2 and LaMA Models: A Python application leveraging state-of-the-art deep learning models to effectively remove watermarks from images with a user-friendly PyQt6 interface.

Language: Python - Size: 63.5 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 659 - Forks: 104

viktor-shcherb/fact-annotation

Build lightweight knowledge graphs from text in minutes. Annotate entities, link facts, and visualize relationships — all in one Streamlit app, with Google Sheets integration.

Language: Python - Size: 584 KB - Last synced at: 10 days ago - Pushed at: 4 months ago - Stars: 3 - Forks: 0

AmmarkoV/RGBDAcquisition

A uniform library wrapper for input from V4L2,Freenect,OpenNI,OpenNI2,DepthSense,Intel Realsense,OpenGL simulations and other types of video and depth input..

Language: C - Size: 19.6 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 57 - Forks: 12

gruporaia/TTS-AutoTuning

Pipeline para finetuning automático de modelos de Text to Speech.

Language: Python - Size: 2.65 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

sysrqmagician/quicklabel

Simple Text-2-Image datasetting tool for hobbyists

Language: Rust - Size: 285 KB - Last synced at: 24 days ago - Pushed at: 2 months ago - Stars: 2 - Forks: 1

DFKI-NI/syclops

Syclops is a tool for creating synthetic data from 3D virtual environments with photorealistic renderings and pixel-perfect annotations.

Language: Python - Size: 29.6 MB - Last synced at: 7 days ago - Pushed at: 2 months ago - Stars: 12 - Forks: 2

rakki194/yipyap

A dataset editor from the future!

Language: TypeScript - Size: 4.32 MB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 2 - Forks: 0

ProGamerGov/dream-creator

Quickly and easily create / train a custom DeepDream model

Language: Python - Size: 44 MB - Last synced at: 2 days ago - Pushed at: about 3 years ago - Stars: 67 - Forks: 6

tanaos/synthex-python

Generate high-quality, large-scale synthetic datasets 📊🧪

Language: Python - Size: 276 KB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 3 - Forks: 1

theallyprompts/PixelPruner

PixelPruner is a user-friendly image cropping app for AI-generated art. It supports PNG, JPG, JPEG, and WEBP formats. Easily crop, preview, and manage images with interactive previews, thumbnail views, rotation tools, and customizable output folders. Streamline your workflow and achieve perfect crops every time with PixelPruner.

Language: Python - Size: 126 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 32 - Forks: 3

K-sel/sapiens

Projet de visualisation de données avec d3.js. Le thème choisi est l'histoire de l'humanité avec comme source le livre sapiens. Le dataset sera crée de toutes pièce par l'équipe pour les besoins du projet.

Language: JavaScript - Size: 11.9 MB - Last synced at: 14 days ago - Pushed at: 4 months ago - Stars: 2 - Forks: 0

usernam3/shopify-app-store-scraper

Crawler behind the Shopify App Marketplace dataset

Language: Python - Size: 68.4 KB - Last synced at: 4 months ago - Pushed at: 9 months ago - Stars: 75 - Forks: 22

ahmedbesbes/dataset-builder

A script to help you quickly build custom computer vision datasets

Language: Python - Size: 14.5 MB - Last synced at: 3 months ago - Pushed at: over 5 years ago - Stars: 35 - Forks: 6

VeyDlin/SankakuParser

The script for parsing sankakucomplex

Language: Python - Size: 656 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 5 - Forks: 2

apereiracv/cr-plates-generator

Costa Rican license plate dataset generator

Language: Python - Size: 1.79 MB - Last synced at: 28 days ago - Pushed at: almost 6 years ago - Stars: 13 - Forks: 5

michaelscutari/protclust

protclust is a Python library for protein sequence analysis that integrates MMseqs2 for fast clustering and provides tools for creating robust machine learning datasets. It offers cluster-aware data splitting to prevent sequence similarity bias in model evaluation, along with comprehensive protein embedding capabilities for feature generation.

Language: Python - Size: 354 KB - Last synced at: 13 days ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

maya-mp/WMATA_Analysis

This project analyzes WMATA Metro ridership trends to optimize transit planning and business strategies through data-driven insights.

Language: Jupyter Notebook - Size: 43.4 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0

benjaminvdb/DBRD

110k Dutch Book Reviews Dataset for Sentiment Analysis

Language: Python - Size: 34.2 KB - Last synced at: 4 months ago - Pushed at: almost 2 years ago - Stars: 29 - Forks: 3

amineHorseman/images-web-crawler

This package is a complete tool for creating a large dataset of images (specially designed -but not only- for machine learning enthusiasts). It can crawl the web, download images, rename / resize / covert the images and merge folders..

Language: Python - Size: 39.1 KB - Last synced at: 5 months ago - Pushed at: about 7 years ago - Stars: 105 - Forks: 24

JadynHax/scpscraper

A Python library designed for scraping data from the SCP wiki.

Language: Python - Size: 216 KB - Last synced at: 2 months ago - Pushed at: almost 5 years ago - Stars: 15 - Forks: 4

varungupta31/dashcam_anonymizer

Code to Blur Human Faces and Vehicle License Plates in Video and Images using a SoTA Object Detection model YOLOv8

Language: Python - Size: 124 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 37 - Forks: 6

mosesab/Categorize-News-Headlines-With-Word-Embeddings

A simple project that creates a dataset of News Headlines with Primary Category, Secondary Category, Date, Day, Month,Year, Sentiment, SentimentPolarity, Emotion and Url. All News Headlines are scraped from punch newspaper and sorted into a csv file.

Language: Python - Size: 709 KB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

alac/txt_to_dataset

Turn txt files into an instruction dataset, using Oobabooga's text generation webui to add metadata.

Language: Python - Size: 823 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 3 - Forks: 0

Aravinda89/google_image_downloader

google image downloader

Language: Jupyter Notebook - Size: 32.2 KB - Last synced at: 6 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

philipperemy/japanese-street-addresses-scraper

Scraper for Japanese street addresses (住所).

Language: Python - Size: 7.02 MB - Last synced at: 4 months ago - Pushed at: over 3 years ago - Stars: 6 - Forks: 2

alexzgu/karaoke_subtitle_dataset

Generates timestamped vocal tokens from color-formatted WebVTT files.

Language: Python - Size: 910 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

YomnaEskander/NAS-A-News-Analysis-Chatbot-

We developed a system to streamline the news consumption process, providing a seamless and smooth user experience. Using a chatbot interface, the system delivers analyzed and summarized daily news from trusted sources.

Language: Jupyter Notebook - Size: 2.84 MB - Last synced at: 5 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

ArchAngelAries/TagScribeR

A tool to streamline AI image captioning

Language: Python - Size: 190 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 0

tam0w/poverty_data

Attempting to analyse and estimate poverty indicators at the Indian district level. First ever district level dataset with a poverty indicator.

Language: Jupyter Notebook - Size: 184 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 0

dtflare/GPTparser

Use GPTparser with your OpenAI API to scrape & parse files into structured JSON files.

Language: Python - Size: 173 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 0

khirotaka/tartare 📦

Tartare: Make homebrew image dataset for machine learning.

Language: Python - Size: 2.74 MB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 0

esencgr/Python_Scripts_Projects

Data Extraction & Dataset Creation & Data Scraping

Language: Python - Size: 6.34 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 2

wITTus/catenc

Category/Label encoder for the shell written in Rust.

Language: Rust - Size: 5.86 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

1fmusic/2021PCOR-ML-AI Fork of onc-healthit/2021PCOR-ML-AI

Through this project, ONC in partnership with National Institutes of Health (NIH) National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), advanced the application of AI/ML in patient-centered outcomes research (PCOR) by generating high quality training datasets for a chronic kidney disease (CKD) use case – predicting mortality within the first 90 days of dialysis.

Language: Jupyter Notebook - Size: 20.9 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

bharathsudharsan/COVID-away

Code for paper 'Avoid touching your face: A hand-to-face 3d motion dataset (covid-away) and trained models for smartwatches'

Language: Python - Size: 102 MB - Last synced at: 3 months ago - Pushed at: about 3 years ago - Stars: 18 - Forks: 4

MJ10/Unix-Project

This repository contains code for the Project 'Image Scraper in BASH'

Language: Shell - Size: 521 KB - Last synced at: 6 months ago - Pushed at: almost 8 years ago - Stars: 2 - Forks: 1

codiceSpaghetti/T4SA-2.0

This project creates the T4SA 2.0 dataset, i.e. a big set of data to train visual models for Sentiment Analysis in the Twitter domain using a cross-modal student-teacher approach.

Language: Jupyter Notebook - Size: 2.73 GB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 1

CristianTuretta/DDoS-Network-Flow-Forensics-Analyser-

We are developing a tool for analyse recorded network traffic in order to detect and investigate about IP source address which may had contribute in a DDoS UDP flood attack. This tool also generates sample pcap datasets.

Language: Python - Size: 637 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 9 - Forks: 2

skywalker023/sodaverse

🥤🧑🏻‍🚀Code and dataset for our EMNLP 2023 paper - "SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization"

Language: Python - Size: 1000 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 176 - Forks: 8

jazibdawre/DatasetCreator

Script for creating a dataset for AI, ML applications

Language: Python - Size: 72.9 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 1

DebeshJha/Kvasir-SEG

Kvasir-SEG: A Segmented Polyp Dataset

Size: 648 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

silenterus/deepspeech-cleaner

Multi-Language Dataset Cleaner/Creator for Mozilla's DeepSpeech Framework

Language: Python - Size: 389 KB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 47 - Forks: 7

krisbolton/tweet-annotation-tool 📦

Annotate tweets with sentiment scores to create sentiment analysis datasets.

Language: Python - Size: 111 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1

aqaqsubin/mmtod-pc

Multimodal TOD for Psychiatric Counseling

Language: Jupyter Notebook - Size: 1.04 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

AkshatM/BoxOfficeMojo-Data

Parsing utility to convert BoxOfficeMojo-like tables to pandas DataFrame objects for analysis

Language: Python - Size: 415 KB - Last synced at: almost 2 years ago - Pushed at: over 10 years ago - Stars: 2 - Forks: 2

bazukas/soyla

Simple terminal application to record speech datasets

Language: Python - Size: 84 KB - Last synced at: 3 months ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

sinjoysaha/Disney-Movies-Wiki-WebScraper

Web Scraping Wikipedia for Disney Movies to create a Disney Movies dataset and then cleaning the data to perform further Data Analysis using the cleaned JSON

Language: Jupyter Notebook - Size: 1.83 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

calista-ai/crowdsourcing-app

A Web Application to collect data from pairwise image comparisons via crowdsourcing

Language: JavaScript - Size: 3.17 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 2

StepanTita/news-contest

This repository contains Jupyter notebooks detailing the experiments conducted in our research paper on Ukrainian news classification. We introduce a framework for simple classification dataset creation with minimal labeling effort, and further compare several pretrained models for the Ukrainian language.

Language: Jupyter Notebook - Size: 698 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

p-minier/neuroblastoma

Final Study Project on Neuroblastoma Classification with Convolutional Neural Networks

Language: Python - Size: 12 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

phe-sto/google-images-download Fork of hardikvasa/google-images-download

Python 3 script and API crawling Google Image to create giant image dataset.

Language: Python - Size: 246 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 0

babua/TTSDatasetRecorder

A simple app for recording speech datasets.

Language: Python - Size: 302 KB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 20 - Forks: 3

windj007/docato

DOCument lAbeling TOol - a simple web-based appearance-aware tool to label text documents for information extraction

Language: JavaScript - Size: 1.27 MB - Last synced at: 3 months ago - Pushed at: over 6 years ago - Stars: 4 - Forks: 1

rodrigo-barraza/inscriptor

Blip 2 Captioning, Mass Captioning, Question Answering, and other tools.

Language: Jupyter Notebook - Size: 491 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

nkb-tech/dataset-collection

Framework to collect dataset in COCO format for images/videos using pretrained neural networks

Language: Python - Size: 121 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

fatemenajafi135/Irony-detection

Persian Irony Detection, include a Persian dataset, creating a dataset automatically, and finetuning transformer-based language models for the task

Language: Python - Size: 1.89 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

sukrutrao/crowdsourced-data-simulator

A program that simulates answers given by a crowd to multiple choice questions with either a single or multiple answers correct, and writes it to a CSV

Language: Python - Size: 14.6 KB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 0

michaelhasey/Archi_Base

A dataset creation tool to aggregate, sort and label large volumes of architectural imagery.

Language: Jupyter Notebook - Size: 925 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 0

nonetrix/booru2trainingdata

This script takes a Dabooru ID then puts it in a format best for training or fine tuning anime txt2img models

Language: Python - Size: 35.2 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

Conchylicultor/ClickAndCrop

A simple Qt program to easily extract and label samples from videos. Used for dataset creation.

Language: C++ - Size: 568 KB - Last synced at: 4 months ago - Pushed at: over 8 years ago - Stars: 12 - Forks: 1

vusec/pandacap

A framework for streamlining the capture of PANDA execution traces.

Language: Shell - Size: 177 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 55 - Forks: 3

mhwasil/automatic_dataset_collection_and_annotation

A package for automatic dataset collection, annotation and generating semantic labels using ROS. Dataset format supported is VOC and KITTI.

Size: 1000 Bytes - Last synced at: over 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

Roar-Network/roar-dataset

Artificial dataset of user-roar-rating to train all machine learning to be used in Roar Network. Roar-dataset is based on Yahoo Answers Topic

Language: Jupyter Notebook - Size: 35.2 KB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

ibiscp/Synthetic-Plants

Dataset augmentation with Generative Adversarial Network for crop/weed segmentation

Language: Python - Size: 6.65 MB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 12 - Forks: 3

ElsevierSoftwareX/SOFTX-D-20-00065 Fork of alessandrosebastianelli/SentinelDataDownloaderTool

A set of scripits that use the Google Earth Engine pyhton API for the automatic creation of datasets for AI applications. To cite this Original Software Publication: https://www.sciencedirect.com/science/article/pii/S2352711021000728

Size: 331 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

LeviBorodenko/dortmund2array

Tool to convert datasets from "Benchmark Data Sets for Graph Kernels" (K. Kersting et al., 2016) into a format suitable for deep learning research.

Language: Python - Size: 25.4 KB - Last synced at: 10 days ago - Pushed at: over 5 years ago - Stars: 2 - Forks: 0

vudat081299/RecordFastProject

Recorder super fast on iOS. Make sound data to train model. default extract .wav file.

Language: Swift - Size: 137 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

BakingBrains/Movie-Genre-Classifier

This model is based on my dataset creation, the original dataset is here: https://drive.google.com/file/d/1iQV5kKF_KGZL9ALx9MMXk_Lg7PklBLCE/view

Language: Jupyter Notebook - Size: 4.62 MB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

rbsathish/Bulk-Image-Resizer

Bulk image resizer with GUI

Language: Python - Size: 3.91 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

sahilparekh/coco2017-tfrecord

Create selective class tfrecord from coco2017 dataset

Language: Python - Size: 140 KB - Last synced at: 10 months ago - Pushed at: almost 6 years ago - Stars: 2 - Forks: 0

mthompson64/dsci510_final_project

DSCI 510 Final Project - Explore the relationship between median income in neighborhoods (quantified by ZIP code), the number of electric vehicle charging locations, and the number of vegan/ vegetarian restaurants in the neighborhood. Currently limited to California only.

Language: Jupyter Notebook - Size: 499 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

Adiprogrammer7/Disney_movies_dataset_creation

Collecting data from Wikipedia and OMDB, plus some data polishing for the sake of dataset creation :)

Language: Jupyter Notebook - Size: 285 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

UBC-NLP/ara_emotion_naacl2018

This repository provides our datasets for Arabic emotion detection in Twitter

Size: 1.5 MB - Last synced at: over 2 years ago - Pushed at: about 7 years ago - Stars: 8 - Forks: 3

syreal17/Chimera

(WIP) Create tens of binaries from GitHub projects with the same compiler flags

Language: Shell - Size: 15.6 KB - Last synced at: 6 days ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 2

seralexger/filmaffinity-scraper

Unofficial class for scrap and recollect info from Filmmaffinity

Language: Python - Size: 14.6 KB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 2

klangner/dataset-recorder

iOS application for creating datasets for Machine Learning projects

Language: Swift - Size: 20.9 MB - Last synced at: over 2 years ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

tianhaoz95/capstone

A set of tools to generate and label dataset from academic papers

Language: Python - Size: 93.8 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

Related Keywords
dataset-creation 88 dataset 24 dataset-generation 22 python 20 machine-learning 19 computer-vision 9 data-science 8 python3 7 image-processing 7 datasets 6 deep-learning 6 data 5 dataset-manager 5 nlp 5 image-classification 5 ai 5 dataset-generator 4 crowdsourcing 4 scraping 3 gui 3 dataset-filtering 3 scraper 3 selenium 3 image 3 web-scraping 3 beautifulsoup 3 webscraping 3 synthetic-data 3 data-analysis 3 transformers 3 image-recognition 2 instance-segmentation 2 knowledge-graph 2 web-scraper 2 detection 2 sentiment-analysis 2 opencv 2 google-images-downloader 2 segmentation 2 images 2 data-collection 2 big-data 2 webscraper 2 text-to-speech 2 tts 2 speech-recognition 2 labeling-tool 2 ios-app 2 corpus-tools 2 audio-datasets 2 cnn 2 license-plate-recognition 2 dataset-analysis 2 stable-diffusion 2 visualization 2 uv 2 leetcode-solutions 2 leetcode 2 huggingface 2 hf-datasets 2 hackerrank 2 geeksforgeeks 2 datasets-preparation 2 dataset-collection 2 cp-dataset-gui 2 neural-networks 2 data-mining 2 data-engineering 2 data-analytics 2 image-captioning 2 pretrained-models 2 file-management 2 wikipedia 2 data-cleaning 2 scraper-engine 2 data-scraping 2 google-images 2 crawler 2 image-segmentation 2 beautifulsoup4 2 image-manipulation 2 docker 2 classification 2 jupyter-notebook 1 pig-latin 1 speech-to-text 1 pig 1 task-oriented-dialogue 1 multimodal 1 udp-flood 1 chatgpt 1 commonsense 1 pca-analysis 1 dialogue 1 mapreduce 1 dialogue-generation 1 jupyter 1 polyp-segmentation-task 1 polyp-segmentation 1 polyp-detection 1