GitHub topics: datasets-preparation
HiagoFF/oneclick-image-downloader-extension
Chrome extension to download images with one click, saving time on image dataset creation.
Language: JavaScript - Size: 167 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

Beejixx/oneclick-image-downloader-extension
Chrome extension to download images with one click, saving time on image dataset creation.
Language: JavaScript - Size: 167 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1 - Forks: 0

Sytheflay1/oneclick-image-downloader-extension
Chrome extension to download images with one click, saving time on image dataset creation.
Size: 1.95 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 2 - Forks: 0

nmicovic/katachi
Katachi is a Python framework for validating and processing hierarchical directory structures using YAML-based schemas. It ensures your folders and files follow expected shapes, naming rules, and relationships—before any processing begins. Use it to enforce structure, catch issues early, and keep your data pipelines reliable.
Language: Python - Size: 2.54 MB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 0 - Forks: 0

AndyTheFactory/newspaper4k
📰 Newspaper4k a fork of the beloved Newspaper3k. Extraction of articles, titles, and metadata from news websites.
Language: HTML - Size: 24.5 MB - Last synced at: 29 days ago - Pushed at: 3 months ago - Stars: 772 - Forks: 81

franklinkemta/oneclick-image-downloader-extension
Chrome extension to download images with one click, saving time on image dataset creation.
Language: JavaScript - Size: 4.02 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 6 - Forks: 1

edobranchi/PokeTCG_downloader
Pokemon card automatic images downloader
Language: Python - Size: 19.1 MB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 0

visual-layer/visuallayer
Simplify Your Visual Data Ops. Find and visualize issues with your computer vision datasets such as duplicates, anomalies, data leakage, mislabels and others.
Language: Jupyter Notebook - Size: 90.1 MB - Last synced at: 14 days ago - Pushed at: about 1 month ago - Stars: 69 - Forks: 3

karmazinoleh/hackEmotion-hackathon
Website for organising and collecting emotion datasets with smart system of validation
Language: Java - Size: 78.6 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

ProGamerGov/powershell-dataset-tools
Language: PowerShell - Size: 108 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

basillicus/traincraft
Atomic Dataset Generator for training ML potentials
Language: Python - Size: 105 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0

reverseame/MALVADA
MALVADA: Malware Execution Traces Dataset generation.
Language: Python - Size: 37.4 MB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 3 - Forks: 2

birenkamdar/actigraphy
For actigraphy csv files downloaded from Philips devices. This STATA do file bulk imports, appends, and organizes variables from unlimited csv files to generate a clean file ready for analysis.
Language: Stata - Size: 15.6 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Sanyamjin/BRIDGE_ANAMOLY_DETECTION
Developed a Machine Learning Model for SpectoV for an internship second screening round. Generated a Dataset with temperature, strain , vibration as features and class anamoly.
Language: Jupyter Notebook - Size: 38.1 KB - Last synced at: 4 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

yevh/anonymizer
Anonymize sensitive data in your datasets.
Language: Python - Size: 1.16 MB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 12 - Forks: 1

sbl-sdsc/kg-import
kg-import automates the ingestion of heterogeneous datasets into a Knowledge Graph.
Language: Jupyter Notebook - Size: 902 KB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 6 - Forks: 4

nicolay-r/arekit-ss
Low Resource Context Relation Sampler for contexts with relations for fact-checking and fine-tuning your LLM models, powered by AREkit
Language: Python - Size: 2 MB - Last synced at: about 1 month ago - Pushed at: 6 months ago - Stars: 3 - Forks: 0

hasanirtiza/Pedestron
[Pedestron] Generalizable Pedestrian Detection: The Elephant In The Room. @ CVPR2021
Language: Python - Size: 64.8 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 689 - Forks: 157

0ssamaak0/DLTA-AI
Data Labeling, Tracking and Annotation with AI
Language: Python - Size: 233 MB - Last synced at: 11 months ago - Pushed at: about 1 year ago - Stars: 309 - Forks: 39

rishiswethan/ExtractSegmentationHabitat
This repo can help people having trouble with extracting segmentation images and masks from replica and matterport3d-habitat
Language: Python - Size: 17.6 KB - Last synced at: 9 days ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

Dartvauder/NeuroTrainerWebUI
(Windows/Linux) Local WebUI for finetuning, evaluation and generation of neural network models (LLM and StableDiffusion) on python (In Gradio interface)
Language: Python - Size: 1.15 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

nicolay-r/SemEval2024-Task3
The supplementary sevice over THoR Chain-of-Thought framework as part of SemEval-2024 Task 3 paper
Language: Python - Size: 39.1 KB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

serp-ai/datasets
Datasets
Size: 6.67 MB - Last synced at: about 1 month ago - Pushed at: almost 2 years ago - Stars: 7 - Forks: 1

vivesweb/row-math-ml-csv
Check row data from csv to extract number & percentage of emtpy, null, na, nan values, extract the type of the value (string, numeric, date, ip, emtpy, null, na, nan). Count(empty cols), percentage(empty cols), zeros values, ....
Language: PHP - Size: 174 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

vivesweb/cli-graph-ml
CLI PHP for visualize Machine learning datasets in Graph bar format. Detect Outliers. See your data before Training
Language: PHP - Size: 434 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 2

banda-larga/dataset-editor
Conversations / Instructions Editor
Language: Python - Size: 5.86 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

stellar-gen-ai/stellar-dataset
Official Code for the dataset exploration of Stellar: Systematic Evaluation of Human-Centric Personalized Text-to-Image Methods
Language: Jupyter Notebook - Size: 991 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 1

AymenBOUGUERRA/Tool-for-making-noisy-images-and-their-masks-for-Unet-appliation-
While working on a Unet project, I created a program that can be used to add noise, a random grid (textbook) and a random shade of grey , this tool will output (depending on witch variation) combinations of two images the noisy image ut self and the clear one for the first variation (this one gave better results with Unet application) while the second variation will output the noisy image and the noise as its mask
Language: Python - Size: 5.86 KB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

lyooyl/AVADatasetMake
Make AVADataset custom dataset.
Language: Python - Size: 56 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

lucadiliello/asnq-challenging
ASNQ without trivial negative answers.
Language: Python - Size: 13.7 KB - Last synced at: 1 day ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

dennis-n-schneider/datasets
A Plugin for the judo project, enabling a reproducible way of dataset-management.
Language: Makefile - Size: 19.5 KB - Last synced at: about 1 year ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

MostHappyCougar/HDF5ImageMarker
Utility to making datasets of images and points coordinates that have been marked up on these images by user
Language: Python - Size: 1.9 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

HSaurabh0919/tresta
Tresta contains Heuristics, Reinforcement Learning, Graph based Learning related Implementation
Language: Jupyter Notebook - Size: 1.19 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

FelosRG/Herramientas-Proyecto-Solar
Herramientas y librerÃas para la descarga y manipulación de datos satélitales del GOES-16 y datos de radiación solar. Asà como un script para la generación de datasets con ambos tipos de datos para el entrenamiento de modelos de machine learning.
Language: Jupyter Notebook - Size: 38.4 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

Praveen2795/Basic-Data-Analysis-Projects
This repository will contain multiple Data Analysis projects using Python as a programming language.
Language: Jupyter Notebook - Size: 14.9 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

amandascm/Databases-preprocessing
Widely known databases preprocessing in Python
Language: Python - Size: 2.14 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

marizombie/bing-images-downloader
Simple python app for Bing images download with help of Images Search API and Visual Search API, can be used for datasets preparing
Language: Python - Size: 19.5 KB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 1
