GitHub topics: duplicate-removal
FaNa-AI/preprocessing-Titanic
a Python script for cleaning the Titanic dataset by handling missing values, removing duplicates, and fixing data inconsistencies.
Size: 104 KB - Last synced at: 12 days ago - Pushed at: 12 days ago - Stars: 0 - Forks: 0

Jim-JMCD/DuplicateFF
Duplicate file finder, a small Linux app that produces reports on duplicate and unique fils using sha256 checksum. Input can be one or more directories with optional filters of maximum files size and parts of file names . Output is multiple CSV (spreadsheet) reports that can be used to move or delete duplicates. Linux, Windows - WSL, MSYS2, Gitbash
Size: 31.3 KB - Last synced at: 18 days ago - Pushed at: 19 days ago - Stars: 1 - Forks: 0

harshasrisri/dedup
Remove local files that are duplicates of files in another path
Language: Rust - Size: 103 KB - Last synced at: 1 day ago - Pushed at: 22 days ago - Stars: 1 - Forks: 0

Vasishta03/PixelPerfector
I have developed a working prototype of PixelPerfector. I tested it using sample images from my neighborhood, which included both duplicates and images with poor lighting conditions. I experimented with the thresholds to find the optimal values for accurate detection.
Language: Python - Size: 5.86 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

Killer242/csv-deduper
CSV-Deduper efficiently removes duplicate rows from a given CSV file.
Language: Python - Size: 25.4 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0

MBit08/DoppelFiles
Programa encargado de encontrar duplicados dentro de una o más carpetas designadas, y mover todas las copias a una carpeta destino. Próximamente mejoras en la usabilidad.
Language: Python - Size: 274 KB - Last synced at: 2 months ago - Pushed at: 2 months ago - Stars: 0 - Forks: 0

waived/duplicate-remover
Windows utility that checks folder and sub-folder(s) for all files that contain identical MD5 hash and prep them for deletion
Size: 0 Bytes - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

cr33p1ngp4ck3t/Data-Sweeper
Language: Python - Size: 10.7 KB - Last synced at: 3 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

Kilemonn/Duplicate-File-Remover
A command-line tool that takes input directories and create an output directory containing only unique files from the provided input directories. The files are determined as being unique based on its content hash.
Language: Go - Size: 36.1 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 1 - Forks: 1

bakdata/dedupe
Java DSL for (online) deduplication
Language: Java - Size: 1.01 MB - Last synced at: 2 months ago - Pushed at: 5 months ago - Stars: 20 - Forks: 2

visiuun/Folder-duplicates-deleter
Directory Bulk Duplicate Files Deleter written in python.
Language: Python - Size: 5.86 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 0 - Forks: 0

BaseMax/URLExtractify
URLExtractify is a lightweight, web-based tool to extract URLs from pasted content efficiently. It allows users to remove duplicates, sort URLs alphabetically, and customize URL extraction preferences through a simple, intuitive interface.
Language: JavaScript - Size: 103 KB - Last synced at: 3 days ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

softonus-io/prettier-plugin-duplicate-remover
A Prettier plugin that removes duplicate class names in class and className attributes, ensuring cleaner, more efficient code in frontend projects like React, Vue.js, and Angular.
Language: JavaScript - Size: 11.7 KB - Last synced at: about 2 months ago - Pushed at: 7 months ago - Stars: 2 - Forks: 1

VMC10/Simple-Duplicate-Cleaner
A simple app written in Python to delete duplicate files
Language: Python - Size: 2.93 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

JakubJanowski/File-Duplicate-Finder
A small desktop application to help you organize disk space and backup folders by searching duplicated files by their content.
Language: C# - Size: 8.26 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

seanima9/SpotifyToYoutubeMusic
Converts Spotify playlists to YouTube playlists using their respective API's
Language: Python - Size: 75.2 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

danieldotwav/Remove-Duplicates-From-Sorted-Array-I-and-II
This Java program efficiently removes duplicate elements from a sorted array in-place, ensuring the original order of elements is maintained. It's designed to optimize space and time complexity while handling various array scenarios, including empty arrays and arrays with consecutive or non-consecutive duplicates.
Language: Java - Size: 4.88 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

Robson-Teixeira/java-design-patterns-I-loja
Repositório do curso Jornada do Conhecimento de Back-End Java (Nível Intermediário) - Design Patterns em Java I: boas práticas de programação da plataforma Alura.
Language: Java - Size: 25.4 KB - Last synced at: 12 days ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

EmreCanKURAN/DuplicateFileRemover
Removes the duplicate files.
Language: Python - Size: 4.88 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 1

Demez/duplicate_file_finder
Language: Python - Size: 18.6 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

cryham/py-dirt
Python command line tool to find and delete duplicated files. Also to rename with added rating prefix. Has few options e.g. across subdirs, using hash, stats etc.
Language: Python - Size: 80.1 KB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 1 - Forks: 0

mdhasnainali/Duplicate-Files-Finder
Discover and eliminate duplicate files effortlessly with my Duplicate Files Finder project. This Python-based tool efficiently scans any directory, pinpointing identical files to save disk space. Organize your data hassle-free and optimize storage utilization with this intuitive utility.
Language: Python - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

CCBR/spacesavers
Optimizing diskspace on Biowulf
Language: Python - Size: 807 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 5 - Forks: 0

BloodyToolzz/Bloody-Duplicates-Remover
Simple duplicates remover for text files with too much text.
Language: Python - Size: 2.93 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

Bonniface/CleanData
New way of Cleaning Data in R.
Language: R - Size: 13 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

alessandrobelottidev/DuplicatesCleaner
A simple, but yet fast and powerful software to remove any kind of duplicated files in a folder (+ subfolders) written in Python. This repository includes the script + the GUI for better user experience.
Language: Python - Size: 41 KB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 0

yolona-oss/rm-clones
Small dups remover based on sha256 algorithm
Language: C - Size: 47.9 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

SSbit01/Duplicate-Remover
It's a simple Python program that searches and deletes all duplicate files in a folder
Language: Python - Size: 3.91 KB - Last synced at: over 2 years ago - Pushed at: almost 4 years ago - Stars: 0 - Forks: 0

silicontrip/yajdf
yet another java duplicates finder
Language: Java - Size: 29.3 KB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

chaitanya100100/B-Trees-Duplicate-Removal
Duplicate Removal from Database Relation using B-Trees and Hashing
Language: Python - Size: 51.8 KB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

dhruvraj-singh-rawat/Duplito
A Machine learning model that can Identify Variation in names and identifying a unique person and hence solve deduplication of records comming from multiple sources
Language: Jupyter Notebook - Size: 1.41 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0
