GitHub topics: near-duplicate-detection
justinbt1/Akin
Python library for detecting near duplicate texts in a corpus at scale.
Language: Python - Size: 2.78 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 9 - Forks: 0
iscc/iscc-specs
ISCC: International Standard Content Code
Language: Python - Size: 6.74 MB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 48 - Forks: 9
kamil-sita/image-copy-finder
Multi module project focused on near-duplicate search for images.
Language: Java - Size: 182 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 1
Logan-Fouts/Thesis
Bachelor's Thesis on Near-Duplicate Image Detection. This repo contains all resources, code, and documentation developed during the process.
Language: Python - Size: 1.15 GB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 0 - Forks: 0
Luis-Varona/shadowseek
A CLI tool for near-duplicate detection in text files, written in Rust with no dependencies on runtime environments.
Language: Rust - Size: 30.3 KB - Last synced at: 4 months ago - Pushed at: 7 months ago - Stars: 1 - Forks: 0
LexTypeC/smlr
A Simple Image Clustering Script using CLIP and Hierarchial Clustering
Language: Python - Size: 25.4 KB - Last synced at: 4 months ago - Pushed at: over 2 years ago - Stars: 37 - Forks: 3
vitali-fedulov/images4
Image similarity in Golang. Version 4 (LATEST)
Language: Go - Size: 890 KB - Last synced at: 11 months ago - Pushed at: over 1 year ago - Stars: 89 - Forks: 10
s-emanuilov/LangVec
Language of Vectors (LangVec) is a simple Python library designed for transforming numerical vector data into a language-like structure using a predefined set of words (lexicon).
Language: Python - Size: 1 MB - Last synced at: 27 days ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0
SasheVuchkov/near-duplicate-docs
Simple library for finding duplicate and near-duplicate text documents in massive sets/libraries/databases
Language: TypeScript - Size: 2 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 9 - Forks: 0
vitali-fedulov/imagehash2
Fast image similarity search with hash tables (Golang). Version 2 (LATEST)
Language: Go - Size: 33.2 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 0
vitali-fedulov/imagehash
Fast image similarity search with hash tables (Golang). Version 1
Language: Go - Size: 43 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 3 - Forks: 1
MaviVestini/ADM-LT_HW1
First homework for the Advance Data Mining course
Language: HTML - Size: 5.91 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 1
santurini/Search-Engine-Evaluation-and-Near-Duplicate-Detection
Exploiting the PyTerrier library to perform Search Engine Evaluation and Near Duplicate Detection on different datasets.
Language: Jupyter Notebook - Size: 267 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0
giulio-derasmo/Search-Engine-Evaluation-and-Near-Duplicate-Detection
Exploiting the PyTerrier library to build a Search Engine and resolve the Near Duplicate Detection tasks.
Language: Jupyter Notebook - Size: 547 KB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 1
sayakpaul/near-dup-parser
Holds code for near-duplicate image parser using optimized image classifiers.
Language: Jupyter Notebook - Size: 6.32 MB - Last synced at: 5 months ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 1