GitHub topics: minhash-similarity
beowolx/rensa
High-performance MinHash implementation in Rust with Python bindings for efficient similarity estimation and deduplication of large datasets
Language: Python - Size: 112 KB - Last synced at: 20 days ago - Pushed at: 20 days ago - Stars: 80 - Forks: 9

shreyansh26/MinHash-Implemenation
A simple MinHash implementation based on the explanation in the Mining of Massive Datasets course by Stanford
Language: Python - Size: 7.4 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

jio-gl/datahipsters
DataHipsters is a service implementing MinHash similarity on a Key-Value Database (Google AppEngine/GCloud), including an API for k-nearest neighbors (k-nn) used in Online Recommender Systems.
Language: Python - Size: 4.04 MB - Last synced at: 9 months ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

Barb02/MPEI_MovieLensProject
Interactive program to get similar users and movies
Language: MATLAB - Size: 15.8 MB - Last synced at: 12 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

dynatrace-research/set-sketch-paper
SetSketch: Filling the Gap between MinHash and HyperLogLog
Language: C++ - Size: 23.7 MB - Last synced at: 12 months ago - Pushed at: over 3 years ago - Stars: 46 - Forks: 5

AIn0n/FMHD
Fast MinHash Distances algorithms collection
Language: C++ - Size: 288 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 1
