GitHub topics: minwise-hashing
oertl/treeminhash
TreeMinHash: Fast Sketching for Weighted Jaccard Similarity Estimation
Language: C++ - Size: 2.62 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 14 - Forks: 3

dynatrace-research/set-sketch-paper
SetSketch: Filling the Gap between MinHash and HyperLogLog
Language: C++ - Size: 23.7 MB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 46 - Forks: 5

oertl/bagminhash
BagMinHash - Minwise Hashing Algorithm for Weighted Sets
Language: C++ - Size: 1.02 MB - Last synced at: 3 months ago - Pushed at: almost 5 years ago - Stars: 26 - Forks: 6

santurini/Search-Engine-Evaluation-and-Near-Duplicate-Detection
Exploiting the PyTerrier library to perform Search Engine Evaluation and Near Duplicate Detection on different datasets.
Language: Jupyter Notebook - Size: 267 KB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0
