GitHub topics: minwise-hashing
oertl/treeminhash
TreeMinHash: Fast Sketching for Weighted Jaccard Similarity Estimation
Language: C++ - Size: 2.62 MB - Last synced at: 10 days ago - Pushed at: about 2 years ago - Stars: 14 - Forks: 3

dynatrace-research/set-sketch-paper
SetSketch: Filling the Gap between MinHash and HyperLogLog
Language: C++ - Size: 23.7 MB - Last synced at: 12 months ago - Pushed at: over 3 years ago - Stars: 46 - Forks: 5

oertl/bagminhash
BagMinHash - Minwise Hashing Algorithm for Weighted Sets
Language: C++ - Size: 1.02 MB - Last synced at: 10 days ago - Pushed at: over 4 years ago - Stars: 26 - Forks: 6

santurini/Search-Engine-Evaluation-and-Near-Duplicate-Detection
Exploiting the PyTerrier library to perform Search Engine Evaluation and Near Duplicate Detection on different datasets.
Language: Jupyter Notebook - Size: 267 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0
