GitHub topics: data-sketches
dynatrace-oss/hash4j
Dynatrace hash library for Java
Language: Java - Size: 37.5 MB - Last synced at: 11 days ago - Pushed at: 11 days ago - Stars: 116 - Forks: 13

andrewmcloud/consimilo
A Clojure library for querying large data-sets on similarity
Language: Clojure - Size: 536 KB - Last synced at: 9 days ago - Pushed at: over 6 years ago - Stars: 65 - Forks: 4

ekzhu/datasketch
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
Language: Python - Size: 5.68 MB - Last synced at: 4 months ago - Pushed at: over 1 year ago - Stars: 2,699 - Forks: 299

oertl/hyperloglog-sketch-estimation-paper
Paper about the estimation of cardinalities from HyperLogLog sketches
Language: TeX - Size: 51.6 MB - Last synced at: 7 days ago - Pushed at: over 4 years ago - Stars: 62 - Forks: 6

Btsan/ApproximateSketch
Approximate Sketches for Join Size Estimation (SIGMOD'24)
Language: Python - Size: 18.1 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 2 - Forks: 0

ikegami-yukino/madoka-python
Memory-efficient Count-Min Sketch Counter (based on Madoka C++ library)
Language: C++ - Size: 231 KB - Last synced at: 2 months ago - Pushed at: over 6 years ago - Stars: 26 - Forks: 2

Shozye/sketcher
Program to test Performance of Data Sketches such as FastExpSketch, QSketch
Language: C++ - Size: 45.9 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

dynatrace-oss/dynahist
DynaHist: A Dynamic Histogram Library for Java
Language: Java - Size: 1.84 MB - Last synced at: 4 months ago - Pushed at: 12 months ago - Stars: 45 - Forks: 9

dynatrace-research/exaloglog-paper
ExaLogLog: Space-Efficient and Practical Approximate Distinct Counting up to the Exa-Scale
Language: Java - Size: 2.27 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 1

dynatrace-research/ultraloglog-paper
UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting
Language: Python - Size: 4.23 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

turu/yalal
Yet Another Lame Algorithm Library
Language: Python - Size: 50.8 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

isarn/isarn-sketches-spark
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Language: Scala - Size: 1.33 MB - Last synced at: 5 months ago - Pushed at: over 1 year ago - Stars: 29 - Forks: 12

erikerlandson/cdf-splining-prototype
A Prototype For Fitting Monotonic Cubic Splines to a Tdigest Sketch
Language: Jupyter Notebook - Size: 1.2 MB - Last synced at: 6 months ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 0

galprz/dns-random-subdomains-ddos-attack
Implementation for - Mitigating DNS random subdomain DDoS attacks by distinct heavy hitters sketches
Language: Jupyter Notebook - Size: 1.11 MB - Last synced at: over 2 years ago - Pushed at: almost 6 years ago - Stars: 8 - Forks: 3

justinfargnoli/simhash
A barebones implementation of the simhash data sketching algorithm.
Language: Go - Size: 7.81 KB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 0
