GitHub topics: data-sketching
isarn/isarn-sketches
Sketching data structures for scala, including t-digest
Language: Scala - Size: 1.32 MB - Last synced at: 21 days ago - Pushed at: over 3 years ago - Stars: 15 - Forks: 5

andrewmcloud/consimilo
A Clojure library for querying large data-sets on similarity
Language: Clojure - Size: 536 KB - Last synced at: 8 days ago - Pushed at: about 6 years ago - Stars: 63 - Forks: 4

dynatrace-research/exaloglog-paper
ExaLogLog: Space-Efficient and Practical Approximate Distinct Counting up to the Exa-Scale
Language: Java - Size: 2.27 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 8 - Forks: 1

dynatrace-research/ultraloglog-paper
UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting
Language: Python - Size: 4.23 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

isarn/isarn-sketches-spark
Routines and data structures for using isarn-sketches idiomatically in Apache Spark
Language: Scala - Size: 1.33 MB - Last synced at: 13 days ago - Pushed at: 11 months ago - Stars: 29 - Forks: 12

sanxore/spark-theta-sketch-udfs
This project aims to use Yahoo Theta Sketch api as Spark sql UDFs
Language: Scala - Size: 9.77 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 3 - Forks: 0

isarn/isarn-sketches-algebird-api
Type-classes to interface isarn-sketches with Algebird
Language: Scala - Size: 301 KB - Last synced at: about 2 months ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 2

justinfargnoli/simhash
A barebones implementation of the simhash data sketching algorithm.
Language: Go - Size: 7.81 KB - Last synced at: 10 months ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0
