An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-sketching

isarn/isarn-sketches

Sketching data structures for scala, including t-digest

Language: Scala - Size: 1.32 MB - Last synced at: 21 days ago - Pushed at: over 3 years ago - Stars: 15 - Forks: 5

andrewmcloud/consimilo

A Clojure library for querying large data-sets on similarity

Language: Clojure - Size: 536 KB - Last synced at: 8 days ago - Pushed at: about 6 years ago - Stars: 63 - Forks: 4

dynatrace-research/exaloglog-paper

ExaLogLog: Space-Efficient and Practical Approximate Distinct Counting up to the Exa-Scale

Language: Java - Size: 2.27 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 8 - Forks: 1

dynatrace-research/ultraloglog-paper

UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting

Language: Python - Size: 4.23 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

isarn/isarn-sketches-spark

Routines and data structures for using isarn-sketches idiomatically in Apache Spark

Language: Scala - Size: 1.33 MB - Last synced at: 13 days ago - Pushed at: 11 months ago - Stars: 29 - Forks: 12

sanxore/spark-theta-sketch-udfs

This project aims to use Yahoo Theta Sketch api as Spark sql UDFs

Language: Scala - Size: 9.77 KB - Last synced at: over 1 year ago - Pushed at: about 6 years ago - Stars: 3 - Forks: 0

isarn/isarn-sketches-algebird-api

Type-classes to interface isarn-sketches with Algebird

Language: Scala - Size: 301 KB - Last synced at: about 2 months ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 2

justinfargnoli/simhash

A barebones implementation of the simhash data sketching algorithm.

Language: Go - Size: 7.81 KB - Last synced at: 10 months ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0