Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: data-sketches

ekzhu/datasketch

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW

Language: Python - Size: 5.68 MB - Last synced: 5 days ago - Pushed: 7 days ago - Stars: 2,381 - Forks: 289

dynatrace-oss/dynahist

DynaHist: A Dynamic Histogram Library for Java

Language: Java - Size: 1.82 MB - Last synced: 9 days ago - Pushed: 9 days ago - Stars: 41 - Forks: 7

dynatrace-oss/hash4j

Dynatrace hash library for Java

Language: Java - Size: 40.4 MB - Last synced: 2 months ago - Pushed: 2 months ago - Stars: 73 - Forks: 9

dynatrace-research/exaloglog-paper

ExaLogLog: Space-Efficient and Practical Approximate Distinct Counting up to the Exa-Scale

Language: Java - Size: 2.27 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 8 - Forks: 1

dynatrace-research/ultraloglog-paper

UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting

Language: Python - Size: 4.23 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0

turu/yalal

Yet Another Lame Algorithm Library

Language: Python - Size: 50.8 KB - Last synced: about 2 months ago - Pushed: almost 2 years ago - Stars: 2 - Forks: 0

ikegami-yukino/madoka-python

Memory-efficient Count-Min Sketch Counter (based on Madoka C++ library)

Language: C++ - Size: 231 KB - Last synced: about 1 month ago - Pushed: over 5 years ago - Stars: 25 - Forks: 2

justinfargnoli/simhash

A barebones implementation of the simhash data sketching algorithm.

Language: Go - Size: 7.81 KB - Last synced: 11 months ago - Pushed: almost 3 years ago - Stars: 1 - Forks: 0

isarn/isarn-sketches-spark

Routines and data structures for using isarn-sketches idiomatically in Apache Spark

Language: Scala - Size: 1.33 MB - Last synced: 11 months ago - Pushed: over 1 year ago - Stars: 30 - Forks: 12

andrewmcloud/consimilo

A Clojure library for querying large data-sets on similarity

Language: Clojure - Size: 536 KB - Last synced: about 1 month ago - Pushed: over 5 years ago - Stars: 62 - Forks: 4

oertl/hyperloglog-sketch-estimation-paper

Paper about the estimation of cardinalities from HyperLogLog sketches

Language: TeX - Size: 51.6 MB - Last synced: over 1 year ago - Pushed: about 3 years ago - Stars: 51 - Forks: 4

galprz/dns-random-subdomains-ddos-attack

Implementation for - Mitigating DNS random subdomain DDoS attacks by distinct heavy hitters sketches

Language: Jupyter Notebook - Size: 1.11 MB - Last synced: over 1 year ago - Pushed: over 4 years ago - Stars: 8 - Forks: 3

erikerlandson/cdf-splining-prototype

A Prototype For Fitting Monotonic Cubic Splines to a Tdigest Sketch

Language: Jupyter Notebook - Size: 1.2 MB - Last synced: over 1 year ago - Pushed: over 5 years ago - Stars: 1 - Forks: 0

Related Keywords
data-sketches 13 hyperloglog 6 data-sketching 5 cardinality-estimation 4 data-structures 3 minhash 3 java 2 stream-processing 2 count-distinct 2 simhash 2 scala 2 probabilistic-data-structures 2 t-digest 2 python 2 lsh-forest 2 lsh 2 jaccard-similarity 2 sketching-algorithm 1 pyspark 1 spark 1 splines 1 spark-ml 1 feature-importance 1 datasets 1 dataset 1 dataframes 1 dataframe 1 apache-spark 1 aggregator 1 golang 1 go 1 python-wrapper 1 memory-efficient 1 consistent-hashing 1 spline-interpolation 1 monotonic-splines 1 density-functions 1 cumulative-distribution-function 1 mirai-bot 1 mirai 1 heavy-hitters 1 dns 1 ddos-attacks 1 sketch-data-structures 1 hyperloglog-sketches 1 similarity-search 1 similarity 1 recommender-system 1 plagiarism-detection 1 minhash-lsh-algorithm 1 hamming-distance 1 document-similarity 1 cosine-distance 1 collaborative-filtering 1 clojure 1 variable-importance 1 udaf 1 sketches 1 quantiles 1 quantile-estimation 1 quantile 1 order-statistics 1 memory-efficiency 1 histogram-library 1 histogram 1 hdrhistogram 1 dynamic-allocation 1 ddsketch 1 compression-algorithm 1 approximation-algorithms 1 weighted-quantiles 1 top-k 1 search 1 lsh-ensemble 1 locality-sensitive-hashing 1 hnsw 1 data-summary 1 counter 1 machine-learning-algorithms 1 cuckoo-filter 1 bloom-filter 1 algorithm-library 1 ultraloglog 1 cardinality 1 sketch 1 hll-algorithm 1 wyhash 1 superminhash 1 streaming-algorithms 1 non-cryptographic-hash-functions 1 murmur3 1 jumphash 1 imohash 1 hashing-algorithm 1 hash-functions 1 hash-algorithm 1 hash 1 farmhash 1