An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: cardinality-estimation

dynatrace-oss/hash4j

Dynatrace hash library for Java

Language: Java - Size: 37 MB - Last synced at: about 18 hours ago - Pushed at: about 19 hours ago - Stars: 103 - Forks: 11

Wind-Gone/awesome-ai4db-paper

Paper related to AI4DB techniques

Size: 114 KB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 60 - Forks: 6

koykov/pbtk

Probabilistic data structures toolkit.

Language: Go - Size: 698 KB - Last synced at: 8 days ago - Pushed at: 8 days ago - Stars: 0 - Forks: 0

mikeheddes/fast-multi-join-sketch

Fast Cardinality Estimation of Multi-Join Queries Using Sketches

Language: Python - Size: 23 MB - Last synced at: 10 days ago - Pushed at: about 1 year ago - Stars: 16 - Forks: 1

bcgsc/ntCard

Estimating k-mer coverage histogram of genomics data

Language: C++ - Size: 1.24 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 78 - Forks: 9

cloudflare/cardinality-estimator

A crate for estimating the cardinality of distinct elements in a stream or dataset.

Language: Rust - Size: 571 KB - Last synced at: 16 days ago - Pushed at: 2 months ago - Stars: 20 - Forks: 5

ascv/HyperLogLog

Fast HyperLogLog for Python.

Language: C - Size: 306 KB - Last synced at: 11 days ago - Pushed at: 3 months ago - Stars: 104 - Forks: 19

Wind-Gone/ai4db-datasets

Datasets Used in AI4DB Research Work

Size: 19.5 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 6 - Forks: 0

oertl/hyperloglog-sketch-estimation-paper

Paper about the estimation of cardinalities from HyperLogLog sketches

Language: TeX - Size: 51.6 MB - Last synced at: 24 days ago - Pushed at: almost 4 years ago - Stars: 62 - Forks: 6

jlumbroso/affirmative-sampling

Reference implementation of the Affirmative Sampling algorithm by Jérémie Lumbroso and Conrado Martínez (2022). 🍀

Language: Python - Size: 794 KB - Last synced at: 11 days ago - Pushed at: almost 3 years ago - Stars: 5 - Forks: 0

DylanPenney/Huawei-European-University-Challenge-2024 📦

Submission for the hackathon Huawei European University Challenge 2024 - Fundamental Software.      Place: 15th

Language: Makefile - Size: 413 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

LiveRamp/HyperMinHash-java

Union, intersection, and set cardinality in loglog space

Language: Java - Size: 572 KB - Last synced at: 26 days ago - Pushed at: almost 2 years ago - Stars: 56 - Forks: 10

sykwon/teddy-dream

[VLDB'22] Cardinality Estimation of Approximate Substring Queries using Deep Learning.

Language: Python - Size: 104 MB - Last synced at: 29 days ago - Pushed at: over 1 year ago - Stars: 9 - Forks: 1

omung789/TechArena-Track-2

Submission for Huawei European University Challenge 2024 Team: Knee Surgery Place:

Language: C++ - Size: 499 KB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

outbrain-inc/outrank

A Python library for efficient feature ranking and selection on sparse data sets.

Language: Python - Size: 2.83 MB - Last synced at: 5 months ago - Pushed at: 5 months ago - Stars: 19 - Forks: 3

pagegitss/UAE

A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation

Language: Python - Size: 11.3 MB - Last synced at: 6 months ago - Pushed at: over 3 years ago - Stars: 25 - Forks: 10

wurenzhi/learned_ndv_estimator

Learned model to estimate number of distinct values (NDV) of a population using a small sample.

Language: Python - Size: 1.22 MB - Last synced at: 6 days ago - Pushed at: 8 months ago - Stars: 7 - Forks: 1

alibaba/pilotscope

PilotScope is a middleware to bridge the gaps of deploying AI4DB (Artificial Intelligence for Databases) algorithms into actual database systems.

Language: Python - Size: 122 MB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 140 - Forks: 17

preciz/talan

Probabilistic data structures (bloom filter / counting bloom filter / linear counter)

Language: Elixir - Size: 90.8 KB - Last synced at: 12 days ago - Pushed at: 9 months ago - Stars: 6 - Forks: 2

Asoke26/Simpli-Squared

Simpli-Squared is a statistics-free join ordering algorithm Without Cardinality Estimates.

Language: Python - Size: 171 KB - Last synced at: 10 months ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 0

Sana0124/EDA-Feature-Engineering

Exploratory data analysis of 2 datasets

Language: Jupyter Notebook - Size: 5.17 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

OnizukaLab/nar-cardest

Robust Cardinality Estimator by Non-autoregressive Model

Language: Python - Size: 157 MB - Last synced at: 11 months ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

dynatrace-research/exaloglog-paper

ExaLogLog: Space-Efficient and Practical Approximate Distinct Counting up to the Exa-Scale

Language: Java - Size: 2.27 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 8 - Forks: 1

dynatrace-research/ultraloglog-paper

UltraLogLog: A Practical and More Space-Efficient Alternative to HyperLogLog for Approximate Distinct Counting

Language: Python - Size: 4.23 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 0 - Forks: 0

MatejaMaric/kafka-go-cardinality

Estimating cardinality for a data stream using Go and Apache Kafka

Language: Go - Size: 43.9 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 1

dynatrace-research/set-sketch-paper

SetSketch: Filling the Gap between MinHash and HyperLogLog

Language: C++ - Size: 23.7 MB - Last synced at: 12 months ago - Pushed at: over 3 years ago - Stars: 46 - Forks: 5

DataManagementLab/deepdb-public

Implementation of DeepDB: Learn from Data, not from Queries!

Language: Python - Size: 2.84 MB - Last synced at: about 1 year ago - Pushed at: over 2 years ago - Stars: 83 - Forks: 25

OnizukaLab/Scardina

Scalable Join Cardinality Estimaitor

Language: Python - Size: 384 KB - Last synced at: 11 months ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 2

asjadsyed/AnalyticsMesh

Distributed Cardinality Tracking

Language: Python - Size: 20.5 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

ZhengtongYan/CEB Fork of learnedsystems/CEB

Cardinality Estimation Benchmark

Language: Python - Size: 3.86 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Wind-Gone/STAT-Ai4CardinalityEstimation-CodeBase

Current STAT Learning-based Cardinality Estimation Code Base

Language: C - Size: 13.7 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 4 - Forks: 0

for0nething/FACE-A-Normalizing-Flow-based-Cardinality-Estimator

A pytorch implementation for FACE: A Normalizing Flow based Cardinality Estimator

Language: Python - Size: 6.84 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 17 - Forks: 4

anthonysyk/go-cardinality

go-cardinality is a Go library that calculates the cardinality and distinct count of values in a dataset, providing efficient and accurate estimations.

Language: Go - Size: 261 KB - Last synced at: 10 months ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

darkLord19/hyperloglog

HyperLogLog implementation in Go.

Language: Go - Size: 1.95 KB - Last synced at: 3 days ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

surajiyer/Database-SQL-joins-algorithm

An implementation of the algorithms presented in the paper "Cardinality Estimation Done Right: Index-Based Join Sampling"

Language: Python - Size: 59.6 KB - Last synced at: 9 days ago - Pushed at: about 8 years ago - Stars: 9 - Forks: 6

ethantrott/hyperloglog-estimation

python implementations of the Flajolet-Martin, LogLog, SuperLogLog, and HyperLogLog cardinality estimation algorithms, specifically used to estimate the cardinality of unique traffic violations in NYC in the 2019 fiscal year

Language: Python - Size: 94.7 KB - Last synced at: 9 months ago - Pushed at: over 5 years ago - Stars: 4 - Forks: 2

neurocard/neurocard

State-of-the-art neural cardinality estimators for join queries

Language: Python - Size: 52.2 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 51 - Forks: 18

ZhengtongYan/sql-generation Fork of zhouxh19/sql-generation

Size: 452 KB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 1

naru-project/naru

Neural Relation Understanding: neural cardinality estimators for tabular data

Language: Python - Size: 54.7 KB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 76 - Forks: 25

esalini22/gene-hll

HyperLogLog en C++ y OpenMP para cálculo de similitud de genomas mediante índice de Jaccard

Language: C++ - Size: 185 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 0

waltercai/pqo-opensource

Language: C - Size: 30.8 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 20 - Forks: 5

lucaswo/cardest

Code for Local Deep Learning Models for Cardinality Estimation

Language: Python - Size: 2.87 MB - Last synced at: about 2 years ago - Pushed at: about 3 years ago - Stars: 8 - Forks: 2

lucaswo/local-cardinality-estimation

Cardinality estimation with local models

Language: Python - Size: 2.82 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

martinkiefer/feedback-kde

Self-Tuning GPU-Accelerated Kernel Density Estimators

Language: C - Size: 28.9 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 2

var-skip/var-skip

Code for variable skipping ICML 2020 paper

Language: Python - Size: 102 KB - Last synced at: about 2 years ago - Pushed at: almost 5 years ago - Stars: 6 - Forks: 3

Kyziridis/Probabilistic-Counting

Some Algoithms to Count Unique Elements

Language: Python - Size: 432 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

Related Keywords
cardinality-estimation 46 hyperloglog 14 learned-database 6 query-optimization 6 machine-learning 6 database 6 cardinality 5 deep-generative-model 4 learned-database-components 4 postgresql 4 ai4db 4 data-sketches 4 minhash 3 cpp 3 count-distinct 3 streaming-algorithms 3 self-supervised-learning 3 density-estimation 3 hyperloglog-sketches 3 query-optimizer 3 deep-learning 3 transformers 2 autoregressive-neural-networks 2 jaccard-similarity-estimation 2 jaccard-similarity 2 loglog 2 python 2 probabilistic-programming 2 unsupervised-learning 2 sketch-data-structures 2 java 2 go 2 probabilistic-data-structures 2 sketch 2 data-sketching 2 approximate-query-processing 1 sketch-algorithm 1 minwise-hashing-algorithm 1 farmhash 1 sum-product-networks 1 crdt 1 benchmarking 1 experiment 1 papercode 1 minwise-hashing 1 minhash-sketches 1 minhash-similarity 1 minhash-lsh-algorithm 1 locality-sensitive-hashing 1 jaccard 1 intersection 1 inclusion-exclusion 1 estimation 1 cosine-similarity 1 streams 1 kafka 1 golang 1 ultraloglog 1 hll-algorithm 1 jaccard-distance 1 jaccard-index 1 kmer 1 openmp 1 parallel-computing 1 parallel-programming 1 stream-processing 1 agm-bound 1 entropic-bound 1 sigmod 1 c 1 opencl 1 optimization 1 pattern-matching 1 loglog-counting 1 numpy 1 trailing-zero 1 consistent-hashing 1 normalizing-flow 1 count 1 dimension-table 1 enum 1 joins 1 pandas-dataframe 1 sql 1 flajolet-martin 1 python-implementations 1 superloglog 1 data-sampling 1 ml-for-systems 1 probabilistic-models 1 cost-estimation 1 generative-model 1 genomics 1 jumphash 1 k-mer-counting 1 k-mer-frequency 1 distinct-elements 1 probalistic-data-structures 1 sketches 1 hashing-algorithm 1