An open API service providing repository metadata for many open source software ecosystems.

Topic: "lsh-algorithm"

ritchie46/lsh-rs

Locality Sensitive Hashing in Rust with Python bindings

Language: Rust - Size: 511 KB - Last synced at: 20 days ago - Pushed at: almost 2 years ago - Stars: 115 - Forks: 21

guofei9987/pyLSHash

Locality Sensitive Hashing, fuzzy-hash, min-hash, simhash, aHash, pHash, dHash。基于 Hash值的图片相似度、文本相似度

Language: Python - Size: 257 KB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 59 - Forks: 6

oertl/probminhash

ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity

Language: C++ - Size: 6.26 MB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 42 - Forks: 6

xiaogp/recsys_faiss

一个基于 fasttext + faiss 的商品内容相关推荐实现,nginx+uwsgi+flask / gunicorn+uvicorn+fastapi 提供api查询接口,增加Spark实现 Ansj+Word2vec+LSH+Phoenix

Language: Python - Size: 41.3 MB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 40 - Forks: 16

AaronYang2333/DSCI_553

USC :v: 2020 Spring DSCI 553 (Foundations and Applications of Data Mining) 数据挖掘基础与应用 Score: :nine::four:

Language: ReScript - Size: 265 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 34 - Forks: 21

Infini-AI-Lab/MagicPIG

MagicPIG: LSH Sampling for Efficient LLM Generation

Language: Python - Size: 54.3 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 27 - Forks: 0

RishabhMaheshwary/query-attack

A Query Efficient Natural Language Attack in a Black Box Setting

Language: Python - Size: 1.67 MB - Last synced at: 10 months ago - Pushed at: over 3 years ago - Stars: 16 - Forks: 4

oertl/treeminhash

TreeMinHash: Fast Sketching for Weighted Jaccard Similarity Estimation

Language: C++ - Size: 2.62 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 14 - Forks: 3

munnafaisal/Deep-Object-Search-With-Hash

Search your object with hash

Language: Python - Size: 10.2 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 12 - Forks: 5

shaltielshmid/MinHashSharp

A Robust Library in C# for Similarity Estimation

Language: C# - Size: 39.1 KB - Last synced at: 13 days ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 1

lehuutrung1412/ImageRetrieval

Build content-based image retrieval system using deep learning, applied some large scale similarity search technicals like Kdtree, LSH, Faiss.

Language: Python - Size: 4.58 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 3

justinfargnoli/lshforest

An implementation of LSH Forrest based off of the following paper (http://infolab.stanford.edu/~bawa/Pub/similarity.pdf).

Language: Go - Size: 29.3 KB - Last synced at: 12 months ago - Pushed at: almost 4 years ago - Stars: 7 - Forks: 1

theatina/CryptoRecommendation

Recommendation System on cryptocurrency, using data collected from users' tweets + 10-Fold Cross Validation ( Based on the cryptocoins from each user's tweets, the program runs algorithms on the data, resulting in the recommendation of other cryptocoins for each user) ( readme in greek but soon to be translated in English )

Language: C - Size: 9.2 MB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 7 - Forks: 0

ZiadSheriif/IntelliQuery

A semantic search indexing system designed to efficiently retrieve top matching results from a database of 20 million documents. Given the embedding of a search query, it quickly identifies and returns the most relevant documents

Language: Jupyter Notebook - Size: 5.84 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 6 - Forks: 4

Alexdruso/ID2222-Data-Mining-Sanvito-Stuart

Lab assignments for the course ID2222-Data Mining at KTH

Language: Roff - Size: 62.1 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 3

muyuuuu/high-performance-LSH

使用线程池的高并发 LSH 算法, C++ 实现

Language: C++ - Size: 47.9 KB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 0

kochlisGit/Big-Data-Algorithms

Implementation of algorithms for big data using python, numpy, pandas.

Language: Python - Size: 28.8 MB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 0

NikolasGialitsis/LSH-and-Cube

LSH and Cube Implementation (Hashing and Querying Points on Higher Dimensions)

Language: C++ - Size: 8.26 MB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 0

aidaLabDEI/LEIT-motifs

Scalable mining of multidimensional time series motifs.

Language: Python - Size: 64.8 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

Mrugank97/KNNavigate

Scaling Up Nearest Neighbor Search : How Dataset Size and Dimensionality Affect KNN Variants

Language: Jupyter Notebook - Size: 1.71 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

eduardosantoshf/most-frequent-itemsets 📦

MDLE First Assignment - The objective of this project was to implement the A-Priori algorithm to obtain the most frequent itemsets for a list of conditions for a large set of patients, obtaining then associations between conditions by extracting some rules, and also to implement and apply LSH to identify similar news articles from a dataset.

Language: Jupyter Notebook - Size: 24.7 MB - Last synced at: about 13 hours ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

SwamiKannan/Natural-Language-Processing-Specialization

Coursera's Natural Language Processing specialization

Language: HTML - Size: 3.68 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

Sitaras/Software-Development-for-Algorithmic-Problems_Project-1

Vectors - Nearest neighbor search and Clustering using LSH, Hypercube (and Lloyd's only at the clustering) algorithms with L2 metric.

Language: C - Size: 15.8 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 1

ludwigfriborg/SwiftNilsimsa

Nilsimsa implementation as a swift package

Language: Swift - Size: 18.6 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

MohammadJavadArdestani/NLP-with-Classification-and-Vector-Spaces

Language: Jupyter Notebook - Size: 9.93 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 1

Vedant2311/Data-Mining-Algorithms

Repository for all assignments of the course COL761: Data Mining (Fall 2020), taught at IIT Delhi

Language: C++ - Size: 4.9 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

FrancescoMonaco/span

Euclidean Minimum Spanning Tree approximation with a parameterless LSH index

Language: C++ - Size: 251 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

akshatrajsaxena/Implementing-LSH

Implementation of LSH in order to find the similarity in a large dataset

Language: Jupyter Notebook - Size: 2.88 MB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

MajaJuri/Analiza-velikih-skupova-podataka

Implementacija algoritama predstavljenih na predmetu Analiza velikih skupova podataka (AVSP)

Language: Java - Size: 1.03 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Lefteris-Souflas/Movie-Rating-User-Similarity

Explored Jaccard distance, Min-Hashing, and LSH for user similarity in a movie rating dataset. Tasks involve dataset preprocessing, exact Jaccard Similarity computation, Min-Hash signatures, and LSH implementation. Results and observations are documented in code, output files, and a report

Language: Jupyter Notebook - Size: 1.22 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

DevPhamPham/NCKH_PySpark

Language: Python - Size: 354 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

leiyunin/Locality-Sensitive-Hashing-and-Collaborative-Filtering-on-Yelp-Data

The assignment comprises two main tasks: implementing LSH to identify similar businesses based on user ratings and developing various collaborative filtering recommendation systems to predict user ratings for businesses.

Language: Python - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Yasar2019/BigData-HW03

Finding similar documents using LSH with MapReduce on multi-node Spark Cluster

Language: Python - Size: 71 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

LM1997610/ADM_HW4

Homework_4 for Algorithmic Methods for Data Mining (ADM), MSc in Data Science at La Sapienza University of Rome

Language: Jupyter Notebook - Size: 3.65 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

C-Ritam98/SimToReal

Unnatural Language Processing

Language: Jupyter Notebook - Size: 518 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

imRP21/Summer-Research-Internship-2022

This repo shows research paper upon which I worked during my summer research intern - 2022.

Size: 12.3 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

santurini/MinHash-LSH-From-Scratch

Implementing a simplified copy of Shazam application from scratch using MinHashing and LSH.

Language: Python - Size: 210 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

mark-antal-csizmadia/finding-similar-items-textually-similar-documents

Finding Similar Items: Textually Similar Documents

Language: Jupyter Notebook - Size: 451 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

pedroalbanese/lshsum

TTAK.KO-12.0276 LSH Recursive Hasher

Language: Go - Size: 23.4 KB - Last synced at: 4 months ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

MenesesGHZ/locality-sensitive-hashing

LSH algorithm made with C++

Language: Makefile - Size: 5.39 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

Sitaras/Software-Development-for-Algorithmic-Problems_Project-2 Fork of giannhskp/Software-Development-for-Algorithmic-Problems_Project-2

📈|Time Series - Nearest neighbor search and Clustering using LSH, Hypercube (and Lloyd's only at the clustering) algorithms with metrics: L2, Discrete and Continuous Fréchet.

Language: C - Size: 33.9 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

julialwang/docuSearch

a Python program that uses LSH (locality-sensitive hashing) to search and retrieve filenames from a csv file that contains similar words to the user's input.

Language: Python - Size: 91.8 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

JaiJaveria/Data_Mining

Projects involving Frequent Itemset Mining and analysis of hierarchical space partitioning techniques

Language: HTML - Size: 203 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

hugofpaiva/mpei-p1 📦

Trabalho Prático da UC de Métodos Probabilísticos para Engenharia Informática, UA 2019/2020

Language: Java - Size: 39.9 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 1

FilipeLopesPires/SpellChecker

SpellChecker: an application to check for spell errors.

Language: Java - Size: 3.54 MB - Last synced at: 14 days ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 1

spyros-briakos/Autoencoder-Dimensionality-Reduction

Autoencoder dimensionality reduction, EMD-Manhattan metrics comparison and classifier based clustering on MNIST dataset.

Language: C++ - Size: 16.2 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

AndreasTraut/Deep_learning_explorations

Example on the Local Sensitive Hashing (LSH) algorithm. Relevant for Big Data

Language: Jupyter Notebook - Size: 118 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

xadityax/Locality-Sensitive-Hashing-DNA-Seqs

Implementing Locality Sensitive Hashing for DNA Sequences.

Language: Python - Size: 1.77 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

Related Topics
lsh 10 lsh-implementation 8 minhash 6 locality-sensitive-hashing 6 clustering 5 minhash-lsh-algorithm 4 data-mining 4 shingling 4 hashing 4 jaccard-similarity 3 cosine-similarity 3 pcy 3 pyspark 3 bloom-filter 3 apriori-algorithm 3 hypercube 3 python 3 faiss 2 similarity 2 similarity-search 2 dimensionality-reduction 2 min-hashing 2 frequent-itemset-mining 2 similar-items 2 jaccard-similarity-estimation 2 r-tree 2 fp-tree 2 recommendation-system 2 nearest-neighbors 2 clustering-algorithm 2 spark 2 cpp 2 gna 1 cluster 1 modularity 1 node-ranking 1 simhash 1 fuzzy-hash 1 imagehash 1 a-priori-algorithm 1 association-rules 1 min-hash 1 logistic-regression 1 naive-bayes-classifier 1 text-process 1 transformation-matrix 1 tweet-analysis 1 l2-distance 1 rust 1 big-data-processing 1 jabeja 1 euclidean-distances 1 range-search 1 data-science 1 linux 1 lloyds 1 kmeansplusplus 1 vectors 1 cube 1 adversarial-attacks 1 dna-sequences 1 cosine-distance 1 deep-neural-networks 1 triest 1 machine-learning 1 nlp 1 clusters 1 collaborative-filtering-algorithm 1 dgim 1 spectral-clustering 1 data-mining-algorithms 1 minhash-sketches 1 sketch 1 hash-algorithm 1 jaccard 1 jaccard-coefficient 1 jaccard-distance 1 jaccard-index 1 locality-sensitive 1 minwise-hashing 1 minwise-hashing-algorithm 1 similarity-measures 1 similarity-metric 1 sketching 1 sketching-algorithm 1 weighted-sets 1 motif-discovery 1 timeseries 1 minimum-spanning-tree 1 deduplication 1 deduplication-filter 1 statistics 1 coursera 1 n-grams 1 natural-language-processing 1 part-of-speech-tagger 1 specialization 1 stochastic-gradient-descent 1 viterbi-algorithm 1 word2vec-algorithm 1