Topic: "lsh-algorithm"
ritchie46/lsh-rs
Locality Sensitive Hashing in Rust with Python bindings
Language: Rust - Size: 511 KB - Last synced at: 20 days ago - Pushed at: almost 2 years ago - Stars: 115 - Forks: 21

guofei9987/pyLSHash
Locality Sensitive Hashing, fuzzy-hash, min-hash, simhash, aHash, pHash, dHash。基于 Hash值的图片相似度、文本相似度
Language: Python - Size: 257 KB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 59 - Forks: 6

oertl/probminhash
ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity
Language: C++ - Size: 6.26 MB - Last synced at: about 2 months ago - Pushed at: over 4 years ago - Stars: 42 - Forks: 6

xiaogp/recsys_faiss
一个基于 fasttext + faiss 的商品内容相关推荐实现,nginx+uwsgi+flask / gunicorn+uvicorn+fastapi 提供api查询接口,增加Spark实现 Ansj+Word2vec+LSH+Phoenix
Language: Python - Size: 41.3 MB - Last synced at: over 2 years ago - Pushed at: almost 3 years ago - Stars: 40 - Forks: 16

AaronYang2333/DSCI_553
USC :v: 2020 Spring DSCI 553 (Foundations and Applications of Data Mining) 数据挖掘基础与应用 Score: :nine::four:
Language: ReScript - Size: 265 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 34 - Forks: 21

Infini-AI-Lab/MagicPIG
MagicPIG: LSH Sampling for Efficient LLM Generation
Language: Python - Size: 54.3 MB - Last synced at: 8 months ago - Pushed at: 8 months ago - Stars: 27 - Forks: 0

RishabhMaheshwary/query-attack
A Query Efficient Natural Language Attack in a Black Box Setting
Language: Python - Size: 1.67 MB - Last synced at: 10 months ago - Pushed at: over 3 years ago - Stars: 16 - Forks: 4

oertl/treeminhash
TreeMinHash: Fast Sketching for Weighted Jaccard Similarity Estimation
Language: C++ - Size: 2.62 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 14 - Forks: 3

munnafaisal/Deep-Object-Search-With-Hash
Search your object with hash
Language: Python - Size: 10.2 MB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 12 - Forks: 5

shaltielshmid/MinHashSharp
A Robust Library in C# for Similarity Estimation
Language: C# - Size: 39.1 KB - Last synced at: 13 days ago - Pushed at: over 1 year ago - Stars: 8 - Forks: 1

lehuutrung1412/ImageRetrieval
Build content-based image retrieval system using deep learning, applied some large scale similarity search technicals like Kdtree, LSH, Faiss.
Language: Python - Size: 4.58 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 8 - Forks: 3

justinfargnoli/lshforest
An implementation of LSH Forrest based off of the following paper (http://infolab.stanford.edu/~bawa/Pub/similarity.pdf).
Language: Go - Size: 29.3 KB - Last synced at: 12 months ago - Pushed at: almost 4 years ago - Stars: 7 - Forks: 1

theatina/CryptoRecommendation
Recommendation System on cryptocurrency, using data collected from users' tweets + 10-Fold Cross Validation ( Based on the cryptocoins from each user's tweets, the program runs algorithms on the data, resulting in the recommendation of other cryptocoins for each user) ( readme in greek but soon to be translated in English )
Language: C - Size: 9.2 MB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 7 - Forks: 0

ZiadSheriif/IntelliQuery
A semantic search indexing system designed to efficiently retrieve top matching results from a database of 20 million documents. Given the embedding of a search query, it quickly identifies and returns the most relevant documents
Language: Jupyter Notebook - Size: 5.84 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 6 - Forks: 4

Alexdruso/ID2222-Data-Mining-Sanvito-Stuart
Lab assignments for the course ID2222-Data Mining at KTH
Language: Roff - Size: 62.1 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 3

muyuuuu/high-performance-LSH
使用线程池的高并发 LSH 算法, C++ 实现
Language: C++ - Size: 47.9 KB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 0

kochlisGit/Big-Data-Algorithms
Implementation of algorithms for big data using python, numpy, pandas.
Language: Python - Size: 28.8 MB - Last synced at: over 2 years ago - Pushed at: about 5 years ago - Stars: 2 - Forks: 0

NikolasGialitsis/LSH-and-Cube
LSH and Cube Implementation (Hashing and Querying Points on Higher Dimensions)
Language: C++ - Size: 8.26 MB - Last synced at: almost 2 years ago - Pushed at: over 6 years ago - Stars: 2 - Forks: 0

aidaLabDEI/LEIT-motifs
Scalable mining of multidimensional time series motifs.
Language: Python - Size: 64.8 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

Mrugank97/KNNavigate
Scaling Up Nearest Neighbor Search : How Dataset Size and Dimensionality Affect KNN Variants
Language: Jupyter Notebook - Size: 1.71 MB - Last synced at: 12 months ago - Pushed at: 12 months ago - Stars: 1 - Forks: 0

eduardosantoshf/most-frequent-itemsets 📦
MDLE First Assignment - The objective of this project was to implement the A-Priori algorithm to obtain the most frequent itemsets for a list of conditions for a large set of patients, obtaining then associations between conditions by extracting some rules, and also to implement and apply LSH to identify similar news articles from a dataset.
Language: Jupyter Notebook - Size: 24.7 MB - Last synced at: about 13 hours ago - Pushed at: almost 2 years ago - Stars: 1 - Forks: 0

SwamiKannan/Natural-Language-Processing-Specialization
Coursera's Natural Language Processing specialization
Language: HTML - Size: 3.68 MB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

Sitaras/Software-Development-for-Algorithmic-Problems_Project-1
Vectors - Nearest neighbor search and Clustering using LSH, Hypercube (and Lloyd's only at the clustering) algorithms with L2 metric.
Language: C - Size: 15.8 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 1

ludwigfriborg/SwiftNilsimsa
Nilsimsa implementation as a swift package
Language: Swift - Size: 18.6 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

MohammadJavadArdestani/NLP-with-Classification-and-Vector-Spaces
Language: Jupyter Notebook - Size: 9.93 MB - Last synced at: over 2 years ago - Pushed at: about 4 years ago - Stars: 1 - Forks: 1

Vedant2311/Data-Mining-Algorithms
Repository for all assignments of the course COL761: Data Mining (Fall 2020), taught at IIT Delhi
Language: C++ - Size: 4.9 MB - Last synced at: over 2 years ago - Pushed at: over 4 years ago - Stars: 1 - Forks: 0

FrancescoMonaco/span
Euclidean Minimum Spanning Tree approximation with a parameterless LSH index
Language: C++ - Size: 251 KB - Last synced at: 19 days ago - Pushed at: 19 days ago - Stars: 0 - Forks: 0

akshatrajsaxena/Implementing-LSH
Implementation of LSH in order to find the similarity in a large dataset
Language: Jupyter Notebook - Size: 2.88 MB - Last synced at: 2 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

MajaJuri/Analiza-velikih-skupova-podataka
Implementacija algoritama predstavljenih na predmetu Analiza velikih skupova podataka (AVSP)
Language: Java - Size: 1.03 MB - Last synced at: 2 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Lefteris-Souflas/Movie-Rating-User-Similarity
Explored Jaccard distance, Min-Hashing, and LSH for user similarity in a movie rating dataset. Tasks involve dataset preprocessing, exact Jaccard Similarity computation, Min-Hash signatures, and LSH implementation. Results and observations are documented in code, output files, and a report
Language: Jupyter Notebook - Size: 1.22 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

DevPhamPham/NCKH_PySpark
Language: Python - Size: 354 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

leiyunin/Locality-Sensitive-Hashing-and-Collaborative-Filtering-on-Yelp-Data
The assignment comprises two main tasks: implementing LSH to identify similar businesses based on user ratings and developing various collaborative filtering recommendation systems to predict user ratings for businesses.
Language: Python - Size: 6.84 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

Yasar2019/BigData-HW03
Finding similar documents using LSH with MapReduce on multi-node Spark Cluster
Language: Python - Size: 71 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

LM1997610/ADM_HW4
Homework_4 for Algorithmic Methods for Data Mining (ADM), MSc in Data Science at La Sapienza University of Rome
Language: Jupyter Notebook - Size: 3.65 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

C-Ritam98/SimToReal
Unnatural Language Processing
Language: Jupyter Notebook - Size: 518 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

imRP21/Summer-Research-Internship-2022
This repo shows research paper upon which I worked during my summer research intern - 2022.
Size: 12.3 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

santurini/MinHash-LSH-From-Scratch
Implementing a simplified copy of Shazam application from scratch using MinHashing and LSH.
Language: Python - Size: 210 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

mark-antal-csizmadia/finding-similar-items-textually-similar-documents
Finding Similar Items: Textually Similar Documents
Language: Jupyter Notebook - Size: 451 KB - Last synced at: over 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

pedroalbanese/lshsum
TTAK.KO-12.0276 LSH Recursive Hasher
Language: Go - Size: 23.4 KB - Last synced at: 4 months ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

MenesesGHZ/locality-sensitive-hashing
LSH algorithm made with C++
Language: Makefile - Size: 5.39 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

Sitaras/Software-Development-for-Algorithmic-Problems_Project-2 Fork of giannhskp/Software-Development-for-Algorithmic-Problems_Project-2
📈|Time Series - Nearest neighbor search and Clustering using LSH, Hypercube (and Lloyd's only at the clustering) algorithms with metrics: L2, Discrete and Continuous Fréchet.
Language: C - Size: 33.9 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

julialwang/docuSearch
a Python program that uses LSH (locality-sensitive hashing) to search and retrieve filenames from a csv file that contains similar words to the user's input.
Language: Python - Size: 91.8 KB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

JaiJaveria/Data_Mining
Projects involving Frequent Itemset Mining and analysis of hierarchical space partitioning techniques
Language: HTML - Size: 203 MB - Last synced at: over 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

hugofpaiva/mpei-p1 📦
Trabalho Prático da UC de Métodos Probabilísticos para Engenharia Informática, UA 2019/2020
Language: Java - Size: 39.9 MB - Last synced at: about 1 year ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 1

FilipeLopesPires/SpellChecker
SpellChecker: an application to check for spell errors.
Language: Java - Size: 3.54 MB - Last synced at: 14 days ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 1

spyros-briakos/Autoencoder-Dimensionality-Reduction
Autoencoder dimensionality reduction, EMD-Manhattan metrics comparison and classifier based clustering on MNIST dataset.
Language: C++ - Size: 16.2 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

AndreasTraut/Deep_learning_explorations
Example on the Local Sensitive Hashing (LSH) algorithm. Relevant for Big Data
Language: Jupyter Notebook - Size: 118 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

xadityax/Locality-Sensitive-Hashing-DNA-Seqs
Implementing Locality Sensitive Hashing for DNA Sequences.
Language: Python - Size: 1.77 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0
