Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: lsh-algorithm

MajaJuri/Analiza-velikih-skupova-podataka

Implementacija algoritama predstavljenih na predmetu Analiza velikih skupova podataka (AVSP)

Language: Java - Size: 1.03 MB - Last synced: 4 days ago - Pushed: 4 days ago - Stars: 0 - Forks: 0

guofei9987/pyLSHash

Locality Sensitive Hashing, fuzzy-hash, min-hash, simhash, aHash, pHash, dHash。基于 Hash值的图片相似度、文本相似度

Language: Python - Size: 257 KB - Last synced: about 9 hours ago - Pushed: 5 months ago - Stars: 47 - Forks: 5

hugofpaiva/mpei-p1 📦

Trabalho Prático da UC de Métodos Probabilísticos para Engenharia Informática, UA 2019/2020

Language: Java - Size: 39.9 MB - Last synced: 29 days ago - Pushed: about 3 years ago - Stars: 0 - Forks: 1

ritchie46/lsh-rs

Locality Sensitive Hashing in Rust with Python bindings

Language: Rust - Size: 511 KB - Last synced: 21 days ago - Pushed: 11 months ago - Stars: 103 - Forks: 20

Lefteris-Souflas/Movie-Rating-User-Similarity

Explored Jaccard distance, Min-Hashing, and LSH for user similarity in a movie rating dataset. Tasks involve dataset preprocessing, exact Jaccard Similarity computation, Min-Hash signatures, and LSH implementation. Results and observations are documented in code, output files, and a report

Language: Jupyter Notebook - Size: 1.22 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 0 - Forks: 0

ZiadSheriif/IntelliQuery

A semantic search indexing system designed to efficiently retrieve top matching results from a database of 20 million documents. Given the embedding of a search query, it quickly identifies and returns the most relevant documents

Language: Jupyter Notebook - Size: 5.84 MB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 1 - Forks: 0

DevPhamPham/NCKH_PySpark

Language: Python - Size: 354 MB - Last synced: about 2 months ago - Pushed: about 2 months ago - Stars: 0 - Forks: 0

leiyunin/Locality-Sensitive-Hashing-and-Collaborative-Filtering-on-Yelp-Data

The assignment comprises two main tasks: implementing LSH to identify similar businesses based on user ratings and developing various collaborative filtering recommendation systems to predict user ratings for businesses.

Language: Python - Size: 6.84 KB - Last synced: 3 months ago - Pushed: 3 months ago - Stars: 0 - Forks: 0

shaltielshmid/MinHashSharp

A Robust Library in C# for Similarity Estimation

Language: C# - Size: 39.1 KB - Last synced: 27 days ago - Pushed: 6 months ago - Stars: 1 - Forks: 0

Yasar2019/BigData-HW03

Finding similar documents using LSH with MapReduce on multi-node Spark Cluster

Language: Python - Size: 71 MB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

Alexdruso/ID2222-Data-Mining-Sanvito-Stuart

Lab assignments for the course ID2222-Data Mining at KTH

Language: Roff - Size: 62.1 MB - Last synced: 6 months ago - Pushed: over 1 year ago - Stars: 3 - Forks: 3

xadityax/Locality-Sensitive-Hashing-DNA-Seqs

Implementing Locality Sensitive Hashing for DNA Sequences.

Language: Python - Size: 1.77 MB - Last synced: 7 months ago - Pushed: over 3 years ago - Stars: 0 - Forks: 0

imRP21/Summer-Research-Internship-2022

This repo shows research paper upon which I worked during my summer research intern - 2022.

Size: 12.3 MB - Last synced: 8 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

Sitaras/Software-Development-for-Algorithmic-Problems_Project-2 Fork of giannhskp/Software-Development-for-Algorithmic-Problems_Project-2

📈|Time Series - Nearest neighbor search and Clustering using LSH, Hypercube (and Lloyd's only at the clustering) algorithms with metrics: L2, Discrete and Continuous Fréchet.

Language: C - Size: 33.9 MB - Last synced: 8 months ago - Pushed: about 2 years ago - Stars: 0 - Forks: 0

lehuutrung1412/ImageRetrieval

Build content-based image retrieval system using deep learning, applied some large scale similarity search technicals like Kdtree, LSH, Faiss.

Language: Python - Size: 4.58 MB - Last synced: 8 months ago - Pushed: over 2 years ago - Stars: 8 - Forks: 3

AndreasTraut/Deep_learning_explorations

Example on the Local Sensitive Hashing (LSH) algorithm. Relevant for Big Data

Language: Jupyter Notebook - Size: 118 MB - Last synced: 8 months ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0

NikolasGialitsis/LSH-and-Cube

LSH and Cube Implementation (Hashing and Querying Points on Higher Dimensions)

Language: C++ - Size: 8.26 MB - Last synced: 10 months ago - Pushed: over 5 years ago - Stars: 2 - Forks: 0

pedroalbanese/lshsum

TTAK.KO-12.0276 LSH Recursive Hasher

Language: Go - Size: 23.4 KB - Last synced: 11 months ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

justinfargnoli/lshforest

An implementation of LSH Forrest based off of the following paper (http://infolab.stanford.edu/~bawa/Pub/similarity.pdf).

Language: Go - Size: 29.3 KB - Last synced: 11 months ago - Pushed: almost 3 years ago - Stars: 5 - Forks: 1

LM1997610/ADM_HW4

Homework_4 for Algorithmic Methods for Data Mining (ADM), MSc in Data Science at La Sapienza University of Rome

Language: Jupyter Notebook - Size: 3.65 MB - Last synced: 3 months ago - Pushed: about 1 year ago - Stars: 0 - Forks: 0

Sitaras/Software-Development-for-Algorithmic-Problems_Project-1

Vectors - Nearest neighbor search and Clustering using LSH, Hypercube (and Lloyd's only at the clustering) algorithms with L2 metric.

Language: C - Size: 15.8 MB - Last synced: 8 months ago - Pushed: about 2 years ago - Stars: 1 - Forks: 1

ludwigfriborg/SwiftNilsimsa

Nilsimsa implementation as a swift package

Language: Swift - Size: 18.6 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 1 - Forks: 0

RishabhMaheshwary/query-attack

A Query Efficient Natural Language Attack in a Black Box Setting

Language: Python - Size: 1.67 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 12 - Forks: 4

AaronYang2333/DSCI_553

USC :v: 2020 Spring DSCI 553 (Foundations and Applications of Data Mining) 数据挖掘基础与应用 Score: :nine::four:

Language: ReScript - Size: 265 MB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 34 - Forks: 21

oertl/probminhash

ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity

Language: C++ - Size: 6.26 MB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 33 - Forks: 3

xiaogp/recsys_faiss

一个基于 fasttext + faiss 的商品内容相关推荐实现,nginx+uwsgi+flask / gunicorn+uvicorn+fastapi 提供api查询接口,增加Spark实现 Ansj+Word2vec+LSH+Phoenix

Language: Python - Size: 41.3 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 40 - Forks: 16

munnafaisal/Deep-Object-Search-With-Hash

Search your object with hash

Language: Python - Size: 10.2 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 12 - Forks: 5

kochlisGit/Big-Data-Algorithms

Implementation of algorithms for big data using python, numpy, pandas.

Language: Python - Size: 28.8 MB - Last synced: about 1 year ago - Pushed: about 4 years ago - Stars: 2 - Forks: 0

Vedant2311/Data-Mining-Algorithms

Repository for all assignments of the course COL761: Data Mining (Fall 2020), taught at IIT Delhi

Language: C++ - Size: 4.9 MB - Last synced: about 1 year ago - Pushed: over 3 years ago - Stars: 1 - Forks: 0

oertl/treeminhash

TreeMinHash: Fast Sketching for Weighted Jaccard Similarity Estimation

Language: C++ - Size: 2.62 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 12 - Forks: 3

C-Ritam98/SimToReal

Unnatural Language Processing

Language: Jupyter Notebook - Size: 518 KB - Last synced: 4 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

santurini/MinHash-LSH-From-Scratch

Implementing a simplified copy of Shazam application from scratch using MinHashing and LSH.

Language: Python - Size: 210 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 1

theatina/CryptoRecommendation

Recommendation System on cryptocurrency, using data collected from users' tweets + 10-Fold Cross Validation ( Based on the cryptocoins from each user's tweets, the program runs algorithms on the data, resulting in the recommendation of other cryptocoins for each user) ( readme in greek but soon to be translated in English )

Language: C - Size: 9.2 MB - Last synced: about 1 year ago - Pushed: about 5 years ago - Stars: 7 - Forks: 0

muyuuuu/high-performance-LSH

使用线程池的高并发 LSH 算法, C++ 实现

Language: C++ - Size: 47.9 KB - Last synced: about 1 year ago - Pushed: about 2 years ago - Stars: 3 - Forks: 0

SwamiKannan/Natural-Language-Processing-Specialization

Coursera's Natural Language Processing specialization

Language: HTML - Size: 3.68 MB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 1 - Forks: 0

mark-antal-csizmadia/finding-similar-items-textually-similar-documents

Finding Similar Items: Textually Similar Documents

Language: Jupyter Notebook - Size: 451 KB - Last synced: about 1 year ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

MenesesGHZ/locality-sensitive-hashing

LSH algorithm made with C++

Language: Makefile - Size: 5.39 MB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 0 - Forks: 0

julialwang/docuSearch

a Python program that uses LSH (locality-sensitive hashing) to search and retrieve filenames from a csv file that contains similar words to the user's input.

Language: Python - Size: 91.8 KB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

JaiJaveria/Data_Mining

Projects involving Frequent Itemset Mining and analysis of hierarchical space partitioning techniques

Language: HTML - Size: 203 MB - Last synced: about 1 year ago - Pushed: over 2 years ago - Stars: 0 - Forks: 0

FilipePires98/SpellChecker

SpellChecker: an application to check for spell errors.

Language: Java - Size: 3.54 MB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 0 - Forks: 1

MohammadJavadArdestani/NLP-with-Classification-and-Vector-Spaces

Language: Jupyter Notebook - Size: 9.93 MB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 1 - Forks: 1

spyros-briakos/Autoencoder-Dimensionality-Reduction

Autoencoder dimensionality reduction, EMD-Manhattan metrics comparison and classifier based clustering on MNIST dataset.

Language: C++ - Size: 16.2 MB - Last synced: 9 months ago - Pushed: about 3 years ago - Stars: 0 - Forks: 0

Related Keywords
lsh-algorithm 42 lsh 9 lsh-implementation 8 locality-sensitive-hashing 6 minhash 6 clustering 5 hashing 4 minhash-lsh-algorithm 4 shingling 4 data-mining 4 pyspark 3 pcy 3 bloom-filter 3 cosine-similarity 3 python 3 jaccard-similarity 3 hypercube 3 apriori-algorithm 2 similar-items 2 fp-tree 2 spark 2 clustering-algorithm 2 dimensionality-reduction 2 frequent-itemset-mining 2 min-hashing 2 nearest-neighbors 2 recommendation-system 2 cpp 2 similarity-search 2 jaccard-similarity-estimation 2 faiss 2 similarity 2 r-tree 2 top-k-query 1 hash-algorithm 1 jaccard 1 jaccard-coefficient 1 jaccard-distance 1 jaccard-index 1 locality-sensitive 1 minwise-hashing 1 minwise-hashing-algorithm 1 similarity-measures 1 similarity-metric 1 natural-language-processing 1 collaborative-filtering-algorithm 1 deeplearning 1 object-detection 1 object-search 1 search-engine 1 yolov3 1 a-priori 1 big-data-processing 1 frequent-itemsets 1 min-hasing 1 multihash-pcy 1 multistage-pcy 1 stream-mining 1 streams 1 fp-tree-c-implementation 1 fsg 1 gaston 1 gspan-algorithm 1 subgraph-mining 1 part-of-speech-tagger 1 specialization 1 stochastic-gradient-descent 1 viterbi-algorithm 1 word2vec-algorithm 1 textual-similarity 1 java 1 murmur 1 spell-checker 1 word-suggestion 1 logistic-regression 1 naive-bayes-classifier 1 text-process 1 transformation-matrix 1 tweet-analysis 1 approximate-nearest-neighbor-search 1 autoencoder 1 bottleneck 1 earth-movers-distance 1 manhattan-distance 1 mnist-dataset 1 sketching 1 sketching-algorithm 1 weighted-sets 1 attention 1 bert 1 hierarchical-models 1 lstm 1 rnn 1 seq2seq 1 jupyter-notebook 1 librosa 1 c 1 cryptocurrency 1 datascience 1 text-mining 1