An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: locality-sensitive-hashing

ryputtam/Locality-Sensitive-Hashing-Plagiarism-Detection

Implementation of Locality Sensitive Hashing to detect plagiarism

Language: Jupyter Notebook - Size: 170 KB - Last synced at: 10 months ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 1

akole123/Locality-Sensitive-Hashing

An advanced technique to find similarities in files.

Language: Python - Size: 10.3 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

Lizhen0909/LSHVec

Language: HTML - Size: 5.88 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 6 - Forks: 4

gasparian/lsh-search-go

Locality-sensitive hashing index implementation [EXPERIMENT]

Language: Go - Size: 2.29 MB - Last synced at: 2 days ago - Pushed at: over 3 years ago - Stars: 10 - Forks: 1

VictorBusque/LSH-on-documents 📦

C++ implementation of Locality-Sensitive Hashing over txt documents, using Jaccard Similarity.

Language: C++ - Size: 27.3 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 3 - Forks: 1

KeremZaman/semantic-sh

semantic-sh is a SimHash implementation to detect and group similar texts by taking power of word vectors and transformer-based language models (BERT).

Language: Python - Size: 40 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 23 - Forks: 3

pyt243/IR-LSH

Locality sensitive hashing based plagiarism checker

Language: Python - Size: 2.81 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 3 - Forks: 0

vsnupoudel/NLP_Specialization

Contains work done for NLP Specialization courses from DeepLearning.AI

Language: Jupyter Notebook - Size: 14 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

Nikoletos-K/WinnER

A Winner-Take-All Hashing-Based Unsupervised Model for Entity Resolution Problems. [B. Sc. Thesis]

Language: Jupyter Notebook - Size: 154 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

stepping1st/hyperplane-hash

Finding top n nearest neighbor points by hyperplane hashing. LSH

Language: Java - Size: 13.9 MB - Last synced at: almost 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

SwamiKannan/Natural-Language-Processing-Specialization

Coursera's Natural Language Processing specialization

Language: HTML - Size: 3.68 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 1 - Forks: 0

santurini/Search-Engine-Evaluation-and-Near-Duplicate-Detection

Exploiting the PyTerrier library to perform Search Engine Evaluation and Near Duplicate Detection on different datasets.

Language: Jupyter Notebook - Size: 267 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

giulio-derasmo/Search-Engine-Evaluation-and-Near-Duplicate-Detection

Exploiting the PyTerrier library to build a Search Engine and resolve the Near Duplicate Detection tasks.

Language: Jupyter Notebook - Size: 547 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 1

saaay71/spark-LSH

Locality Sensitive Hashing (LSH) using Spark for clustering

Language: Python - Size: 17.6 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

tian-kun/Fly-LSH

Implementation of a locality-sensitive-hashing (LSH) algorithm inspired by how the fruit fly's olfactory circuit encode odors (Dasgupta et al., 2017).

Language: Python - Size: 988 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 3 - Forks: 0

bergwald/ai

A collection of notebooks on artificial intelligence, with a focus on deep learning and data engineering.

Language: Jupyter Notebook - Size: 16.7 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 1 - Forks: 0

MenesesGHZ/locality-sensitive-hashing

LSH algorithm made with C++

Language: Makefile - Size: 5.39 MB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

andriyka/LSH-accommodation-clustering

Locality-sensitive hashing Algorithm implemented using Python and Spark for clustering housing data from Airbnb

Language: Jupyter Notebook - Size: 24.4 KB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 4 - Forks: 0

shr1611/Data-mining-Plagiarism-Check

A Java program to check Plagiarisms between multiple documents using the method of Shingling, MinHashing and Locality Sensitive Hashing.

Language: Java - Size: 38.1 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0

fearofcode/lsh_rs

Simple standalone multi-threaded locality sensitive hashing implementation in Rust

Language: Rust - Size: 10.7 KB - Last synced at: 6 days ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

akshayratnawat/NaturalLanguageProcessingSpecialization

Language: Jupyter Notebook - Size: 680 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

PetropoulakisPanagiotis/nearest-neighbor-search

Nearest neighbor search. Methods: LSH, hypercube, and exhaustive search. C++

Language: C++ - Size: 13.8 MB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 8 - Forks: 1

jinyeom/lsh-knn

[Experiment] Approximate k-nearest neighbors (k-NN) with locality-sensitive hashing (LSH)

Language: Jupyter Notebook - Size: 14.6 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 2 - Forks: 0

CrosleyZack/cse515

Code developed for CSE 515 Multimedia Web Databases

Language: Python - Size: 140 MB - Last synced at: 5 days ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 3

steven-s/minhash-document-clusters

Minhash clustering of text documents

Language: Scala - Size: 33.2 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 4 - Forks: 1

krishnakaushik25/Locality-Sensitive-Hashing

In this deep learning project, similar images are found(lookalikes) using deep learning and locality-sensitive hashing to find customers most likely to click on an ad.

Language: Python - Size: 41.8 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

davidsbatista/MuSICo

A Minwise Hashing Method for Addressing Relationship Extraction from Text

Language: Java - Size: 37.4 MB - Last synced at: about 1 year ago - Pushed at: about 8 years ago - Stars: 5 - Forks: 2

Ankit-Kumar-Saini/Coursera_Natural_Language_Processing_Specialization

Implementation of state-of-the-art NLP models using transformers for tasks including machine translation, text-summarization, chatbots, and question answering.

Language: Jupyter Notebook - Size: 18.7 MB - Last synced at: about 2 years ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 2

michielbuddingh/nilsimsa

A native go implementation of nilsimsa

Language: C - Size: 54.7 KB - Last synced at: 10 months ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

sjmoran/GRH

Graph Regularised Hashing code

Language: MATLAB - Size: 22.5 KB - Last synced at: about 2 years ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 0

Kejie-Wang/lsh

Language: C++ - Size: 4.98 MB - Last synced at: about 2 years ago - Pushed at: over 8 years ago - Stars: 1 - Forks: 0

alepfu/kmeans-lsh

k-means implementation using locality-sensitive hashing

Language: Java - Size: 7.81 KB - Last synced at: 3 months ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 1

HCMY/UnCategoticalableAlgorithm

local sensitive hash, Traveling Salesman Problem, Kevin Bacon Game, Genetic Algorithm

Language: C++ - Size: 2.82 MB - Last synced at: about 2 years ago - Pushed at: over 5 years ago - Stars: 0 - Forks: 0

duartefellipe/minmaxcsa

MinMax Circular Sector Arc for External Plagiarism’s Heuristic Retrieval Stage code

Language: Python - Size: 32.2 KB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 3 - Forks: 4

SidJain1412/LocalitySensitiveHashing

Testing a Recommendation System Using Locality Sensitive Hashing

Language: Jupyter Notebook - Size: 33.2 KB - Last synced at: about 2 years ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 2

coderthetyler/mhash-c

An implementation of the MinHashing algorithm in C using POSIX threads.

Language: C - Size: 3.86 MB - Last synced at: 12 months ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

farosato/Data-Mining-homeworks

Homeworks done within Data Mining course of M.Sc. in Engineering in Computer Science at Università degli Studi di Roma "La Sapienza" (A.Y. 2016/2017), in collaboration with Giacomo Lanciano and Francisco Ferreres.

Language: TeX - Size: 31.8 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

acgtun/hsearch

HSEARCH: fast and accurate protein sequence motif search and clustering

Language: C++ - Size: 3.66 MB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 1 - Forks: 0

cwuu/DataMining-LearningFromLargeDataSet-Task1

ETH Zurich Fall 2017

Language: Python - Size: 496 KB - Last synced at: almost 2 years ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0

worldofnick/Machine-Learning

Collection of code covering various topics in Machine Learning

Language: Jupyter Notebook - Size: 3.48 MB - Last synced at: about 2 years ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

Cirice/Hash64

A Python3 library implementing locality sensitive hashing.

Language: Python - Size: 6.84 KB - Last synced at: about 2 years ago - Pushed at: about 8 years ago - Stars: 2 - Forks: 0

Related Keywords
locality-sensitive-hashing 141 lsh 31 minhash 23 python 19 nearest-neighbor-search 17 clustering 11 machine-learning 11 data-mining 10 jaccard-similarity 10 minhash-lsh-algorithm 9 similarity-search 9 deep-learning 8 simhash 8 hashing 8 approximate-nearest-neighbor-search 8 lsh-algorithm 6 information-retrieval 6 jaccard-similarity-estimation 6 nlp 5 natural-language-processing 5 data-science 5 big-data 5 search-engine 5 hypercube 4 deduplication 4 naive-bayes-classifier 4 jaccard-distance 4 k-means-clustering 4 near-duplicate-detection 4 minwise-hashing 4 time-series 4 sentiment-analysis 4 tlsh 4 random-projections 4 k-means 4 logistic-regression 4 kd-tree 4 machine-translation 4 k-nearest-neighbors 4 image-processing 4 numpy 4 cosine-similarity 4 minwise-hashing-algorithm 3 pyspark 3 svd 3 hierarchical-clustering 3 page-rank 3 hashing-algorithm 3 collaborative-filtering 3 recommender-system 3 mapreduce 3 text-mining 3 minhash-sketches 3 nearest-neighbors 3 word-embeddings 3 transformers 3 pca 3 python3 3 high-dimensional-data 3 cosine-distance 3 java 3 pytorch 3 golang 3 go 3 similarity 3 shingling 3 elasticsearch 3 document-similarity 3 hyperloglog 2 random-indexing 2 machine-learning-algorithms 2 digest 2 hash 2 spark 2 datamining 2 nodejs 2 nilsimsa 2 pyterrier 2 search-engine-optimization 2 knn 2 dimensionality-reduction 2 bert 2 word2vec 2 part-of-speech-tagging 2 spam 2 time-series-classification 2 principal-component-analysis 2 ann 2 quantization 2 approximate-nearest-neighbors 2 attention-model 2 tf-idf 2 qalsh 2 latent-dirichlet-allocation 2 dataset 2 viterbi-algorithm 2 sketch 2 weighted-sets 2 zero-shot-learning 2 image 2