An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: locality-sensitive-hashing

spotify/annoy

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

Language: C++ - Size: 1.93 MB - Last synced at: about 13 hours ago - Pushed at: 9 months ago - Stars: 13,685 - Forks: 1,187

AddictedCS/soundfingerprinting

Open source audio fingerprinting in .NET. An efficient algorithm for acoustic fingerprinting written purely in C#.

Language: C# - Size: 60.5 MB - Last synced at: about 13 hours ago - Pushed at: about 1 month ago - Stars: 977 - Forks: 195

dr-mushtaq/Natural-language-processing

This repository is a related to all about Natural Langauge Processing - an A-Z guide to the world of Data Science. This supplement contains the implementation of algorithms, statistical methods and techniques (in Python)

Language: Jupyter Notebook - Size: 1.49 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 14 - Forks: 4

JoshEngels/FLINNG

A fast high dimensional near neighbor search algorithm based on group testing and locality sensitive hashing

Language: C++ - Size: 117 MB - Last synced at: 3 days ago - Pushed at: over 1 year ago - Stars: 23 - Forks: 4

ekzhu/datasketch

MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW

Language: Python - Size: 5.68 MB - Last synced at: 14 days ago - Pushed at: 11 months ago - Stars: 2,667 - Forks: 296

alexklibisz/elastiknn

Elasticsearch plugin for nearest neighbor search. Store vectors and run similarity search using exact and approximate algorithms.

Language: Scala - Size: 139 MB - Last synced at: 9 days ago - Pushed at: 15 days ago - Stars: 379 - Forks: 49

amatov/DifferentialDiagnosisCBIR

Image retrieval can facilitate medical diagnosis by identifying categories of similar to a new patient presented for diagnosis phenotypes which have already been assigned a diagnosis

Language: Python - Size: 604 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 1 - Forks: 0

FALCONN-LIB/FALCONN

FAst Lookups of Cosine and Other Nearest Neighbors (based on fast locality-sensitive hashing)

Language: C - Size: 4.93 MB - Last synced at: 8 days ago - Pushed at: 11 months ago - Stars: 1,150 - Forks: 194

SethEra666/disk-optimization

Disk optimization refers to the process of improving the performance and efficiency of a storage device, such as a hard disk drive (HDD) or solid-state drive (SSD). This involves a variety of techniques aimed at optimizing how data is stored, accessed, and managed on the disk, ultimately improving system performance

Size: 2.93 KB - Last synced at: 27 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

glaslos/tlsh

TLSH lib in Golang

Language: Go - Size: 658 KB - Last synced at: 15 days ago - Pushed at: over 2 years ago - Stars: 138 - Forks: 16

serega/gaoya

Locality Sensitive Hashing

Language: Rust - Size: 236 KB - Last synced at: 14 days ago - Pushed at: almost 2 years ago - Stars: 72 - Forks: 7

justinbt1/Akin

Python library for detecting near duplicate texts in a corpus at scale.

Language: Python - Size: 2.77 MB - Last synced at: 6 days ago - Pushed at: about 2 months ago - Stars: 8 - Forks: 0

zyocum/dedup

Find duplicate text files.

Language: Python - Size: 19.5 KB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 14 - Forks: 3

loretoparisi/lshash

locality sensitive hashing (LSHASH) for Python3

Language: Python - Size: 63.5 KB - Last synced at: about 1 month ago - Pushed at: about 1 year ago - Stars: 65 - Forks: 8

rdspring1/LSH_DeepLearning

Scalable and Sustainable Deep Learning via Randomized Hashing

Language: Java - Size: 26.4 KB - Last synced at: 17 days ago - Pushed at: almost 3 years ago - Stars: 93 - Forks: 22

dselivanov/LSHR

Locality Sensitive Hashing In R

Language: R - Size: 98.6 KB - Last synced at: 9 days ago - Pushed at: over 6 years ago - Stars: 40 - Forks: 13

cmdevries/LMW-tree

Learning M-Way Tree - Web Scale Clustering - EM-tree, K-tree, k-means, TSVQ, repeated k-means, bitwise clustering

Language: C++ - Size: 74.5 MB - Last synced at: 2 months ago - Pushed at: about 3 years ago - Stars: 74 - Forks: 20

idealista/tlsh-js

JavaScript port of TLSH (Trend Micro Locality Sensitive Hash)

Language: JavaScript - Size: 311 KB - Last synced at: 16 days ago - Pushed at: almost 4 years ago - Stars: 162 - Forks: 16

Ivagnesmanuel/Big-data-computing-homework

Big data computing homework

Language: Shell - Size: 939 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

dataplayer12/Fly-LSH

An implementation of efficient LSH inspired by fruit fly brain

Language: Python - Size: 214 KB - Last synced at: 19 days ago - Pushed at: over 6 years ago - Stars: 88 - Forks: 27

checktor/face_amnesia

Face detection and retrieval in image and video files.

Language: Python - Size: 102 MB - Last synced at: 23 days ago - Pushed at: 4 months ago - Stars: 7 - Forks: 2

dstein64/aghasher

An implementation of Anchor Graph Hashing (Liu et al. 2011) in Python.

Language: Python - Size: 26.2 MB - Last synced at: 8 days ago - Pushed at: 4 months ago - Stars: 9 - Forks: 3

lgautier/mashing-pumpkins

Minhash and maxhash library in Python, combining flexibility, expressivity, and performance.

Language: C - Size: 1.4 MB - Last synced at: 3 days ago - Pushed at: 4 months ago - Stars: 21 - Forks: 3

Nikoletos-K/Winner-Take-All-Hash-Python

🥇Winner Take All Hash algorithm by J. Yagnik, implemented in Python.

Language: Jupyter Notebook - Size: 28.3 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 0

duhaime/minhash

Quickly estimate the similarity between many sets

Language: JavaScript - Size: 1010 KB - Last synced at: 10 days ago - Pushed at: over 2 years ago - Stars: 51 - Forks: 11

jfeser/loopfinder

Extract looping GIFs from longer videos using locality-sensitive hashing.

Language: Python - Size: 13.7 KB - Last synced at: 3 days ago - Pushed at: over 8 years ago - Stars: 5 - Forks: 0

RobCyberLab/Machine-Learning-Search

🔎Machine Learning Search🔍

Language: Python - Size: 1.63 MB - Last synced at: 23 days ago - Pushed at: 5 months ago - Stars: 1 - Forks: 0

oertl/treeminhash

TreeMinHash: Fast Sketching for Weighted Jaccard Similarity Estimation

Language: C++ - Size: 2.62 MB - Last synced at: 9 days ago - Pushed at: about 2 years ago - Stars: 14 - Forks: 3

oertl/probminhash

ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity

Language: C++ - Size: 6.26 MB - Last synced at: 9 days ago - Pushed at: over 4 years ago - Stars: 42 - Forks: 6

Forthoney/doc_sim

Approximate document similarity with Minhash + Locality Sensitive Hashing

Language: Ruby - Size: 48.8 KB - Last synced at: 11 days ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 2

MNoorFawi/lshashing

python library to perform Locality-Sensitive Hashing for faster nearest neighbors search in high dimensional data

Language: Python - Size: 5.24 MB - Last synced at: 19 days ago - Pushed at: 8 months ago - Stars: 19 - Forks: 2

james-bowman/nlp

Selected Machine Learning algorithms for natural language processing and semantic analysis in Golang

Language: Go - Size: 396 KB - Last synced at: 9 months ago - Pushed at: almost 4 years ago - Stars: 445 - Forks: 45

chanzuckerberg/ExpressionMatrix2 📦

Software for exploration of gene expression data from single-cell RNA sequencing.

Language: C++ - Size: 13.4 MB - Last synced at: 24 days ago - Pushed at: almost 6 years ago - Stars: 28 - Forks: 5

petroniocandido/clshq_tk

Contrastive-LSH Embedding and Tokenization Technique for Multivariate Time Series Classification

Language: Jupyter Notebook - Size: 842 KB - Last synced at: 9 months ago - Pushed at: 10 months ago - Stars: 0 - Forks: 0

aj-talaei/Natural_Language_Processing_Specialization

This repository contains my coursework and projects completed during the Natural Language Processing Specialization offered by DeepLearning.AI.

Language: Jupyter Notebook - Size: 44.4 MB - Last synced at: 4 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

ajaichemmanam/OTLSH-Tracker

A Tracking Framework for MOT Challenge

Language: Python - Size: 29.3 KB - Last synced at: 25 days ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

michielbuddingh/spamsum

A native go implementation of spamsum

Language: Go - Size: 59.6 KB - Last synced at: 10 months ago - Pushed at: about 7 years ago - Stars: 19 - Forks: 3

learning2hash/learning2hash.github.io

Website for "Awesome Learning to Hash" https://learning2hash.github.io

Language: HTML - Size: 67.1 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 17 - Forks: 5

edawson/mkmh

Generate kmers/minimizers/hashes/MinHash signatures, including with multiple kmer sizes.

Language: C++ - Size: 204 KB - Last synced at: 20 days ago - Pushed at: over 4 years ago - Stars: 24 - Forks: 2

dynatrace-research/set-sketch-paper

SetSketch: Filling the Gap between MinHash and HyperLogLog

Language: C++ - Size: 23.7 MB - Last synced at: 12 months ago - Pushed at: over 3 years ago - Stars: 46 - Forks: 5

zhaoxiaofei/bindash

Fast and precise comparison of genomes and metagenomes (in the order of terabytes) on a typical personal laptop

Language: C++ - Size: 985 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 51 - Forks: 7

taslanidis/Nearest-Neighbors

ANN - Approximate Nearest Neighbors Index with Locality Sensitive Hashing and Hyper Cube projections for vectors and multi-dimensional data.

Language: C++ - Size: 7.46 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

Shao-Group/lsb-learn

A learning algorithm for locality-sensitive bucketing functions

Language: Python - Size: 117 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 2

anirban-code-to-live/tipr-first-assignment

Hola, amigos! Welcome to the first assignment of TIPR-2019.

Language: Python - Size: 10.5 MB - Last synced at: about 1 year ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 6

kamil-sita/image-copy-finder

Multi module project focused on near-duplicate search for images.

Language: Java - Size: 176 KB - Last synced at: 9 months ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 1

huzaifakhan04/music-recommendation-web-application-based-on-rhythmic-similarity-using-locality-sensitive-hashing

This repository contains a web application that integrates with a music recommendation system, which leverages a dataset of 3,415 audio files, each lasting thirty seconds, utilising a Locality-Sensitive Hashing (LSH) implementation to determine rhythmic similarity, as part of an assignment for the Fundamental of Big Data Analytics (DS2004) course.

Language: Jupyter Notebook - Size: 4.93 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 5 - Forks: 0

adamliesko/tlsh

TLSH (Trend Micro Locality Sensitive Hash) library for Ruby

Language: Ruby - Size: 549 KB - Last synced at: 8 months ago - Pushed at: over 7 years ago - Stars: 25 - Forks: 3

SalmaHisham/Analysis-of-the-MovieLen-dataset

Explores the MovieLens dataset (1M version) to uncover valuable insights into user behavior, demographics, movie popularity, and community structures. Various tasks, including data preprocessing, clustering, community detection, and recommendation systems, provide a holistic understanding of the dataset.

Language: Jupyter Notebook - Size: 4.99 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 1

oertl/bagminhash

BagMinHash - Minwise Hashing Algorithm for Weighted Sets

Language: C++ - Size: 1.02 MB - Last synced at: 9 days ago - Pushed at: over 4 years ago - Stars: 26 - Forks: 6

dbrcina/AVSP-FER-2020-21

Lab solutions for Analysis of Massive Datasets ("Analiza velikih skupova podataka") course at FER 2020/21

Language: Java - Size: 1.32 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

kernelmethod/LSHFunctions.jl

Locality-sensitive hashing (LSH) in Julia.

Language: Julia - Size: 1.47 MB - Last synced at: 6 days ago - Pushed at: over 3 years ago - Stars: 14 - Forks: 1

idealista/tlsh

Java port of TLSH (Trend Micro Locality Sensitive Hash)

Language: Java - Size: 30.3 KB - Last synced at: 24 days ago - Pushed at: almost 4 years ago - Stars: 20 - Forks: 8

AmbarChatterjee/ADM_HW4_Group3

This repository contains code and analysis for a homework assignment on recommendation systems and clustering algorithms in Python. Implements techniques like minhash, LSH, feature engineering, dimensionality reduction, K-means and DBSCAN clustering.

Language: Jupyter Notebook - Size: 48.1 MB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

zyocum/simphon

Proof-of-concept for measuring similarity of phoneme sequences using locality sensitive hashing (LSH).

Language: Jupyter Notebook - Size: 1.23 MB - Last synced at: 4 days ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

nekcht/minhash-lsh-evaluation

Assessing MinHash LSH for text similarity. Compares with kNN using BART embeddings as ground truth. Involves data preprocessing, shingle creation, LSH experiments. Findings inform LSH's efficiency in document similarity tasks, enhancing understanding of LSH techniques.

Language: Jupyter Notebook - Size: 367 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

akdel/locality-sensitive-hashing

fast and simple locality-sensitive hashing implemented in (numba + numpy)

Language: Python - Size: 22.5 KB - Last synced at: 14 days ago - Pushed at: over 3 years ago - Stars: 3 - Forks: 1

geraked/bigdata

Implementation of Big Data Analytics Algorithms in Python

Language: Jupyter Notebook - Size: 11 MB - Last synced at: about 2 months ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

sjmoran/NPQ

Neighbourhood Preserving Quantisation (NPQ) code

Language: PostScript - Size: 214 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 0

HuangQiang/P2HNNS

Point-to-Hyperplane NNS Beyond the Unit Hypersphere (SIGMOD 2021)

Language: C++ - Size: 42.1 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 10 - Forks: 1

nepiskopos/duplicate-questions-detection-lsh

Knowledge extraction through Data Analysis, including Locality Sensitive Hashing (LSH).

Language: Jupyter Notebook - Size: 423 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

xadityax/Locality-Sensitive-Hashing-DNA-Seqs

Implementing Locality Sensitive Hashing for DNA Sequences.

Language: Python - Size: 1.77 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

alexklibisz/elastik-nearest-neighbors 📦

Go to: https://github.com/alexklibisz/elastiknn

Language: Python - Size: 56.4 MB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 253 - Forks: 62

giannhskp/Software-Development-for-Algorithmic-Problems_Project-2

Neighbor Search and Clustering for Time-Series using Locality-sensitive hashing and Randomized Projection to Hypercube. Time series comparison is performed using Discrete Frechet or Continuous Frechet metric.

Language: C - Size: 33.9 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 1

gurushida/mnemophonix

A simple audio fingerprinting system

Language: C - Size: 316 KB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 25 - Forks: 4

giannhskp/Software-Development-for-Algorithmic-Problems_Project-1 Fork of Sitaras/Software-Development-for-Algorithmic-Problems_Project-1

Neighbor Search and Clustering for Vectors using Locality-sensitive hashing and Randomized Projection to Hypercube

Language: C - Size: 16.4 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 1 - Forks: 0

petroniocandido/nde_tsc

Neural Density Estimation for Time Series Classification

Language: Python - Size: 93.8 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

GemsLab/hashing-based-network-discovery

Hashing-based network discovery from time series

Language: Python - Size: 137 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 8 - Forks: 3

hkthiet2999/Massive-Data-Processing-Course

Research about Massive Data Processing Techniques in Data Science

Language: Jupyter Notebook - Size: 14.3 MB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 2 - Forks: 2

rihenperry/csuci-mscs-thesis-dist-web-crawler

documents my master's level thesis work on building continous, topical web crawler based on mercator 1999

Language: TeX - Size: 27.4 MB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0

PreferredAI/recommendation-retrieval

A tutorial on scalable retrieval of matrix factorization recommendations

Language: Jupyter Notebook - Size: 51.9 MB - Last synced at: 12 months ago - Pushed at: about 6 years ago - Stars: 26 - Forks: 8

ByJuanDiego/db2-project-3

Face Recognition

Language: Python - Size: 13.6 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

micts/jss

Fast Jaccard similarity search for abstract sets (documents, products, users, etc.) using MinHashing and Locality Sensitve Hashing

Language: Python - Size: 23.4 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 3 - Forks: 0

GemsLab/HashAlign

Hashing-based network alignment based on structural features

Language: Python - Size: 34.2 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 4 - Forks: 3

petroniocandido/ts_lsh

Locally Sensitive Hashing based embedding for High Dimensional Multivariate Time Series

Language: Python - Size: 83 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

annieyan/NearestNeighbor-LSH

Use KD-trees and Locality Sensitive Hashing (LSH) to find exact and approximate nearest neighbor

Language: Java - Size: 3.29 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 2 - Forks: 0

giacomolanciano/Data-Mining-homeworks

Homeworks done within Data Mining course of M.Sc. in Engineering in Computer Science at UniversitĂ  degli Studi di Roma "La Sapienza" (A.Y. 2016/2017), in collaboration with Fabio Rosato and Francisco Ferreres.

Language: TeX - Size: 31.8 MB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 0

ichuniq/Massive-Data-Analysis

Renowned data mining algorithms implemented in PySpark

Language: Jupyter Notebook - Size: 532 KB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

mavaddat/tlsh Fork of trendmicro/tlsh

Locality-sensitive hashing algorithm to identify similar messages. Designed for a range of security and digital forensic applications.

Language: Max - Size: 7.77 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

cchatzis/Nearest-Neighbour-LSH

C++ program that, given a vectorised dataset and query set, performs locality sensitive hashing, finding either Nearest Neighbour (NN) or Neighbours in specified range of points in query set, using either Euclidian distance or Cosine Similarity.

Language: C++ - Size: 1.27 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 7 - Forks: 2

thekaranacharya/ai-visual-reasoning

A pipeline to explain any CNN Image Classification model outputs using a combination of GradCAM(visual) and Case-based Reasoning methods

Language: Jupyter Notebook - Size: 11.5 MB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 2 - Forks: 1

FabrizioSandri/data-mining-project

Data mining project 2022/2023 - Query recommendation system

Language: Python - Size: 29.2 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 1

wherefortravel/minhash-node-rs

MinHash and LSH index written in Rust for Node.js

Language: Rust - Size: 207 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 12 - Forks: 1

florianmorath/Data-Mining

Data Mining course at ETH ZĂĽrich.

Language: Python - Size: 71.9 MB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 1 - Forks: 0

johneckberg/HilbertNearestNeighbors

A possible twist on previous work done to use space filling curves as a locality senstive hashing mechanisim

Language: Python - Size: 18.6 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

RUSH-LAB/LSH_Memory

One-Shot Learning using Nearest-Neighbor Search (NNS) and Locality-Sensitive Hashing LSH

Language: Python - Size: 24.4 KB - Last synced at: over 1 year ago - Pushed at: almost 7 years ago - Stars: 70 - Forks: 18

HuangQiang/QALSH_Mem

Query-Aware LSH for Approximate NNS (Memory Version of QALSH)

Language: C++ - Size: 7.85 MB - Last synced at: almost 2 years ago - Pushed at: almost 4 years ago - Stars: 13 - Forks: 0

ahiralesc/NNS

Nearest neighbor search (NNS)

Language: Python - Size: 214 KB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 2

jasonfilippou/DimReduce

Implementations of 3 linear and non-linear dimensionality reduction algorithms

Language: Python - Size: 48.3 MB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 1

super-m-a-n/Time-Series-Similarity-Search-and-Clustering

similarity search and clustering algorithms for time-series represented as euclidean polygonal curves

Language: C++ - Size: 9.68 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

super-m-a-n/LSH-Hashing-and-Centroid-Based-Clustering

approximation algorithms for exact nearest neighbors search and clustering on multi-dimensional vectors

Language: C++ - Size: 2.3 MB - Last synced at: almost 2 years ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

DiceTechJobs/VectorsInSearch

Dice.com repo to accompany the dice.com 'Vectors in Search' talk by Simon Hughes, from the Activate 2018 search conference, and the 'Searching with Vectors' talk from Haystack 2019 (US). Builds upon my conceptual search and semantic search work from 2015

Language: Python - Size: 49.8 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 82 - Forks: 15

rdspring1/LSH_Memory

One-Shot Learning using Nearest-Neighbor Search (NNS) and Locality-Sensitive Hashing LSH

Language: Python - Size: 40 KB - Last synced at: 19 days ago - Pushed at: over 7 years ago - Stars: 2 - Forks: 2

Hussein-Fadl/cs422-locality-sensitive-hashing

This project implements a large-scale data processing pipeline over IMDB dataset for rating aggregation and similarity search.

Language: Scala - Size: 332 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

RUSH-LAB/LSH_DeepLearning Fork of rdspring1/LSH_DeepLearning

Scalable and Sustainable Deep Learning via Randomized Hashing

Language: Java - Size: 24.4 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 7 - Forks: 0

ludwigfriborg/SwiftNilsimsa

Nilsimsa implementation as a swift package

Language: Swift - Size: 18.6 KB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

nicoDs96/Document-Similarity-using-Python-and-PySpark

Document Similarity with Apache Spark using Locality Sesitive Hashing and Python

Language: Jupyter Notebook - Size: 444 KB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 6 - Forks: 1

bahromnematov/Easy_Localization

Easy_Localization

Language: C++ - Size: 261 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 0

HuangQiang/QALSH

Query-Aware LSH for Approximate NNS (PVLDB 2015 and VLDBJ 2017)

Language: C++ - Size: 9.95 MB - Last synced at: about 2 years ago - Pushed at: almost 4 years ago - Stars: 20 - Forks: 8

mendesk/image-ndd-lsh

Near-duplicate image detection using Locality Sensitive Hashing

Language: Python - Size: 16.1 MB - Last synced at: about 2 years ago - Pushed at: over 3 years ago - Stars: 49 - Forks: 13

SSQ/Coursera-UW-Machine-Learning-Clustering-Retrieval

Language: Python - Size: 81.9 MB - Last synced at: about 2 years ago - Pushed at: about 6 years ago - Stars: 28 - Forks: 28

Related Keywords
locality-sensitive-hashing 141 lsh 31 minhash 23 python 19 nearest-neighbor-search 17 clustering 11 machine-learning 11 data-mining 10 jaccard-similarity 10 minhash-lsh-algorithm 9 similarity-search 9 deep-learning 8 simhash 8 hashing 8 approximate-nearest-neighbor-search 8 lsh-algorithm 6 information-retrieval 6 jaccard-similarity-estimation 6 nlp 5 natural-language-processing 5 data-science 5 big-data 5 search-engine 5 hypercube 4 deduplication 4 naive-bayes-classifier 4 jaccard-distance 4 k-means-clustering 4 near-duplicate-detection 4 minwise-hashing 4 time-series 4 sentiment-analysis 4 tlsh 4 random-projections 4 k-means 4 logistic-regression 4 kd-tree 4 machine-translation 4 k-nearest-neighbors 4 image-processing 4 numpy 4 cosine-similarity 4 minwise-hashing-algorithm 3 pyspark 3 svd 3 hierarchical-clustering 3 page-rank 3 hashing-algorithm 3 collaborative-filtering 3 recommender-system 3 mapreduce 3 text-mining 3 minhash-sketches 3 nearest-neighbors 3 word-embeddings 3 transformers 3 pca 3 python3 3 high-dimensional-data 3 cosine-distance 3 java 3 pytorch 3 golang 3 go 3 similarity 3 shingling 3 elasticsearch 3 document-similarity 3 hyperloglog 2 random-indexing 2 machine-learning-algorithms 2 digest 2 hash 2 spark 2 datamining 2 nodejs 2 nilsimsa 2 pyterrier 2 search-engine-optimization 2 knn 2 dimensionality-reduction 2 bert 2 word2vec 2 part-of-speech-tagging 2 spam 2 time-series-classification 2 principal-component-analysis 2 ann 2 quantization 2 approximate-nearest-neighbors 2 attention-model 2 tf-idf 2 qalsh 2 latent-dirichlet-allocation 2 dataset 2 viterbi-algorithm 2 sketch 2 weighted-sets 2 zero-shot-learning 2 image 2