GitHub topics: jaccard-similarity
ekzhu/datasketch
MinHash, LSH, LSH Forest, Weighted MinHash, HyperLogLog, HyperLogLog++, LSH Ensemble and HNSW
Language: Python - Size: 5.68 MB - Last synced at: 3 days ago - Pushed at: 12 months ago - Stars: 2,699 - Forks: 299

ashvardanian/jaccard-index
Optimizing bit-level Jaccard Index and Population Counts for large-scale quantized Vector Search via Harley-Seal CSA and Lookup Tables
Language: Jupyter Notebook - Size: 76.2 KB - Last synced at: 7 days ago - Pushed at: 7 days ago - Stars: 18 - Forks: 1

PiotrTymoszuk/FGFR-BLCA
Genetic alterations and expression of genes coding for FGF ligands and FGF reseptors in urothelial cancer
Language: R - Size: 163 MB - Last synced at: 9 days ago - Pushed at: 9 days ago - Stars: 0 - Forks: 0

adrg/strutil
Go metrics for calculating string similarity and other string utility functions
Language: Go - Size: 111 KB - Last synced at: 3 days ago - Pushed at: 19 days ago - Stars: 382 - Forks: 25

izikeros/sentence-plagiarism
Compare sentences from input document with all sentences from reference documents - find very similar ones.
Language: Python - Size: 244 KB - Last synced at: 5 days ago - Pushed at: 15 days ago - Stars: 3 - Forks: 0

matiskay/html-similarity
Compare html similarity using structural and style metrics
Language: Python - Size: 64.5 KB - Last synced at: 5 days ago - Pushed at: about 2 years ago - Stars: 211 - Forks: 23

Jingnan-Jia/segmentation_metrics
A package to compute medical segmentation metrics.
Language: Python - Size: 171 KB - Last synced at: 8 days ago - Pushed at: 10 months ago - Stars: 159 - Forks: 12

dennismgoetz/DataMining
"Data Mining" course at the University of Trento
Language: Jupyter Notebook - Size: 68.6 MB - Last synced at: 7 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 0

chrismattmann/tika-similarity
Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.
Language: Python - Size: 3.22 MB - Last synced at: 21 days ago - Pushed at: about 1 month ago - Stars: 107 - Forks: 60

RobCyberLab/Ngram-Similarity-Engine
🤖Ngram Similarity Engine📚
Language: Python - Size: 3.62 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 1 - Forks: 0

Dakshmulundkar/SocialVoyage
Social Voyagee is a travel matchmaking app that connects users based on shared destinations, group size, gender, and age. It features secure authentication, profile management, friend requests, and real-time matchmaking using Jaccard similarity. Built with Flask, MongoDB, and a modern UI, it makes travel social and fun! 🚀
Language: HTML - Size: 1.91 MB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 1

miltiadiss/CEID_NE4338-Multidimensional-Data-Structures
This project implements multi-dimensional indices (k-d trees, quad trees, range trees, R-trees) for querying computer scientists' data by surname, awards, and publications, with education similarity measured using LSH, comparing the methods experimentally.
Language: Python - Size: 3.29 MB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

adityapathakk/match-resume-with-jobDescription Fork of adityapathak-cubastion/match-resume-with-jobDescription
This project aims to make the process of matching resumes with a particular job description much faster. Simply enter the required job-description and all the resumes that need to be filtered and run the script to find the top scorer as well as the 'n' best matching resumes! Built using Python, Hugging Face and Scikit-Learn.
Language: Python - Size: 219 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

adityapathak-cubastion/match-resume-with-jobDescription
This project aims to make the process of matching resumes with a particular job description much faster. Simply enter the required job-description and all the resumes that need to be filtered and run the script to find the top scorer as well as the 'n' best matching resumes! Built using Python, Hugging Face and Scikit-Learn.
Language: Python - Size: 265 KB - Last synced at: about 1 month ago - Pushed at: 3 months ago - Stars: 0 - Forks: 1

xSenzaki/Automated-Essay-Checker
A project requirement for the subject 'CS303 - Automata Theory'
Language: Python - Size: 0 Bytes - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

ashithapallath/KNN-Distance-Measures
This project compares k-NN performance using different distance metrics. Euclidean, Manhattan, and Minkowski achieved 100% accuracy, making them ideal for numerical data. Cosine Similarity performed well (93.33%), while Hamming and Jaccard were ineffective (33.33%).
Language: Jupyter Notebook - Size: 89.8 KB - Last synced at: about 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

berksudan/Where-is-the-Answer
A Turkish NLP tool built as a computer project. Used: Python 3, Word2Vec, Natural Language Processing Techniques, Linux Bash Script.
Language: Python - Size: 183 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

mrkkrp/text-metrics
Calculate various string metrics efficiently in Haskell
Language: Haskell - Size: 122 KB - Last synced at: 22 days ago - Pushed at: 4 months ago - Stars: 44 - Forks: 4

iMD10/CS315-Texts-Similarity
This repository showcases a project developed for the CS315 Algorithms Design and Analysis course, focusing on finding the similarity between two texts using Jaccard Similarity.
Language: Python - Size: 3.7 MB - Last synced at: 30 days ago - Pushed at: 5 months ago - Stars: 4 - Forks: 0

italo-batista/lsh-semantic-similarity
Locality Sensitive Hashing for semantic similarity (Python 3.x)
Language: Python - Size: 9.77 KB - Last synced at: 17 days ago - Pushed at: almost 7 years ago - Stars: 15 - Forks: 2

Abdelrahman-Amen/Word-Embedding
This code showcases text preprocessing (tokenization, stopword removal, and standardization), training a Word2Vec model to generate word embeddings, and analyzing word relationships using metrics like cosine similarity and Jaccard index. It also visualizes high-dimensional embeddings in 2D using MDS, illustrating how similar words cluster together
Language: Jupyter Notebook - Size: 793 KB - Last synced at: 2 months ago - Pushed at: 4 months ago - Stars: 0 - Forks: 0

MrPowers/spark-stringmetric
Spark functions to run popular phonetic and string matching algorithms
Language: Scala - Size: 457 KB - Last synced at: about 2 months ago - Pushed at: about 3 years ago - Stars: 60 - Forks: 6

lgautier/mashing-pumpkins
Minhash and maxhash library in Python, combining flexibility, expressivity, and performance.
Language: C - Size: 1.4 MB - Last synced at: 1 day ago - Pushed at: 5 months ago - Stars: 21 - Forks: 3

sumn2u/string-comparisons
A collection of string comparisons algorithms
Language: JavaScript - Size: 700 KB - Last synced at: about 2 months ago - Pushed at: about 1 year ago - Stars: 14 - Forks: 5

atkamara/Taxability
Descriptive, predictive analysis of taxability
Language: Jupyter Notebook - Size: 48.2 MB - Last synced at: 6 months ago - Pushed at: 6 months ago - Stars: 1 - Forks: 0

SonakshiA/Similarity-Score-Techniques
The repository shows 6 techniques to measure similarity to determine how similar two pieces of text are. Similarity Measure plays an important role in document/information retrieval, machine translation, question-answering, and document matching.
Language: Jupyter Notebook - Size: 4.88 KB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 0

ikajdan/article_similarity_analysis
Analysis of the similarity between articles based on their content using TF-IDF and LDA
Language: Python - Size: 8.11 MB - Last synced at: 1 day ago - Pushed at: 7 months ago - Stars: 0 - Forks: 1

wajahati/ZAROORAT-ReactNative-Firebase-App
ZAROORAT is a react native app where users can buy and sell stuff.
Language: JavaScript - Size: 1.8 MB - Last synced at: 7 months ago - Pushed at: 7 months ago - Stars: 0 - Forks: 2

IgorSAlencar/SimilaridadeJaccardCosseno
Código desenvolvido para o Trabalho de Conclusão de Curso (TCC) da Licenciatura em Matemática no IFSP - Campus Itaquaquecetuba, como parte dos requisitos para a obtenção do grau. O projeto aplica técnicas de Similaridade de Cosseno e Jaccard para análise de feedbacks de clientes.
Language: Jupyter Notebook - Size: 213 KB - Last synced at: about 2 months ago - Pushed at: 8 months ago - Stars: 0 - Forks: 0

vickumar1981/stringdistance
A fuzzy matching string distance library for Scala and Java that includes Levenshtein distance, Jaro distance, Jaro-Winkler distance, Dice coefficient, N-Gram similarity, Cosine similarity, Jaccard similarity, Longest common subsequence, Hamming distance, and more..
Language: Scala - Size: 1.27 MB - Last synced at: 10 days ago - Pushed at: about 3 years ago - Stars: 78 - Forks: 14

Animesh-Chourey/Loan-Classifier
Trained machine learning algorithms (Logistic Regression, KNN, SVM, Decision Tree) specifically, after performing visualization and pre-preocessing tasks on a loan dataset. Executed the evaluation metrics such as F1-score, Log loss and jaccard-similarity score to assess the algorithms performance.
Language: Jupyter Notebook - Size: 29.3 KB - Last synced at: 3 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

SasheVuchkov/near-duplicate-docs
Simple library for finding duplicate and near-duplicate text documents in massive sets/libraries/databases
Language: TypeScript - Size: 2 MB - Last synced at: about 1 month ago - Pushed at: over 3 years ago - Stars: 9 - Forks: 0

oertl/treeminhash
TreeMinHash: Fast Sketching for Weighted Jaccard Similarity Estimation
Language: C++ - Size: 2.62 MB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 14 - Forks: 3

oertl/probminhash
ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity
Language: C++ - Size: 6.26 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 42 - Forks: 6

FaridYusifli/AMDM_hw4
Homework 4 of Algorithmic Methods for Data Mining. We dealing with networks and graph with about 1 000 000 nodes
Language: Jupyter Notebook - Size: 1.54 MB - Last synced at: 10 months ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

dartseoengineer/keyword-clustering
This repository provides a Python script to cluster keywords based on the similarity of their associated URLs, calculated using the Jaccard similarity coefficient.
Language: Python - Size: 14.6 KB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

kumaranjalij/Flora-Genie
Flora Genie is a personalized plant recommendation system designed to help amateur gardeners select the most suitable plants for their homes or gardens.
Language: Jupyter Notebook - Size: 227 MB - Last synced at: 11 months ago - Pushed at: 11 months ago - Stars: 0 - Forks: 0

andrewmcloud/consimilo
A Clojure library for querying large data-sets on similarity
Language: Clojure - Size: 536 KB - Last synced at: 9 days ago - Pushed at: over 6 years ago - Stars: 63 - Forks: 4

ppw0/minhash
find similar text files quickly
Language: Python - Size: 53.7 KB - Last synced at: 6 months ago - Pushed at: about 4 years ago - Stars: 6 - Forks: 1

adriacabeza/Document-similarity-detection-using-hashing
:page_with_curl:Document similarity detection using hashing
Language: TeX - Size: 16 MB - Last synced at: 3 months ago - Pushed at: over 6 years ago - Stars: 3 - Forks: 1

vokter/vokter-scheduler
(WIP)
Size: 0 Bytes - Last synced at: about 1 year ago - Pushed at: over 8 years ago - Stars: 0 - Forks: 0

vokter/vokter-client-java
Sample Jetty/Jersey2 server that interoperates with a running Vokter server (https://github.com/vokter/vokter).
Language: Java - Size: 7.81 KB - Last synced at: about 1 year ago - Pushed at: almost 9 years ago - Stars: 0 - Forks: 0

vokter/vokter-server
(WIP) HTTP server that deploy distributes Vokter (https://github.com/vokter/vokter) through a REST API.
Size: 3.91 KB - Last synced at: about 1 year ago - Pushed at: over 8 years ago - Stars: 0 - Forks: 0

kavya76/Search-Engine
A simple search engine for Environmental News NLP archive
Language: Jupyter Notebook - Size: 1.43 MB - Last synced at: about 1 year ago - Pushed at: over 3 years ago - Stars: 4 - Forks: 1

john-fotis/Movie-Recommender
A movie recommender written in Go that suggests movies considering various factors within a particular dataset, encompassing users, movies, and movie ratings.
Language: Go - Size: 1.45 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Lefteris-Souflas/Movie-Rating-User-Similarity
Explored Jaccard distance, Min-Hashing, and LSH for user similarity in a movie rating dataset. Tasks involve dataset preprocessing, exact Jaccard Similarity computation, Min-Hash signatures, and LSH implementation. Results and observations are documented in code, output files, and a report
Language: Jupyter Notebook - Size: 1.22 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

Lefteris-Souflas/Entity-Resolution
Addressed Entity Resolution challenges. Tasks include schema-agnostic blocking, pairwise comparisons, Meta-Blocking graph construction, and Jaccard similarity computation. Deliverables include source code, reports, and reproducibility guidelines in Python
Language: Jupyter Notebook - Size: 4.54 MB - Last synced at: 3 months ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

sdevalapurkar/similar-questions
👯 Algorithms using Jaccard similarity to identify questions from a list that are similar to one another
Language: Python - Size: 13.6 MB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

dynatrace-research/set-sketch-paper
SetSketch: Filling the Gap between MinHash and HyperLogLog
Language: C++ - Size: 23.7 MB - Last synced at: about 1 year ago - Pushed at: almost 4 years ago - Stars: 46 - Forks: 5

vkbandari/job_recommendation_engine
recommendation of jobs by various machine learning models
Language: Jupyter Notebook - Size: 8.42 MB - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0

emarkou/Text-Similarity
A text similarity computation using minhashing and Jaccard distance on reuters dataset
Language: R - Size: 69.3 KB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 16 - Forks: 5

mtshikomba/jaccard_text_summarizer
Using the Jaccard ranking algorithm to summarize a document
Language: Jupyter Notebook - Size: 25.4 KB - Last synced at: about 1 year ago - Pushed at: almost 5 years ago - Stars: 0 - Forks: 0

AdrianaMacc/Covid-19-BigData-Project
SARS-COV-2 genome analysis using Big Data algorithms in order to find clusters of similar mutations that belongs to different clades which mutate together and generate the correspondent clade.
Language: Jupyter Notebook - Size: 513 KB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

ngiambla/syn_sugar
Extracting topics using rules.
Language: PureBasic - Size: 888 MB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

imenbkr/Fraud-Detection-Project
An application for fraud detection in medicine packages and tablets.
Language: Python - Size: 24 MB - Last synced at: about 1 year ago - Pushed at: about 1 year ago - Stars: 0 - Forks: 0

youssefelmougy/jaccard-selector
Asynchronous Distributed Actor-based Approach to Jaccard Similarity for Genome Comparisons
Language: Fortran - Size: 112 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

christinebuckler/provider-prescriber
Language: Jupyter Notebook - Size: 30.5 MB - Last synced at: over 1 year ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 1

oertl/bagminhash
BagMinHash - Minwise Hashing Algorithm for Weighted Sets
Language: C++ - Size: 1.02 MB - Last synced at: about 1 month ago - Pushed at: over 4 years ago - Stars: 26 - Forks: 6

MovieTone/JaccardDocumentComparison
Document Comparison web application based on Jaccard Similarity Index. The uploaded file is compared to all previously uploaded ones. Built with Java/JSP
Language: CSS - Size: 16.6 KB - Last synced at: about 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

leocvml/DeepTool
Language: Python - Size: 156 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 1 - Forks: 1

cankobanz/multithreaded-scientific-search-engine
This is a school project from Operating Systems course where threads, mutexes, semaphores, task pools and critical sections are used effectively to ensure synchronization among threads.
Language: C++ - Size: 42 KB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 0

harryyizihan/predict_champions
League of Legends Champion Recommender System
Language: Jupyter Notebook - Size: 27.1 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

ManishaLagisetty/Travel-Recommendation-System
Machine Learning, Python
Language: Jupyter Notebook - Size: 10.3 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

khaosdoctor/sound-recommender
Simple API to recommend songs
Language: TypeScript - Size: 59.6 KB - Last synced at: 6 days ago - Pushed at: about 1 year ago - Stars: 4 - Forks: 0

EslamElbassel/Indexing-and-Documents-Similarity
Measures the similarity between documents by calculating Jaccard similarity between documents and provide a similarity score based on how similar the sentences are compared to each other
Language: Java - Size: 7.81 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

ratthapon/simple-shape-classification
A simple shape recognition using Jaccard similarity, implemented on MATLAB.
Language: Matlab - Size: 1000 Bytes - Last synced at: over 1 year ago - Pushed at: about 9 years ago - Stars: 1 - Forks: 0

BeardedMorganKeller/MovieRecommendationEngine
IMDB Movie Recommendation Engine. Uses jaccard similarity of genres, and title similarity
Language: Python - Size: 687 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

iamtusharbhatia/Machine-Learning
This repository contains various assignments that I have done as a part of the Machine Learning course.
Language: R - Size: 3.62 MB - Last synced at: over 1 year ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 0

dominic-sagers/MovieLens-20M-Recommender-System
Using the MovieLens 20 Million review dataset, this project aims to explore different ways to design, evaluate, and explain recommender systems algorithms. Different item-based and user-based recommender systems are showcased as well as a hybrid algorithm using a modified page-rank algorithm.
Language: Jupyter Notebook - Size: 0 Bytes - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

jayvatti/spellChecker
Spell Checker using a Hash Table
Language: C++ - Size: 109 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

EdDuarte/similarity-search-java
Easy-to-use Java similarity algorithms for text and numeric-series
Language: Java - Size: 149 KB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 18 - Forks: 10

usc-isi-i2/ppjoin
PPJoin and P4Join Python 3 implementation
Language: Python - Size: 172 KB - Last synced at: over 1 year ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 0

iamr2k/JaccardSimilarity
Flask app to find similar movies using Jaccard similarity
Language: CSS - Size: 13.2 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

ada-k/TweetsClassification
Exploring Jaccard and Cosine similarities performances then visualising their output using k means and kmeans with pca. Additional input on time series analysis, web scrapping and twitter scrapping.
Language: Jupyter Notebook - Size: 525 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 14 - Forks: 9

ddellagiacoma/datamining-2016-project
Four different ways to predict reviews' rating through text analysis
Language: Java - Size: 5.65 MB - Last synced at: over 1 year ago - Pushed at: over 8 years ago - Stars: 0 - Forks: 0

nepiskopos/duplicate-questions-detection-lsh
Knowledge extraction through Data Analysis, including Locality Sensitive Hashing (LSH).
Language: Jupyter Notebook - Size: 423 KB - Last synced at: over 1 year ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

samuel-bohman/jaccard-index
Function for calculating the Jaccard index and Jaccard distance for binary attributes
Language: R - Size: 2.93 KB - Last synced at: over 1 year ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 0

Raghuls-github/Best-Classifier
Set of codes and algorithms to find various regression and further the Jaccard score, F1 score, and logloass.
Language: Jupyter Notebook - Size: 120 KB - Last synced at: over 1 year ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0

hellojudger/AntirattanLite
A simple program to solve the similarity between the solution and the code function by function.
Language: Python - Size: 31.3 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

NikosMav/DataAnalysis-Netflix
A notebook for movie and TV show recommendations using Boolean and TF-IDF methods. Get personalized suggestions based on text descriptions and choose the method that suits your preferences.
Language: Jupyter Notebook - Size: 582 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0

Sitaras/Data-Mining
Project 1: 🎬🍿 Movie-Recommendation-System, Project 2: 📰🔍Fake News Detection System
Language: Jupyter Notebook - Size: 9.3 MB - Last synced at: over 1 year ago - Pushed at: about 3 years ago - Stars: 6 - Forks: 0

MagallanesFito/weheart
Meet people just like you
Language: Python - Size: 34.1 MB - Last synced at: over 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0

micts/jss
Fast Jaccard similarity search for abstract sets (documents, products, users, etc.) using MinHashing and Locality Sensitve Hashing
Language: Python - Size: 23.4 KB - Last synced at: over 1 year ago - Pushed at: about 5 years ago - Stars: 3 - Forks: 0

thejchap/catch
Matches gym partners based on schedule, location, and interests using augmented interval trees and Jaccard indices
Language: Ruby - Size: 483 KB - Last synced at: almost 2 years ago - Pushed at: about 6 years ago - Stars: 0 - Forks: 0

92amartins/simple-recommender
A simple content-based recommender system
Language: R - Size: 1000 Bytes - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

fagnercarvalho/QuestionSimilarityTest
Testing Jaccard similarity and Cosine similarity techniques to calculate the similarity between two questions.
Language: C# - Size: 6.84 KB - Last synced at: almost 2 years ago - Pushed at: almost 7 years ago - Stars: 1 - Forks: 0

DorinK/AI-Recommendation-Systems
Third Assignment in 'Artificial Intelligence' course by Dr. Ram Meshulam at Bar-Ilan University
Language: Python - Size: 2.91 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0

holopoj/FHCP
Implementation of the paper "Finding Highly Correlated Pairs with Powerful Pruning" in Java.
Language: Java - Size: 1.56 MB - Last synced at: almost 2 years ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 1

mariofv/DocSim
Minhash text analyzer developed during Algorithmics subject.
Language: C++ - Size: 43.1 MB - Last synced at: almost 2 years ago - Pushed at: over 7 years ago - Stars: 0 - Forks: 1

Abdelrahman-Hussain/Jaccard_similarity
this is a simple application to calculate the Jaccard similarity between input query and stored docs.
Language: Java - Size: 3.91 KB - Last synced at: almost 2 years ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0

srsviegas/ufrgs-ed-jaccard
Programa que calcula o coeficiente de Jaccard entre dois arquivos de texto | Disciplina de Estrutura de Dados da UFRGS
Language: C - Size: 536 KB - Last synced at: almost 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

ulf1/simiscore-syntax
An ML API to compute the Jaccard similarity based on shingled subtrees of the dependency grammar.
Language: Python - Size: 64.5 KB - Last synced at: about 2 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

akelsch/spotify-recommender
Recommender system based on the Spotify Million Playlist Dataset
Language: Java - Size: 1.31 MB - Last synced at: almost 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 1

Pooja-Bhojwani/linked-eed
Aim is to come up with a job recommender system, which takes the skills from LinkedIn and jobs from Indeed and throws the best jobs available for you according to your skills.
Language: Python - Size: 443 KB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 29 - Forks: 17

chanddu/Sentence-similarity-based-on-Semantic-nets-and-Corpus-Statistics-
This is an implementation of the paper written by Yuhua Li, David McLean, Zuhair A. Bandar, James D. O’Shea, and Keeley Crockett
Language: Python - Size: 2.93 KB - Last synced at: about 1 year ago - Pushed at: over 5 years ago - Stars: 21 - Forks: 9

abdo-essam/Inverted-Index
Implements an inverted index to support text search. The inverted index is built from a set of documents, where each document is represented by a unique integer ID.
Language: Java - Size: 872 KB - Last synced at: 2 months ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

salimtirit/multithreaded-search-engine
Multithreaded scientific search engine in C++ that uses Jaccard Similarity to summarize relevant paper abstracts.
Language: C++ - Size: 61.5 KB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

anshul1004/TweetsClustering
Clustering similar tweets using K-means clustering algorithm and Jaccard distance metric
Language: Python - Size: 3.32 MB - Last synced at: over 1 year ago - Pushed at: over 5 years ago - Stars: 7 - Forks: 4

am-tropin/restaurant-europe
🇪🇺🍽 The project classifies restaurants by various features using XGBoost and scikit-learn models and gives content-based recommendations of European restaurants using Jaccard metric from SciPy.
Language: Jupyter Notebook - Size: 36.6 MB - Last synced at: about 2 years ago - Pushed at: about 2 years ago - Stars: 0 - Forks: 0

elifmeseci/link-prediction-on-complex-networks
Using neighborhood-based link prediction methods to predict new links that will occur in networks created from darts championship competitions
Language: Jupyter Notebook - Size: 5.08 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 2 - Forks: 1
