GitHub topics: string-distance
feature23/StringSimilarity.NET
A .NET port of java-string-similarity
Language: C# - Size: 566 KB - Last synced at: 7 days ago - Pushed at: 9 days ago - Stars: 504 - Forks: 72
cicirello/JavaPermutationTools
A Java library for computation on permutations and sequences
Language: Java - Size: 6.79 MB - Last synced at: 7 days ago - Pushed at: 10 days ago - Stars: 19 - Forks: 11
matthieugomez/StringDistances.jl
String Distances in Julia
Language: Julia - Size: 416 KB - Last synced at: 3 days ago - Pushed at: about 2 months ago - Stars: 141 - Forks: 18
Turnerj/Quickenshtein
Making the quickest and most memory efficient implementation of Levenshtein Distance with SIMD and Threading support
Language: C# - Size: 324 KB - Last synced at: 14 days ago - Pushed at: about 2 years ago - Stars: 310 - Forks: 13
jens-muenker/fuzzywuzzy-kotlin
Android library for string matching based on the JavaWuzzy. The algorithm uses Levenshtein distance to calculate similarity between strings.
Language: Kotlin - Size: 186 KB - Last synced at: 26 days ago - Pushed at: 26 days ago - Stars: 8 - Forks: 1
adrg/strutil
Go metrics for calculating string similarity and other string utility functions
Language: Go - Size: 123 KB - Last synced at: 26 days ago - Pushed at: 27 days ago - Stars: 404 - Forks: 27
ac000/libac
A C library of miscellaneous utility functions
Language: C - Size: 336 KB - Last synced at: about 1 month ago - Pushed at: about 1 month ago - Stars: 1 - Forks: 2
aeye-sa/jedi-ts
A typescript implementation of JSON Edit dIstance, as described in the paper "JEDI: These aren't the JSON documents you're looking for..."
Language: TypeScript - Size: 324 KB - Last synced at: about 2 months ago - Pushed at: about 2 months ago - Stars: 0 - Forks: 0
fasiha/mudderjs
Lexicographically-subdivide the “space” between strings, by defining an alternate non-base-ten number system using a pre-defined dictionary of symbol↔︎number mappings. Handy for ordering NoSQL keys.
Language: JavaScript - Size: 508 KB - Last synced at: about 2 months ago - Pushed at: 10 months ago - Stars: 129 - Forks: 9
agext/levenshtein
Levenshtein distance and similarity metrics with customizable edit costs and Winkler-like bonus for common prefix.
Language: Go - Size: 23.4 KB - Last synced at: 19 days ago - Pushed at: about 5 years ago - Stars: 91 - Forks: 7
Daniel-Liu-c0deb0t/triple_accel
Rust edit distance routines accelerated using SIMD. Supports fast Hamming, Levenshtein, restricted Damerau-Levenshtein, etc. distance calculations and string search.
Language: Rust - Size: 182 KB - Last synced at: about 1 month ago - Pushed at: over 2 years ago - Stars: 109 - Forks: 12
andrewjsaid/levenshtypo
A fuzzy string dictionary based on Levenshtein automata
Language: C# - Size: 1.68 MB - Last synced at: about 2 months ago - Pushed at: 5 months ago - Stars: 6 - Forks: 0
technikhil314/offline-diff-viewer
A Privacy focused, easy sharable, open source and anonymous tracking diff viewer.
Language: Vue - Size: 1.37 MB - Last synced at: about 2 months ago - Pushed at: 3 months ago - Stars: 105 - Forks: 22
cicirello/jpt-examples
Example programs for the JavaPermutationTools (JPT) library
Language: Java - Size: 325 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 1 - Forks: 0
cadmiumcr/cadmium
Natural Language Processing (NLP) library for Crystal
Language: Crystal - Size: 9.24 MB - Last synced at: about 1 month ago - Pushed at: almost 4 years ago - Stars: 208 - Forks: 14
ArveH/StringStuff
C# version of calculating the Damerau Levenshtein Distance. Code is based on the Wikipedia article.
Language: C# - Size: 12.7 KB - Last synced at: 2 months ago - Pushed at: over 8 years ago - Stars: 1 - Forks: 0
hbollon/go-edlib
📚 String comparison and edit distance algorithms library, featuring : Levenshtein, LCS, Hamming, Damerau levenshtein (OSA and Adjacent transpositions algorithms), Jaro-Winkler, Cosine, etc...
Language: Go - Size: 83 KB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 523 - Forks: 27
tdebatty/java-string-similarity
Implementation of various string similarity and distance algorithms: Levenshtein, Jaro-winkler, n-Gram, Q-Gram, Jaccard index, Longest Common Subsequence edit distance, cosine similarity ...
Language: Java - Size: 729 KB - Last synced at: 5 months ago - Pushed at: over 3 years ago - Stars: 2,726 - Forks: 417
anirbanmu/str_metrics
Ruby gem (native extension in Rust) providing implementations of various string metrics
Language: Ruby - Size: 94.7 KB - Last synced at: 6 days ago - Pushed at: over 3 years ago - Stars: 77 - Forks: 2
J535D165/recordlinkage
A powerful and modular toolkit for record linkage and duplicate detection in Python
Language: Python - Size: 70 MB - Last synced at: 7 months ago - Pushed at: almost 2 years ago - Stars: 1,007 - Forks: 156
xdrop/fuzzywuzzy
Java fuzzy string matching implementation of the well known Python's fuzzywuzzy algorithm. Fuzzy search for Java
Language: Java - Size: 415 KB - Last synced at: 7 months ago - Pushed at: over 2 years ago - Stars: 839 - Forks: 125
dedupeio/pyhacrf Fork of dirko/pyhacrf
:triangular_ruler: Hidden alignment conditional random field for classifying string pairs.
Language: Python - Size: 428 KB - Last synced at: 23 minutes ago - Pushed at: 17 days ago - Stars: 24 - Forks: 12
hyperjumptech/beda
Beda is a golang library for detecting how similar a two string
Language: Go - Size: 20.5 KB - Last synced at: 7 months ago - Pushed at: almost 5 years ago - Stars: 55 - Forks: 5
sandinmyjoints/equivalency
Declaratively define rules for string equivalence so you can focus on the differences that matter.
Language: JavaScript - Size: 1.57 MB - Last synced at: 2 months ago - Pushed at: 8 months ago - Stars: 4 - Forks: 0
wyndow/fuzzywuzzy
Fuzzy string matching for PHP
Language: PHP - Size: 3.91 KB - Last synced at: 3 months ago - Pushed at: over 1 year ago - Stars: 72 - Forks: 31
simonschoelly/InformationDistances.jl
A small Julia library for calculating the normalized compression distance.
Language: Julia - Size: 268 KB - Last synced at: 5 months ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 1
dexyk/stringosim
String similarity functions, String distance's, Jaccard, Levenshtein, Hamming, Jaro-Winkler, Q-grams, N-grams, LCS - Longest Common Subsequence, Cosine similarity...
Language: Go - Size: 16.6 KB - Last synced at: 8 months ago - Pushed at: about 8 years ago - Stars: 61 - Forks: 8
lovit/levenshtein_finder
Similar string search in Levenshtein distance
Language: Python - Size: 3.02 MB - Last synced at: 3 months ago - Pushed at: over 4 years ago - Stars: 21 - Forks: 2
andreiamatuni/strings
strings for zig
Size: 490 KB - Last synced at: 7 months ago - Pushed at: over 5 years ago - Stars: 14 - Forks: 3
iesl/stance
Learned string similarity for entity names using optimal transport.
Language: Python - Size: 71.3 KB - Last synced at: 5 months ago - Pushed at: about 5 years ago - Stars: 35 - Forks: 3
Dynom/TySug
A project around helping to prevent typing typos. TySug (Typo Suggestions) suggests alternative words with respect to keyboard layouts
Language: Go - Size: 440 KB - Last synced at: about 2 months ago - Pushed at: almost 3 years ago - Stars: 19 - Forks: 3
nkkarpov/editdistancek
LMS algorithm for computing edit distance with SIMD optimizations
Language: Rust - Size: 18.6 KB - Last synced at: about 2 months ago - Pushed at: over 1 year ago - Stars: 11 - Forks: 2
dedupeio/affinegap
:triangular_ruler: A Cython implementation of the affine gap string distance
Language: Cython - Size: 71.3 KB - Last synced at: 14 days ago - Pushed at: almost 3 years ago - Stars: 57 - Forks: 10
dev-ahmadbilal/string-master
A comprehensive JS/TS library with 18 specialized classes for string manipulation, conversion, validation, and more. Streamline your development with powerful, all-in-one solutions.
Language: TypeScript - Size: 600 KB - Last synced at: 8 months ago - Pushed at: 12 months ago - Stars: 4 - Forks: 0
sumn2u/string-comparisons
A collection of string comparisons algorithms
Language: JavaScript - Size: 700 KB - Last synced at: 6 months ago - Pushed at: over 1 year ago - Stars: 14 - Forks: 5
t-ski/string-similarity-algorithms
Common string similarity algorithm implementations.
Language: Python - Size: 6.84 KB - Last synced at: 2 days ago - Pushed at: over 1 year ago - Stars: 2 - Forks: 0
lorenzocestaro/seqalign
Collection of sequence alignment algorithms.
Language: JavaScript - Size: 366 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 25 - Forks: 3
ywu94/python-text-distance
A python implementation of a variety of text/string distance and similarity metrics. No GPL!
Language: Python - Size: 62.5 KB - Last synced at: 8 months ago - Pushed at: over 5 years ago - Stars: 8 - Forks: 2
mtingers/hashfuzz
Detects similarities between strings & generates similarity hash
Language: Python - Size: 16.6 KB - Last synced at: 9 months ago - Pushed at: almost 7 years ago - Stars: 2 - Forks: 1
OlivierBinette/groupbyrule
Deduplicate data using fuzzy and deterministic matching rules.
Language: Python - Size: 11.9 MB - Last synced at: over 1 year ago - Pushed at: over 2 years ago - Stars: 7 - Forks: 0
Elara6331/pak
This repository is a mirror. Do not post issues or PRs here.
Language: Go - Size: 42 KB - Last synced at: 8 months ago - Pushed at: over 4 years ago - Stars: 2 - Forks: 0
mehrandvd/Simila
A project for string similarities.
Language: C# - Size: 2.2 MB - Last synced at: 7 months ago - Pushed at: about 2 years ago - Stars: 13 - Forks: 5
jancajthaml-go/levenstein
levenstein string distance
Language: Go - Size: 17.6 KB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 1 - Forks: 0
sarveshj/Longest-Subsequence-Alphabetical-Order
program that prints the longest substring in which the letters occur in alphabetical order
Language: Python - Size: 2.93 KB - Last synced at: over 2 years ago - Pushed at: about 8 years ago - Stars: 0 - Forks: 0
bitfoundation/Simila Fork of mehrandvd/Simila
A project for string similarities.
Language: C# - Size: 2.2 MB - Last synced at: 3 months ago - Pushed at: about 5 years ago - Stars: 9 - Forks: 1
TeodorDyakov/wildcard-trie
String trie that supports wildcard search
Language: Java - Size: 14.6 KB - Last synced at: over 2 years ago - Pushed at: almost 5 years ago - Stars: 5 - Forks: 7
ecomp-shONgit/string-distance
A set of (string) distance functions written in JavaScript / Python / PHP.
Language: PHP - Size: 95.7 KB - Last synced at: over 2 years ago - Pushed at: about 3 years ago - Stars: 17 - Forks: 3
lagodiuk/levenshtein-top-k
Algorithm for the derivation of the top-K string alignments, based on the Levenshtein distance.
Language: Java - Size: 1.09 MB - Last synced at: over 2 years ago - Pushed at: almost 8 years ago - Stars: 4 - Forks: 0
sfischer13/python-stringmetric 📦
:snake: Python implementations of common string distance and similarity algorithms
Language: Python - Size: 7.81 KB - Last synced at: 2 months ago - Pushed at: over 8 years ago - Stars: 1 - Forks: 2
obulkin/string-dist
A Python library for calculating string distances using C extensions (with a pure Python fallback)
Language: Python - Size: 15.6 KB - Last synced at: 3 months ago - Pushed at: over 5 years ago - Stars: 16 - Forks: 3
ZippeyKeys12/pymatching
String comparisons
Language: Python - Size: 62.5 KB - Last synced at: over 2 years ago - Pushed at: over 5 years ago - Stars: 1 - Forks: 0
Adamishere/Fuzzymatching
Matching records based on imperfect strings using string distances to assign the closest match. Optimized for large files on a single computer.
Language: R - Size: 7.81 KB - Last synced at: almost 3 years ago - Pushed at: over 4 years ago - Stars: 6 - Forks: 2
henrik9999/string-similarity
Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.
Language: PHP - Size: 11.7 KB - Last synced at: about 2 months ago - Pushed at: about 2 years ago - Stars: 1 - Forks: 2
sp1ff/damerau-levenshtein
Comparison of a few algorithms for computing Damerau–Levenshtein distance
Language: Shell - Size: 154 KB - Last synced at: almost 3 years ago - Pushed at: over 4 years ago - Stars: 3 - Forks: 1
subpath/pystrsim
Python wrapper for Rust's strsim library
Language: Python - Size: 17.6 KB - Last synced at: almost 3 years ago - Pushed at: over 3 years ago - Stars: 1 - Forks: 0
seehuhn/go-levenshtein
compute the Levenshtein distance between two strings in Go
Language: Go - Size: 14.6 KB - Last synced at: 12 months ago - Pushed at: about 3 years ago - Stars: 0 - Forks: 0
dedupeio/highered
CRF Edit Distance
Language: Python - Size: 9.77 KB - Last synced at: 2 months ago - Pushed at: over 5 years ago - Stars: 6 - Forks: 4
Alex-Werner/khal
Utils for node project
Language: JavaScript - Size: 43.9 KB - Last synced at: about 2 months ago - Pushed at: over 8 years ago - Stars: 5 - Forks: 1
calebwin/quill
A high-level API for computing edit distance
Language: Java - Size: 9.77 KB - Last synced at: 3 months ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 1
ailgroup/string_dist
String distances in rust
Language: Rust - Size: 33.2 KB - Last synced at: over 2 years ago - Pushed at: over 6 years ago - Stars: 1 - Forks: 2
agricolamz/2017_ANDAN_course
Course for ANDAN Summer School about strings and texts in R
Size: 47.3 MB - Last synced at: 6 months ago - Pushed at: almost 8 years ago - Stars: 2 - Forks: 0
krokerik/capital-guesser
Language: JavaScript - Size: 8.79 KB - Last synced at: about 1 year ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0
lqdc/pysimstr
Fast(ish) string similarity for one vs many comparisons.
Language: Python - Size: 8.79 KB - Last synced at: over 1 year ago - Pushed at: over 9 years ago - Stars: 2 - Forks: 1
johnny-morrice/anagram
Fun anagram ranking tool for golang
Language: Go - Size: 16.6 KB - Last synced at: about 2 months ago - Pushed at: over 8 years ago - Stars: 2 - Forks: 0
paulirwin/StringSimilarity.NET Fork of feature23/StringSimilarity.NET
Language: C# - Size: 482 KB - Last synced at: over 2 years ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0
stuartcraig/lev
A Mata-based Stata command to calculate Levenshtein edit distance.
Language: Stata - Size: 1.95 KB - Last synced at: over 2 years ago - Pushed at: almost 8 years ago - Stars: 0 - Forks: 0
xiy/distance
A go-kit based microservice for the Levenshtein distance algorithm
Language: Lua - Size: 12.7 KB - Last synced at: almost 3 years ago - Pushed at: over 9 years ago - Stars: 1 - Forks: 0
elazzabi/fuzzy-string-matching
Get the degree of resemblance between two strings
Language: JavaScript - Size: 3.91 KB - Last synced at: 6 months ago - Pushed at: over 9 years ago - Stars: 0 - Forks: 0