Ecosyste.ms: Repos

An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: string-distance

cicirello/JavaPermutationTools

A Java library for computation on permutations and sequences

Language: Java - Size: 6.41 MB - Last synced: 3 days ago - Pushed: 4 days ago - Stars: 7 - Forks: 11

feature23/StringSimilarity.NET

A .NET port of java-string-similarity

Language: C# - Size: 509 KB - Last synced: 4 days ago - Pushed: 6 months ago - Stars: 430 - Forks: 70

obulkin/string-dist

A Python library for calculating string distances using C extensions (with a pure Python fallback)

Language: Python - Size: 15.6 KB - Last synced: 5 days ago - Pushed: over 3 years ago - Stars: 16 - Forks: 3

lovit/levenshtein_finder

Similar string search in Levenshtein distance

Language: Python - Size: 3.02 MB - Last synced: 1 day ago - Pushed: almost 3 years ago - Stars: 21 - Forks: 2

J535D165/recordlinkage

A powerful and modular toolkit for record linkage and duplicate detection in Python

Language: Python - Size: 70 MB - Last synced: 8 days ago - Pushed: 3 months ago - Stars: 913 - Forks: 150

cicirello/jpt-examples

Example programs for the JavaPermutationTools (JPT) library

Language: Java - Size: 250 KB - Last synced: 13 days ago - Pushed: 13 days ago - Stars: 0 - Forks: 0

lorenzocestaro/seqalign

Collection of sequence alignment algorithms.

Language: JavaScript - Size: 366 KB - Last synced: 16 days ago - Pushed: about 1 year ago - Stars: 25 - Forks: 2

adrg/strutil

Golang metrics for calculating string similarity and other string utility functions

Language: Go - Size: 99.6 KB - Last synced: 14 days ago - Pushed: 14 days ago - Stars: 280 - Forks: 18

sandinmyjoints/equivalency

Declaratively define rules for string equivalence so you can focus on the differences that matter.

Language: JavaScript - Size: 1.56 MB - Last synced: 23 days ago - Pushed: about 2 months ago - Stars: 4 - Forks: 0

hbollon/go-edlib

đź“š String comparison and edit distance algorithms library, featuring : Levenshtein, LCS, Hamming, Damerau levenshtein (OSA and Adjacent transpositions algorithms), Jaro-Winkler, Cosine, etc...

Language: Go - Size: 76.2 KB - Last synced: 18 days ago - Pushed: almost 2 years ago - Stars: 452 - Forks: 23

fasiha/mudderjs

Lexicographically-subdivide the “space” between strings, by defining an alternate non-base-ten number system using a pre-defined dictionary of symbol↔︎number mappings. Handy for ordering NoSQL keys.

Language: JavaScript - Size: 508 KB - Last synced: 8 days ago - Pushed: over 1 year ago - Stars: 112 - Forks: 9

nkkarpov/editdistancek

LMS algorithm for computing edit distance with SIMD optimizations

Language: Rust - Size: 18.6 KB - Last synced: 24 days ago - Pushed: about 2 months ago - Stars: 5 - Forks: 1

matthieugomez/StringDistances.jl

String Distances in Julia

Language: Julia - Size: 413 KB - Last synced: 11 days ago - Pushed: about 1 month ago - Stars: 135 - Forks: 18

Turnerj/Quickenshtein

Making the quickest and most memory efficient implementation of Levenshtein Distance with SIMD and Threading support

Language: C# - Size: 324 KB - Last synced: 13 days ago - Pushed: 7 months ago - Stars: 273 - Forks: 14

wyndow/fuzzywuzzy

Fuzzy string matching for PHP

Language: PHP - Size: 3.91 KB - Last synced: 14 days ago - Pushed: about 4 years ago - Stars: 70 - Forks: 23

tdebatty/java-string-similarity

Implementation of various string similarity and distance algorithms: Levenshtein, Jaro-winkler, n-Gram, Q-Gram, Jaccard index, Longest Common Subsequence edit distance, cosine similarity ...

Language: Java - Size: 729 KB - Last synced: 24 days ago - Pushed: almost 2 years ago - Stars: 2,663 - Forks: 399

sumn2u/string-comparisons

A collection of string comparisons algorithms

Language: JavaScript - Size: 700 KB - Last synced: about 1 month ago - Pushed: about 1 month ago - Stars: 7 - Forks: 5

technikhil314/offline-diff-viewer

A Privacy focused, easy sharable, open source and trackingless diff viewer.

Language: Vue - Size: 2.56 MB - Last synced: 18 days ago - Pushed: 19 days ago - Stars: 42 - Forks: 11

agext/levenshtein

Levenshtein distance and similarity metrics with customizable edit costs and Winkler-like bonus for common prefix.

Language: Go - Size: 23.4 KB - Last synced: 19 days ago - Pushed: over 3 years ago - Stars: 85 - Forks: 6

Dynom/TySug

A project around helping to prevent typing typos. TySug (Typo Suggestions) suggests alternative words with respect to keyboard layouts

Language: Go - Size: 440 KB - Last synced: 14 days ago - Pushed: about 1 year ago - Stars: 18 - Forks: 3

hyperjumptech/beda

Beda is a golang library for detecting how similar a two string

Language: Go - Size: 20.5 KB - Last synced: 29 days ago - Pushed: over 3 years ago - Stars: 50 - Forks: 3

cadmiumcr/cadmium

Natural Language Processing (NLP) library for Crystal

Language: Crystal - Size: 9.24 MB - Last synced: 18 days ago - Pushed: over 2 years ago - Stars: 201 - Forks: 16

Daniel-Liu-c0deb0t/triple_accel

Rust edit distance routines accelerated using SIMD. Supports fast Hamming, Levenshtein, restricted Damerau-Levenshtein, etc. distance calculations and string search.

Language: Rust - Size: 182 KB - Last synced: 19 days ago - Pushed: about 1 year ago - Stars: 93 - Forks: 10

dexyk/stringosim

String similarity functions, String distance's, Jaccard, Levenshtein, Hamming, Jaro-Winkler, Q-grams, N-grams, LCS - Longest Common Subsequence, Cosine similarity...

Language: Go - Size: 16.6 KB - Last synced: 3 months ago - Pushed: over 6 years ago - Stars: 55 - Forks: 8

xdrop/fuzzywuzzy

Java fuzzy string matching implementation of the well known Python's fuzzywuzzy algorithm. Fuzzy search for Java

Language: Java - Size: 415 KB - Last synced: 4 months ago - Pushed: 10 months ago - Stars: 758 - Forks: 112

dedupeio/affinegap

:triangular_ruler: A Cython implementation of the affine gap string distance

Language: Cython - Size: 71.3 KB - Last synced: 28 days ago - Pushed: over 1 year ago - Stars: 58 - Forks: 9

OlivierBinette/groupbyrule

Deduplicate data using fuzzy and deterministic matching rules.

Language: Python - Size: 11.9 MB - Last synced: 18 days ago - Pushed: about 1 year ago - Stars: 7 - Forks: 0

jens-muenker/fuzzywuzzy-kotlin

Android library for string matching based on the JavaWuzzy Python algorithm. The algorithm uses Levenshtein distance to calculate similarity between strings.

Language: Kotlin - Size: 170 KB - Last synced: 6 months ago - Pushed: 6 months ago - Stars: 0 - Forks: 0

ywu94/python-text-distance

A python implementation of a variety of text/string distance and similarity metrics. No GPL!

Language: Python - Size: 62.5 KB - Last synced: 25 days ago - Pushed: about 4 years ago - Stars: 7 - Forks: 2

simonschoelly/InformationDistances.jl

A small Julia library for calculating the normalized compression distance.

Language: Julia - Size: 268 KB - Last synced: 11 days ago - Pushed: almost 3 years ago - Stars: 4 - Forks: 0

anirbanmu/str_metrics

Ruby gem (native extension in Rust) providing implementations of various string metrics

Language: Ruby - Size: 94.7 KB - Last synced: 24 days ago - Pushed: about 2 years ago - Stars: 79 - Forks: 2

jancajthaml-go/levenstein

levenstein string distance

Language: Go - Size: 17.6 KB - Last synced: 10 months ago - Pushed: about 3 years ago - Stars: 1 - Forks: 0

sarveshj/Longest-Subsequence-Alphabetical-Order

program that prints the longest substring in which the letters occur in alphabetical order

Language: Python - Size: 2.93 KB - Last synced: 10 months ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0

bitfoundation/Simila Fork of mehrandvd/Simila

A project for string similarities.

Language: C# - Size: 2.2 MB - Last synced: 18 days ago - Pushed: over 3 years ago - Stars: 9 - Forks: 1

mehrandvd/Simila

A project for string similarities.

Language: C# - Size: 2.14 MB - Last synced: 10 months ago - Pushed: almost 5 years ago - Stars: 10 - Forks: 4

iesl/stance

Learned string similarity for entity names using optimal transport.

Language: Python - Size: 71.3 KB - Last synced: 10 months ago - Pushed: over 3 years ago - Stars: 31 - Forks: 3

Elara6331/pak

This repository is a mirror. Do not post issues or PRs here.

Language: Go - Size: 42 KB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 1 - Forks: 0

TeodorDyakov/wildcard-trie

String trie that supports wildcard search

Language: Java - Size: 14.6 KB - Last synced: about 1 year ago - Pushed: about 3 years ago - Stars: 5 - Forks: 7

jhermsmeier/node-sift-distance

SIFT distance algorithm

Language: JavaScript - Size: 160 KB - Last synced: 21 days ago - Pushed: over 9 years ago - Stars: 7 - Forks: 1

ecomp-shONgit/string-distance

A set of (string) distance functions written in JavaScript / Python / PHP.

Language: PHP - Size: 95.7 KB - Last synced: 11 months ago - Pushed: over 1 year ago - Stars: 17 - Forks: 3

lagodiuk/levenshtein-top-k

Algorithm for the derivation of the top-K string alignments, based on the Levenshtein distance.

Language: Java - Size: 1.09 MB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 4 - Forks: 0

sfischer13/python-stringmetric 📦

:snake: Python implementations of common string distance and similarity algorithms

Language: Python - Size: 7.81 KB - Last synced: 17 days ago - Pushed: almost 7 years ago - Stars: 1 - Forks: 2

ZippeyKeys12/pymatching

String comparisons

Language: Python - Size: 62.5 KB - Last synced: about 1 year ago - Pushed: almost 4 years ago - Stars: 1 - Forks: 0

Adamishere/Fuzzymatching

Matching records based on imperfect strings using string distances to assign the closest match. Optimized for large files on a single computer.

Language: R - Size: 7.81 KB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 6 - Forks: 2

henrik9999/string-similarity

Finds degree of similarity between two strings, based on Dice's Coefficient, which is mostly better than Levenshtein distance.

Language: PHP - Size: 11.7 KB - Last synced: 25 days ago - Pushed: 6 months ago - Stars: 1 - Forks: 1

sp1ff/damerau-levenshtein

Comparison of a few algorithms for computing Damerau–Levenshtein distance

Language: Shell - Size: 154 KB - Last synced: about 1 year ago - Pushed: almost 3 years ago - Stars: 3 - Forks: 1

subpath/pystrsim

Python wrapper for Rust's strsim library

Language: Python - Size: 17.6 KB - Last synced: about 1 year ago - Pushed: almost 2 years ago - Stars: 1 - Forks: 0

seehuhn/go-levenshtein

compute the Levenshtein distance between two strings in Go

Language: Go - Size: 14.6 KB - Last synced: 9 months ago - Pushed: over 1 year ago - Stars: 0 - Forks: 0

dedupeio/pyhacrf Fork of dirko/pyhacrf

:triangular_ruler: Hidden alignment conditional random field for classifying string pairs.

Language: Python - Size: 406 KB - Last synced: 3 days ago - Pushed: 3 months ago - Stars: 24 - Forks: 12

dedupeio/highered

CRF Edit Distance

Language: Python - Size: 9.77 KB - Last synced: about 1 month ago - Pushed: about 4 years ago - Stars: 6 - Forks: 4

ac000/libac

A C library of miscellaneous utility functions

Language: C - Size: 321 KB - Last synced: 8 months ago - Pushed: 8 months ago - Stars: 1 - Forks: 2

Alex-Werner/khal

Utils for node project

Language: JavaScript - Size: 43.9 KB - Last synced: 24 days ago - Pushed: about 7 years ago - Stars: 5 - Forks: 1

calebwin/quill

A high-level API for computing edit distance

Language: Java - Size: 9.77 KB - Last synced: 11 days ago - Pushed: over 5 years ago - Stars: 1 - Forks: 1

ailgroup/string_dist

String distances in rust

Language: Rust - Size: 33.2 KB - Last synced: 11 months ago - Pushed: almost 5 years ago - Stars: 1 - Forks: 2

agricolamz/2017_ANDAN_course

Course for ANDAN Summer School about strings and texts in R

Size: 47.3 MB - Last synced: about 1 year ago - Pushed: about 6 years ago - Stars: 2 - Forks: 0

lqdc/pysimstr

Fast(ish) string similarity for one vs many comparisons.

Language: Python - Size: 8.79 KB - Last synced: 16 days ago - Pushed: almost 8 years ago - Stars: 2 - Forks: 1

johnny-morrice/anagram

Fun anagram ranking tool for golang

Language: Go - Size: 16.6 KB - Last synced: 9 months ago - Pushed: about 7 years ago - Stars: 2 - Forks: 0

mtingers/hashfuzz

Detects similarities between strings & generates similarity hash

Language: Python - Size: 16.6 KB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0

paulirwin/StringSimilarity.NET Fork of feature23/StringSimilarity.NET

Language: C# - Size: 482 KB - Last synced: about 1 year ago - Pushed: over 5 years ago - Stars: 0 - Forks: 0

stuartcraig/lev

A Mata-based Stata command to calculate Levenshtein edit distance.

Language: Stata - Size: 1.95 KB - Last synced: about 1 year ago - Pushed: over 6 years ago - Stars: 0 - Forks: 0

xiy/distance

A go-kit based microservice for the Levenshtein distance algorithm

Language: Lua - Size: 12.7 KB - Last synced: about 1 year ago - Pushed: about 8 years ago - Stars: 1 - Forks: 0

ArveH/StringStuff

C# version of calculating the Damerau Levenshtein Distance. Code is based on the Wikipedia article.

Language: C# - Size: 12.7 KB - Last synced: about 1 year ago - Pushed: about 7 years ago - Stars: 0 - Forks: 0

elazzabi/fuzzy-string-matching

Get the degree of resemblance between two strings

Language: JavaScript - Size: 3.91 KB - Last synced: about 1 year ago - Pushed: about 8 years ago - Stars: 0 - Forks: 0

jhermsmeier/sift.c

SIFT string distance algorithm

Language: C - Size: 105 KB - Last synced: about 1 year ago - Pushed: almost 10 years ago - Stars: 0 - Forks: 0