An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: entity-matching

AI-team-UoA/pyJedAI

An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.

Language: Python - Size: 139 MB - Last synced at: about 18 hours ago - Pushed at: about 19 hours ago - Stars: 76 - Forks: 12

deweylab/MetaSRA-pipeline

MetaSRA: normalized sample-specific metadata for the Sequence Read Archive

Language: Python - Size: 27.4 MB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 43 - Forks: 14

Senzing/awesome

Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.

Language: Python - Size: 244 KB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 57 - Forks: 2

tshu-w/EMBer

Code and data for the paper "Bridging the Gap between Reality and Ideality of Entity Matching: A Revisiting and Benchmark Re-Construction" (IJCAI 2022)

Language: Python - Size: 29.8 MB - Last synced at: 4 days ago - Pushed at: 11 days ago - Stars: 5 - Forks: 2

dell-research-harvard/linktransformer

A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning

Language: Python - Size: 1.81 MB - Last synced at: 10 days ago - Pushed at: 19 days ago - Stars: 118 - Forks: 10

beyond-tabs/prolog-matcher

A Prolog-based service for named entity matching and ranking, powered by SWI-Prolog and MySQL ODBC integration

Language: Prolog - Size: 8.79 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

ing-bank/EntityMatchingModel

Entity Matching Model solves the problem of matching company names between two possibly very large datasets.

Language: Python - Size: 290 KB - Last synced at: 18 days ago - Pushed at: about 2 months ago - Stars: 69 - Forks: 8

Gaglia88/gsm_repro

Reproducibility experiments for Generalized Supervised Meta-blocking

Language: Python - Size: 60.7 MB - Last synced at: 10 days ago - Pushed at: 25 days ago - Stars: 0 - Forks: 0

llm-db/llm-enhanced-entity-matching-comparative-analysis-of-traditional-and-modern-techniques

LLM-Enhanced Entity Matching: Comparative Analysis of traditional and modern techniques (Master Thesis, ETH Zürich, 2025)

Size: 1000 Bytes - Last synced at: 19 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

abcsys/libem

Compound AI toolchain for fast and accurate entity matching, powered by LLMs.

Language: Python - Size: 3.54 MB - Last synced at: 4 days ago - Pushed at: 29 days ago - Stars: 22 - Forks: 4

scify/JedAIToolkit

An open source, high scalability toolkit in Java for Entity Resolution.

Language: Java - Size: 278 MB - Last synced at: 17 days ago - Pushed at: about 1 year ago - Stars: 218 - Forks: 47

pi-kappa-devel/py-neer-match

NEural-symbolic Entity Reasoning and Matching in Python

Language: Python - Size: 1.1 MB - Last synced at: 10 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 2

vintasoftware/entity-embed

PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.

Language: Jupyter Notebook - Size: 11.4 MB - Last synced at: 14 days ago - Pushed at: over 2 years ago - Stars: 151 - Forks: 16

tshu-w/ComEM

Code for the paper "Match, Compare, or Select? An Investigation of Large Language Models for Entity Matching" (COLING 2025)

Language: Python - Size: 158 KB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 11 - Forks: 2

magantoine/JobSkape

JOBSKAPE: A Framework for Generating Synthetic Job Postings to Enhance Skill Matching

Language: Python - Size: 51.4 MB - Last synced at: 12 days ago - Pushed at: 5 months ago - Stars: 8 - Forks: 0

data61/clkhash

CLK hash: hash pii for entity matching

Language: Python - Size: 3.49 MB - Last synced at: 2 days ago - Pushed at: about 1 month ago - Stars: 47 - Forks: 9

Evnsn/awsome-entity-resolution

A collection of awesome resources regarding Record Linkage.

Size: 13.7 KB - Last synced at: 12 days ago - Pushed at: 8 months ago - Stars: 7 - Forks: 0

INCATools/neoplasmer

Neoplasm Entity Recognition: matching disease names to ontology classes

Language: Prolog - Size: 88.9 KB - Last synced at: about 1 month ago - Pushed at: almost 6 years ago - Stars: 5 - Forks: 0

aryanGupta-09/PaperTrail

An integrated research and patent information system.

Language: JavaScript - Size: 1.02 MB - Last synced at: 27 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

zentity-io/zentity

Entity resolution for Elasticsearch.

Language: Java - Size: 634 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 158 - Forks: 29

tteofili/certa

CERTA - Computing Entity Resolution explanations with TriAngles

Language: Python - Size: 26.8 MB - Last synced at: 4 days ago - Pushed at: 4 months ago - Stars: 5 - Forks: 3

rutgers-db/EntityBlockingBySimilarityJoins

An end-to-end entity matching system

Language: C++ - Size: 42.1 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 1

ivan-bilan/Entity-Matching-Tutorial

Language: Jupyter Notebook - Size: 45.9 KB - Last synced at: 2 months ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

abcsys/libem-sample-data

Libem sample datasets.

Language: Python - Size: 17.2 MB - Last synced at: 7 months ago - Pushed at: 9 months ago - Stars: 3 - Forks: 1

indisalsa/FairnessAIArticles

Performance Analysis of Entity Matching With Fuzzy Wuzzy on Fairness AI Articles

Language: Jupyter Notebook - Size: 708 KB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

indisalsa/FairnessInAIArticles

Research Article Matching Methods using Attention, Hybrid, RNN, and SIF

Language: Jupyter Notebook - Size: 150 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

abcsys/libem-notebook

Libem notebooks.

Language: Jupyter Notebook - Size: 2.32 MB - Last synced at: 7 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

boscoj2008/AdapterEM

AdapterEM: Pre-trained Language Model Adaptation for Generalized Entity Matching using Adapter-tuning

Language: Python - Size: 163 MB - Last synced at: 11 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

ZJU-DAILY/PromptEM

Code for the paper "PromptEM: Prompt-tuning for Low-resource Generalized Entity Matching". VLDB 2023.

Language: Python - Size: 35.6 MB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 12 - Forks: 3

ZJU-DAILY/CollaborEM

Code for the paper "CollaborEM: A Self-supervised Entity Matching Framework Using Multi-features Collaboration". TKDE 2021.

Language: Python - Size: 5.78 MB - Last synced at: 11 months ago - Pushed at: almost 3 years ago - Stars: 21 - Forks: 3

softlab-unimore/landmark

Entity Matching specific Explanation tool. Landmark generates reliable and coherent explanations through a perturbation analysis.

Language: Jupyter Notebook - Size: 34.1 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 1

catalyst-cooperative/ccai-entity-matching 📦

An exploration of generalizable approaches to unsupervised entity matching for use in linking tabular public energy data sources.

Language: Jupyter Notebook - Size: 12.2 MB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

phymbert/spark-search 📦

Spark Search - high performance advanced search features based on Apache Lucene

Language: Scala - Size: 785 KB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 23 - Forks: 2

VarunCode/Data_Science

4 stage data science project

Language: Jupyter Notebook - Size: 3.96 MB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

cyn0/book_worms

Entity-Match books from goodreads.com and bookdepository.com

Language: Python - Size: 544 KB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0

chakshuahuja/CS839

Submissions for Data Science: Principles, Algorithms, and Applications (CS839) @ UW-Madison

Language: Jupyter Notebook - Size: 5.31 MB - Last synced at: about 1 year ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

swatianand56/CS839

Submission Repository for Data Science Class Project

Language: Jupyter Notebook - Size: 5.19 MB - Last synced at: about 1 year ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

enricopal/STEM

Language: Java - Size: 79.7 MB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 7 - Forks: 3

tteofili/er-utils

utilities for working with Entity Resolution models

Language: Python - Size: 35.2 KB - Last synced at: 25 days ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

kylegilde/Entity-Matching-in-Online-Retail

Master's Degree Final Project using Python & NLP

Language: Jupyter Notebook - Size: 24.3 MB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

achen353/dacon

Data Augmentation for Entity Matching using Consistency Learning

Language: Jupyter Notebook - Size: 105 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 1

marcderbauer/entity_matching

Entity matching ensemble algorithm

Language: Python - Size: 24.4 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

UIC-InDeXLab/fair_entity_matching

Fair Entity Matching: A Fairness Suite for Auditing Entity Matching Approaches

Language: Python - Size: 9.09 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

okgreece/Alignment

Alignment, a collaborative, system aided, user driven ontology/vocabulary matching and validation platform.

Language: JavaScript - Size: 41.2 MB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 12 - Forks: 1

diaspora-orm/diaspora 📦

:coffee: Multi-source ORM for Javascript Client+Server

Language: TypeScript - Size: 26 MB - Last synced at: 7 days ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 2

wbsg-uni-mannheim-students/cross-lingual-product-matching

Language: Python - Size: 76.2 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 3 - Forks: 2

Gaglia88/ruler

Scalable record-level matching rules

Language: Scala - Size: 2.44 MB - Last synced at: 16 days ago - Pushed at: about 5 years ago - Stars: 6 - Forks: 0

ArjitJ/DIAL

Implementation of the paper "Deep Indexed Active Learning for Matching Heterogeneous Entity Representations"

Language: Python - Size: 4.88 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 13 - Forks: 2

wbsg-uni-mannheim/winter Fork of olehmberg/winter

WInte.r is a Java framework for end-to-end data integration. The WInte.r framework implements well-known methods for data pre-processing, schema matching, identity resolution, data fusion, and result evaluation.

Language: Java - Size: 18.6 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 1

Nikoletos-K/WinnER

A Winner-Take-All Hashing-Based Unsupervised Model for Entity Resolution Problems. [B. Sc. Thesis]

Language: Jupyter Notebook - Size: 154 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

sensasi-delight/deep-learning-using-deepmatcher

This repository is a supplement resource for a research article entitled "Deep Learning Untuk Entity Matching Produk Kamera Antar Online Store Menggunakan DeepMatcher"

Language: Jupyter Notebook - Size: 202 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

delftdata/repro-di

Language: Jupyter Notebook - Size: 2.37 MB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

chinkitp/entity-matching-dataset

Language: Jupyter Notebook - Size: 103 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

escanillans/entity_matching

Performed entity matching on Album music data across two different (extracted) tables from metacritic.com and wikipedia.

Language: Jupyter Notebook - Size: 429 KB - Last synced at: 4 months ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

YuanTingHsieh/CS-839-Data-Science Fork of WenFuLee/CS-839-Data-Science

Course project for CS839 Spring18 at UW-Madison

Language: Jupyter Notebook - Size: 3.9 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0