GitHub topics: entity-matching
AI-team-UoA/pyJedAI
An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.
Language: Python - Size: 139 MB - Last synced at: about 18 hours ago - Pushed at: about 19 hours ago - Stars: 76 - Forks: 12

deweylab/MetaSRA-pipeline
MetaSRA: normalized sample-specific metadata for the Sequence Read Archive
Language: Python - Size: 27.4 MB - Last synced at: about 19 hours ago - Pushed at: about 20 hours ago - Stars: 43 - Forks: 14

Senzing/awesome
Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.
Language: Python - Size: 244 KB - Last synced at: 6 days ago - Pushed at: 7 days ago - Stars: 57 - Forks: 2

tshu-w/EMBer
Code and data for the paper "Bridging the Gap between Reality and Ideality of Entity Matching: A Revisiting and Benchmark Re-Construction" (IJCAI 2022)
Language: Python - Size: 29.8 MB - Last synced at: 4 days ago - Pushed at: 11 days ago - Stars: 5 - Forks: 2

dell-research-harvard/linktransformer
A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning
Language: Python - Size: 1.81 MB - Last synced at: 10 days ago - Pushed at: 19 days ago - Stars: 118 - Forks: 10

beyond-tabs/prolog-matcher
A Prolog-based service for named entity matching and ranking, powered by SWI-Prolog and MySQL ODBC integration
Language: Prolog - Size: 8.79 KB - Last synced at: 21 days ago - Pushed at: 21 days ago - Stars: 0 - Forks: 0

ing-bank/EntityMatchingModel
Entity Matching Model solves the problem of matching company names between two possibly very large datasets.
Language: Python - Size: 290 KB - Last synced at: 18 days ago - Pushed at: about 2 months ago - Stars: 69 - Forks: 8

Gaglia88/gsm_repro
Reproducibility experiments for Generalized Supervised Meta-blocking
Language: Python - Size: 60.7 MB - Last synced at: 10 days ago - Pushed at: 25 days ago - Stars: 0 - Forks: 0

llm-db/llm-enhanced-entity-matching-comparative-analysis-of-traditional-and-modern-techniques
LLM-Enhanced Entity Matching: Comparative Analysis of traditional and modern techniques (Master Thesis, ETH Zürich, 2025)
Size: 1000 Bytes - Last synced at: 19 days ago - Pushed at: 27 days ago - Stars: 0 - Forks: 0

abcsys/libem
Compound AI toolchain for fast and accurate entity matching, powered by LLMs.
Language: Python - Size: 3.54 MB - Last synced at: 4 days ago - Pushed at: 29 days ago - Stars: 22 - Forks: 4

scify/JedAIToolkit
An open source, high scalability toolkit in Java for Entity Resolution.
Language: Java - Size: 278 MB - Last synced at: 17 days ago - Pushed at: about 1 year ago - Stars: 218 - Forks: 47

pi-kappa-devel/py-neer-match
NEural-symbolic Entity Reasoning and Matching in Python
Language: Python - Size: 1.1 MB - Last synced at: 10 days ago - Pushed at: about 1 month ago - Stars: 0 - Forks: 2

vintasoftware/entity-embed
PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.
Language: Jupyter Notebook - Size: 11.4 MB - Last synced at: 14 days ago - Pushed at: over 2 years ago - Stars: 151 - Forks: 16

tshu-w/ComEM
Code for the paper "Match, Compare, or Select? An Investigation of Large Language Models for Entity Matching" (COLING 2025)
Language: Python - Size: 158 KB - Last synced at: 4 days ago - Pushed at: 3 months ago - Stars: 11 - Forks: 2

magantoine/JobSkape
JOBSKAPE: A Framework for Generating Synthetic Job Postings to Enhance Skill Matching
Language: Python - Size: 51.4 MB - Last synced at: 12 days ago - Pushed at: 5 months ago - Stars: 8 - Forks: 0

data61/clkhash
CLK hash: hash pii for entity matching
Language: Python - Size: 3.49 MB - Last synced at: 2 days ago - Pushed at: about 1 month ago - Stars: 47 - Forks: 9

Evnsn/awsome-entity-resolution
A collection of awesome resources regarding Record Linkage.
Size: 13.7 KB - Last synced at: 12 days ago - Pushed at: 8 months ago - Stars: 7 - Forks: 0

INCATools/neoplasmer
Neoplasm Entity Recognition: matching disease names to ontology classes
Language: Prolog - Size: 88.9 KB - Last synced at: about 1 month ago - Pushed at: almost 6 years ago - Stars: 5 - Forks: 0

aryanGupta-09/PaperTrail
An integrated research and patent information system.
Language: JavaScript - Size: 1.02 MB - Last synced at: 27 days ago - Pushed at: 3 months ago - Stars: 0 - Forks: 0

zentity-io/zentity
Entity resolution for Elasticsearch.
Language: Java - Size: 634 KB - Last synced at: 3 months ago - Pushed at: 3 months ago - Stars: 158 - Forks: 29

tteofili/certa
CERTA - Computing Entity Resolution explanations with TriAngles
Language: Python - Size: 26.8 MB - Last synced at: 4 days ago - Pushed at: 4 months ago - Stars: 5 - Forks: 3

rutgers-db/EntityBlockingBySimilarityJoins
An end-to-end entity matching system
Language: C++ - Size: 42.1 MB - Last synced at: 4 months ago - Pushed at: 4 months ago - Stars: 4 - Forks: 1

ivan-bilan/Entity-Matching-Tutorial
Language: Jupyter Notebook - Size: 45.9 KB - Last synced at: 2 months ago - Pushed at: about 4 years ago - Stars: 2 - Forks: 0

abcsys/libem-sample-data
Libem sample datasets.
Language: Python - Size: 17.2 MB - Last synced at: 7 months ago - Pushed at: 9 months ago - Stars: 3 - Forks: 1

indisalsa/FairnessAIArticles
Performance Analysis of Entity Matching With Fuzzy Wuzzy on Fairness AI Articles
Language: Jupyter Notebook - Size: 708 KB - Last synced at: 3 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

indisalsa/FairnessInAIArticles
Research Article Matching Methods using Attention, Hybrid, RNN, and SIF
Language: Jupyter Notebook - Size: 150 KB - Last synced at: 9 months ago - Pushed at: 9 months ago - Stars: 0 - Forks: 0

abcsys/libem-notebook
Libem notebooks.
Language: Jupyter Notebook - Size: 2.32 MB - Last synced at: 7 months ago - Pushed at: 10 months ago - Stars: 1 - Forks: 0

boscoj2008/AdapterEM
AdapterEM: Pre-trained Language Model Adaptation for Generalized Entity Matching using Adapter-tuning
Language: Python - Size: 163 MB - Last synced at: 11 months ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 0

ZJU-DAILY/PromptEM
Code for the paper "PromptEM: Prompt-tuning for Low-resource Generalized Entity Matching". VLDB 2023.
Language: Python - Size: 35.6 MB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 12 - Forks: 3

ZJU-DAILY/CollaborEM
Code for the paper "CollaborEM: A Self-supervised Entity Matching Framework Using Multi-features Collaboration". TKDE 2021.
Language: Python - Size: 5.78 MB - Last synced at: 11 months ago - Pushed at: almost 3 years ago - Stars: 21 - Forks: 3

softlab-unimore/landmark
Entity Matching specific Explanation tool. Landmark generates reliable and coherent explanations through a perturbation analysis.
Language: Jupyter Notebook - Size: 34.1 MB - Last synced at: about 1 year ago - Pushed at: almost 2 years ago - Stars: 0 - Forks: 1

catalyst-cooperative/ccai-entity-matching 📦
An exploration of generalizable approaches to unsupervised entity matching for use in linking tabular public energy data sources.
Language: Jupyter Notebook - Size: 12.2 MB - Last synced at: 12 months ago - Pushed at: about 1 year ago - Stars: 1 - Forks: 1

phymbert/spark-search 📦
Spark Search - high performance advanced search features based on Apache Lucene
Language: Scala - Size: 785 KB - Last synced at: 5 months ago - Pushed at: over 2 years ago - Stars: 23 - Forks: 2

VarunCode/Data_Science
4 stage data science project
Language: Jupyter Notebook - Size: 3.96 MB - Last synced at: about 1 year ago - Pushed at: almost 7 years ago - Stars: 0 - Forks: 0

cyn0/book_worms
Entity-Match books from goodreads.com and bookdepository.com
Language: Python - Size: 544 KB - Last synced at: about 1 year ago - Pushed at: about 7 years ago - Stars: 0 - Forks: 0

chakshuahuja/CS839
Submissions for Data Science: Principles, Algorithms, and Applications (CS839) @ UW-Madison
Language: Jupyter Notebook - Size: 5.31 MB - Last synced at: about 1 year ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

swatianand56/CS839
Submission Repository for Data Science Class Project
Language: Jupyter Notebook - Size: 5.19 MB - Last synced at: about 1 year ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

enricopal/STEM
Language: Java - Size: 79.7 MB - Last synced at: about 1 year ago - Pushed at: over 7 years ago - Stars: 7 - Forks: 3

tteofili/er-utils
utilities for working with Entity Resolution models
Language: Python - Size: 35.2 KB - Last synced at: 25 days ago - Pushed at: about 3 years ago - Stars: 2 - Forks: 0

kylegilde/Entity-Matching-in-Online-Retail
Master's Degree Final Project using Python & NLP
Language: Jupyter Notebook - Size: 24.3 MB - Last synced at: over 1 year ago - Pushed at: almost 6 years ago - Stars: 0 - Forks: 0

achen353/dacon
Data Augmentation for Entity Matching using Consistency Learning
Language: Jupyter Notebook - Size: 105 MB - Last synced at: over 1 year ago - Pushed at: over 3 years ago - Stars: 0 - Forks: 1

marcderbauer/entity_matching
Entity matching ensemble algorithm
Language: Python - Size: 24.4 KB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 0 - Forks: 0

UIC-InDeXLab/fair_entity_matching
Fair Entity Matching: A Fairness Suite for Auditing Entity Matching Approaches
Language: Python - Size: 9.09 MB - Last synced at: over 1 year ago - Pushed at: over 1 year ago - Stars: 1 - Forks: 0

okgreece/Alignment
Alignment, a collaborative, system aided, user driven ontology/vocabulary matching and validation platform.
Language: JavaScript - Size: 41.2 MB - Last synced at: about 1 year ago - Pushed at: about 3 years ago - Stars: 12 - Forks: 1

diaspora-orm/diaspora 📦
:coffee: Multi-source ORM for Javascript Client+Server
Language: TypeScript - Size: 26 MB - Last synced at: 7 days ago - Pushed at: about 3 years ago - Stars: 3 - Forks: 2

wbsg-uni-mannheim-students/cross-lingual-product-matching
Language: Python - Size: 76.2 KB - Last synced at: over 1 year ago - Pushed at: almost 4 years ago - Stars: 3 - Forks: 2

Gaglia88/ruler
Scalable record-level matching rules
Language: Scala - Size: 2.44 MB - Last synced at: 16 days ago - Pushed at: about 5 years ago - Stars: 6 - Forks: 0

ArjitJ/DIAL
Implementation of the paper "Deep Indexed Active Learning for Matching Heterogeneous Entity Representations"
Language: Python - Size: 4.88 MB - Last synced at: almost 2 years ago - Pushed at: over 3 years ago - Stars: 13 - Forks: 2

wbsg-uni-mannheim/winter Fork of olehmberg/winter
WInte.r is a Java framework for end-to-end data integration. The WInte.r framework implements well-known methods for data pre-processing, schema matching, identity resolution, data fusion, and result evaluation.
Language: Java - Size: 18.6 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 4 - Forks: 1

Nikoletos-K/WinnER
A Winner-Take-All Hashing-Based Unsupervised Model for Entity Resolution Problems. [B. Sc. Thesis]
Language: Jupyter Notebook - Size: 154 MB - Last synced at: about 2 years ago - Pushed at: over 2 years ago - Stars: 3 - Forks: 0

sensasi-delight/deep-learning-using-deepmatcher
This repository is a supplement resource for a research article entitled "Deep Learning Untuk Entity Matching Produk Kamera Antar Online Store Menggunakan DeepMatcher"
Language: Jupyter Notebook - Size: 202 KB - Last synced at: about 2 years ago - Pushed at: almost 3 years ago - Stars: 0 - Forks: 0

delftdata/repro-di
Language: Jupyter Notebook - Size: 2.37 MB - Last synced at: 11 months ago - Pushed at: over 2 years ago - Stars: 0 - Forks: 0

chinkitp/entity-matching-dataset
Language: Jupyter Notebook - Size: 103 MB - Last synced at: about 2 years ago - Pushed at: about 5 years ago - Stars: 0 - Forks: 0

escanillans/entity_matching
Performed entity matching on Album music data across two different (extracted) tables from metacritic.com and wikipedia.
Language: Jupyter Notebook - Size: 429 KB - Last synced at: 4 months ago - Pushed at: about 7 years ago - Stars: 1 - Forks: 0

YuanTingHsieh/CS-839-Data-Science Fork of WenFuLee/CS-839-Data-Science
Course project for CS839 Spring18 at UW-Madison
Language: Jupyter Notebook - Size: 3.9 MB - Last synced at: about 2 years ago - Pushed at: over 6 years ago - Stars: 0 - Forks: 0
