An open API service providing repository metadata for many open source software ecosystems.

GitHub topics: deduplicate-data

moj-analytical-services/splink

Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends

Language: Python - Size: 98.3 MB - Last synced at: 5 days ago - Pushed at: 6 days ago - Stars: 1,563 - Forks: 172

gochore/uniq

Sort and deduplicate data.

Language: Go - Size: 27.3 KB - Last synced at: 8 months ago - Pushed at: about 4 years ago - Stars: 0 - Forks: 0

bmiller1009/deduper

General deduping engine for JDBC sources with output to JDBC/csv targets

Language: Kotlin - Size: 1.23 MB - Last synced at: 5 months ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 0