An open API service providing repository metadata for many open source software ecosystems.

Topic: "deduplicate-data"

moj-analytical-services/splink

Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends

Language: Python - Size: 102 MB - Last synced at: 3 days ago - Pushed at: 3 days ago - Stars: 1,618 - Forks: 182

bmiller1009/deduper

General deduping engine for JDBC sources with output to JDBC/csv targets

Language: Kotlin - Size: 1.23 MB - Last synced at: 7 months ago - Pushed at: over 4 years ago - Stars: 5 - Forks: 0

gochore/uniq

Sort and deduplicate data.

Language: Go - Size: 27.3 KB - Last synced at: 9 months ago - Pushed at: over 4 years ago - Stars: 0 - Forks: 0