Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/moj-analytical-services/splink
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
https://github.com/moj-analytical-services/splink
data-matching data-science deduplicate-data deduplication duckdb em-algorithm entity-resolution fuzzy-matching record-linkage spark uk-gov-data-science
Last synced: about 2 months ago
JSON representation
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
- Host: GitHub
- URL: https://github.com/moj-analytical-services/splink
- Owner: moj-analytical-services
- License: mit
- Created: 2019-11-22T14:27:33.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2024-04-12T11:04:53.000Z (5 months ago)
- Last Synced: 2024-04-14T10:19:08.688Z (5 months ago)
- Topics: data-matching, data-science, deduplicate-data, deduplication, duckdb, em-algorithm, entity-resolution, fuzzy-matching, record-linkage, spark, uk-gov-data-science
- Language: Python
- Homepage: https://moj-analytical-services.github.io/splink/
- Size: 89.1 MB
- Stars: 1,072
- Watchers: 16
- Forks: 126
- Open Issues: 202
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE