https://github.com/ihmeuw/easylink
A tool that allows users to build and run highly configurable record linkage/entity resolution pipelines.
https://github.com/ihmeuw/easylink
data-science entity-resolution record-linkage
Last synced: 2 months ago
JSON representation
A tool that allows users to build and run highly configurable record linkage/entity resolution pipelines.
- Host: GitHub
- URL: https://github.com/ihmeuw/easylink
- Owner: ihmeuw
- License: bsd-3-clause
- Created: 2023-08-02T18:00:54.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2026-03-25T23:39:39.000Z (3 months ago)
- Last Synced: 2026-03-26T21:53:05.522Z (3 months ago)
- Topics: data-science, entity-resolution, record-linkage
- Language: Python
- Homepage: https://easylink.readthedocs.io/en/latest/
- Size: 5.16 MB
- Stars: 11
- Watchers: 8
- Forks: 0
- Open Issues: 3
-
Metadata Files:
- Readme: README.rst
- Changelog: CHANGELOG.rst
- Contributing: CONTRIBUTING.rst
- License: LICENSE
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
README
========
EasyLink
========
.. _intro:
EasyLink is a tool that allows users to build and run highly configurable record linkage/entity resolution pipelines.
Its configurability enables users to "mix and match" different pieces of record
linkage software by ensuring that each piece of the pipeline conforms to standard patterns.
For example, users at the Census Bureau could easily evaluate whether using a more sophisticated "blocking"
method would improve results in a certain pipeline, without having to rewrite the entire pipeline.
In its current state, EasyLink provides only one or two implementations for each step, does not yet have documentation
to support users in creating their own implementations, and is not yet stable enough to be recommended as a tool for production pipelines.
.. _end_intro:
.. _python_support:
**Supported Python versions: 3.11, 3.12**
.. _end_python_support:
Installation
============
.. _installation:
**NOTE: This package requires AMD64 CPU architecture - it is not compatible with
Apple's ARM64 architecture (e.g. M1 and newer Macs).**
There are a few things to install in order to use this package:
- Set up Linux.
Singularity (and thus EasyLink) requires Linux to run. If you are not already
using Linux, you will need to set up a virtual machine; refer to the
`Singularity documentation for installing on Windows or Mac `_.
- Install Singularity.
First check if you already have Singularity installed by running the command
``singularity --version``. For an existing installation, your Singularity version
number is printed.
If Singularity is not yet installed, you will need to install it;
refer to the `Singularity docs for installing on Linux `_.
Note that this requires administrator privileges; you may need to request installation
from your system admin if you are working in a shared computing environment.
- Install conda.
We recommend `miniforge `_. You can
check if you already have conda installed by running the command ``conda --version``.
For an existing installation, a version will be displayed.
- Create a conda environment with python and graphviz installed.
::
$ conda create --name easylink -c conda-forge python=3.12 graphviz 'gcc<14' -y
$ conda activate easylink
- Install easylink in the environment.
Option 1 - Install from PyPI with pip::
$ pip install easylink
Option 2 - Build from source with pip::
$ pip install git+https://github.com/ihmeuw/easylink.git
.. _end_installation:
Documentation
=============
You can view documentation at https://easylink.readthedocs.io/en/latest/