https://github.com/gaglia88/ruler
Scalable record-level matching rules
https://github.com/gaglia88/ruler
distributed-computing entity-matching entity-resolution similarity-join
Last synced: 19 days ago
JSON representation
Scalable record-level matching rules
- Host: GitHub
- URL: https://github.com/gaglia88/ruler
- Owner: Gaglia88
- License: mit
- Created: 2020-01-10T09:59:50.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2020-03-26T09:36:48.000Z (about 5 years ago)
- Last Synced: 2025-03-22T17:11:20.279Z (about 1 month ago)
- Topics: distributed-computing, entity-matching, entity-resolution, similarity-join
- Language: Scala
- Homepage:
- Size: 2.44 MB
- Stars: 6
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# RulER
RulER is a tool for Apache Spark that uses a novel technique that allows to find similar records by applying complex joining rules on one or more attributes.---
If use this library, please cite:
- **Gagliardelli, L., Simonini, G., & Bergamaschi, S. (2020). RulER: Scaling Up Record-level Matching Rules. In EDBT 2020: 23nd International Conference on Extending Database Technology.**
---
A brief presentation about RulER is available by clicking on the image below
[](http://www.youtube.com/watch?v=ZuIre-WO3lY "")### Contacts
For any questions about RulER write us at [email protected]
* Luca Gagliardelli
* Giovanni Simonini