{"id":18412371,"url":"https://github.com/gaglia88/ruler","last_synced_at":"2025-04-07T11:31:58.236Z","repository":{"id":81519504,"uuid":"233018313","full_name":"Gaglia88/ruler","owner":"Gaglia88","description":"Scalable record-level matching rules","archived":false,"fork":false,"pushed_at":"2020-03-26T09:36:48.000Z","size":2557,"stargazers_count":6,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-22T17:11:20.279Z","etag":null,"topics":["distributed-computing","entity-matching","entity-resolution","similarity-join"],"latest_commit_sha":null,"homepage":"","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Gaglia88.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-01-10T09:59:50.000Z","updated_at":"2023-02-28T11:25:12.000Z","dependencies_parsed_at":null,"dependency_job_id":"9b5ab10a-ebb0-415c-ae3a-5462ed1a1d32","html_url":"https://github.com/Gaglia88/ruler","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Gaglia88%2Fruler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Gaglia88%2Fruler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Gaglia88%2Fruler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Gaglia88%2Fruler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Gaglia88","download_url":"https://codeload.github.com/Gaglia88/ruler/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247644333,"owners_count":20972267,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["distributed-computing","entity-matching","entity-resolution","similarity-join"],"created_at":"2024-11-06T03:41:15.631Z","updated_at":"2025-04-07T11:31:58.230Z","avatar_url":"https://github.com/Gaglia88.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"# RulER\nRulER is a tool for Apache Spark that uses a novel technique that allows to find similar records by applying complex joining rules on one or more attributes.\n\n---\n\nIf use this library, please cite:\n\n- **Gagliardelli, L., Simonini, G., \u0026 Bergamaschi, S. (2020). RulER: Scaling Up Record-level Matching Rules. In EDBT 2020: 23nd International Conference on Extending Database Technology.**\n\n---\n\n\nA brief presentation about RulER is available by clicking on the image below\n[![](http://img.youtube.com/vi/ZuIre-WO3lY/0.jpg)](http://www.youtube.com/watch?v=ZuIre-WO3lY \"\")\n\n### Contacts\nFor any questions about RulER write us at name.surname@unimore.it\n* Luca Gagliardelli\n* Giovanni Simonini\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgaglia88%2Fruler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgaglia88%2Fruler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgaglia88%2Fruler/lists"}