{"id":16708023,"url":"https://github.com/yvolk/personsmerger","last_synced_at":"2025-07-19T03:03:46.167Z","repository":{"id":77880348,"uuid":"343893340","full_name":"yvolk/PersonsMerger","owner":"yvolk","description":"Real life deep learning trained models to identify same person's records coming from different systems. Evolutionary algorithm to train models. With Test cases and sample trained models. Run tests to see learning and trained models in action. Written in Kotlin.","archived":false,"fork":false,"pushed_at":"2023-04-09T08:04:26.000Z","size":110,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-14T23:44:55.398Z","etag":null,"topics":["deep-learning","evolutionary-algorithm","kotlin","model-training"],"latest_commit_sha":null,"homepage":"","language":"Kotlin","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/yvolk.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-03-02T19:46:18.000Z","updated_at":"2021-12-22T17:20:09.000Z","dependencies_parsed_at":null,"dependency_job_id":"f8639a44-7cf2-4f80-b695-c0926488c982","html_url":"https://github.com/yvolk/PersonsMerger","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/yvolk/PersonsMerger","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yvolk%2FPersonsMerger","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yvolk%2FPersonsMerger/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yvolk%2FPersonsMerger/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yvolk%2FPersonsMerger/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/yvolk","download_url":"https://codeload.github.com/yvolk/PersonsMerger/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/yvolk%2FPersonsMerger/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265878930,"owners_count":23843038,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","evolutionary-algorithm","kotlin","model-training"],"created_at":"2024-10-12T19:41:48.470Z","updated_at":"2025-07-19T03:03:46.117Z","avatar_url":"https://github.com/yvolk.png","language":"Kotlin","funding_links":[],"categories":[],"sub_categories":[],"readme":"The real life project that solves the Systems Integration task to identify records\nof the same persons in data sets that come from different systems.\nNumber of identifiable attributes and even their values differ (e.g. due to errors and \nhistorical changes),\nand there are no strict rules, when we should consider two persons\nfrom different sources as the same person, when to treat them as different, \nand when to decide that this is \"unknown\" to the system and maybe needs manual decision step.\n\nThe automatic decisions are taken using *trained models* that are obtained with an \n[Evolutionary algorithm](https://en.wikipedia.org/wiki/Evolutionary_algorithm) \nthat uses default values of *weights* for initial models and then iteratively, \ngeneration by generation, breeds better models, applying mutation and selection \nat each iteration.\n\nThe **PersonsData** set of sample pairs of persons' records with expected outputs \n(more than two hundred pairs) is used to train the models.\n\n**ModelTest** test cases for the initial and for trained models allow quickly evaluate\nthese models for fitness for the PersonsData.\n\n**Learning** test cases implement the Evolutionary algorithm. \n\nRun these tests to see learning and trained models in action! \n\nWritten in Kotlin programming language.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyvolk%2Fpersonsmerger","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fyvolk%2Fpersonsmerger","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fyvolk%2Fpersonsmerger/lists"}