{"id":19640655,"url":"https://github.com/daniel-km/omeka-s-module-deduplicate","last_synced_at":"2025-02-26T22:45:17.323Z","repository":{"id":148008284,"uuid":"586876123","full_name":"Daniel-KM/Omeka-S-module-Deduplicate","owner":"Daniel-KM","description":"Module for Omeka S to deduplicate resources based on properties","archived":false,"fork":false,"pushed_at":"2024-04-24T10:13:27.000Z","size":54,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-09T18:57:58.974Z","etag":null,"topics":["curation","deduplication","omeka-s","omeka-s-module"],"latest_commit_sha":null,"homepage":"","language":"PHP","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Daniel-KM.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-01-09T12:48:28.000Z","updated_at":"2024-04-24T10:13:30.000Z","dependencies_parsed_at":"2025-01-10T10:54:02.103Z","dependency_job_id":null,"html_url":"https://github.com/Daniel-KM/Omeka-S-module-Deduplicate","commit_stats":null,"previous_names":[],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Daniel-KM%2FOmeka-S-module-Deduplicate","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Daniel-KM%2FOmeka-S-module-Deduplicate/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Daniel-KM%2FOmeka-S-module-Deduplicate/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Daniel-KM%2FOmeka-S-module-Deduplicate/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Daniel-KM","download_url":"https://codeload.github.com/Daniel-KM/Omeka-S-module-Deduplicate/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240947648,"owners_count":19883030,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["curation","deduplication","omeka-s","omeka-s-module"],"created_at":"2024-11-11T14:06:22.405Z","updated_at":"2025-02-26T22:45:17.304Z","avatar_url":"https://github.com/Daniel-KM.png","language":"PHP","funding_links":[],"categories":[],"sub_categories":[],"readme":"Deduplicate (module for Omeka S)\n================================\n\n\u003e __New versions of this module and support for Omeka S version 3.0 and above\n\u003e are available on [GitLab], which seems to respect users and privacy better\n\u003e than the previous repository.__\n\nVoir le [Lisez-moi] en français.\n\n[Deduplicate] is a module for [Omeka S] that allows to search for duplicate\nresources based on a value and to merge them.\n\nThe search of duplicate resources can be done strictly or with heuristics:\n\n- [Similar text]\n  The number of matching characters is calculated by finding the longest first\n  common substring, and then doing this for the prefixes and the suffixes,\n  recursively. The lengths of all found common substrings are added.\n- [Levenshtein distance]\n  The distance is the minimum number of single-character edits (insertions,\n  deletions or substitutions) required to change one word into the other.\n- [Soundex]\n  Phonetic algorithm for indexing names by sound, as pronounced in British\n  English.\n- [Metaphone]\n  Improved version of Soundex.\n\n\nInstallation\n------------\n\nSee general end user documentation for [installing a module].\n\nThis module requires the modules [Common] and [Advanced Search], that should be\ninstalled first.\n\n* From the zip\n\nDownload the last release [Deduplicate.zip] from the list of releases, and\nuncompress it in the `modules` directory.\n\n* From the source and for development\n\nIf the module was installed from the source, rename the name of the folder of\nthe module to `Deduplicate`.\n\nThen install it like any other Omeka module and follow the config instructions.\n\n\nUsage\n-----\n\n- Click on \"Deduplicate\" in the left menu, under modules.\n- Select the property and fill the value to check.\n- Click \"Submit\".\n- A new page displays the records of all resources matching the query, if any,\n  and whose value is the same or nearly the same.\n- Click the resource you want to keep and the ones you want to merge.\n- Click on \"Merge\".\n\nThe resources that are checked will be removed and all linked resources of these\nremoved resources will be attached to the selected resource.\n\nTo process other resource types than items or to filter the resources on which\nthe search is done, go to a resource browse page and do a query then click on\n\"Deduplicate resources\" in the batch edit dropdown or select some resources and\nselect \"Deduplicate selected resources\" and fill the form.\n\n\nTODO\n----\n\n- [ ] Merge selected properties (by row) or selected value (by resource).\n\n\nWarning\n-------\n\nUse it at your own risk.\n\nIt’s always recommended to backup your files and your databases and to check\nyour archives regularly so you can roll back if needed.\n\n\nTroubleshooting\n---------------\n\nSee online issues on the [module issues] page.\n\n\nLicense\n-------\n\nThis module is published under the [CeCILL v2.1] license, compatible with\n[GNU/GPL] and approved by [FSF] and [OSI].\n\nThis software is governed by the CeCILL license under French law and abiding by\nthe rules of distribution of free software. You can use, modify and/ or\nredistribute the software under the terms of the CeCILL license as circulated by\nCEA, CNRS and INRIA at the following URL \"http://www.cecill.info\".\n\nAs a counterpart to the access to the source code and rights to copy, modify and\nredistribute granted by the license, users are provided only with a limited\nwarranty and the software’s author, the holder of the economic rights, and the\nsuccessive licensors have only limited liability.\n\nIn this respect, the user’s attention is drawn to the risks associated with\nloading, using, modifying and/or developing or reproducing the software by the\nuser in light of its specific status of free software, that may mean that it is\ncomplicated to manipulate, and that also therefore means that it is reserved for\ndevelopers and experienced professionals having in-depth computer knowledge.\nUsers are therefore encouraged to load and test the software’s suitability as\nregards their requirements in conditions enabling the security of their systems\nand/or data to be ensured and, more generally, to use and operate it in the same\nconditions as regards security.\n\nThe fact that you are presently reading this means that you have had knowledge\nof the CeCILL license and that you accept its terms.\n\n\nCopyright\n---------\n\n* Copyright Daniel Berthereau, 2022-2024\n\nThese features were built for the future digital library [Manioc] of the\nUniversité des Antilles and Université de la Guyane, currently managed with\n[Greenstone].\n\n\n[Deduplicate]: https://gitlab.com/Daniel-KM/Omeka-S-module-Deduplicate\n[Lisez-moi]: https://gitlab.com/Daniel-KM/Omeka-S-module-Deduplicate/-/blob/master/LISEZMOI.md\n[Omeka S]: https://omeka.org/s\n[Similar text]: https://www.php.net/manual/en/function.similar-text\n[Levenshtein distance]: https://en.wikipedia.org/wiki/Levenshtein_distance\n[Soundex]: https://en.wikipedia.org/wiki/Soundex\n[Metaphone]: https://en.wikipedia.org/wiki/Metaphone\n[installing a module]: https://omeka.org/s/docs/user-manual/modules/#installing-modules\n[Common]: https://gitlab.com/Daniel-KM/Omeka-S-module-Common\n[Advanced Search]: https://gitlab.com/Daniel-KM/Omeka-S-module-AdvancedSearch\n[module issues]: https://gitlab.com/Daniel-KM/Omeka-S-module-Deduplicate/-/issues\n[CeCILL v2.1]: https://www.cecill.info/licences/Licence_CeCILL_V2.1-en.html\n[GNU/GPL]: https://www.gnu.org/licenses/gpl-3.0.html\n[FSF]: https://www.fsf.org\n[OSI]: http://opensource.org\n[Manioc]: http://www.manioc.org\n[Greenstone]: http://www.greenstone.org\n[GitLab]: https://gitlab.com/Daniel-KM\n[Daniel-KM]: https://gitlab.com/Daniel-KM \"Daniel Berthereau\"\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdaniel-km%2Fomeka-s-module-deduplicate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdaniel-km%2Fomeka-s-module-deduplicate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdaniel-km%2Fomeka-s-module-deduplicate/lists"}