{"id":25368310,"url":"https://github.com/discoverygarden/ri-solr-diff","last_synced_at":"2025-04-09T06:18:04.098Z","repository":{"id":19635739,"uuid":"22887767","full_name":"discoverygarden/ri-solr-diff","owner":"discoverygarden","description":null,"archived":false,"fork":false,"pushed_at":"2018-03-13T12:49:02.000Z","size":56,"stargazers_count":0,"open_issues_count":0,"forks_count":2,"subscribers_count":19,"default_branch":"1.x","last_synced_at":"2025-02-15T00:37:01.832Z","etag":null,"topics":["tool"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/discoverygarden.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2014-08-12T18:26:12.000Z","updated_at":"2021-07-28T20:02:09.000Z","dependencies_parsed_at":"2022-08-24T14:02:17.624Z","dependency_job_id":null,"html_url":"https://github.com/discoverygarden/ri-solr-diff","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/discoverygarden%2Fri-solr-diff","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/discoverygarden%2Fri-solr-diff/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/discoverygarden%2Fri-solr-diff/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/discoverygarden%2Fri-solr-diff/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/discoverygarden","download_url":"https://codeload.github.com/discoverygarden/ri-solr-diff/tar.gz/refs/heads/1.x","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247987235,"owners_count":21028895,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["tool"],"created_at":"2025-02-15T00:37:06.244Z","updated_at":"2025-04-09T06:18:04.074Z","avatar_url":"https://github.com/discoverygarden.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ri-solr-diff\n\n## Introduction\n\nA utility to identify and resolve differences between a Fedora Commons installation and a Solr index of the contents of the given Fedora Commons installation. Also provides a utility to re-index a list of PIDs given a file input from stdIn.\n\n## Requirements\n\nThis program requires:\n\n* [requests](http://docs.python-requests.org/)\n* [python-dateutil](http://labix.org/python-dateutil)\n\n## Installation\n\nIt is recommended to install this utility inside of a [virtualenv](http://virtualenv.readthedocs.org/en/latest/) virtual Python environment.\n\nThis program can be relatively easily installed two very similar ways, which should take care of installing the required dependencies:\n* If [pip](https://pypi.python.org/pypi/pip) is installed, one can run:\n```bash\npip install git+https://github.com/discoverygarden/ri-solr-diff\n```\n* If just [setuptools](https://pypi.python.org/pypi/setuptools) is installed, one can run:\n```bash\ngit clone https://github.com/discoverygarden/ri-solr-diff\ncd ri-solr-diff\npython setup.py install\n```\n\nIt is also possible (though more work) to resolve the dependencies of requests and python-dateutil and to make it available when running `ri_solr_diff.py` on its own (or directly through the interpreter, anyway).\n\n## Usage\n\nOutput of `ri_solr_diff.py --help`:\n```\nusage: ri_solr_diff.py [-h] [--ri RI] [--ri-user RI_USER] [--ri-pass RI_PASS]\n                       [--solr SOLR]\n                       [--solr-last-modified-field SOLR_LAST_MODIFIED_FIELD]\n                       [--keep-docs] [--gsearch GSEARCH]\n                       [--gsearch-user GSEARCH_USER]\n                       [--gsearch-pass GSEARCH_PASS]\n                       [--query-limit QUERY_LIMIT] [--dryrun]\n                       (--all | --last-n-days LAST_N_DAYS | --last-n-seconds LAST_N_SECONDS | --since SINCE | --config-file CONFIG_FILE)\n                       [--verbose | --quiet]\n\nIdentify and resolve differences between a Fedora Resource and Solr index.\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --ri RI               URL of the resource index at the host. (default:\n                        http://localhost:8080/fedora/risearch)\n  --ri-user RI_USER     Username to communicate with resource index. (default:\n                        fedoraAdmin)\n  --ri-pass RI_PASS     Password to communicate with resource index. (default:\n                        islandora)\n  --solr SOLR           URL of the Solr end-point. (default:\n                        http://localhost:8080/solr)\n  --solr-last-modified-field SOLR_LAST_MODIFIED_FIELD\n                        The Solr field storing the last modified date of each\n                        object. (default: fgs_lastModifiedDate_dt)\n  --keep-docs           Keep docs in Solr which do not appear to have related\n                        objects in Fedora. The default is to delete Solr\n                        documents in this state.\n  --gsearch GSEARCH     URL of the GSearch end-point. (default:\n                        http://localhost:8080/fedoragsearch/rest)\n  --gsearch-user GSEARCH_USER\n                        Username to communicate with GSearch servelet.\n                        (default: fedoraAdmin)\n  --gsearch-pass GSEARCH_PASS\n                        Password to communicate with GSearch servelet.\n                        (default: islandora)\n  --query-limit QUERY_LIMIT\n                        The number of results which will be fetched from the\n                        RI and Solr at a time. (default: 10000)\n  --dryrun              Diff without making changes (default: False)\n  --all                 Compare all objects.\n  --last-n-days LAST_N_DAYS\n                        Compare objects modified in the last n days.\n  --last-n-seconds LAST_N_SECONDS\n                        Compare objects modified in the last n seconds.\n  --since SINCE         Compare objects modified since the given Unix\n                        timestamp.\n  --config-file CONFIG_FILE\n                        Provide a JSON configuration file of arguments to be\n                        used in place of the CLI.\n  --verbose, -v         Adjust verbosity of output. More times == more\n                        verbose.\n  --quiet, -q           Adjust verbosity of output. More times == less\n                        verbose.  \n\nExit code will be \"0\" if everything was up-to-date. If documents were updated,\nthe exit code will be \"1\" (though may also be \"1\" due to runtime errors). If\nconfig-file is specified and it does not exist \"-1\" will be exited with.\n```\nOutput of `solr_reindex.py --help`:\n```\nusage: solr_reindex.py [-h] [--gsearch GSEARCH] [--gsearch-user GSEARCH_USER]\n                       [--gsearch-pass GSEARCH_PASS]\n\nTrigger a Solr re-index for a list of PIDs parsed from CSV.\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --gsearch GSEARCH     URL of the GSearch end-point. (default:\n                        http://localhost:8080/fedoragsearch/rest)\n  --gsearch-user GSEARCH_USER\n                        Username to communicate with GSearch servelet.\n                        (default: fedoraAdmin)\n  --gsearch-pass GSEARCH_PASS\n                        Password to communicate with GSearch servelet.\n                        (default: islandora)\n\nExit code will be \"1\" if re-index was succcesful, \"0\" otherwise.\n```\n\nConfiguration file:\n\nOptionally a JSON configuration file can be specified in place of command-line arguments using the `--config-file` argument. The configuration file will contain key/value pairs of any of the allowed arguments such as:\n```json\n{\n   \"ri\":\"http:\\/\\/localhost:8080\\/fedora\\/risearch\",\n   \"ri-user\":\"fedoraAdmin\",\n   \"ri-pass\":\"islandora\",\n   \"solr\":\"http:\\/\\/localhost:8080\\/solr\",\n   \"solr-last-modified-field\":\"fgs_lastModifiedDate_dt\",\n   \"keep-docs\":true,\n   \"gsearch\":\"http:\\/\\/localhost:8080\\/fedoragsearch\\/rest\",\n   \"gsearch-user\":\"fedoraAdmin\",\n   \"gsearch-pass\":\"islandora\",\n   \"query-limit\":10000,\n   \"all\":true,\n   \"verbose\":true\n}\n```\n\nExample of Solr re-indexing: `solr_reindex.py \u003c /mydirectory/file.txt`\n\n## Maintainers/Sponsors\n\nCurrent maintainers:\n\n* [discoverygarden Inc.](https://github.com/discoverygarden)\n\nSponsors:\n\n* [United States Department of Agriculture: National Agricultural Library](https://www.nal.usda.gov/)\n\n## License\n\n[GPLv3](http://www.gnu.org/licenses/gpl-3.0.txt)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdiscoverygarden%2Fri-solr-diff","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdiscoverygarden%2Fri-solr-diff","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdiscoverygarden%2Fri-solr-diff/lists"}