{"id":20074743,"url":"https://github.com/greenelab/library-access","last_synced_at":"2026-03-11T16:03:27.073Z","repository":{"id":79359523,"uuid":"104778630","full_name":"greenelab/library-access","owner":"greenelab","description":"Collecting data on whether library access to scholarly literature","archived":false,"fork":false,"pushed_at":"2018-03-14T20:52:30.000Z","size":54,"stargazers_count":6,"open_issues_count":1,"forks_count":3,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-08-21T10:48:30.218Z","etag":null,"topics":["articles","catalog","database","dataset","doi","journals","library","literature","sci-hub","tool"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/greenelab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE-BSD-3-Clause.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-09-25T17:08:04.000Z","updated_at":"2025-03-14T13:12:38.000Z","dependencies_parsed_at":"2023-03-12T07:49:37.211Z","dependency_job_id":null,"html_url":"https://github.com/greenelab/library-access","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/greenelab/library-access","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/greenelab%2Flibrary-access","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/greenelab%2Flibrary-access/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/greenelab%2Flibrary-access/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/greenelab%2Flibrary-access/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/greenelab","download_url":"https://codeload.github.com/greenelab/library-access/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/greenelab%2Flibrary-access/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30386995,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-11T14:10:17.325Z","status":"ssl_error","status_checked_at":"2026-03-11T14:09:37.934Z","response_time":84,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["articles","catalog","database","dataset","doi","journals","library","literature","sci-hub","tool"],"created_at":"2024-11-13T14:54:08.756Z","updated_at":"2026-03-11T16:03:27.043Z","avatar_url":"https://github.com/greenelab.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Data on Library Access to Scholarly Literature\n\n[![Build Status](https://travis-ci.org/greenelab/library-access.svg?branch=master)](https://travis-ci.org/greenelab/library-access)\n\nThis repository is cataloging University Library access to scholarly literature.\nScholarly articles are identified using their DOIs.\nThe impetus for this project was [this discussion](https://github.com/greenelab/scihub-manuscript/issues/21 \"Potential followup: comparison to authorized access\") on the Sci-Hub Coverage Study.\n\nThe code in this repository facilitates fetching indicators of full-text availability for a list of DOIs from an OpenURL resolver. In this way, it enables large-scale analysis of bibliographic holdings / availability.\n\n## Using the Code\n\n**The code files in this repository assume that your working directory is set to the top-level directory of this repository.**\n\n### Contents of this Repository, and the Order of Their Use\n\n- `LICENSE-*.md`: License text to accompany the [License](#License) section of this Readme below.\n- `environment.yml`: Conda environment file (see [Environment](#environment) below).\n- `.gitattributes`: File with information for tracking files using [Git Large File Storage (LFS)](https://git-lfs.github.com/).\n- `library_management_system_downloader` contains the following scripts, to be used in the following order:\n\t1. `downloader_configuration_file_TEMPLATE.py` should be copied to `downloader_configuration_file.py` and edited for your own institution's OpenURL resolver (These scripts were specifically tested using the OpenURL resolver that comes with Ex Libris' Alma management software).\n\t\t- Within `downloader_configuration_file.py`, the variable `api_base_url` will be based on the OpenURL resolver / vendor that your institution uses, and thus will be different from institution to institution. To find out what that base URL should be, it may be necessary to ask your local library technology team for help and/or documentation.\n\t\t- It is additionally the case that different OpenURL resolvers may return slightly different formats of data. Thus, it may be necessary to modify the function `fulltext_indication` in the file `evaluate_api_response_for_fulltext_indication.py` to look for an XML field that the data from your institution's OpenURL resolver contains.\n\t1. `run_api_download_and_parse_results.py`\n\t1. `copy_and_compress_database_and_extract_tsv.py`\n- `evaluate_library_access_from_output_tsv` contains the following scripts, to be used in the following order:\n\t1. `create_stratefied_sample_of_dois.R`\n\t1. `join_doi-200_dates_to_doi-500.R`\n\t1. \\[Run `facilitate_going_through_dois_manually.R` to help fill in the `.tsv` files created by the scripts above\\]\n\t1. `penntext-accuracy-200.ipynb`\n\t1. `penntext-accuracy-500.ipynb`\n\n- `data`: \\[This is where datasets will be saved by the above scripts.\\]\n\n### Environment\n\nThis repository uses [conda](http://conda.pydata.org/docs/) to manage its environment as specified in [`environment.yml`](environment.yml).\nInstall the environment with:\n\n```sh\nconda env create --file=environment.yml\n```\n\nThen use `source activate library-access` and `source deactivate` to activate or deactivate the environment.\nOn windows, use `activate library-access` and `deactivate` instead.\n\n## License\n\nThe files in this repository are released under the CC0 1.0 public domain dedication ([`LICENSE-CC0.md`](LICENSE-CC0.md)), excepting those that match the glob patterns listed below.\nFiles matching the following glob patters are instead released under a BSD 3-Clause license ([`LICENSE-BSD-3-Clause.md`](LICENSE-BSD-3-Clause.md)):\n\n- `*.py`\n- `*.md`\n- `.gitignore`\n- `*.r`\n- `*.sh`\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgreenelab%2Flibrary-access","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgreenelab%2Flibrary-access","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgreenelab%2Flibrary-access/lists"}