{"id":26322740,"url":"https://github.com/ropensci/fulltext","last_synced_at":"2025-03-15T17:01:54.751Z","repository":{"id":17511247,"uuid":"20299174","full_name":"ropensci-archive/fulltext","owner":"ropensci-archive","description":":warning: ARCHIVED :warning: Search across and get full text for OA \u0026 closed journals","archived":true,"fork":false,"pushed_at":"2022-09-09T09:11:01.000Z","size":6396,"stargazers_count":270,"open_issues_count":0,"forks_count":46,"subscribers_count":18,"default_branch":"master","last_synced_at":"2024-05-21T22:11:09.646Z","etag":null,"topics":["crossref","extract-text","metadata","open-access","pdf","r","r-package","rstats","text-ming","xml"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ropensci-archive.png","metadata":{"files":{"readme":"README-not.md","changelog":null,"contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2014-05-29T15:51:10.000Z","updated_at":"2024-01-21T20:16:26.000Z","dependencies_parsed_at":"2023-01-11T20:26:44.755Z","dependency_job_id":null,"html_url":"https://github.com/ropensci-archive/fulltext","commit_stats":null,"previous_names":["ropensci/fulltext"],"tags_count":14,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci-archive%2Ffulltext","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci-archive%2Ffulltext/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci-archive%2Ffulltext/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ropensci-archive%2Ffulltext/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ropensci-archive","download_url":"https://codeload.github.com/ropensci-archive/fulltext/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243762264,"owners_count":20343979,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crossref","extract-text","metadata","open-access","pdf","r","r-package","rstats","text-ming","xml"],"created_at":"2025-03-15T17:01:30.153Z","updated_at":"2025-03-15T17:01:54.698Z","avatar_url":"https://github.com/ropensci-archive.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"\n\nfulltext\n========\n\n[![cran checks](https://cranchecks.info/badges/flavor/release/fulltext)](https://cranchecks.info/pkgs/fulltext)\n[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)\n[![R-check](https://github.com/ropensci/fulltext/workflows/R-check/badge.svg)](https://github.com/ropensci/fulltext/actions/)\n[![codecov](https://codecov.io/gh/ropensci/fulltext/branch/master/graph/badge.svg)](https://codecov.io/gh/ropensci/fulltext)\n[![rstudio mirror downloads](https://cranlogs.r-pkg.org/badges/fulltext)](https://github.com/r-hub/cranlogs.app)\n[![cran version](https://www.r-pkg.org/badges/version/fulltext)](https://cran.r-project.org/package=fulltext)\n\n__Get full text research articles__\n\nCheckout the [package docs][docs] and the [fulltext manual][ftbook] to get started.\n\n-----\n\nrOpenSci has a number of R packages to get either full text, metadata, or both from various publishers. The goal of `fulltext` is to integrate these packages to create a single interface to many data sources.\n\n`fulltext` makes it easy to do text-mining by supporting the following steps:\n\n* Search for articles - `ft_search`\n* Fetch articles - `ft_get`\n* Get links for full text articles (xml, pdf) - `ft_links`\n* Extract text from articles / convert formats - `ft_extract`\n* Collect all texts into a data.frame - `ft_table`\n\nPreviously supported use cases, extracted out to other packages:\n\n* Collect bits of articles that you actually need - moved to package `pubchunks`\n* Supplementary data from papers has been moved to the `suppdata`\n\n\nIt's easy to go from the outputs of `ft_get` to text-mining packages such as \n`tm` and `quanteda`\n\nData sources in `fulltext` include:\n\n* Crossref - via the `rcrossref` package\n* Public Library of Science (PLOS) - via the `rplos` package\n* Biomed Central\n* arXiv - via the `aRxiv` package\n* bioRxiv - via the `biorxivr` package\n* PMC/Pubmed via Entrez - via the `rentrez` package\n* Scopus - internal tooling\n* Semantic Scholar - internal tooling\n* Many more are supported via the above sources (e.g., _Royal Society Open Science_ is\navailable via Pubmed)\n* We __will__ add more, as publishers open up, and as we have time...See the issues\n\nAuthentication: A number of publishers require authentication via API key, and some even more\ndraconian authentication processes involving checking IP addresses. We are working on supporting\nall the various authentication things for different publishers, but of course all the OA content\nis already easily available. See the **Authentication** section in `?fulltext-package` after \nloading the package.\n\nWe'd love your feedback. Let us know what you think in the issue tracker\n\n\n## Installation\n\nStable version from CRAN\n\n\n```r\ninstall.packages(\"fulltext\")\n```\n\nDevelopment version from GitHub\n\n\n```r\nremotes::install_github(\"ropensci/fulltext\")\n```\n\nLoad library\n\n\n```r\nlibrary('fulltext')\n```\n\n## Interoperability with other packages downstream\n\nNote: this example not included in vignettes as that would require the two below packages in Suggests here. To see many examples and documentation see the [package docs][docs] and the [fulltext manual][ftbook].\n\n\n```r\ncache_options_set(path = (td \u003c- 'foobar'))\nres \u003c- ft_get(c('10.7554/eLife.03032', '10.7554/eLife.32763'), type = \"pdf\")\nlibrary(readtext)\nx \u003c- readtext::readtext(file.path(cache_options_get()$path, \"*.pdf\"))\n```\n\n\n```r\nlibrary(quanteda)\nquanteda::corpus(x)\n```\n\n## Contributors\n\n* Scott Chamberlain\n* Will Pearse\n* Katrin Leinweber\n\n## Meta\n\n* Please [report any issues or bugs](https://github.com/ropensci/fulltext/issues).\n* License: MIT\n* Get citation information for `fulltext`: `citation(package = 'fulltext')`\n* Please note that this package is released with a [Contributor Code of Conduct](https://ropensci.org/code-of-conduct/). By contributing to this project, you agree to abide by its terms.\n\n\n[docs]: https://docs.ropensci.org/fulltext/\n[ftbook]: https://books.ropensci.org/fulltext/\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fropensci%2Ffulltext","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fropensci%2Ffulltext","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fropensci%2Ffulltext/lists"}