{"id":13464588,"url":"https://github.com/tidyverse/rvest","last_synced_at":"2025-12-16T13:56:06.149Z","repository":{"id":18958274,"uuid":"22178685","full_name":"tidyverse/rvest","owner":"tidyverse","description":"Simple web scraping for R","archived":false,"fork":false,"pushed_at":"2025-09-09T10:02:49.000Z","size":13418,"stargazers_count":1510,"open_issues_count":37,"forks_count":351,"subscribers_count":85,"default_branch":"main","last_synced_at":"2025-12-08T20:46:56.492Z","etag":null,"topics":["html","r","web-scraping"],"latest_commit_sha":null,"homepage":"https://rvest.tidyverse.org","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tidyverse.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":".github/SUPPORT.md","governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2014-07-23T21:22:27.000Z","updated_at":"2025-12-03T14:45:36.000Z","dependencies_parsed_at":"2023-02-10T06:45:14.636Z","dependency_job_id":"abef9852-c51f-4294-9bb6-775521d0bd37","html_url":"https://github.com/tidyverse/rvest","commit_stats":{"total_commits":425,"total_committers":32,"mean_commits":13.28125,"dds":"0.12470588235294122","last_synced_commit":"c9be5b8dd9d672e84dd0dc515e3a37ab5c03111f"},"previous_names":["hadley/rvest"],"tags_count":15,"template":false,"template_full_name":null,"purl":"pkg:github/tidyverse/rvest","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tidyverse%2Frvest","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tidyverse%2Frvest/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tidyverse%2Frvest/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tidyverse%2Frvest/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tidyverse","download_url":"https://codeload.github.com/tidyverse/rvest/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tidyverse%2Frvest/sbom","scorecard":{"id":884873,"data":{"date":"2025-08-11","repo":{"name":"github.com/tidyverse/rvest","commit":"78b3e1ff38cd55e217ca61ac1fe0cbf3b33b5b75"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3.6,"checks":[{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Code-Review","score":2,"reason":"Found 6/30 approved changesets -- score normalized to 2","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 1 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Warn: jobLevel 'contents' permission set to 'write': .github/workflows/pkgdown.yaml:23","Warn: no topLevel permission defined: .github/workflows/R-CMD-check.yaml:1","Warn: no topLevel permission defined: .github/workflows/pkgdown.yaml:1","Warn: no topLevel permission defined: .github/workflows/pr-commands.yaml:1","Warn: no topLevel permission defined: .github/workflows/test-coverage.yaml:1"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/R-CMD-check.yaml:45: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/R-CMD-check.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/R-CMD-check.yaml:47: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/R-CMD-check.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/R-CMD-check.yaml:49: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/R-CMD-check.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/R-CMD-check.yaml:55: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/R-CMD-check.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/R-CMD-check.yaml:60: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/R-CMD-check.yaml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/pkgdown.yaml:25: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/pkgdown.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/pkgdown.yaml:27: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/pkgdown.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/pkgdown.yaml:29: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/pkgdown.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/pkgdown.yaml:33: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/pkgdown.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/pkgdown.yaml:44: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/pkgdown.yaml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/pr-commands.yaml:17: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/pr-commands.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/pr-commands.yaml:19: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/pr-commands.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/pr-commands.yaml:23: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/pr-commands.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/pr-commands.yaml:27: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/pr-commands.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/pr-commands.yaml:43: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/pr-commands.yaml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/pr-commands.yaml:54: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/pr-commands.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/pr-commands.yaml:56: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/pr-commands.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/pr-commands.yaml:60: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/pr-commands.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/pr-commands.yaml:77: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/pr-commands.yaml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/test-coverage.yaml:18: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/test-coverage.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/test-coverage.yaml:20: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/test-coverage.yaml/main?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/test-coverage.yaml:24: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/test-coverage.yaml/main?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/test-coverage.yaml:47: update your workflow using https://app.stepsecurity.io/secureworkflow/tidyverse/rvest/test-coverage.yaml/main?enable=pin","Info:   0 out of   6 GitHub-owned GitHubAction dependencies pinned","Info:   0 out of  17 third-party GitHubAction dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"License","score":9,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Warn: project license file does not contain an FSF or OSI license."],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'main'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 15 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-24T09:36:28.538Z","repository_id":18958274,"created_at":"2025-08-24T09:36:28.538Z","updated_at":"2025-08-24T09:36:28.538Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":27765941,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-12-16T02:00:10.477Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["html","r","web-scraping"],"created_at":"2024-07-31T14:00:46.731Z","updated_at":"2025-12-16T13:56:06.142Z","avatar_url":"https://github.com/tidyverse.png","language":"R","funding_links":[],"categories":["All","R"],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, echo = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE, \n  comment = \"#\u003e\",\n  fig.path = \"README-\"  \n)\n```\n\n# rvest \u003ca href=\"https://rvest.tidyverse.org\"\u003e\u003cimg src=\"man/figures/logo.png\" align=\"right\" height=\"138\" alt=\"rvest website\" /\u003e\u003c/a\u003e\n\n\u003c!-- badges: start --\u003e\n\n[![CRAN status](https://www.r-pkg.org/badges/version/rvest)](https://cran.r-project.org/package=rvest)\n[![R-CMD-check](https://github.com/tidyverse/rvest/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/tidyverse/rvest/actions/workflows/R-CMD-check.yaml)\n[![Codecov test coverage](https://codecov.io/gh/tidyverse/rvest/graph/badge.svg)](https://app.codecov.io/gh/tidyverse/rvest)\n\u003c!-- badges: end --\u003e\n\n## Overview\n\nrvest helps you scrape (or harvest) data from web pages.\nIt is designed to work with [magrittr](https://github.com/tidyverse/magrittr) to make it easy to express common web scraping tasks, inspired by libraries like [beautiful soup](https://www.crummy.com/software/BeautifulSoup/) and [RoboBrowser](http://robobrowser.readthedocs.io/en/latest/readme.html).\n\nIf you're scraping multiple pages, I highly recommend using rvest in concert with [polite](https://dmi3kno.github.io/polite/).\nThe polite package ensures that you're respecting the [robots.txt](https://en.wikipedia.org/wiki/Robots_exclusion_standard) and not hammering the site with too many requests.\n\n## Installation\n\n```{r, eval = FALSE}\n# The easiest way to get rvest is to install the whole tidyverse:\ninstall.packages(\"tidyverse\")\n\n# Alternatively, install just rvest:\ninstall.packages(\"rvest\")\n```\n\n## Usage\n\n```{r, message = FALSE}\nlibrary(rvest)\n\n# Start by reading a HTML page with read_html():\nstarwars \u003c- read_html(\"https://rvest.tidyverse.org/articles/starwars.html\")\n\n# Then find elements that match a css selector or XPath expression\n# using html_elements(). In this example, each \u003csection\u003e corresponds\n# to a different film\nfilms \u003c- starwars |\u003e html_elements(\"section\")\nfilms\n\n# Then use html_element() to extract one element per film. Here\n# we the title is given by the text inside \u003ch2\u003e\ntitle \u003c- films |\u003e \n  html_element(\"h2\") |\u003e \n  html_text2()\ntitle\n\n# Or use html_attr() to get data out of attributes. html_attr() always\n# returns a string so we convert it to an integer using a readr function\nepisode \u003c- films |\u003e \n  html_element(\"h2\") |\u003e \n  html_attr(\"data-id\") |\u003e \n  readr::parse_integer()\nepisode\n```\n\nIf the page contains tabular data you can convert it directly to a data frame with `html_table()`:\n\n```{r}\nhtml \u003c- read_html(\"https://en.wikipedia.org/w/index.php?title=The_Lego_Movie\u0026oldid=998422565\")\n\nhtml |\u003e \n  html_element(\".tracklist\") |\u003e \n  html_table()\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftidyverse%2Frvest","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftidyverse%2Frvest","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftidyverse%2Frvest/lists"}