{"id":27879378,"url":"https://github.com/gesistsa/webbotparser","last_synced_at":"2025-05-05T03:21:19.367Z","repository":{"id":146235288,"uuid":"614778415","full_name":"gesistsa/webbotparseR","owner":"gesistsa","description":":mag: R package to parse search engine results ","archived":false,"fork":false,"pushed_at":"2024-12-02T14:18:42.000Z","size":43139,"stargazers_count":8,"open_issues_count":2,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-05-03T10:02:23.743Z","etag":null,"topics":["browser-extension","rstats","rstats-package","search-engine"],"latest_commit_sha":null,"homepage":"https://gesistsa.github.io/webbotparseR/","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gesistsa.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2023-03-16T09:54:15.000Z","updated_at":"2024-12-02T14:16:07.000Z","dependencies_parsed_at":"2023-09-16T12:27:23.012Z","dependency_job_id":"4d205fd8-fedd-4d14-9922-8b9439cd9e16","html_url":"https://github.com/gesistsa/webbotparseR","commit_stats":null,"previous_names":["gesistsa/webbotparser"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gesistsa%2FwebbotparseR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gesistsa%2FwebbotparseR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gesistsa%2FwebbotparseR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gesistsa%2FwebbotparseR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gesistsa","download_url":"https://codeload.github.com/gesistsa/webbotparseR/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252430272,"owners_count":21746629,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["browser-extension","rstats","rstats-package","search-engine"],"created_at":"2025-05-05T03:21:17.676Z","updated_at":"2025-05-05T03:21:19.329Z","avatar_url":"https://github.com/gesistsa.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n    collapse = TRUE,\n    comment = \"#\u003e\",\n    fig.path = \"man/figures/README-\",\n    out.width = \"100%\"\n)\n```\n\n# webbotparseR  \u003cimg src=\"man/figures/logo.png\" align=\"right\" height=\"139\" /\u003e\n\n\u003c!-- badges: start --\u003e\n[![Codecov test coverage](https://codecov.io/gh/schochastics/webbotparseR/branch/main/graph/badge.svg)](https://app.codecov.io/gh/gesistsa/webbotparseR?branch=main)\n[![R-CMD-check](https://github.com/schochastics/webbotparseR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/gesistsa/webbotparseR/actions/workflows/R-CMD-check.yaml)\n\u003c!-- badges: end --\u003e\n\nwebbotparseR allows to parse search engine results that where scraped with the [WebBot](https://github.com/gesiscss/WebBot) browser extension. A similar python library is [also available](https://github.com/gesiscss/WebBot-tutorials).\n\n## Installation\n\nYou can install the development version of webbotparseR like so:\n\n``` r\nremotes::install_github(\"schochastics/webbotparseR\")\n```\n\nThe package contains an example html from a google search on climate change.\n```{r ex_file}\nlibrary(webbotparseR)\nex_file \u003c- system.file(\"www.google.com_climatechange_text_2023-03-16_08_16_11.html\", package = \"webbotparseR\")\n```\n\nSuch search results can be parsed via the function `parse_search_results()`. The parameter `engine` is used to specify the\nsearch engine and the search type.  \n\n```{r parse}\noutput \u003c- parse_search_results(path = ex_file, engine = \"google text\")\noutput\n```\n\nNote that images are always returned base64 encoded.\n```{r image}\noutput$image[1]\n```\n\nThe function `base64_to_img()` can be used to decode the image and save it in an appropriate format.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgesistsa%2Fwebbotparser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgesistsa%2Fwebbotparser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgesistsa%2Fwebbotparser/lists"}