{"id":32203997,"url":"https://github.com/mikkelvembye/aiscreenr","last_synced_at":"2025-10-22T04:49:51.096Z","repository":{"id":163650801,"uuid":"639108180","full_name":"MikkelVembye/AIscreenR","owner":"MikkelVembye","description":"AI screening tools in R for systematic reviewing","archived":false,"fork":false,"pushed_at":"2025-10-20T11:57:19.000Z","size":66643,"stargazers_count":14,"open_issues_count":5,"forks_count":5,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-10-22T04:49:45.979Z","etag":null,"topics":["gpt","openai","screening","systematic-review"],"latest_commit_sha":null,"homepage":"https://mikkelvembye.github.io/AIscreenR/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/MikkelVembye.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-05-10T19:24:30.000Z","updated_at":"2025-10-08T06:14:53.000Z","dependencies_parsed_at":"2024-11-26T12:21:28.636Z","dependency_job_id":"cc8d598e-7f1e-414d-b7e9-5d6cde7d67ca","html_url":"https://github.com/MikkelVembye/AIscreenR","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/MikkelVembye/AIscreenR","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MikkelVembye%2FAIscreenR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MikkelVembye%2FAIscreenR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MikkelVembye%2FAIscreenR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MikkelVembye%2FAIscreenR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/MikkelVembye","download_url":"https://codeload.github.com/MikkelVembye/AIscreenR/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/MikkelVembye%2FAIscreenR/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":280382976,"owners_count":26321423,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-22T02:00:06.515Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gpt","openai","screening","systematic-review"],"created_at":"2025-10-22T04:49:50.164Z","updated_at":"2025-10-22T04:49:51.081Z","avatar_url":"https://github.com/MikkelVembye.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n\u003ca href=\"https://mikkelvembye.github.io/AIscreenR/\"\u003e\u003cimg src=\"man/figures/AIscreenR_hex.png\" align=\"right\" width=\"180\" /\u003e\u003c/a\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\",\n  #eval = httr2::secret_has_key(\"AISCREENR_KEY\")\n  eval = FALSE\n)\n```\n\n# AIscreenR: AI screening tools in R for systematic reviewing \n\n\u003c!-- badges: start --\u003e\n[![R-CMD-check](https://github.com/MikkelVembye/AIscreenR/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/MikkelVembye/AIscreenR/actions/workflows/R-CMD-check.yaml)\n[![CRAN Version](https://www.r-pkg.org/badges/version/AIscreenR)](https://cran.r-project.org/package=AIscreenR)\n[![Monthly Downloads](https://cranlogs.r-pkg.org/badges/AIscreenR)](https://cranlogs.r-pkg.org/badges/AIscreenR)\n[![Total Downloads](https://cranlogs.r-pkg.org/badges/grand-total/AIscreenR)](https://cranlogs.r-pkg.org/badges/grand-total/AIscreenR)\n\u003c!-- badges: end --\u003e\n\nThe goal of AIscreenR is to use AI tools to support screening processes (including title and abstract screening) in systematic reviews and related literature reviews. At the current stage, the main aim of the `AIscreenR` package is to test and use OpenAI's GPT API models as second screeners of titles and abstracts or alternatively to reduce the number of references needed to be screened by humans. The package allows user to utilize OpenAI's GPT API models from the [https://api.openai.com/v1/chat/completions](https://platform.openai.com/docs/models/model-endpoint-compatibility) endpoint. In future developments, we may add further LLMs, such as API models from Claude 2. For now, we invite other researchers to test this software, so that we, as a review community, can get a better understanding of the performance of OpenAI's GPT API models for title and abstract screening in high-quality reviews. For guidance on how to conduct reliable title and abstract screenings with GPT API models, see [Vembye et al. (2025)](https://psycnet.apa.org/record/2026-37236-001). \n\n## Installation\n\nInstall the latest release from CRAN:\n\n```{r, eval = FALSE}\ninstall.packages(\"AIscreenR\")\n```\n\nYou can install the development version of AIscreenR from [GitHub](https://github.com/) with:\n\n``` {r}\n# install.packages(\"devtools\")\ndevtools::install_github(\"MikkelVembye/AIscreenR\")\n```\n\nSetting API key and checking rate limits\n```{r, eval=FALSE}\n# Find your api key at https://platform.openai.com/account/api-keys \n# Thereafter, either encrypt it with the secret functions from the httr2 package\n# see https://httr2.r-lib.org/reference/secrets.html or run set_api_key() \n# and then enter you key.\nlibrary(AIscreenR)\nlibrary(synthesisr)\nlibrary(tibble)\nlibrary(dplyr)\nlibrary(future)\n\n# Setting API\nset_api_key()\n\n# Obtain rate limits info (Default is \"gpt-4o-mini\")\nrate_limits \u003c- rate_limits_per_minute()\nrate_limits\n#\u003e # A tibble: 1 × 3\n#\u003e   model       requests_per_minute tokens_per_minute\n#\u003e   \u003cchr\u003e                     \u003cdbl\u003e             \u003cdbl\u003e\n#\u003e 1 gpt-4o-mini               30000         150000000\n```\n\nHow to load RIS files. In this example we have downloaded the RIS files from \nthe [EPPI-Reviewer](https://eppi.ioe.ac.uk/cms/Default.aspx?tabid=2914). \n\n```{r, eval = FALSE}\n\nexcl_path \u003c- system.file(\"extdata\", \"excl_tutorial.ris\", package = \"AIscreenR\")\n\n# Loading RIS file data with synthesisr \nris_dat_excl \u003c- read_refs(\"excl_path\") |\u003e \n  select(studyid = eppi_id, title, abstract) |\u003e \n  as_tibble() |\u003e \n  mutate(\n    human_code = 0 # Indicating exclusion\n  )\n\nincl_path \u003c- system.file(\"extdata\", \"incl_tutorial.ris\", package = \"AIscreenR\")\n\nris_dat_incl \u003c- read_refs(incl_path) |\u003e \n  select(studyid = eppi_id, title, abstract) |\u003e \n  as_tibble() |\u003e \n  mutate(\n    human_code = 1 # Indicating inclusion\n  )\n\nfilges2015_dat\u003c- bind_rows(ris_dat_excl, ris_dat_incl)\nhead(filges2015_dat, 10)\n#\u003e # A tibble: 10 × 4\n#\u003e    studyid title                                             abstract human_code\n#\u003e    \u003cchr\u003e   \u003cchr\u003e                                             \u003cchr\u003e         \u003cdbl\u003e\n#\u003e  1 9434957 Estimating and communicating prognosis in advanc… \"Progno…          0\n#\u003e  2 9433838 Self-Directed Behavioral Family Intervention: Do… \"Behavi…          0\n#\u003e  3 9431171 Frequency domain source localization shows state… \"The to…          0\n#\u003e  4 9433968 A Review of: 'Kearney, C. A. (2010). Helping Chi… \"The ar…          0\n#\u003e  5 9434460 Topographic differences in the adolescent matura… \"STUDY …          0\n#\u003e  6 9433554 BOOK REVIEW                                       \"The ar…          0\n#\u003e  7 9435130 Rapid improvement of depression and quality of l… \"Backgr…          0\n#\u003e  8 9432040 Pictorial cognitive task solving and dynamics of… \"AIMS: …          0\n#\u003e  9 9434093 Enhancing the Impact of Parent Training Through … \"New an…          0\n#\u003e 10 9431505 EEG spectrum as information carrier               \"Sponta…          0\n```\n```{r, echo=FALSE, include=FALSE}\nhead(filges2015_dat, 10)\n```\n\nExample of how to enter a prompt in R. Can also be done in Word (see vignette).\n\n```{r, eval = FALSE}\nprompt \u003c- \"Evaluate the following study based on the selection criteria\n for a systematic review on the effects of family-based interventions on drug abuse\n reduction for young people in treatment for non-opioid drug use.\n A family-based intervention (FFT) is equivalent to a behavior focused\n family therapy, where young people’s drug use is understood in relation to family\n behavior problems. Family-based interventions also includes manual-based family therapies as\n it targets young people and their families as a system throughout treatment, and thereby recognizes\n the important role of the family system in the development and treatment of young people’s drug use\n problems. FFT was developed in the late 1980s on request from the US National Institute on Drug Abuse\n (NIDA). The development of FFT was initially heavily inspired by the alcohol abuse program\n Community Reinforcement Approach (CRA), which was aimed at restructuring the environment\n to reinforce non-alcohol associated activities. FFT developed to have more emphasis on\n contingency contracting, impulse control strategies specific to drug use,\n and increased emphasis on involvement of family members in treatment.\n FFT is designed to accommodate diverse populations of youths with a variety of behavioral,\n cultural and individual preferences. FFT has evolved for use in severe behavioral disturbances\n known to co-exist with substance use and dependence, and the core interventions\n have been enhanced to address several mental health related problems commonly occurring\n as comorbid conditions in drug use treatment participant.  For each study,\n I would like you to assess:  1) Is the study about a family-based intervention,\n such as Functional Family Therapy, Multidimensional Family Therapy, or\n Behavioral Family Therapy? (Outpatient manual-based interventions of any\n duration delivered to young people and their families). If not, exclude study.\n 2) Are the participants in outpatient drug treatment primarily\n for non-opioid drug use? 3) Are the participants within age 11–21?\"\n```\n\nApproximate price of screening before running the screening.\n\n```{r, eval = FALSE}\napp_obj \u003c- \n  approximate_price_gpt(\n    data = filges2015_dat,\n    prompt = prompt,\n    studyid = studyid, # indicate the variable with the studyid in the data\n    title = title, # indicate the variable with the titles in the data\n    abstract = abstract, # indicate the variable with the abstracts in the data\n    model = \"gpt-4o-mini\",\n    rep = 1 \n  )\n\napp_obj\n#\u003e The approximate price of the (simple) screening will be around $0.0476.\n\napp_obj$price_dollar\n#\u003e [1] 0.0476\napp_obj$price_data\n#\u003e # A tibble: 1 × 6\n#\u003e   prompt   model       iterations input_price_dollar output_price_dollar\n#\u003e   \u003cchr\u003e    \u003cchr\u003e            \u003cdbl\u003e              \u003cdbl\u003e               \u003cdbl\u003e\n#\u003e 1 Prompt 1 gpt-4o-mini          1             0.0458             0.00178\n#\u003e # ℹ 1 more variable: total_price_dollar \u003cdbl\u003e\n```\n\nExample of how to conduct simple screening, returning `1` if a reference should be included,\n`0` if excluded, and `1.1` if uncertain. \n\n```{r, warning=FALSE, eval = FALSE}\n# Subsetting the number of references to speed up the tutorial screening\nplan(multisession)\ntest_obj \u003c- \n  tabscreen_gpt(\n    data = filges2015_dat[c(1:5, 266:270),],\n    prompt = prompt, \n    studyid = studyid, # indicate the variable with the studyid in the data\n    title = title, # indicate the variable with the titles in the data\n    abstract = abstract, # indicate the variable with the abstracts in the data\n    model = \"gpt-4o-mini\",\n    reps = 1 # Number of times the same question is asked to ChatGPT\n  ) \n#\u003e * The approximate price of the current (simple) screening will be around $0.0017.\n#\u003e * Consider removing references without abstracts since these can distort the accuracy of the screening.\n#\u003e Progress: ───────────────────────────────────────────────────────────────────────────────────── 100%\nplan(sequential)\ntest_obj\n#\u003e \n#\u003e Find the final result dataset via result_object$answer_data\n\n# Data sets in object\nprice_dat \u003c- test_obj$price_data\nprice_dat\n#\u003e # A tibble: 1 × 6\n#\u003e   prompt model       iterations input_price_dollar output_price_dollar\n#\u003e    \u003cint\u003e \u003cchr\u003e            \u003cdbl\u003e              \u003cdbl\u003e               \u003cdbl\u003e\n#\u003e 1      1 gpt-4o-mini          1             0.0016           0.0000432\n#\u003e # ℹ 1 more variable: total_price_dollar \u003cdbl\u003e\n\nall_dat \u003c- test_obj$answer_data\nall_dat |\u003e select(human_code, decision_binary)\n#\u003e # A tibble: 10 × 2\n#\u003e    human_code decision_binary\n#\u003e         \u003cdbl\u003e           \u003cdbl\u003e\n#\u003e  1          0               0\n#\u003e  2          0               0\n#\u003e  3          0               0\n#\u003e  4          0               0\n#\u003e  5          0               0\n#\u003e  6          1               0\n#\u003e  7          1               1\n#\u003e  8          1               0\n#\u003e  9          1               1\n#\u003e 10          1               1\n```\n\n# References\n\nVembye, M. H., Christensen, J., Mølgaard, A. B., \u0026 Schytt, F. L. W. (2025). \nGenerative pretrained transformer models can function as highly reliable second screeners of titles and abstracts in systematic reviews: A proof of concept and common guidelines. \n*Psychological Methods*, Online first. \u003chttps://doi.org/10.1037/met0000769\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmikkelvembye%2Faiscreenr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmikkelvembye%2Faiscreenr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmikkelvembye%2Faiscreenr/lists"}