https://github.com/harmonydata/harmony_r
R library for Harmony. R package - open source tool using AI for psychology and mental health. Actively recruiting contributors.
https://github.com/harmonydata/harmony_r
ai cran data-science help-wanted multilingual-nlp natural-language-processing nlp psychology r
Last synced: about 1 month ago
JSON representation
R library for Harmony. R package - open source tool using AI for psychology and mental health. Actively recruiting contributors.
- Host: GitHub
- URL: https://github.com/harmonydata/harmony_r
- Owner: harmonydata
- License: other
- Created: 2023-07-21T11:34:26.000Z (almost 2 years ago)
- Default Branch: master
- Last Pushed: 2025-03-13T12:48:39.000Z (2 months ago)
- Last Synced: 2025-03-24T01:22:46.873Z (about 2 months ago)
- Topics: ai, cran, data-science, help-wanted, multilingual-nlp, natural-language-processing, nlp, psychology, r
- Language: HTML
- Homepage: https://harmonydata.ac.uk
- Size: 1.1 MB
- Stars: 2
- Watchers: 2
- Forks: 3
- Open Issues: 2
-
Metadata Files:
- Readme: README.Rmd
- Changelog: NEWS.md
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
README
---
output: github_document
---```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# Harmony R library: harmonydata
You can also **join our [Discord server](https://discord.gg/harmonydata)!**
If you found Harmony helpful, you can [leave us a review](https://g.page/r/CaRWc2ViO653EBM/review)!## 🧠 What Does Harmony Do?
Psychologists and social scientists often have to **match items** in different questionnaires, such as:
> *"I often feel anxious"*
> *"Feeling nervous, anxious or afraid"*This process is called **harmonisation**.
### 🔹 The Problem
- Harmonisation is a **time-consuming** and **subjective** process.
- Researchers have to **manually go through long PDFs** of questionnaires.
- Extracting questions and putting them into **Excel** is tedious.### 🔹 The Solution: Harmony
🚀 **Harmony** uses **natural language processing (NLP)** and **generative AI models** to:
- Automatically **match similar questionnaire items**.
- Help researchers work across **multiple languages**.
- Save **time** and **effort** in data harmonisation.## 📂 Looking for Examples?
Check out our **[examples repository](https://github.com/harmonydata/harmony_examples)** for hands-on demonstrations. ## 🌍 The Harmony Project
**Harmony** is an **AI-powered tool** that helps researchers **compare items from questionnaires** and **identify similar content**.
🔹 **Try Harmony:** [Harmony Web App](https://harmonydata.ac.uk/app)
🔹 **Read our blog:** [Harmony Blog](https://harmonydata.ac.uk/blog/)## 📞 Who to Contact?
🔹 **Harmony Team:** [harmonydata.ac.uk](https://harmonydata.ac.uk/)
🔹 **Thomas Wood:** [fastdatascience.com](https://fastdatascience.com/)## Getting started with the Harmony R library
* Check out [this video walkthrough](https://www.youtube.com/watch?v=hFqg6T_BqZc&t=1s) installing and running R on Windows 10.
* You can run the walkthrough Python notebook in [Google Colab](https://colab.research.google.com/github/harmonydata/harmony/blob/main/Harmony_example_walkthrough.ipynb) with a single click:
* You can also download an R markdown notebook to run in R Studio:
* You can run the walkthrough R notebook in Google Colab with a single click:
* [View the PDF documentation of the R package on CRAN](https://cran.r-project.org/web/packages/harmonydata/harmonydata.pdf)
### Installing R library
You can install the development version of harmonydata from [GitHub](https://github.com/harmonydata/harmony_r) with:
```{r}
#install.packages("devtools") # If you don't have devtools installed already.
library(devtools)
devtools::install_github("harmonydata/harmony_r")
```or you can install it via [CRAN](https://cran.r-project.org/):
```{r, eval = FALSE}
install.packages("harmonydata")
```## Setting up domain
Before starting, you can set up the remote API endpoint for harmony using this function. By default it uses the remote Harmony API [https://api.harmonydata.ac.uk](https://api.harmonydata.ac.uk){.uri}
```{r}
harmonydata::set_url()
```For example, if you want to use Harmony locally, you can run the [Harmony API as a Docker container](https://github.com/harmonydata/harmonyapi). By default it runs on localhost at port 8000. In this case you can run this command to run it locally:
```{bash, eval=FALSE}
docker run -p 8000:8000 harmonydata/harmonylocal
```Now in R you can set the R library to point to your local Harmony on Docker.
```{r, eval = FALSE}
harmonydata::set_url("http://localhost:8000")
```## Parsing a raw file into an Instrument
If you want to read in a raw (unstructured) PDF or Excel file, you can do this via a POST request to the REST API. This will convert the file into an Instrument object in JSON. It returns the instrument as a list.
```{r example}
library(harmonydata)
instrument = load_instruments_from_file(path = "examples/GAD-7.pdf")
names(instrument[[1]])```
You can also input a url containing the questionnaire.
```{r}
instrument_2 = load_instruments_from_file("https://medfam.umontreal.ca/wp-content/uploads/sites/16/GAD-7-fran%C3%A7ais.pdf")
names(instrument_2[[1]])
```## Matching instruments
You can get a list containing the results of the match. Here we can see a list of similarity score for each question comapred to all the other questions in th other questionaire.
```{r}
instruments = append(instrument, instrument_2)
match = match_instruments(instruments)
names(match)
```Here is how the matches look like.
```{r}
match$matches
```# Running harmonydata locally from a docker image
To run harmonydata locally, first you need to pull the docker image using the terminal.
## 1. Pull docker image
```{bash, eval=FALSE}
docker pull harmonydata/harmonyapi
```## 2. Run docker image
```{bash, eval=FALSE}
docker run -p 8000:80 harmonyapi
```## 3. Configure harmonydata to run locally
Set url to use localhost. Don't forget to expose port 8000:
```{r, eval = FALSE}
set_url(harmony_url = "http://localhost:8000")
```## 📜 How do I cite Harmony?
You can cite our validation paper:
McElroy, Wood, Bond, Mulvenna, Shevlin, Ploubidis, Scopel Hoffmann, Moltrecht, [Using natural language processing to facilitate the harmonisation of mental health questionnaires: a validation study using real-world data](https://bmcpsychiatry.biomedcentral.com/articles/10.1186/s12888-024-05954-2#citeas). BMC Psychiatry 24, 530 (2024), https://doi.org/10.1186/s12888-024-05954-2
A BibTeX entry for LaTeX users is
```bibtex
@article{mcelroy2024using,
title={Using natural language processing to facilitate the harmonisation of mental health questionnaires: a validation study using real-world data},
author={McElroy, Eoin and Wood, Thomas and Bond, Raymond and Mulvenna, Maurice and Shevlin, Mark and Ploubidis, George B and Hoffmann, Mauricio Scopel and Moltrecht, Bettina},
journal={BMC Psychiatry},
volume={24},
number={1},
pages={530},
year={2024},
publisher={Springer}
}```