{"id":19152476,"url":"https://github.com/gesiscss/whatsr","last_synced_at":"2025-05-07T05:46:09.836Z","repository":{"id":46138980,"uuid":"200825111","full_name":"gesiscss/WhatsR","owner":"gesiscss","description":"R-package to parse exported WhatsApp chatlog files to use them for quantifying interpersonal relationships trough textmining and meta-data","archived":false,"fork":false,"pushed_at":"2025-01-29T11:05:06.000Z","size":7837,"stargazers_count":25,"open_issues_count":4,"forks_count":4,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-03-31T06:51:12.069Z","etag":null,"topics":["data-donation","textmining","whatsapp"],"latest_commit_sha":null,"homepage":"https://gesiscss.github.io/WhatsR/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gesiscss.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-08-06T09:58:42.000Z","updated_at":"2025-02-20T08:52:33.000Z","dependencies_parsed_at":"2023-12-19T19:14:28.930Z","dependency_job_id":"c4fcb163-f866-4120-ad55-501a3d4a5d47","html_url":"https://github.com/gesiscss/WhatsR","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gesiscss%2FWhatsR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gesiscss%2FWhatsR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gesiscss%2FWhatsR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gesiscss%2FWhatsR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gesiscss","download_url":"https://codeload.github.com/gesiscss/WhatsR/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249738800,"owners_count":21318504,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-donation","textmining","whatsapp"],"created_at":"2024-11-09T08:18:03.647Z","updated_at":"2025-04-19T16:32:43.964Z","avatar_url":"https://github.com/gesiscss.png","language":"R","readme":"# WhatsR\n[![DOI](https://zenodo.org/badge/633831271.svg)](https://zenodo.org/badge/latestdoi/633831271)\n[![R-CMD-check](https://github.com/gesiscss/WhatsR/actions/workflows/r-cmd-check.yml/badge.svg)](https://github.com/gesiscss/WhatsR/actions/workflows/r-cmd-check.yml)\n[![CRAN status](https://www.r-pkg.org/badges/version/WhatsR)](https://www.r-pkg.org/badges/version/WhatsR)\n[![Downloads](https://cranlogs.r-pkg.org/badges/WhatsR)](https://cran.r-project.org/package=WhatsR)\n[![TotalDownloads](https://cranlogs.r-pkg.org/badges/grand-total/WhatsR?color=orange)](https://CRAN.R-project.org/package=WhatsR)\n\n\u003cimg src=\"man/figures/WhatsR_sticker.png\" align=\"right\" alt=\"WhatsR Sticker\" width=\"120\"\u003e\n\nThis is an R-package to import exported WhatsApp chatlogs, parse them into a usable dataframe format and thereby enable further analysis. This parser was built with the goal to work with chat logs extracted on Android as well as iOS devices, run on Linux, Mac and Windows, and to be able to handle multiple languages. Currently, only English and German are supported, but in principle, other languages could be added relatively easily (see below). The repo also contains a function to scrape and update the emoji_dictionary, should new emoji be added to WhatsApp in the meantime.\n\n## How to set it up?\n \n### 1) Requirements\n \n - If on Windows, `RTools` needs to be installed and working\n - The `RJava` package needs to be installed and working\n - plotting emoji requires the `ragg` package and you need to set your graphics backend to `AGG` (In Rstudio: Tools \u003e Global Options \u003e Graphics \u003e Backend)\n \n### 2) Installing it\n\n```R\n # for the most up-to-date GitHub version\n library(devtools)\n devtools::install_github(\"gesiscss/WhatsR\")\n \n # from CRAN\n install.packages(\"WhatsR\")\n \n```\n \n### 3) Testing it\n\n```R\n # creating simulated chatlog (saved in working directory)\n simulated_raw_chat \u003c- create_chatlog(language = \"english\")\n \n # parsing it\n simulated_parsed_chat \u003c- parse_chat(\"Simulated_WhatsR_chatlog.txt\")\n \n # plotting emojis contained in chat\n plot_emoji(simulated_parsed_chat, plot=\"bar\")\n \n```\n \n### 4) Using it with your own data\n \n#### Extract chat from your phone\n \n  - For Android Phones: https://faq.whatsapp.com/en/android/23756533/?category=5245251\n \n  - For Iphones: https://faq.whatsapp.com/en/iphone/20888066/?category=5245251#email\n \n```R\n # parsing it\n simulated_parse_chat \u003c- parse_chat(\"PATH/TO/YOUR/EXPORTED/FILE.txt\")\n \n  # plotting it\n plot_emoji(simulated_parse_chat, plot=\"bar\")\n```\n\n## Scientific use\nIf you are using this package for your research, please cite the corresponding paper accordingly. You get the citation as a BibTex by running\n```R\ncitation(\"WhatsR\")\n```\n \n```\nTo cite package ‘WhatsR’ in publications use:\n\nKohne, J., Montag, C. ChatDashboard: A Framework to collect, link, and process donated WhatsApp Chat Log Data. Behav Res 56, 3658–3684 (2024). https://doi.org/10.3758/s13428-023-02276-1\n\nA BibTeX entry for LaTeX users is\n\n@article{kohne2024chatdashboard,\n  title={Chat{D}ashboard: {A} {F}ramework to collect, link, and process donated {W}hats{A}pp {C}hat {L}og {D}ata},\n  author={Kohne, Julian and Montag, Christian},\n  journal={Behavior Research Methods},\n  volume={56},\n  number = {4},\n  pages={3658--3684},\n  year={2024},\n  publisher={Springer},\n  doi={10.3758/s13428-023-02276-1}\n}\n\n```\n\n## Does this parser work with other languages too?\n\nCurrently, only chats exported from phones set to *German* or *English* are supported. Other languages can be added by appending the `languages.csv` file with the necessary regular expressions to differentiate system messages from user generated content. In addition, `parse_chat` would need to be adapted and additional tests would have to be added. If you would like to add a language, please consider doing so via a pull request in this repository.\n\n\n## Examples\n\nThe package also includes some functions to compute additional metrics and visualize them. We will provide some basic examples for chats with two participants and for group chats with multiple participants here, for a complete overview, you can check the [documentation](man/) or the [figure section](man/figures/). The used chat is a chat that was parsed with the `anonimize = TRUE` parameter to exclude participant names. All plotting functions include multiple types of plots and additional parameters to restrict the range of the data.\n\n### Token Summary per Person\n\n```\nsummarize_tokens_per_person(data)\n```\n\n```\n$`WhatsApp System Message`\n$`WhatsApp System Message`$Timespan\n$`WhatsApp System Message`$Timespan$Start\n[1] \"2020-10-27 18:51:00 UTC\"\n\n$`WhatsApp System Message`$Timespan$End\n[1] \"2022-10-06 19:57:00 UTC\"\n\n\n$`WhatsApp System Message`$TokenStats\n   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. \n      1       1       1       1       1       1 \n\n\n$Person_1\n$Person_1$Timespan\n$Person_1$Timespan$Start\n[1] \"2020-10-27 18:51:00 UTC\"\n\n$Person_1$Timespan$End\n[1] \"2022-10-06 19:57:00 UTC\"\n\n\n$Person_1$TokenStats\n   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. \n  1.000   1.000   6.000   9.195  13.000 169.000 \n\n\n$Person_2\n$Person_2$Timespan\n$Person_2$Timespan$Start\n[1] \"2020-10-27 18:51:00 UTC\"\n\n$Person_2$Timespan$End\n[1] \"2022-10-06 19:57:00 UTC\"\n\n\n$Person_2$TokenStats\n   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. \n   1.00    1.00    6.00   10.75   14.00  407.00 \n\n```\n\n\n### Message Distribution\nDistribution of sent Messages.\n\n```\nplot_messages(data, plot = \"cumsum\", exclude_sm = TRUE)\n```\n\n![](man/figures/plot_messages()_cumsum.png)\n\n### Token Distribution\nDistribution of sent Tokens (words).\n\n```\nplot_tokens(data, plot = \"violin\", exclude_sm = TRUE)\n```\n\n![](man/figures/plot_tokens()_violin.png)\n\n\n### Tokens over Time\nDistribution of sent Tokens per Person over time\n\n```\nplot_tokens_over_time(data, plot = \"hours\", exclude_sm = TRUE)\n```\n\n![](man/figures/plot_tokens_over_time()_hours.png)\n\n### Wordcloud\nWordcloud of sent tokens, seperately for each participant.\n\n```\nplot_wordcloud(data, comparison = TRUE, exclude_sm = TRUE, font_size=50, min_occur= 300)\n```\n![](man/figures/plot_wordcloud()_comparison.png)\n\n### Lexical Dispersion Plot\nOccurrences of keywords in the chat. Example keyword is \"Weihnachten\" (Christmas).\n```\nplot_lexical_dispersion(data,keywords = c(\"weihnachten\"), exclude_sm = TRUE)\n```\n![](man/figures/plot_lexical_dispersion().png)\n\n### Sent Links\nAmount of sent Links per person and over time\n\n```\nplot_links(data, plot = \"heatmap\", exclude_sm = TRUE)\n```\n![](man/figures/plot_links()_heatmap.png)\n\n\n### Sent Smilies\nAmount of sent Smilies over time\n```\nplot_smilies(data, plot = \"cumsum\", exclude_sm = TRUE)\n```\n\n![](man/figures/plot_smilies()_cumsum.png)\n\n\n### Sent Emoji\nAmount of sent emoji per person\n\n```\nplot_emoji(data, plot = \"splitbar\", min_occur = 50, exclude_sm = TRUE)\n```\n![](man/figures/plot_emoji()_splitbar.png)\n\n### Location Visualization\nPlotting mentioned locations by persons\n\n```\nplot_locations(data)\n```\n![](man/figures/plot_locations().png)\n\n### Replytimes\nPlotting time it takes to respond\n```\nplot_replytimes(data, type = \"replytime\", exclude_sm = TRUE)\n```\n![](man/figures/plot_replytimes().png)\n\n### Sent Media\nAmount of sent Media files per person and over time\n\n```\nplot_media(data, plot = \"bar\", exclude_sm = TRUE)\n```\n![](man/figures/plot_media()_bar.png)\n\n\n### Interactive Networks\nInteractive network of chat participants. A connection represents a response to a message. Each Message is interpreted as a response to the previous message. Consecutive messages by the same chat participant are summarized into one \"session\". The shown plot is simple image, the actual output is an interactive HTML object, see [man folder](/man/).\n```\nplot_network(data)\n```\n![](man/figures/plot_network()_image.png)\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgesiscss%2Fwhatsr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgesiscss%2Fwhatsr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgesiscss%2Fwhatsr/lists"}