{"id":16274175,"url":"https://github.com/trinker/formality","last_synced_at":"2025-07-24T15:03:33.560Z","repository":{"id":146663518,"uuid":"44067280","full_name":"trinker/formality","owner":"trinker","description":null,"archived":false,"fork":false,"pushed_at":"2017-04-12T00:53:53.000Z","size":322,"stargazers_count":3,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-02-14T13:15:15.447Z","etag":null,"topics":["formality","r","text-measure","text-mining"],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/trinker.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-10-11T19:56:33.000Z","updated_at":"2021-09-02T04:53:13.000Z","dependencies_parsed_at":"2023-06-25T22:51:06.841Z","dependency_job_id":null,"html_url":"https://github.com/trinker/formality","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinker%2Fformality","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinker%2Fformality/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinker%2Fformality/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinker%2Fformality/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/trinker","download_url":"https://codeload.github.com/trinker/formality/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247880701,"owners_count":21011712,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["formality","r","text-measure","text-mining"],"created_at":"2024-10-10T18:27:32.779Z","updated_at":"2025-04-08T16:27:29.885Z","avatar_url":"https://github.com/trinker.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\ntitle: \"formality\"\ndate: \"`r format(Sys.time(), '%d %B, %Y')`\"\noutput:\n  md_document:\n    toc: true      \n---\n\n```{r, echo=FALSE, message=FALSE, warning=FALSE}\nlibrary(knitr)\ndesc \u003c- suppressWarnings(readLines(\"DESCRIPTION\"))\nregex \u003c- \"(^Version:\\\\s+)(\\\\d+\\\\.\\\\d+\\\\.\\\\d+)\"\nloc \u003c- grep(regex, desc)\nver \u003c- gsub(regex, \"\\\\2\", desc[loc])\nverbadge \u003c- sprintf('\u003ca href=\"https://img.shields.io/badge/Version-%s-orange.svg\"\u003e\u003cimg src=\"https://img.shields.io/badge/Version-%s-orange.svg\" alt=\"Version\"/\u003e\u003c/a\u003e\u003c/p\u003e', ver, ver)\n````\n\n```{r, echo=FALSE}\nknit_hooks$set(htmlcap = function(before, options, envir) {\n  if(!before) {\n    paste('\u003cp class=\"caption\"\u003e\u003cb\u003e\u003cem\u003e',options$htmlcap,\"\u003c/em\u003e\u003c/b\u003e\u003c/p\u003e\",sep=\"\")\n    }\n    })\nknitr::opts_knit$set(self.contained = TRUE, cache = FALSE)\nknitr::opts_chunk$set(fig.path = \"tools/figure/\")\n```\n\n[![Project Status: Active - The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/0.1.0/active.svg)](http://www.repostatus.org/#active)\n[![Build Status](https://travis-ci.org/trinker/formality.svg?branch=master)](https://travis-ci.org/trinker/formality)\n[![Coverage Status](https://coveralls.io/repos/trinker/formality/badge.svg?branch=master)](https://coveralls.io/r/trinker/formality?branch=master)\n`r verbadge`\n\n![](tools/formality_logo/r_formality.png)\n\n\n**formality** utilizes the [**tagger**](https://github.com/trinker/tagger) package to conduct formality analysis.  Heylighen (1999) and Heylighen \u0026 Dewaele (2002, 1999) have given the *F-measure* as a measure of how *contextual* or *formal* language is.  Language is considered more formal when it contains much of the information directly in the text, whereas, contextual language relies on shared experiences to more efficiently dialogue with others.  \n\n# Formality Equation\n\nThe **formality** package's main function is also titled `formality` and uses Heylighen \u0026 Dewaele's (1999) *F-measure*.  The *F-measure* is defined formally as:\n\n$$F = 50(((n_f - n_c)/N) + 1)$$\n\nWhere:\n\n$$f = \\{noun, adjective, preposition, article\\}$$    \n$$c = \\{pronoun, verb, adverb, interjection\\}$$    \n$$N = n_f + n_c$$     \n\n\nThis yields an *F-measure* between $0$ and $100$%, with completely contextualized language on the zero end and completely formal language on the $100$ end.\n\nPlease see the following references for more details about formality and the *F-measure*:\n\n- Heylighen, F. (1999). Advantages and limitations of formal expression. Foundations of Science, 4, 25-56. \u003ca href=\"http://link.springer.com/article/10.1023%2FA%3A1009686703349\"\u003edoi:10.1023/A:1009686703349\u003c/a\u003e\n- Heylighen, F. \u0026 Dewaele, J.-M. (1999). Formality of language: Definition, measurement and behavioral determinants. Center \"Leo Apostel\", Free University of Brussels. Retrieved from [http://pespmc1.vub.ac.be/Papers/Formality.pdf](http://pespmc1.vub.ac.be/Papers/Formality.pdf)\n- Heylighen, F. \u0026 Dewaele, J.-M. (2002). Variation in the contextuality of language: An empirical measure. Foundations of Science, 7(3), 293-340. \u003ca href=\"http://link.springer.com/article/10.1023%2FA%3A1019661126744\"\u003edoi:10.1023/A:1019661126744\u003c/a\u003e\n\n\n\n\n# Installation\n\nTo download the development version of **formality**:\n\nDownload the [zip ball](https://github.com/trinker/formality/zipball/master) or [tar ball](https://github.com/trinker/formality/tarball/master), decompress and run `R CMD INSTALL` on it, or use the **pacman** package to install the development version:\n\n```r\nif (!require(\"pacman\")) install.packages(\"pacman\")\npacman::p_load_gh(c(\n    \"trinker/termco\", \n    \"trinker/tagger\", \n    \"trinker/formality\"\n))\n```\n\n# Contact\n\nYou are welcome to:\n* submit suggestions and bug-reports at: \u003chttps://github.com/trinker/formality/issues\u003e\n* send a pull request on: \u003chttps://github.com/trinker/formality/\u003e\n* compose a friendly e-mail to: \u003ctyler.rinker@gmail.com\u003e\n\n\n# Examples\n\nThe following examples demonstrate some of the functionality of **formality**.\n\n## Load the Tools/Data\n\n```{r}\nlibrary(formality)\ndata(presidential_debates_2012)\n```\n\n\n## Assessing Formality\n\n`formality` takes the text as `text.var` and any number of grouping variables as `grouping.var`.  Here we use the `presidential_debates_2012` data set and look at the formality of the people involved.  Note that for smaller text Heylighen \u0026 Dewaele (2002) state: \n\n\u003e At present, a sample would probably need to contain a few hundred words for the measure to be minimally reliable. For single sentences, the F-value should only be computed for purposes of illustration\" (p. 24).\n\n```{r}\nform1 \u003c- with(presidential_debates_2012, formality(dialogue, person))\nform1\n```\n\n## Recycling the First Run\n\nThis will take ~20 seconds because of the part of speech tagging that must be undertaken.  The output can be reused as `text.var`, cutting the time to a fraction of the first run.\n\n\n```{r}\nwith(presidential_debates_2012, formality(form1, list(time, person)))\n```\n\n## Plotting\n\nThe generic `plot` function provides three views of the data:\n\n1. A filled bar plot of formal vs. contextual usage\n2. A dotplot of formality\\*\\*\n3. A heatmap of the usage of the parts of speech used to calculate the formality score\n\n\\*\\****Note*** *red dot in center is a warning of less than 300 words*\n\n```{r}\nplot(form1)\n````\n\n\nThe `plot` function uses **gridExtra** to stitch the plots together, which is plotted imediately.  However, the three subplots are actually returned as a list as seen below.\n\n```{r}\nnames(plot(form1, plot=FALSE))\n```\n\n\nEach of these is a **ggplot2** object that can be further manipulated with various scales,  facets, and annotations.  I demonstrate some of this functionality in the plots below.\n\n```{r, warning=FALSE, message=FALSE}\nlibrary(ggplot2)\nplot(form1, plot=FALSE)[[1]] +\n    scale_size(range= c(8, 45)) +\n    scale_x_continuous(limits = c(52, 63))\n\n\nplot(form1, plot=FALSE)[[2]] +\n    scale_fill_grey()\n\nplot(form1, plot=FALSE)[[2]] +\n    scale_fill_brewer(palette = \"Pastel1\") +\n    facet_grid(~type)\n\nplot(form1, plot=FALSE)[[3]] +\n    scale_fill_gradient(high = \"red\", low=\"white\") +\n    ggtitle(\"Participant's Use of Parts of Speech\")\n\n\nplot(form1, plot=FALSE)[[3]] +\n    scale_fill_gradient2(midpoint=.12, high = \"red\", low=\"blue\")\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftrinker%2Fformality","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftrinker%2Fformality","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftrinker%2Fformality/lists"}