{"id":16274048,"url":"https://github.com/trinker/syllable","last_synced_at":"2025-07-06T07:07:44.313Z","repository":{"id":35785326,"uuid":"40065959","full_name":"trinker/syllable","owner":"trinker","description":"A Small Collection of Syllable Counting Functions","archived":false,"fork":false,"pushed_at":"2019-02-17T18:34:43.000Z","size":1707,"stargazers_count":11,"open_issues_count":1,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-03T08:12:40.269Z","etag":null,"topics":["count-syllables","r","readability","syllable-counts","text-mining"],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/trinker.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-08-02T02:10:21.000Z","updated_at":"2024-04-19T16:18:28.000Z","dependencies_parsed_at":"2022-09-04T01:40:20.391Z","dependency_job_id":null,"html_url":"https://github.com/trinker/syllable","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/trinker/syllable","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinker%2Fsyllable","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinker%2Fsyllable/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinker%2Fsyllable/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinker%2Fsyllable/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/trinker","download_url":"https://codeload.github.com/trinker/syllable/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinker%2Fsyllable/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263861949,"owners_count":23521355,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["count-syllables","r","readability","syllable-counts","text-mining"],"created_at":"2024-10-10T18:26:52.467Z","updated_at":"2025-07-06T07:07:44.290Z","avatar_url":"https://github.com/trinker.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\ntitle: \"syllable\"\ndate: \"`r format(Sys.time(), '%d %B, %Y')`\"\noutput:\n  md_document:\n    toc: true      \n---\n\n```{r, echo=FALSE}\ndesc \u003c- suppressWarnings(readLines(\"DESCRIPTION\"))\nregex \u003c- \"(^Version:\\\\s+)(\\\\d+\\\\.\\\\d+\\\\.\\\\d+)\"\nloc \u003c- grep(regex, desc)\nver \u003c- gsub(regex, \"\\\\2\", desc[loc])\nverbadge \u003c- sprintf('\u003ca href=\"https://img.shields.io/badge/Version-%s-orange.svg\"\u003e\u003cimg src=\"https://img.shields.io/badge/Version-%s-orange.svg\" alt=\"Version\"/\u003e\u003c/a\u003e\u003c/p\u003e', ver, ver)\npacman::p_load(syllable, knitr)\n```\n\n```{r, echo=FALSE}\nknit_hooks$set(htmlcap = function(before, options, envir) {\n  if(!before) {\n    paste('\u003cp class=\"caption\"\u003e\u003cb\u003e\u003cem\u003e',options$htmlcap,\"\u003c/em\u003e\u003c/b\u003e\u003c/p\u003e\",sep=\"\")\n    }\n    })\nknitr::opts_knit$set(self.contained = TRUE, cache = FALSE)\nknitr::opts_chunk$set(fig.path = \"tools/figure/\")\n```\n\n\n[![Project Status: Inactive – The project has reached a stable, usable state but is no longer being actively developed; support/maintenance will be provided as time allows.](https://www.repostatus.org/badges/latest/inactive.svg)](https://www.repostatus.org/#inactive)\n[![Build Status](https://travis-ci.org/trinker/syllable.svg?branch=master)](https://travis-ci.org/trinker/syllable)\n[![Coverage Status](https://coveralls.io/repos/trinker/syllable/badge.svg?branch=master)](https://coveralls.io/r/trinker/syllable?branch=master)\n[![DOI](https://zenodo.org/badge/5398/trinker/syllable.svg)](https://zenodo.org/badge/latestdoi/5398/trinker/syllable)\n[![](http://cranlogs.r-pkg.org/badges/syllable)](https://cran.r-project.org/package=syllable)\n`r verbadge`\n\n![](tools/syllable_logo/r_syllable.png)\n\n\n**syllable** is a small collection of tools for counting syllables and polysyllables.  The tools rely \nprimarily on [**data.table**](https://CRAN.R-project.org/package=data.table) hash table lookups, resulting in fast syllable counting.\n\n# Main Functions \n\nThe main functions follow the format of `action_object`.  \n\n## Actions\n\nThe following table outlines the actions.  Example Output correspond to this string: `\"I like chicken sandwiches.\"`.\n\n\n| Action       | Description                | Returns               | Example Output              |\n|--------------|----------------------------|-----------------------|-----------------------------|\n| `count`      | One integer per word       | A vector per string   | 1, 1, 2, 3                  |\n| `sum`        | Sum of syllable counts     | An integer per string | 7                           |\n| `tally`\\*      | Sum of syllable attributes | An integer per string | pollysyllable tallies = 1   |\n\n\\* The addition of `_mono`, `_di`, `_poly` `_short` (monosyllabic + disyllabic), or `_both` (short \u0026 pollysyllabic) to `tally` allows the user specify what syllable attribute is being tallied.\n\n## Objects \n\nThe following table outlines the objects acted upon:\n\n| Object       | Description                     | Example                        |\n|--------------|---------------------------------|--------------------------------|\n| `string`     | A character string              | `\"I like chicken sandwiches.\"` |\n| `vector`\\*   | A vector of character strings   | `c(\"I like it.\", \"Look out!\")` |\n\n\\* The addition of `_by` to `vector` allows the user to aggregate by one or more vectors of grouping variables.\n\n\n## Putting It Together\n\nThe function `count_vector` will provide a vector of integer counts for each word in a string.  For this reason `count_vector` will return a `list` of integer vector counts. \n\n```{r}\ncount_vector(c(\"I like it.\", \"Look out!\"))\n```\n\nEach of the main functions is optimized to do its task efficiently.  While one could use `sum(count_vector(x))` and achieve the same results as `sum_vector(x)` it would be less efficient.  \n\nThe available syllable functions that follow the format of `action_object` are:\n\n```{r, results='asis', echo=FALSE, comment=NA, warning=FALSE, htmlcap=\"Available Variable Functions\"}\np_load(pander, xtable, dplyr)\n\navaible_syllable_funs() %\u003e%\n    xtable() %\u003e%\n    print(type = 'html', include.colnames = FALSE, include.rownames = FALSE,\n        html.table.attributes = '')\n\n#matrix(c(sprintf(\"`%s`\", vect), blanks), ncol=4) %\u003e%\n#    pandoc.table(format = \"markdown\", caption = \"Available variable functions.\")\n```\n\n\n# Installation\n\nTo download the development version of **syllable**:\n\nDownload the [zip ball](https://github.com/trinker/syllable/zipball/master) or [tar ball](https://github.com/trinker/syllable/tarball/master), decompress and run `R CMD INSTALL` on it, or use the **pacman** package to install the development version:\n\n```r\nif (!require(\"pacman\")) install.packages(\"pacman\")\npacman::p_load_gh(\n    'trinker/lexicon',\n    'trinker/textclean',\n    'trinker/textshape',\n    'trinker/syllable'\n)\n```\n\n# Contact\n\nYou are welcome to:\n* submit suggestions and bug-reports at: \u003chttps://github.com/trinker/syllable/issues\u003e\n* send a pull request on: \u003chttps://github.com/trinker/syllable/\u003e\n* compose a friendly e-mail to: \u003ctyler.rinker@gmail.com\u003e\n\n# Examples\n\nThe following examples demonstrate the functionality of a select sample of **syllable** functions.\n\n## Count Syllables In a String\n\nCounts the number of syllables for each word in a string.\n\n\n```{r}\ncount_string(\"I like chicken and eggs for breakfast\")\n```\n\n\n## Count Syllables In a Vector of Strings\n\n\n```{r}\nsents \u003c- c(\"I like chicken.\", \"I want eggs benidict for breakfast.\")\ncount_vector(sents)\n\nMap(function(x, y) setNames(x, y),\n   count_vector(sents),\n   strsplit(gsub(\"[^a-z ]\", \"\", tolower(sents)), \"\\\\s+\")\n)\n```\n\n\n## Sum the Syllables In a Vector of Strings by Grouping Variable(s)\n\n\n```{r}\ndat \u003c- data.frame(\n   text = c(\"I like chicken.\", \"I want eggs benedict for breakfast.\", \"Really?\"),\n   group = c(\"A\", \"B\", \"A\")\n)\nsum_vector_by(dat$text, dat$group)\n```\n\n\n## Tally the Short/Poly-Syllabic Words by Group(s)\n\n```{r}\ndat \u003c- data.frame(\n   text = c(\"I like excellent chicken.\", \"I want eggs benedict now.\", \"Really?\"),\n   group = c(\"A\", \"B\", \"A\")\n)\ntally_both_vector_by(dat$text, dat$group)\n\nwith(presidential_debates_2012, tally_both_vector_by(dialogue, person))\n```\n\n\n## Readability Word Statistics by Grouping Variable(s)\n\n```{r}\nwith(presidential_debates_2012, readability_word_stats_by(dialogue, list(person, time)))\n```\n\n\n## Visualize Poly Syllable Distributions\n\n\n```{r}\nif (!require(\"pacman\")) install.packages(\"pacman\")\npacman::p_load(dplyr, ggplot2, scales)\n\ntally_both_vector(presidential_debates_2012$dialogue) %\u003e%\n    mutate(Duration = 1:length(poly)) %\u003e%\n    rowwise() %\u003e%\n    filter((short + poly) \u003e 4) %\u003e%\n    mutate(\n        short = short/(short+poly),\n        poly = 1 - short,\n        size = poly \u003e .3\n    ) %\u003e%\n    ggplot(aes(Duration, poly)) +\n        geom_text(aes(label = Duration, size = size, color = size)) +\n        coord_flip() +\n        scale_size_manual(values = c(1.5, 2.5), guide=FALSE) +\n        scale_color_manual(values = c(\"grey75\", \"black\"), guide=FALSE) +\n        scale_x_reverse() +\n        scale_y_continuous(label = scales::percent) +\n        ylab(\"Poly-syllabic\") +\n        xlab(\"Duration (sentences)\") +\n        theme_bw() \n```\n\n\n## Visualize Poly Syllable Distributions by Group \n\n\n```{r}\nif (!require(\"pacman\")) install.packages(\"pacman\")\npacman::p_load(dplyr, ggplot2, tidyr, scales)\n\nwith(presidential_debates_2012, tally_both_vector_by(dialogue, list(person, time))) %\u003e%\n    mutate(\n        person_time = paste(person, time, sep = \"-\"),\n        short = short/(short+poly),\n        poly = 1 - short\n    ) %\u003e%\n    arrange(poly) %\u003e%\n    mutate(person_time = factor(person_time, levels = person_time)) %\u003e%\n    gather(type, prop, c(short, poly)) %\u003e%\n    ggplot(aes(person_time, weight = prop, fill = type)) +\n        geom_bar() +\n        coord_flip() +        \n        scale_y_continuous(label = scales::percent) +\n        scale_fill_discrete(name=\"Syllable\\nType\") +\n        xlab(\"Person \u0026 Time\") +\n        ylab(\"Usage\") +\n        theme_bw()\n```\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftrinker%2Fsyllable","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftrinker%2Fsyllable","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftrinker%2Fsyllable/lists"}