{"id":13858199,"url":"https://github.com/friendly/vcdExtra","last_synced_at":"2025-07-13T23:31:46.323Z","repository":{"id":34088662,"uuid":"105277533","full_name":"friendly/vcdExtra","owner":"friendly","description":"Extensions and additions to vcd: Visualizing Categorical Data ","archived":false,"fork":false,"pushed_at":"2025-03-24T18:26:58.000Z","size":44152,"stargazers_count":24,"open_issues_count":5,"forks_count":7,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-07-13T10:47:15.966Z","etag":null,"topics":["categorical-data-visualization","generalized-linear-models","mosaic-plots","r-package"],"latest_commit_sha":null,"homepage":"https://friendly.github.io/vcdExtra/","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/friendly.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2017-09-29T13:49:37.000Z","updated_at":"2025-03-24T18:27:02.000Z","dependencies_parsed_at":"2023-02-16T01:45:32.963Z","dependency_job_id":"86a77977-b84b-44f2-babf-8255ee280f6e","html_url":"https://github.com/friendly/vcdExtra","commit_stats":{"total_commits":459,"total_committers":7,"mean_commits":65.57142857142857,"dds":"0.40305010893246185","last_synced_commit":"9a898aa473d1812754d9411a39c5b05f436e2141"},"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/friendly/vcdExtra","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/friendly%2FvcdExtra","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/friendly%2FvcdExtra/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/friendly%2FvcdExtra/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/friendly%2FvcdExtra/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/friendly","download_url":"https://codeload.github.com/friendly/vcdExtra/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/friendly%2FvcdExtra/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265220530,"owners_count":23729837,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["categorical-data-visualization","generalized-linear-models","mosaic-plots","r-package"],"created_at":"2024-08-05T03:02:00.230Z","updated_at":"2025-07-13T23:31:41.303Z","avatar_url":"https://github.com/friendly.png","language":"HTML","readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r setup, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  warning = FALSE,\n  comment = \"##\",\n  fig.path = \"man/figures/README-\",\n  fig.height = 5,\n  fig.width = 5\n#  out.width = \"100%\"\n)\n\nlibrary(vcdExtra)\n```\n\n\u003c!-- badges: start --\u003e\n\n[![CRAN_Status](http://www.r-pkg.org/badges/version/vcdExtra)](https://cran.r-project.org/package=vcdExtra)\n[![](http://cranlogs.r-pkg.org/badges/grand-total/vcdExtra)](https://cran.r-project.org/package=vcdExtra)\n[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)\n[![License](https://img.shields.io/badge/license-GPL%20%28%3E=%202%29-brightgreen.svg?style=flat)](https://www.gnu.org/licenses/gpl-2.0.html) \n\n\n\u003c!-- badges: end --\u003e\n\n# vcdExtra \u003cimg src=\"man/figures/logo.png\" style=\"float:right; height:200px;\" /\u003e\n## Extensions and additions to vcd: Visualizing Categorical Data \n\nVersion 0.8-4\n\nThis package provides additional data sets, documentation, and many\nfunctions designed to extend the [vcd](https://CRAN.R-project.org/package=vcd) package for *Visualizing Categorical Data*\nand the [gnm](https://CRAN.R-project.org/package=gnm) package for *Generalized Nonlinear Models*. \nIn particular, `vcdExtra` extends mosaic, assoc and sieve plots from vcd to handle `glm()` and \n`gnm()` models and\nadds a 3D version in `mosaic3d()`.\n\n`vcdExtra` is a support package for the book [*Discrete Data Analysis with R*](https://www.routledge.com/Discrete-Data-Analysis-with-R-Visualization-and-Modeling-Techniques-for/Friendly-Meyer/p/book/9781498725835) (DDAR) by Michael Friendly and David Meyer. There is also a\n[web site for DDAR](http://ddar.datavis.ca) with all figures and code samples from the book.\nIt is also used in my graduate course, [Psy 6136: Categorical Data Analysis](https://friendly.github.io/psy6136/).\n\n## Installation\n\nGet the released version from CRAN:\n\n     install.packages(\"vcdExtra\")\n\nThe development version can be installed to your R library directly from the [GitHub repo](https://github.com/friendly/vcdExtra) via:\n\n     if (!require(remotes)) install.packages(\"remotes\")\n     remotes::install_github(\"friendly/vcdExtra\", build_vignettes = TRUE)\n\n\n### Overview\n\nThe original purpose of this package was to serve as a sandbox for\nintroducing extensions of\nmosaic plots and related graphical methods\nthat apply to loglinear models fitted using `MASS::loglm()`,\ngeneralized linear models using\n`stats::glm()` and the related, generalized _nonlinear_ models fitted\nwith `gnm()` in the [gnm](https://CRAN.R-project.org/package=gnm) package.\n\nA related purpose was to fill in some holes in the analysis of\ncategorical data in R, not provided in base R, [vcd](https://CRAN.R-project.org/package=vcd), \nor other commonly used packages.\n\n##### See also:\n\u003ca href=\"https://www.routledge.com/Discrete-Data-Analysis-with-R-Visualization-and-Modeling-Techniques-for/Friendly-Meyer/p/book/9781498725835\"\u003e\u003cimg src=\"http://ddar.datavis.ca/images/ddar-cover.png\" style=\"height:70px;\"/\u003e\u003c/a\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \n\u003ca href=\"https://friendly.github.io/psy6136/\"\u003e\u003cimg src=\"https://friendly.github.io/psy6136/icons/psy6136-highres.png\" style=\"height:70px;\" /\u003e\u003c/a\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp; \n\u003ca href=\"https://friendly.github.io/nestedLogit/\"\u003e\u003cimg src=\"https://friendly.github.io/nestedLogit/logo.png\" style=\"height:70px;\" /\u003e\u003c/a\u003e\n\n\n* My book, [*Discrete Data Analysis with R: Visualization and Modeling Techniques for Categorical and Count Data*](https://www.routledge.com/Discrete-Data-Analysis-with-R-Visualization-and-Modeling-Techniques-for/Friendly-Meyer/p/book/9781498725835)\n\n* My graduate course, [Psy 6136: Categorical Data Analysis](https://friendly.github.io/psy6136/)\n\n* A companion package, [`nestedLogit`](https://friendly.github.io/nestedLogit/), for fitting nested dichotomy logistic regression models for a polytomous response.\n \n#### vcdExtra Highlights\n\n##### mosaic plot extensions\n* The method `mosaic.glm()` \nextends the `mosaic.loglm()` method in the vcd\npackage to this wider class of models, e.g., models for ordinal factors, which can't\nbe handled with `MASS::loglm()`.\nThis method also works for\nthe generalized _nonlinear_ models fit with the [gnm](https://CRAN.R-project.org/package=gnm) package,\nincluding models for square tables and models with multiplicative associations (RC models).\n\n* `mosaic3d()`\nintroduces a 3D generalization of mosaic displays using the\n[rgl](https://CRAN.R-project.org/package=rgl) package.\n\n##### model extensions\n* A new class, `glmlist`, is introduced for working with\ncollections of glm objects, e.g., `Kway()` for fitting\nall K-way models from a basic marginal model, and `LRstats()`\nfor brief statistical summaries of goodness-of-fit for a collection of\nmodels.\n\n* Similarly, for loglinear models fit using `MASS::loglm()`, the function `seq_loglm()`\n fits a series of sequential models to the 1-, 2-, ... _n_-way marginal tables, corresponding to a variety of types of models for joint, conditional, mutual, ... independence. It\n returns an object of class `loglmlist`, each of which is a class `loglm` object.\n The function `seq_mosaic()` generates the mosaic plots and other plots in the\n `vcd::strucplot()` framework. \n\n* For **square tables** with ordered factors, `Crossings()` supplements the \nspecification of terms in model formulas using\n`gnm::Symm()`,\n`gnm::Diag()`, \n`gnm::Topo(),` etc. in the [gnm](https://CRAN.R-project.org/package=gnm) package.\n\n#### Other additions\n\n* many new data sets; use `datasets(\"vcdExtra\")` to see a list with titles and descriptions.\nThe vignette, `vignette(\"datasets\", package=\"vcdExtra\")` provides a classification of these\naccording to methods of analysis.\n\n```{r vcdExtra-datasets}\nvcdExtra::datasets(\"vcdExtra\")[,1]\n```\n\n* a [collection of tutorial vignettes](https://cran.r-project.org/web/packages/vcdExtra/vignettes/). In the installed package, they can be viewed using `browseVignettes(package = \"vcdExtra\")`;\n\n```{r vignettes}\ntools::getVignetteInfo(\"vcdExtra\")[,c(\"File\", \"Title\")] |\u003e knitr::kable()\n```\n\n* a few useful utility functions for manipulating categorical data sets and working with models for\ncategorical data. \n\n\n## Examples\n\nThese `README` examples simply provide illustrations of using some of the package functions in the\ncontext of loglinear models for frequency tables fit using `glm()`, including\nmodels for _structured associations_ taking ordinality into account.\n\nThe dataset `Mental` is a data frame frequency table representing the cross-classification of mental health status (`mental`) of 1660 young New York residents by their parents' socioeconomic status (`ses`).\nBoth are _ordered_ factors.\n\n```{r ex-mental1}\ndata(Mental)\nstr(Mental)\n\n# show as frequency table\n(Mental.tab \u003c- xtabs(Freq ~ ses+mental, data=Mental))\n```\n\n\n#### Independence model\nFit the independence model, `Freq ~ mental + ses`, using `glm(..., family = poisson)`\nThis model is equivalent to the `chisq.test(Mental)` for general association; it\ndoes not take ordinality into account. `LRstats()` provides a compact summary of\nfit statistics for one or more models.\n```{r ex-mental2}\nindep \u003c- glm(Freq ~ mental + ses,\n             family = poisson, data = Mental)\nLRstats(indep)\n```\n\n`mosaic.glm()` is the mosaic method for `glm` objects.\nThe default mosaic display for these data:\n```{r mental1}\nmosaic(indep)\n```\n\nIt is usually better to use _standardized residuals_ (`residuals_type=\"rstandard\"`) in mosaic displays, rather than the default Pearson residuals.\nHere we also add longer labels for the table factors (`set_varnames`)\nand display the\nvalues of residuals (`labeling=labeling_residuals`) in the cells. \n\nThe strucplot `formula` argument, `~ ses + mental`\nhere gives the order of the factors in the mosaic display,\nnot the statistical model for independence. That is, the\nunit square is first split by `ses`, then by `mental` within\neach level of `ses`.\n```{r mental2}\n# labels for table factors\nlong.labels \u003c- list(set_varnames = c(mental=\"Mental Health Status\", \n                                     ses=\"Parent SES\"))\n\nmosaic(indep, formula = ~ ses + mental,\n       residuals_type=\"rstandard\",\n       labeling_args = long.labels, \n       labeling=labeling_residuals)\n```\n\nThe **opposite-corner** pattern of the residuals clearly shows that association\nbetween Parent SES and mental health depends on the _ordered_ levels of the factors:\nhigher Parent SES is associated with better mental health status. A principal virtue\nof mosaic plots is to show the pattern of association that remains\nafter a model has been fit, and thus help suggest a better model.\n\n#### Ordinal models\nOrdinal models use **numeric** scores for the row and/or column variables.\nThese models typically use equally spaced _integer_ scores.\nThe test for association here is analogous to a test of the correlation\nbetween the frequency-weighted scores, carried out using `CMHtest()`.\n\nIn the data, `ses` and `mental` were declared to be ordered factors,\nso using `as.numeric(Mental$ses)` is sufficient to create a new `Cscore`\nvariable. Similarly for the numeric version of `mental`, giving `Rscore`.\n\n```{r mental-scores}\nCscore \u003c- as.numeric(Mental$ses)\nRscore \u003c- as.numeric(Mental$mental)\n```\n\n\nUsing these, the term `Rscore:Cscore` represents an association\nconstrained to be **linear x linear**; that is, the slopes for profiles of\nmental health status are assumed to vary linearly with those for Parent SES.\n(This model asserts that only one parameter (a local odds ratio)\nis sufficient to account for all association, and is also called the model of \"uniform association\".)\n\n\n```{r mental3}\n# fit linear x linear (uniform) association.  Use integer scores for rows/cols \nCscore \u003c- as.numeric(Mental$ses)\nRscore \u003c- as.numeric(Mental$mental)\n\nlinlin \u003c- glm(Freq ~ mental + ses + Rscore:Cscore,\n              family = poisson, data = Mental)\nmosaic(linlin, ~ ses + mental,\n       residuals_type=\"rstandard\", \n       labeling_args = long.labels, \n       labeling=labeling_residuals, \n       suppress=1, \n       gp=shading_Friendly,\n       main=\"Lin x Lin model\")\n```\n\nNote that the test for linear x linear association consumes only 1 degree of freedom,\ncompared to the `(r-1)*(c-1) = 15` degrees of freedom for general association.\n```{r}\nanova(linlin, test=\"Chisq\")\n```\n\n\nOther models are possible between the independence model, `Freq ~ mental + ses`,\nand the saturated model `Freq ~ mental + ses + mental:ses`.\nThe `update.glm()` method make these easy to specify, as addition of terms to\nthe independence model.\n```{r}\n# use update.glm method to fit other models\n\nlinlin \u003c- update(indep, . ~ . + Rscore:Cscore)\nroweff \u003c- update(indep, . ~ . + mental:Cscore)\ncoleff \u003c- update(indep, . ~ . + Rscore:ses)\nrowcol \u003c- update(indep, . ~ . + Rscore:ses + mental:Cscore)\n```\n\n**Compare the models**: \nFor `glm` objects, the `print` and `summary` methods give too much information if all one wants to see is a brief summary of model goodness of fit, and there is no easy way to display a compact comparison of model goodness of fit for a collection of models fit to the same data.\n\n`LRstats()` provides a brief summary for one or more models fit to the same dataset.\nThe likelihood ratio $\\chi^2$ values (`LR Chisq`)test lack of fit.\nBy these tests, none of the ordinal models show significant lack of fit.\nBy the AIC and BIC statistics, the `linlin` model is the best, combining parsimony and goodness of fit.\n```{r}\nLRstats(indep, linlin, roweff, coleff, rowcol)\n```\nThe `anova.glm()` function gives tests of nested models.\n```{r}\nanova(indep, linlin, roweff, test = \"Chisq\")\n\n```\n\n\n## References\n\nFriendly, M. \u0026 Meyer, D. (2016). _Discrete Data Analysis with R: Visualization and Modeling Techniques for Categorical and Count Data_. Boca Raton, FL: Chapman \u0026 Hall/CRC.\n","funding_links":[],"categories":["R"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffriendly%2FvcdExtra","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffriendly%2FvcdExtra","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffriendly%2FvcdExtra/lists"}