{"id":13857620,"url":"https://github.com/elbersb/segregation","last_synced_at":"2025-07-12T14:31:06.000Z","repository":{"id":41207653,"uuid":"128066061","full_name":"elbersb/segregation","owner":"elbersb","description":"R package to calculate entropy-based segregation indices, with a focus on the Mutual Information Index (M) and Theil’s Information Index (H)","archived":false,"fork":false,"pushed_at":"2024-01-30T22:19:17.000Z","size":11795,"stargazers_count":35,"open_issues_count":1,"forks_count":3,"subscribers_count":6,"default_branch":"master","last_synced_at":"2024-10-13T21:37:21.853Z","etag":null,"topics":["entropy","r","r-package","rstats","segregation","statistics"],"latest_commit_sha":null,"homepage":"https://elbersb.com/segregation","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/elbersb.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2018-04-04T13:28:20.000Z","updated_at":"2024-09-06T14:57:45.000Z","dependencies_parsed_at":"2022-08-21T06:50:47.142Z","dependency_job_id":"252272d1-aa63-475c-9ed5-ffdf3e1c7ad9","html_url":"https://github.com/elbersb/segregation","commit_stats":{"total_commits":197,"total_committers":3,"mean_commits":65.66666666666667,"dds":"0.060913705583756306","last_synced_commit":"3babfdad71679d3ee0bb6ce01d6b11e4464fb880"},"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elbersb%2Fsegregation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elbersb%2Fsegregation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elbersb%2Fsegregation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/elbersb%2Fsegregation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/elbersb","download_url":"https://codeload.github.com/elbersb/segregation/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225825268,"owners_count":17529905,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["entropy","r","r-package","rstats","segregation","statistics"],"created_at":"2024-08-05T03:01:42.145Z","updated_at":"2024-11-22T00:34:38.119Z","avatar_url":"https://github.com/elbersb.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"---\noutput:\n  md_document:\n    variant: gfm\neditor_options: \n  markdown: \n    wrap: 72\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, echo = FALSE}\nknitr::opts_chunk$set(\n    collapse = TRUE,\n    comment = \"#\u003e\",\n    fig.path = \"man/figures/README-\"\n)\noptions(scipen = 999)\noptions(digits = 3)\nset.seed(69839)\n```\n\n# segregation\n\n[![CRAN\nVersion](https://www.r-pkg.org/badges/version/segregation)](https://CRAN.R-project.org/package=segregation)\n[![R build\nstatus](https://github.com/elbersb/segregation/workflows/R-CMD-check/badge.svg)](https://github.com/elbersb/segregation/actions)\n[![Coverage\nstatus](https://codecov.io/gh/elbersb/segregation/branch/master/graph/badge.svg)](https://app.codecov.io/github/elbersb/segregation?branch=master)\n\nAn R package to calculate, visualize, and decompose various segregation indices. \nThe package currently supports\n\n-   the Mutual Information Index (M),\n-   Theil's Information Index (H),\n-   the index of Dissimilarity (D),\n-   the isolation and exposure index.\n\nFind more information in `vignette(\"segregation\")`\nand the [documentation](https://elbersb.de/segregation).\n\nThe package also supports\n\n-   [standard error and confidence intervals estimation via bootstrapping](https://elbersb.com/public/posts/2021-11-24-segregation-bias/),\n    which also corrects for small sample bias\n-   decomposition of the M and H indices (within/between, local segregation)\n-   decomposing differences in total segregation over time (Elbers 2020)\n-   [segregation visualizations](https://elbersb.github.io/segregation/articles/plotting.html) (segregation curves and 'segplots')\n\nMost methods return [tidy](https://vita.had.co.nz/papers/tidy-data.html)\n[data.tables](https://rdatatable.gitlab.io/data.table/) for easy\npost-processing and plotting. For speed, the package uses the [`data.table`](https://rdatatable.gitlab.io/data.table/)\npackage internally, and implements some functions in C++.\n\nMost of the procedures implemented in this package are described in more\ndetail [in this SMR\npaper](https://journals.sagepub.com/doi/full/10.1177/0049124121986204)\n([Preprint](https://osf.io/preprints/socarxiv/ya7zs/)) and [in this\nworking paper](https://osf.io/preprints/socarxiv/ruw4g/).\n\n## Usage\n\nThe package provides an easy way to calculate segregation measures,\nbased on the Mutual Information Index (M) and Theil's Entropy Index (H).\n\n```{r}\nlibrary(segregation)\n\n# example dataset with fake data provided by the package\nmutual_total(schools00, \"race\", \"school\", weight = \"n\")\n```\n\nStandard errors in all functions can be estimated via boostrapping. This\nwill also apply bias-correction to the estimates:\n\n```{r}\nmutual_total(schools00, \"race\", \"school\",\n    weight = \"n\",\n    se = TRUE, CI = 0.90, n_bootstrap = 500\n)\n```\n\nDecompose segregation into a between-state and a within-state term (the\nsum of these equals total segregation):\n\n```{r}\n# between states\nmutual_total(schools00, \"race\", \"state\", weight = \"n\")\n\n# within states\nmutual_total(schools00, \"race\", \"school\", within = \"state\", weight = \"n\")\n```\n\nLocal segregation (`ls`) is a decomposition by units or groups (here\nracial groups). This function also support standard error and CI\nestimation. The sum of the proportion-weighted local segregation scores\nequals M:\n\n```{r}\nlocal \u003c- mutual_local(schools00,\n    group = \"school\", unit = \"race\", weight = \"n\",\n    se = TRUE, CI = 0.90, n_bootstrap = 500, wide = TRUE\n)\nlocal[, c(\"race\", \"ls\", \"p\", \"ls_CI\")]\nsum(local$p * local$ls)\n```\n\nDecompose the difference in M between 2000 and 2005, using iterative\nproportional fitting (IPF) and the Shapley decomposition (see Elbers\n2021 for details):\n\n```{r}\nmutual_difference(schools00, schools05,\n    group = \"race\", unit = \"school\",\n    weight = \"n\", method = \"shapley\"\n)\n```\n\nShow a segplot:\n\n```{r segplot}\nsegplot(schools00, group = \"race\", unit = \"school\", weight = \"n\")\n```\n\nFind more information in the\n[documentation](https://elbersb.github.io/segregation/).\n\n## How to install\n\nTo install the package from CRAN, use\n\n```{r eval=FALSE}\ninstall.packages(\"segregation\")\n```\n\nTo install the development version, use\n\n```{r eval=FALSE}\ndevtools::install_github(\"elbersb/segregation\")\n```\n\n## Citation\n\nIf you use this package for your research, please cite one of the following papers:\n\n- Elbers, Benjamin (2021). A Method for Studying Differences in Segregation\nAcross Time and Space. Sociological Methods \u0026 Research.\n\u003chttps://doi.org/10.1177/0049124121986204\u003e\n\n- Elbers, Benjamin and Rob Gruijters (2023). Segplot: A New Method for Visualizing Patterns of Multi-Group Segregation.\n\u003chttps://doi.org/10.1016/j.rssm.2023.100860\u003e\n\n## Some additional resources\n\n-   The book *Analyzing US Census Data: Methods, Maps, and Models in R*\n    by Kyle E. Walker contains [a discussion of this\n    package](https://walker-data.com/census-r/modeling-us-census-data.html#indices-of-segregation-and-diversity),\n    and is a great resource for anyone working with spatial data,\n    especially U.S. Census data.\n-   A paper that makes use of this package: [Did Residential Racial\n    Segregation in the U.S. Really Increase? An Analysis Accounting for\n    Changes in Racial\n    Diversity](https://elbersb.com/public/posts/2021-07-23-segregation-increase/)\n    ([Code and Data](https://osf.io/mg9q4/))\n-   Some of the analyses [in this\n    article](https://multimedia.tijd.be/diversiteit/) by the Belgian\n    newspaper *De Tijd* used the package.\n-   The analyses of [this article in the Wall Street\n    Journal](https://www.wsj.com/articles/chicago-vs-dallas-why-the-north-lags-behind-the-south-and-west-in-racial-integration-11657936680)\n    were produced using this package.\n\n## References on entropy-based segregation indices\n\nDeutsch, J., Flückiger, Y. \u0026 Silber, J. (2009). Analyzing Changes in\nOccupational Segregation: The Case of Switzerland (1970--2000), in: Yves\nFlückiger, Sean F. Reardon, Jacques Silber (eds.) Occupational and\nResidential Segregation (Research on Economic Inequality, Volume 17),\n171--202.\n\nDiPrete, T. A., Eller, C. C., Bol, T., \u0026 van de Werfhorst, H. G. (2017).\nSchool-to-Work Linkages in the United States, Germany, and France.\nAmerican Journal of Sociology, 122(6), 1869-1938.\n\u003chttps://doi.org/10.1086/691327\u003e\n\nElbers, B. (2021). A Method for Studying Differences in Segregation\nAcross Time and Space. Sociological Methods \u0026 Research.\n\u003chttps://doi.org/10.1177/0049124121986204\u003e\n\nForster, A. G., \u0026 Bol, T. (2017). Vocational education and employment\nover the life course using a new measure of occupational specificity.\nSocial Science Research, 70, 176-197.\n\u003chttps://doi.org/10.1016/j.ssresearch.2017.11.004\u003e\n\nTheil, H. (1971). Principles of Econometrics. New York: Wiley.\n\nFrankel, D. M., \u0026 Volij, O. (2011). Measuring school segregation.\nJournal of Economic Theory, 146(1), 1-38.\n\u003chttps://doi.org/10.1016/j.jet.2010.10.008\u003e\n\nMora, R., \u0026 Ruiz-Castillo, J. (2003). Additively decomposable\nsegregation indexes. The case of gender segregation by occupations and\nhuman capital levels in Spain. The Journal of Economic Inequality, 1(2),\n147-179. \u003chttps://doi.org/10.1023/A:1026198429377\u003e\n\nMora, R., \u0026 Ruiz-Castillo, J. (2009). The Invariance Properties of the\nMutual Information Index of Multigroup Segregation, in: Yves Flückiger,\nSean F. Reardon, Jacques Silber (eds.) Occupational and Residential\nSegregation (Research on Economic Inequality, Volume 17), 33-53.\n\nMora, R., \u0026 Ruiz-Castillo, J. (2011). Entropy-based Segregation Indices.\nSociological Methodology, 41(1), 159--194.\n\u003chttps://doi.org/10.1111/j.1467-9531.2011.01237.x\u003e\n\nVan Puyenbroeck, T., De Bruyne, K., \u0026 Sels, L. (2012). More than 'Mutual\nInformation': Educational and sectoral gender segregation and their\ninteraction on the Flemish labor market. Labour Economics, 19(1), 1-8.\n\u003chttps://doi.org/10.1016/j.labeco.2011.05.002\u003e\n\nWatts, M. The Use and Abuse of Entropy Based Segregation Indices.\nWorking Paper. URL:\n\u003chttp://www.ecineq.org/ecineq_lux15/FILESx2015/CR2/p217.pdf\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felbersb%2Fsegregation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Felbersb%2Fsegregation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Felbersb%2Fsegregation/lists"}