{"id":24319394,"url":"https://github.com/btskinner/rmargins","last_synced_at":"2025-03-10T20:40:40.180Z","repository":{"id":84089699,"uuid":"218315372","full_name":"btskinner/rmargins","owner":"btskinner","description":"Stata-like margins in R by hand","archived":false,"fork":false,"pushed_at":"2019-10-29T15:41:03.000Z","size":32,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-17T15:44:51.646Z","etag":null,"topics":["margins","r","stata"],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/btskinner.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-10-29T15:07:05.000Z","updated_at":"2019-10-29T15:41:05.000Z","dependencies_parsed_at":null,"dependency_job_id":"fb341438-bb85-4744-aa97-ed02755aae1a","html_url":"https://github.com/btskinner/rmargins","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/btskinner%2Frmargins","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/btskinner%2Frmargins/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/btskinner%2Frmargins/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/btskinner%2Frmargins/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/btskinner","download_url":"https://codeload.github.com/btskinner/rmargins/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":242925855,"owners_count":20207752,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["margins","r","stata"],"created_at":"2025-01-17T15:32:53.497Z","updated_at":"2025-03-10T20:40:40.169Z","avatar_url":"https://github.com/btskinner.png","language":"R","readme":"Notes\n=====\n\nThe main file, `margins.R`, shows how to manually compute Stata-like\nmargins in R in the context of logistic regression. It’s mostly just to\nshow the intuition underlying Stata’s `-margins-` command, but you can\nuse the results to make nice margins figures with ggplot. Output from R\ncan be checked in Stata with `margins_check.do` and `fake_data.csv` can\nbe recreated with `make_fake_data.R`.\n\nFor a more complete suite of ready-to-go commands, there’s the\n[`margins`](https://cran.r-project.org/web/packages/margins/vignettes/Introduction.html)\nR package.\n\nSteps\n-----\n\n### Run logistic regression\n\n    ## read in fake data\n    df \u003c- read.csv('./fake_data.csv')\n\n    ## run logit\n    mod \u003c- glm(y ~ x1 + x2 + x3 + x4, data = df, family = binomial(link = 'logit'))\n    summary(mod)\n\n    ## \n    ## Call:\n    ## glm(formula = y ~ x1 + x2 + x3 + x4, family = binomial(link = \"logit\"), \n    ##     data = df)\n    ## \n    ## Deviance Residuals: \n    ##      Min        1Q    Median        3Q       Max  \n    ## -2.54441  -0.28561  -0.05508   0.15145   2.72244  \n    ## \n    ## Coefficients:\n    ##             Estimate Std. Error z value Pr(\u003e|z|)    \n    ## (Intercept)   0.7359     0.2081   3.536 0.000406 ***\n    ## x1            1.7795     0.1706  10.432  \u003c 2e-16 ***\n    ## x2           -3.5876     0.2732 -13.131  \u003c 2e-16 ***\n    ## x3           -4.6119     0.3689 -12.501  \u003c 2e-16 ***\n    ## x4            1.5467     0.2983   5.185 2.16e-07 ***\n    ## ---\n    ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n    ## \n    ## (Dispersion parameter for binomial family taken to be 1)\n    ## \n    ##     Null deviance: 1275.33  on 999  degrees of freedom\n    ## Residual deviance:  479.14  on 995  degrees of freedom\n    ## AIC: 489.14\n    ## \n    ## Number of Fisher Scoring iterations: 7\n\n### Margins for unit change in binary variable (`x3`)\n\n    ## (1) get model matrix from glm() object\n    mm \u003c- model.matrix(mod)\n    head(mm)\n\n    ##   (Intercept)         x1         x2 x3 x4\n    ## 1           1  0.2839260  1.2895032  1  0\n    ## 2           1  1.3495918 -2.0475880  0  0\n    ## 3           1  0.4017083  0.8771911  1  0\n    ## 4           1 -2.0652666  0.7446761  1  0\n    ## 5           1  0.5624508  0.2748494  1  0\n    ## 6           1 -0.1020731 -1.6143429  0  1\n\n    ## (2) drop intercept column of ones b/c we don't need it\n    mm \u003c- mm[,-1]\n    head(mm)\n\n    ##           x1         x2 x3 x4\n    ## 1  0.2839260  1.2895032  1  0\n    ## 2  1.3495918 -2.0475880  0  0\n    ## 3  0.4017083  0.8771911  1  0\n    ## 4 -2.0652666  0.7446761  1  0\n    ## 5  0.5624508  0.2748494  1  0\n    ## 6 -0.1020731 -1.6143429  0  1\n\n    ## (3) convert to data.frame to make life easier\n    df_mm \u003c- as.data.frame(mm)\n\n### VERSION 1: all other variables `-atmeans-`\n\n**NB: this should be equivalent to Stata `margins x3, atmeans`**\n\n    ## (4) make \"new data\" where # rows == # margins for key var, averages elsewhere\n    new_df \u003c- data.frame(x1 = mean(df_mm$x1),\n                         x2 = mean(df_mm$x2),\n                         x3 = c(0,1),       # two margins, 0/1, for x3\n                         x4 = mean(df_mm$x4))\n\n    new_df\n\n    ##           x1          x2 x3    x4\n    ## 1 0.05914387 -0.03310865  0 0.193\n    ## 2 0.05914387 -0.03310865  1 0.193\n\n    ## (5) use predict() with new data, setting type to get probs\n    pp \u003c- predict(mod, newdata = new_df, se.fit = TRUE, type = 'response')\n    pp\n\n    ## $fit\n    ##          1          2 \n    ## 0.77876250 0.03378329 \n    ## \n    ## $se.fit\n    ##          1          2 \n    ## 0.03568211 0.00822396 \n    ## \n    ## $residual.scale\n    ## [1] 1\n\n    ## check difference (Stata: -margins, dydx(x3) atmeans-)\n    pp$fit[2] - pp$fit[1]\n\n    ##          2 \n    ## -0.7449792\n\n### VERSION 2: `x4 == 1`, others `-atmeans-`\n\n**NB: this should be equivalent to Stata\n`margins x3, at(x4 = 1) atmeans`**\n\n    ## (4) make \"new data\" where # rows == # margins for key var, averages elsewhere\n    new_df \u003c- data.frame(x1 = mean(df_mm$x1),\n                         x2 = mean(df_mm$x2),\n                         x3 = c(0,1),       # two margins, 0/1, for x3\n                         x4 = 1)            # x4 == 1\n\n    new_df\n\n    ##           x1          x2 x3 x4\n    ## 1 0.05914387 -0.03310865  0  1\n    ## 2 0.05914387 -0.03310865  1  1\n\n    ## (5) use predict() with new data, setting type to get probs\n    pp \u003c- predict(mod, newdata = new_df, se.fit = TRUE, type = 'response')\n    pp\n\n    ## $fit\n    ##         1         2 \n    ## 0.9246054 0.1085866 \n    ## \n    ## $se.fit\n    ##          1          2 \n    ## 0.02277638 0.02804731 \n    ## \n    ## $residual.scale\n    ## [1] 1\n\n### Margins for unit change in continuous variable (`x1`)\n\n**NB: this should be equivalent to Stata\n`margins, at(x1 = (-4(1)4)) atmeans`**\n\n    ## get idea of range\n    summary(df$x1)\n\n    ##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. \n    ## -2.69344 -0.57129  0.04494  0.05914  0.70844  3.21878\n\n    ## (4) make \"new data\" where # rows == # margins for key var, averages elsewhere\n    new_df \u003c- data.frame(x1 = seq(from = -4, to = 4, by = 1),\n                         x2 = mean(df_mm$x2),\n                         x3 = mean(df_mm$x3),\n                         x4 = mean(df_mm$x4))\n\n    new_df\n\n    ##   x1          x2    x3    x4\n    ## 1 -4 -0.03310865 0.714 0.193\n    ## 2 -3 -0.03310865 0.714 0.193\n    ## 3 -2 -0.03310865 0.714 0.193\n    ## 4 -1 -0.03310865 0.714 0.193\n    ## 5  0 -0.03310865 0.714 0.193\n    ## 6  1 -0.03310865 0.714 0.193\n    ## 7  2 -0.03310865 0.714 0.193\n    ## 8  3 -0.03310865 0.714 0.193\n    ## 9  4 -0.03310865 0.714 0.193\n\n    ## (5) use predict() with new data, setting type to get probs\n    pp \u003c- predict(mod, newdata = new_df, se.fit = TRUE, type = 'response')\n    pp\n\n    ## $fit\n    ##            1            2            3            4            5 \n    ## 9.538101e-05 5.650291e-04 3.339462e-03 1.947163e-02 1.053009e-01 \n    ##            6            7            8            9 \n    ## 4.109116e-01 8.052238e-01 9.607867e-01 9.931607e-01 \n    ## \n    ## $se.fit\n    ##            1            2            3            4            5 \n    ## 7.674447e-05 3.599678e-04 1.572884e-03 5.992871e-03 1.683050e-02 \n    ##            6            7            8            9 \n    ## 3.710458e-02 4.245440e-02 1.603947e-02 4.003603e-03 \n    ## \n    ## $residual.scale\n    ## [1] 1\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbtskinner%2Frmargins","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbtskinner%2Frmargins","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbtskinner%2Frmargins/lists"}