https://github.com/corymccartan/birdie

Bayesian Instrumental Regression for Disparity Estimation
https://github.com/corymccartan/birdie

Last synced: 3 months ago
JSON representation

Bayesian Instrumental Regression for Disparity Estimation

Host: GitHub
URL: https://github.com/corymccartan/birdie
Owner: CoryMcCartan
License: gpl-3.0
Created: 2021-12-19T00:07:26.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2024-06-19T16:18:38.000Z (12 months ago)
Last Synced: 2025-03-17T22:13:23.436Z (3 months ago)
Language: R
Homepage: http://corymccartan.com/birdie/
Size: 25.4 MB
Stars: 5
Watchers: 4
Forks: 3
Open Issues: 3
Metadata Files:
- Readme: README.Rmd
- License: LICENSE.md

Awesome Lists containing this project

README

        ---

output: github_document

---

```{r, include = FALSE}

knitr::opts_chunk$set(

  collapse = TRUE,

  comment = "#>",

  fig.path = "man/figures/README-",

  out.width = "100%"

)

set.seed(5118)

```

# **BIRDiE**: Estimating disparities when race is not observed 

[![R-CMD-check](https://github.com/CoryMcCartan/birdie/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/CoryMcCartan/birdie/actions/workflows/R-CMD-check.yaml)

[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version-last-release/birdie)](https://cran.r-project.org/package=redist)

![CRAN downloads](http://cranlogs.r-pkg.org/badges/grand-total/birdie)

Bayesian Instrumental Regression for Disparity Estimation (BIRDiE) is a class of

Bayesian models for accurately estimating conditional distributions by race, 

using Bayesian Improved Surname Geocoding (BISG) probability estimates of

individual race.

This package implements BIRDiE as described in [McCartan, Fisher, Goldin, Ho, and Imai (2024)](https://www.nber.org/papers/w32373).

It also implements standard BISG and an improved measurement-error BISG model as described 

in [Imai, Olivella, and Rosenman (2022)](https://www.science.org/doi/full/10.1126/sciadv.adc9824).



## Installation

You can install the latest version of the package from CRAN with:

``` r

install.packages("birdie")

```

You can also install the development version with:

``` r

# install.packages("remotes")

remotes::install_github("CoryMcCartan/birdie")

```

## Basic Usage

A basic analysis has two steps.

First, you compute BISG probability estimates with the `bisg()` or `bisg_me()` functions (or using any other probabilistic race prediction tool).

Then, you estimate the distribution of an outcome variable by race using the `birdie()` function.

```{r}

library(birdie)

data(pseudo_vf)

head(pseudo_vf)

```

To compute BISG probabilities, you provide the last name and (optionally) geography variables as part of a formula.

```{r}

r_probs = bisg(~ nm(last_name) + zip(zip), data=pseudo_vf)

head(r_probs)

```

Computing regression estimates requires specifying a model structure.

Here, we'll use a Categorical-Dirichlet regression model that lets the

relationship between turnout and race vary by ZIP code.

This is the "no-pooling" model from McCartan et al.

We'll use Gibbs sampling for inference, which will also let us capture the uncertainty in our estimates.

```{r}

fit = birdie(r_probs, turnout ~ proc_zip(zip), data=pseudo_vf, 

             family=cat_dir(), algorithm="gibbs")

print(fit)

```

The `proc_zip()` function fills in missing ZIP codes, among other things.

We can extract the estimated conditional distributions with `coef()`.

We can also get updated BISG probabilities that additionally condition on turnout using `fitted()`.

Additional functions allow us to extract a tidy version of our estimates (`tidy()`)

and visualize the estimated distributions (`plot()`).

```{r}

coef(fit)

head(fitted(fit))

tidy(fit)

plot(fit)

```

A more detailed introduction to the method and software package can be found 

on the [Get Started](https://corymccartan.com/birdie/articles/birdie.html) page.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/corymccartan/birdie

Awesome Lists containing this project

README