https://github.com/pslmodels/tax-microdata-benchmarking
A project to develop a benchmarked general-purpose dataset for tax reform impact analysis.
https://github.com/pslmodels/tax-microdata-benchmarking
microsimulation tax-benefit validation
Last synced: 4 days ago
JSON representation
A project to develop a benchmarked general-purpose dataset for tax reform impact analysis.
- Host: GitHub
- URL: https://github.com/pslmodels/tax-microdata-benchmarking
- Owner: PSLmodels
- Created: 2024-02-06T16:18:50.000Z (about 2 years ago)
- Default Branch: master
- Last Pushed: 2026-03-26T16:12:42.000Z (10 days ago)
- Last Synced: 2026-03-26T17:48:23.960Z (10 days ago)
- Topics: microsimulation, tax-benefit, validation
- Language: Python
- Homepage: https://pslmodels.github.io/tax-microdata-benchmarking/
- Size: 175 MB
- Stars: 3
- Watchers: 6
- Forks: 8
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# tax-microdata
This repository contains all working files for a project to develop
validated input files for use in
[Tax-Calculator](https://github.com/PSLmodels/Tax-Calculator).
The **current version is 2.0.0**, which was released on March 29, 2026,
and includes the following significant improvements:
- generate national, state, and Congressional district, input files
for **2022**:
[#470](https://github.com/PSLmodels/tax-microdata-benchmarking/pull/470)
[#471](https://github.com/PSLmodels/tax-microdata-benchmarking/pull/471)
[#472](https://github.com/PSLmodels/tax-microdata-benchmarking/pull/472)
[#473](https://github.com/PSLmodels/tax-microdata-benchmarking/pull/473)
[#474](https://github.com/PSLmodels/tax-microdata-benchmarking/pull/474)
- improve the selection of CPS tax units to represent nonfilers:
[#438](https://github.com/PSLmodels/tax-microdata-benchmarking/pull/438)
- vastly improve the reweighting algorithm:
[#416](https://github.com/PSLmodels/tax-microdata-benchmarking/pull/416)
- impute values for three variables used in new OBBBA deductions:
[#397](https://github.com/PSLmodels/tax-microdata-benchmarking/pull/397)
## Usage instructions
In order to use the code in this repository, you need to license the
2015 PUF from IRS/SOI. Once you have done that, you will have two
CSV-formatted files from IRS/SOI: `puf_2015.csv` and
`demographics_2015.csv`.
To generate the TMD files from the PUF files, do this:
1. Copy the two 2015 PUF files to the `tmd/storage/input` folder
2. Install the SIPP files described in `tmd/storage/input/SIPP24/README.md`
3. Install the CEX files described in `tmd/storage/input/CEX23/README.md`
4. Run `make clean` in the repository's top-level folder
5. Run `make data` in the repository's top-level folder
The `make data` command creates and tests the three national
`tmd*csv*` data files, which are located in the `tmd/storage/output`
folder. Read [this
documentation](https://taxcalc.pslmodels.org/usage/data.html#irs-public-use-data-tmd-csv)
on how to use these three files with Tax-Calculator. Also, you can
look at the tests in this repository to see Python code that uses the
TMD files with Tax-Calculator.