{"id":18941561,"url":"https://github.com/jrblevin/epl-replication","last_synced_at":"2026-03-23T21:30:19.276Z","repository":{"id":219485449,"uuid":"749168395","full_name":"jrblevin/epl-replication","owner":"jrblevin","description":"Dearing and Blevins (2024, Review of Economic Studies) Replication Files","archived":false,"fork":false,"pushed_at":"2024-03-04T19:13:52.000Z","size":19226,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-12-31T22:43:11.108Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"MATLAB","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jrblevin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-01-27T19:20:03.000Z","updated_at":"2024-11-23T19:35:22.000Z","dependencies_parsed_at":"2024-03-04T20:42:07.718Z","dependency_job_id":"a9854e3c-751a-47d6-99a2-7c9de47e53fb","html_url":"https://github.com/jrblevin/epl-replication","commit_stats":null,"previous_names":["jrblevin/epl-replication"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jrblevin%2Fepl-replication","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jrblevin%2Fepl-replication/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jrblevin%2Fepl-replication/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jrblevin%2Fepl-replication/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jrblevin","download_url":"https://codeload.github.com/jrblevin/epl-replication/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239942756,"owners_count":19722330,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-08T12:28:32.700Z","updated_at":"2026-03-23T21:30:19.215Z","avatar_url":"https://github.com/jrblevin.png","language":"MATLAB","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Dearing and Blevins (2024) Replication Files\n\nDearing and Blevins (2024).\nEfficient and Convergent Sequential Pseudo-Likelihood Estimation of\nDynamic Discrete Games.\n_Review of Economic Studies_.\n\n## Overview\n\nThis repository contains the replication code for all Monte Carlo\nexperiments and the Application of Dearing and Blevins (2024).  The\nreplication files for each section are contained in separate\nsubdirectories.  Here we give a brief overview of the contents of the\npackage.\n\n-   The `mc-am2007` subdirectory contains the replication code for the\n    Monte Carlo experiments in Section 4 of the paper.  The\n    simulations are based on the dynamic game of entry and exit in\n    Example 1 of the paper, parameterized to match a model with five\n    heterogeneous firms from Aguirregabiria and Mira (2007).\n\n-   The `application` subdirectory contains the replication code for\n    the application to U.S.  wholesale club store competition in\n    Section 5 of the paper.  The simulations are based on the dynamic\n    game of entry and exit in Example 1 of the paper.\n\n-   The `mc-psd2008` contains the replication code for the Monte Carlo\n    experiments in Appendix B.1 of the paper.  The simulations are\n    based on the model of Pesendorfer and Schmidt-Dengler (2008), a\n    dynamic game with multiple equilibria.\n\n-   The `mc-psd2010` subdirectory contains the replication code for\n    the Monte Carlo experiments in Appendix B.2 of the paper.  The\n    simulations are based on the model of Pesendorfer and\n    Schmidt-Dengler (2010), a static game with an unstable equilibrium\n    leading to non-convergence of NPL.\n\n## Data Availability\n\nThe Monte Carlo experiments in this paper do not involve analysis of external\ndata (i.e., the only data used are generated via simulation in the code).\n\nThe application in the paper uses data from several sources:\n\n-   [IPUMS NHGIS Table CL8: Total Population (1990, 2000, 2010, 2020)][nhgis]\n-   [U.S. Department of Housing and Urban Development Zip Code to County Crosswalk Files][crosswalk]\n-   [Data Axle Historical Business Database][infogroup]\n\nWe have included in this replication package the NHGIS population data\nextract from Table CL8: Total Population (`application/nhgis0012_ts_geog2010_zcta.csv`)\nas well as the HUD crosswalk file (`application/ZIP_COUNTY_122021.xlsx`).\n\nThe source files from the Data Axle Historical Business Database\ncannot be made publicly available due to the terms of use.  While we\ncannot redistribute the original dataset, we have provided our\nanonymized analysis sample which users can use to replicate the tables\nin the paper.\n\nThe full raw dataset can be obtained freely through certain university\nlibraries or directly from Data Axle directly for a fee.  Researchers\ninterested in access to the data may contact Data Axle at\n\u003chttps://www.data-axle.com/contact-us/\u003e.\n\nThe data preparation programs are also provided and can be used to\ngenerate the analysis sample once certain data files have been\nobtained from Data Axle.  These files are named following the pattern\n`BUSINESS_HISTORICAL_YYYY.zip` for the years 1997 through 2020 and\n`YYYY_Business_Academic_QCQ.txt.gz` for the year 2021.\n\n   [nhgis]: https://www.nhgis.org/tabular-data-sources\n   [crosswalk]: https://www.huduser.gov/portal/datasets/usps_crosswalk.html\n   [infogroup]: https://www.data-axle.com/our-data/business-data/\n\n### Details on each Data Source\n\n| Data Name                    | Data Files                       | Location                     | Provided | Citation              |\n|------------------------------|----------------------------------|------------------------------|----------|-----------------------|\n| Zip Code Population          | `nhgis0012_ts_geog2010_zcta.csv` | `application/nhgis0012_csv/` | True     | Manson, et al. (2022) |\n| Zip Code to County Crosswalk | `ZIP_COUNTY_122021.xlsx`         | `application/`               | True     | HUD (2021)            |\n| Historical Business Database | Not available                    | `application/`               | False    | Data Axle (2022)      |\n\n## Software Requirements\n\nMatlab is required to replicate all Monte Carlo experiments and the\napplication.  All code has been verified to be compatible with Matlab\n2022b.\n\nPython and Stata were used to prepare the analysis sample for the\napplication.\n\nThe Python script `application/extract.py` imports the following\nPython modules: `zipfile`, `gzip`, and `pandas`.  Although `zipfile`\nand `gzip` are part of the standard library, Pandas is an external\npackage.  Depending on your Python installation, it can likely be\ninstalled using the following command:\n\n```python\npip install pandas\n```\n\nThe Stata scripts used for the application require two additional\npackages to be installed: `xttrans2` and `mat2txt`.  These can be\ninstalled using the following commands:\n\n```stata\nnet install xttrans2\nssc install mat2txt\n```\n\nThe analysis sample used for the application in the paper was produced\nusing Python 3.11.6, Pandas 2.0.3, Stata MP 16.1, xttrans 1.2.0, and\nmat2txt 1.1.2.\n\n## Hardware Requirements\n\nAll of the programs in this replication package can be completed on a\nmodern laptop (e.g., a circa 2020 MacBook Pro) without issue.\n\n-   The complete series of Monte Carlo experiments reported in Section\n    4 (`mc-am2007` subdirectory) can be completed in approximately\n    three to five hours.\n\n-   The estimation and counterfactual simulations for the application\n    (`application` subdirectory), including 250 bootstrap replications\n    can be completed in approximately 30 minutes to one hour.\n\n-   The complete series of Monte Carlo experiments reported in\n    Appendix B.1 (`mc-psd2008` subdirectory) can be completed in\n    approximately one to two hours.\n\n-   The complete series of Monte Carlo experiments reported in\n    Appendix B.2 (`mc-psd2010` subdirectory) can be completed on a\n    modern laptop in approximately five minutes.\n\n## Description of Code\n\n### Monte Carlo Experiments (Section 4)\n\nBelow is a complete list of program and function files used in the\nMonte Carlo simulations reported in Section 4.  These files are\ncontained in the `mc-am2007` subdirectory.\n\nMonte Carlo experiments:\n\n-   `mc_main.m`: Control program that runs all Monte Carlo simulations\n    (across all experiments and sample sizes) and saves the results.\n-   `mc_epl.m`: The primary function which carries out each NPL and\n    EPL Monte Carlo experiment for a fixed sample size and experiment\n    number.\n-   `Gfunc.m`: Evaluates the equilibrium conditions of the dynamic\n    entry/exit oligopoly model.\n-   `epldygam.m`: Estimates the structural parameters using EPL.\n-   `npldygam.m`: Estimates the structural parameters using NPL.\n-   `freqprob.m`: Frequency probability estimation.\n-   `milogit.m`: Maximum Likelihood Estimation of Logit Model.\n-   `clogit.m`: Maximum Likelihood estimation of McFadden's\n    Conditional Logit model.\n-   `loglogit.m`: Log likelihood and gradient functions for\n    multinomial logit.\n-   `simdygam.m`: Simulates data from the dynamic game.\n-   `mpestat.m`: Sample summary statistics.\n\nPost-estimation programs to generate figures:\n\n-   `iter_histograms.m`: Produces histograms of the number of\n    iterations taken by the EPL and NPL estimators across Monte Carlo\n    replications.\n-   `parameter_histograms.m`: Produces histograms of the parameter\n    estimates across Monte Carlo replications.\n-   `time_histograms.m`: Produces histograms of the runtimes of the\n    EPL and NPL estimators across Monte Carlo replications.\n\n### Application (Section 5)\n\nBelow is a complete list of data, program, and function files used for\nthe application described in Section 5 of the paper.\n\n#### Raw Data\n\n-   `ZIP_COUNTY_122021.xlsx` - U.S. Department of Housing and Urban\n    Development Zip Code to County Crosswalk File.\n\n-   `nhgis0012_csv/` - This subdirectory contains the Zip code\n    population data and codebook from IPUMS NHGIS.\n\n-   Data from the Data Axle Historical Business Database\n    (`BUSINESS_HISTORICAL_YYYY.zip` and `YYYY_Business_Academic_QCQ.txt.gz`)\n    could not be included in the replication package due to restrictions\n    on redistribution.\n\n#### Data Processing\n\nThe following Python and Stata scripts were used to process the raw\ndata and produce the included analysis sample `clubstore_county.csv`:\n\n1.   `extract.py` - Extracts records related to wholesale club stores\n     from the Data Axle Historical Business Database files.  Produces\n     `extract.dta`.\n\n2.   `population_zip.do` - Generate zip-code population data in Stata.\n     Produces `population_zip.dta`.\n\n3.   `population_county.do` - Generate county population data in\n     Stata.  Produces `zip_county_crosswalk.dta` and\n     `population_county.dta`.\n\n4.   `clubstore_county.do` - Generate final analysis dataset using Stata.\n     Produces the analysis sample `clubstore_county.csv` and\n     `ptrans.txt` (both included).\n\n#### Analysis Sample\n\n-   `clubstore_county.csv` - This file contains the county-level\n    analysis sample used in the application.  This file was produced\n    following the steps in the previous Data Processing section.\n\n-   `ptrans.txt` - This file contains the state-to-state transition\n    matrix for market size used in the application.\n\n#### Estimation\n\n-   `main.m` - This is the main Matlab control file for the application.\n    This file produces Tables 7-11 in the paper.\n-   `Gfunc.m`: Evaluates the equilibrium conditions of the dynamic entry/exit\n    oligopoly model.\n-   `epldygam.m`: Estimates the structural parameters using EPL.\n-   `npldygam.m`: Estimates the structural parameters using NPL.\n-   `freqprob.m`: Frequency probability estimation.\n-   `clogit.m`: Maximum Likelihood estimation of McFadden's Conditional\n    Logit model.\n-   `milogit.m`: Maximum Likelihood Estimation of multinomial logit model.\n-   `loglogit.m`: Log likelihood and gradient functions for multinomial logit model.\n-   `mpestat.m`: Sample summary statistics.\n-   `forwardsim.m`: Simulates data on state transition and decisions, used\n    for counterfactual simulations.\n\n### Monte Carlo Experiments (Appendix B.1)\n\nBelow is a complete list of program and function files used in the\nMonte Carlo simulations reported in Appendix B.1 of the paper.  These\nfiles are contained in the `mc-psd2008` subdirectory.\n\nMain program:\n\n-   `main.m` - Main control program that executes all Monte Carlo\n    experiments for multiple sample sizes (`nobs`), multiple\n    equilibria (`eqm_dgp`), and multiple levels of noise in the\n    estimates (`c`).  Produces log files named according to the\n    pattern `psd2008_estimate_eqmN.log` where `N` is the equilibrium\n    index (`eqm_dgp` which can be 1, 2, or 3).  Results are also\n    stored in `.mat` files for further processing if desired.\n\nMain support functions:\n\n-   `psd2008_estimate.m` - Estimates the model using the generated\n    data and stores results.  Requires setting `nsims` (= 1000) before\n    running.  Produces\n\n-   `generate_data.m` - Solves for equilibria and generates data for\n    the Pesendorfer and Schmidt-Dengler (2008) model.  Requires\n    setting `nsims` (number of simulations; `1000` in the paper)\n    `nobs` (number of observations; `250` and `1000` in the paper),\n    and `eqm_dgp` (equilibrium number; 1, 2, or 3) in the Matlab\n    session before running directly.  Produces `psd2008_data.mat`.\n\nOther support functions:\n\n-   `EqmDiff.m` - Equilibrium conditions for the model.\n-   `Gderiv.m` - Derivative of G function with respect to v.\n-   `LL_EPL.m` - Log likelihood for EPL estimation.\n-   `LL_NPL.m` - Log likelihood for NPL estimation.\n-   `normsurplus.m` - Social surplus function under the normal\n    distribution.\n\n### Monte Carlo Experiments (Appendix B.2)\n\nBelow is a complete list of program and function files used in the\nMonte Carlo simulations reported in Appendix B.2 of the paper.  These\nfiles are contained in the `mc-psd2010` subdirectory.\n\n-   `main.m` - Main Monte Carlo simulation and estimation program.\n-   `LL_MLE.m` - Log likelihood function for MLE.\n-   `LL_EPL.m` - Log likelihood function for EPL.\n-   `LL_NPL.m` - Log likelihood function for NPL.\n-   `myCDF.m` - Error CDF specified in Pesendorfer and Schmidt-Dengler (2010).\n\nTo replicate the Monte Carlo experiments, run `main` in Matlab.\n\n## List of Tables and Figures\n\n-   Tables 1-6 were produced by `mc-am2007/mc_main.m`.\n\n-   Tables 7-11 in the paper were produced by `application/main.m`.\n\n-   Tables 12-16 were produced by `mc-psd2008/psd2008_estimate_all.m`.\n    This script loops over all equilibria and sample sizes and\n    estimates the model under several scenarios:\n\n    -   Table 12 corresponds to the case where `eqm_dgp = 1`,\n        `c = 0.0`, `nstart = 1`, and `nobs = [ 250, 1000 ]`.\n\n    -   Table 13 corresponds to the case where `eqm_dgp = 2`,\n        `c = 0.0`, `nstart = 1`, and `nobs = [ 250, 1000 ]`.\n\n    -   Table 14 corresponds to the case where `eqm_dgp = 3`,\n        `c = 0.0`, `nstart = 1`, and `nobs = 1000`.\n\n    -   Table 15 corresponds to the case where `eqm_dgp = [ 1, 2 ]`,\n        `c = 0.5`, `nstart = 1`, and `nobs = 250`.\n\n    -   Table 16 corresponds to the case where `eqm_dgp = [ 1, 2 ]`,\n        `c = 1.0`, `nstart = 5`, and `nobs = [ 250, 1000 ]`.\n\n-   Table 17 was produced by `mc-psd2010/main.m`.\n\n-   Figure 1 was produced by `mc-am2007/parameter_histograms.m`.\n    The subfigures shown correspond to the following PDF files:\n\n    -   `mc_epl_exper_1_6400_obs_param_7_hist.pdf`\n    -   `mc_epl_exper_2_6400_obs_param_7_hist.pdf`\n    -   `mc_epl_exper_3_6400_obs_param_7_hist.pdf`\n    -   `mc_epl_exper_1_6400_obs_param_8_hist.pdf`\n    -   `mc_epl_exper_2_6400_obs_param_8_hist.pdf`\n    -   `mc_epl_exper_3_6400_obs_param_8_hist.pdf`\n\n-   Figure 2 was produced by `mc-am2007/iter_histograms.m` and\n    `mc-am2007/time_histograms.m`.  The subfigures shown correspond to\n    the following PDF files:\n\n    -   `mc_epl_exper_1_6400_obs_iter_hist.pdf`\n    -   `mc_epl_exper_2_6400_obs_iter_hist.pdf`\n    -   `mc_epl_exper_3_6400_obs_iter_hist.pdf`\n    -   `mc_epl_exper_1_6400_obs_time_hist.pdf`\n    -   `mc_epl_exper_2_6400_obs_time_hist.pdf`\n    -   `mc_epl_exper_3_6400_obs_time_hist.pdf`\n\n## Results\n\n### Monte Carlo Experiments (Section 4)\n\nThe `mc-am2007/results` subdirectory contains the output of all Monte Carlo\nexperiments reported in Section 4 of the paper. Equilibria for each\nexperiment 1-3 are computed by the `mc_epl.m` program and stored in\nthe files `mc_epl_eq1.mat`, `mc_epl_eq2.mat`, and `mc_epl_eq3.mat`\nrespectively, if they do not already exist.  If they exist, the\nequilibrium choice probabilities are loaded reused for computational\nefficiency.\n\nEach of the log files named `mc_epl_exper_J_N_obs.log`, contain the results of\n1000 replications single Monte Carlo replication for experiment `J` (1, 2, or 3)\nand sample size `N` (1600 or 6400). These log files are large, so they have\nbeen gzip compressed to save space. The raw Matlab data files containing the\nparameter estimates, number of iterations, and runtimes from each replication\nare saved in .mat files named `mc_epl_exper_J_N_obs.mat`. The parameter, time,\nand iteration histograms are saved in PDF files with similar names.\n\n### Application (Section 5)\n\nThe results in Section 5 of the paper were produced by running\n`application/main.m` described above using Matlab R2022b.  The log\nfile and results are included in the replication package:\n\n-   `application/clubstore.log` - Contains the estimation log file for\n    the application, including estimation with the observed sample\n    (replication 1) as well as 250 bootstrap replications\n    (replications 2-251).\n\n-   `application/clubstore.mat` - The raw Matlab data file for the\n    application, including estimation with the observed sample and all\n    bootstrap replications\n\n### Monte Carlo Experiments (Appendix B.1)\n\nThe results in this section were produced by running `mc-psd2008/main.m`\ndescribed above using Matlab 2018a.  The log files are included in the\nreplication package in the `mc-psd2008/results` subdirectory:\n\n-   `psd2008_estimate_eqm1.log`\n-   `psd2008_estimate_eqm2.log`\n-   `psd2008_estimate_eqm3.log`\n\n### Monte Carlo Experiments (Appendix B.2)\n\nThe results in the paper were produced by running `mc-psd2010/main.m`\ndescribed above.  A log file produced by Matlab 2018b is included in\nthe replication package: `mc-psd2010/mc-psd2010.log`.\n\n## References\n\n-   Aguirregabiria, V., and P. Mira (2007).\n    [Sequential Estimation of Dynamic Discrete Games](https://doi.org/10.1111/j.1468-0262.2007.00731.x).\n    _Econometrica_ 75, 1-53.\n\n-   Data Axle. U.S. Historical Businesses, 1997-2021 [Data set].\n    Retrieved November 15, 2022 from The Ohio State University Libraries\n    Research Commons.\n    \u003chttps://library.ohio-state.edu/record=e1002559~S7\u003e\n\n-   Manson, S., J. Schroeder, D. Van Riper, T. Kugler, and S. Ruggles.\n    IPUMS National Historical Geographic Information System: Version 17.0\n    [Data set]. Minneapolis, MN: IPUMS. 2022.\n    Retrieved March 9, 2023 from IPUMS NHGIS.\n    \u003chttp://doi.org/10.18128/D050.V17.0\u003e\n\n-   Pesendorfer, M. and P. Schmidt-Dengler (2008).\n    [Asymptotic least squares estimators for dynamic games](https://doi.org/10.1111/j.1467-937X.2008.00496.x).\n    _Review of Economic Studies_ 75, 901-928.\n\n-   Pesendorfer, M. and P. Schmidt-Dengler (2010).\n    [Sequential estimation of dynamic discrete games: A comment](https://doi.org/10.3982/ECTA7633).\n    _Econometrica_ 78, 833-842.\n\n-   U.S. Department of Housing and Urban Development (HUD).\n    HUD-USPS ZIP Code Crosswalk data 2021-Q4 [Data set].\n    Retrieved May 16, 2023 from the HUD Office of Policy Development and Research (PD\u0026R).\n    \u003chttps://www.huduser.gov/portal/datasets/usps_crosswalk.html\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjrblevin%2Fepl-replication","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjrblevin%2Fepl-replication","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjrblevin%2Fepl-replication/lists"}