Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/StatsReporting/stargazer

Python implementation of the R stargazer multiple regression model creation tool
https://github.com/StatsReporting/stargazer

latex python regression-models stargazer

Last synced: 3 months ago
JSON representation

Python implementation of the R stargazer multiple regression model creation tool

Host: GitHub
URL: https://github.com/StatsReporting/stargazer
Owner: StatsReporting
License: other
Created: 2018-06-25T00:13:13.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2024-07-26T15:07:14.000Z (7 months ago)
Last Synced: 2024-09-26T19:17:46.209Z (5 months ago)
Topics: latex, python, regression-models, stargazer
Language: Jupyter Notebook
Size: 439 KB
Stars: 193
Watchers: 6
Forks: 49
Open Issues: 33
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Stargazer

This is a python port of the R stargazer package that can be found [on CRAN](https://CRAN.R-project.org/package=stargazer). I was disappointed that there wasn't equivalent functionality in any python packages I was aware of so I'm re-implementing it here.

There is an experimental function in the [statsmodels.regression.linear_model.OLSResults.summary2](http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.OLSResults.summary2.html) that can report single regression model results in HTML/CSV/LaTeX/etc, but it still didn't quite fulfill what I was looking for.

The python package is object oriented now with chained commands to make changes to the rendering parameters, which is hopefully more pythonic and the user doesn't have to put a bunch of arguments in a single function.

## Installation

You can install this package through PyPi with `pip install stargazer` or just clone the repo and take the `stargazer.py` file since it's the only one in the package.

### Dependencies

It depends on `statsmodels`, which in turn depends on several other libraries like `pandas`, `numpy`, etc

## Editing Features

This library implements many of the customization features found in the original package. Examples of most can be found [in the examples jupyter notebook](https://github.com/StatsReporting/stargazer/blob/master/examples.ipynb) and a full list of the methods/features is here below:

* `title`: custom title

* `show_header`: display or hide model header data

* `show_model_numbers`: display or hide model numbers

* `custom_columns`: custom model names and model groupings

* `significance_levels`: change statistical significance thresholds

* `significant_digits`: change number of significant digits

* `show_confidence_intervals`: display confidence intervals instead of variance

* `dependent_variable_name`: rename dependent variable

* `rename_covariates`: rename covariates

* `covariate_order`: reorder covariates

* `reset_covariate_order`: reset covariate order to original ordering

* `show_degrees_of_freedom`: display or hide degrees of freedom

* `custom_note_label`: label notes section at bottom of table

* `add_custom_notes`: add custom notes to section at bottom of the table

* `add_line`: add a custom line to the table

* `append_notes`: display or hide statistical significance thresholds

These features are agnostic of the rendering type and will be applied whether the user outputs in HTML, LaTeX, etc

## Example

Here is an examples of how to quickly get started with the library. More examples can be found in the `examples.ipynb` file in the github repo. The examples all use the scikit-learn diabetes dataset, but it is not a dependency for the package.

### OLS Models Preparation

```python

import pandas as pd

from sklearn import datasets

import statsmodels.api as sm

from stargazer.stargazer import Stargazer

diabetes = datasets.load_diabetes()

df = pd.DataFrame(diabetes.data)

df.columns = ['Age', 'Sex', 'BMI', 'ABP', 'S1', 'S2', 'S3', 'S4', 'S5', 'S6']

df['target'] = diabetes.target

est = sm.OLS(endog=df['target'], exog=sm.add_constant(df[df.columns[0:4]])).fit()

est2 = sm.OLS(endog=df['target'], exog=sm.add_constant(df[df.columns[0:6]])).fit()

stargazer = Stargazer([est, est2])

```

### HTML Example

```python

stargazer.render_html()

```

Dependent variable: target(1)(2)

ABP416.674^***397.583^***

(69.495)(70.870)

Age37.24124.704

(64.117)(65.411)

BMI787.179^***789.742^***

(65.424)(66.887)

S1197.852

(143.812)

S2-169.251

(142.744)

Sex-106.578^*-82.862

(62.125)(64.851)

const152.133^***152.133^***

(2.853)(2.853)

Observations442442R²0.4000.403Adjusted R²0.3950.395Residual Std. Error59.976 (df=437)59.982 (df=435)F Statistic72.913^*** (df=4; 437)48.915^*** (df=6; 435)

Note:^*p<0.1; ^**p<0.05; ^***p<0.01

### LaTeX Example

```python

stargazer.render_latex()

```

![](https://raw.githubusercontent.com/StatsReporting/stargazer/master/latex_example.png)