https://github.com/estebanrucan/estyp
ESTYP: Extended Statistical Toolkit Library for Python
https://github.com/estebanrucan/estyp
estyp glm hypothesis-testing linear-regression machine-learning model-selection python statistical-analysis var-test
Last synced: 2 months ago
JSON representation
ESTYP: Extended Statistical Toolkit Library for Python
- Host: GitHub
- URL: https://github.com/estebanrucan/estyp
- Owner: estebanrucan
- License: mit
- Created: 2023-07-19T00:14:36.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2023-09-17T02:49:27.000Z (over 2 years ago)
- Last Synced: 2025-09-25T04:37:04.836Z (8 months ago)
- Topics: estyp, glm, hypothesis-testing, linear-regression, machine-learning, model-selection, python, statistical-analysis, var-test
- Language: Python
- Homepage: http://estyp.readthedocs.io/
- Size: 8.07 MB
- Stars: 5
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# ESTYP: Extended Statistical Toolkit Yet Practical
[](https://pepy.tech/project/estyp) [](https://github.com/estebanrucan/estyp/actions/workflows/test.yml) [](https://estyp.readthedocs.io/en/latest/?badge=latest) [](https://badge.fury.io/py/estyp) [](https://opensource.org/licenses/MIT) [](https://github.com/estebanrucan/estyp/issues) [](https://es.wikipedia.org/wiki/Chile)
## Description
ESTYP (Extended Statistical Toolkit Yet Practical) is a Python library that serves as a multifaceted toolkit for statistical analysis. The `testing` module encompasses a wide range of statistical tests, including t-tests, chi-squared tests, and correlation tests, providing robust methods for data comparison and validation. In the `linear_model` module, users can find functionalities related to logistic regression, including variable selection techniques and additional methods for calculating confidence intervals and p-values. This module enhances the capabilities of traditional logistic regression analysis. The cluster module is designed to assist in clustering analysis, offering tools to identify the optimal number of `clusters` using methods like the elbow or silhouette techniques. Together, these modules form a comprehensive and practical statistical toolkit that caters to various analytical needs.
Actually, the name comes from the way my friends call me (Esti), plus "p" which is the initial of `python`.
## Installation
To install this library, you can use PyPI:
```bash
pip install estyp
```
Also, you can install it from the source code:
```bash
git clone https://github.com/estebanrucan/estyp.git
cd estyp
pip install -e .
```
## Documentation
You can have a friendly introduction to this library in the [documentation](https://estyp.readthedocs.io/en/latest/).
## Changelog
You can see the full changelog [here](./CHANGELOG.md).
## Features
### `testing` module
* `testing.CheckModel()`: This class provides methods to test the assumptions of the linear regression model., inspired by the `performance::check_model()` function of the R software.
* `testing.t_test()`: Performs one and two sample t-tests on groups of data. This function is inspired by the `t.test()` function of the R software.
* `testing.var_test()`: Performs an F test to compare the variances of two samples from normal populations. This function is inspired by the `var.test()` function of the R software.
* `testing.prop_test()`: it can be used for testing the null that the proportions (probabilities of success) in several groups are the same, or that they equal certain given values. This function is inspired by the `prop.test()` function of the R software.
* `testing.chisq_test()`: Performs a chi-squared test of independence of variables in a contingency table. This function is inspired by the `chisq.test()` function of the R software.
* `testing.cor_test()`: Performs a correlation test with Pearson, Spearman or Kendall method. This function is inspired by the `cor.test()` function of the R software.
* `testing.nested_models_test()`: Performs a nested models test to compare two nested models using deviance criterion.
* `testing.dw_test()`: Performs the Durbin-Watson test for autocorrelation of disturbances (includes a p-value). Inspired by the `lmtest::dwtest()` function of the R software.
### `linear_model` module
* `linear_model.LogisticRegression()`: This class implements a logistic regression model. It is like the `LogisticRegression()` class from `scikit-learn`, but adds additional methods for calculating confidence intervals, p-values, and model summaries like `Logit` class in `statsmodels`.
* `linear_model.Stepwise()`: Provides a implementation to add or remove predictors based on their significance, AIC or BIC in a model.
### `cluster` module
* `cluster.NClusterSearch`: A helper class to identify the optimal number of clusters for clustering algorithms with elbow or silhuette methods.
## License
This library is under the MIT license.
## Contact
If you have any questions about this library, you can contact me at [LinkedIn](https://www.linkedin.com/in/estebanrucan/).