https://github.com/jose-jaen/bayesian-statistics
https://github.com/jose-jaen/bayesian-statistics
bayesian-inference bayesian-statistics machine-learning python r statistics
Last synced: 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/jose-jaen/bayesian-statistics
- Owner: jose-jaen
- Created: 2022-10-04T11:13:31.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-01-30T20:26:24.000Z (over 3 years ago)
- Last Synced: 2025-03-06T17:49:22.074Z (over 1 year ago)
- Topics: bayesian-inference, bayesian-statistics, machine-learning, python, r, statistics
- Language: R
- Homepage:
- Size: 989 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Bayesian Statistics
Implementing Bayesian Inference to analytical problems.
# US Presidential Candidate Prediction
Using tweets, we predict whether Trump or Clinton wrote them.
NLP was used as to prepare unstructured data for modeling. We compare how a Naive Bayes classifier performs using Frequentist Statistics and Bayesian Statistics (Laplace Smoothing). We also employ a new algorithm for text analytics: TF-IDF.
- [Paper](https://github.com/jose-jaen/Bayesian-Statistics/blob/main/Twitter%20US%20Candidate%20Prediction/Bayesian_Statistics__Tweet_Filter.pdf)
- [R Code](https://github.com/jose-jaen/Bayesian-Statistics/blob/main/Twitter%20US%20Candidate%20Prediction/NLP_Tweets.r)
- [Python Code](https://github.com/jose-jaen/Bayesian-Statistics/blob/main/Twitter%20US%20Candidate%20Prediction/NLP_tweets.py)
# Conjugate priors simulation
Given a Gamma prior and exponentially distributed data points, we derive the marginal and predictive distribution of the data.
We also propose a mixture framework for combining prior beliefs.
- [Paper](https://github.com/jose-jaen/Bayesian-Statistics/blob/main/Bayesian%20Conjugate%20Priors/Bayesian_Statistics__Conjugate_Prior.pdf)
- [R Code](https://github.com/jose-jaen/Bayesian-Statistics/blob/main/Bayesian%20Conjugate%20Priors/Conjugate_priors.r)
# Predicting Heart Disease
A Generalized Linear Model, concretely, a logistic regression, is estimated to predict whether a patient had a heart disease or not.
Feature selection is carried out with L1 Regularization or LASSO regression and then Frequentist and Bayesian Inference are compared.
- [Paper](https://github.com/jose-jaen/Bayesian-Statistics/blob/main/Generalized%20Linear%20Models/Bayesian_Statistics__Regression.pdf)
- [R Code](https://github.com/jose-jaen/Bayesian-Statistics/blob/main/Generalized%20Linear%20Models/Bayesian_GLS.r)