Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/yoyolicoris/ml_hw2

My implementation of homework 2 for the Machine Learning class in NCTU (course number 5088).
https://github.com/yoyolicoris/ml_hw2

bayesian-inference bayesian-linear-regression logistic-regression multiclass-logistic-regression

Last synced: 16 days ago
JSON representation

My implementation of homework 2 for the Machine Learning class in NCTU (course number 5088).

Host: GitHub
URL: https://github.com/yoyolicoris/ml_hw2
Owner: yoyolicoris
Created: 2017-11-14T08:50:31.000Z (about 7 years ago)
Default Branch: master
Last Pushed: 2017-11-21T16:00:03.000Z (about 7 years ago)
Last Synced: 2024-12-14T21:39:17.553Z (17 days ago)
Topics: bayesian-inference, bayesian-linear-regression, logistic-regression, multiclass-logistic-regression
Language: Python
Homepage:
Size: 773 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Homework 2 for Machine Learning

## Environment

* ubuntu 16.04 LTS
* python2.7.12 (using Pycharm 2017.2.4)
* extra modules: numpy, scipy, pandas

## Usage of each file

### Bayesian Inference for Gaussian (HW 1-2)

For the task 1-2, type the following command:

```
python bayesian_inference.py 1_data.mat
```
It will find the MAP solution of the covariance matrix, and compare it to the true covariance (using numpy.cov).
The output is like the following:
```
the true covariance is
[[ 0.30082961 0.39309777]
[ 0.39309777 0.89266987]]
< N = 10 >
the MAP solution of covariance is
[[ 0.72927715 1.06310276]
[ 1.06310276 2.47424115]]
error of MAP solution is 0.837507199995
< N = 100 >
the MAP solution of covariance is
[[ 0.36316943 0.45461111]
[ 0.45461111 1.03532019]]
error of MAP solution is 0.0820042028936
< N = 500 >
the MAP solution of covariance is
[[ 0.31936913 0.39865164]
[ 0.39865164 0.88775764]]
error of MAP solution is 0.00863987484804
```
I also try another approach, which will randomly generate muliple covariance matrix, and compute their probability to pick up the maximum one.

```
python bayesian_inference_random_generate.py 1_data.mat

#the output format
----testing----
the approximated MAP solution of covariance is
[[ 0.31907513 0.39842093]
[ 0.39842093 0.88619125]]
the true covariance is
[[ 0.30082961 0.39309777]
[ 0.39309777 0.89266987]]
error of approximated MAP solution is 0.00884261234455
the posterior probability is 15.4547626093
```

### Bayesian Linear Regression (HW 2-2, 2-3)

For the task 2-2, type the following command:
```
python bayesian_linear_regression.py 2_data.mat
```
This will plot four image similar to Fig. 3.9 on the testbook for different N, like the following image:
![](images/blr_1.png)

For the task 2-3, type the following command:
```
python bayesian_lin_regress_show_region.py 2_data.mat
```
This will plot four image similar to Fig. 3.8 on the testbook for different N, like the following image:
![](images/blr_2.png)

### Logistic Regression (HW 3-1~3, 3-5, 3-6)

For the task 3-1 and 3-2, type the following command:
```
python lofistic_regression.py train.csv test.csv
```
This will first plot the cross entropy and accuracy versus number of epochs like the following image:
![](images/logistic_1.png)

And the classification result :
```
The test data classification result is :
[2 2 2 1 2 2 2 2 2 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0]
```
The number means the target class name by columns.

For the task 3-3, please type the following command:
```
python plot_training_data.py train.csv
```

This will create series of image ploting the distribution of each variable named with the column index:
![](images/hist_1.png)

For the task 3-5 and 3-6, type the following command:
```
python most_contributive_variable.py train.csv test.csv
```

It will first display the following message and plot the distribution of the two variable:
```
The most contributive variable pair is column 3 and 12
```
![](images/3&12.png)

And then redo the task 3-1 & 3-2 again using these two variable.
The test data classification result is:
```
[2 2 2 0 2 2 2 2 2 2 0 1 0 1 1 0 1 1 2 0 1 1 1 0 0 0 1 1 1]
```