https://github.com/coderham/data558-machinelearning
Polished python code required for one of the assignments for DATA558 - Statistical Machine Learning For Data Scientists at University of Washington.
https://github.com/coderham/data558-machinelearning
Last synced: about 1 year ago
JSON representation
Polished python code required for one of the assignments for DATA558 - Statistical Machine Learning For Data Scientists at University of Washington.
- Host: GitHub
- URL: https://github.com/coderham/data558-machinelearning
- Owner: CoderHam
- Created: 2018-05-31T17:18:31.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2018-06-01T08:44:52.000Z (about 8 years ago)
- Last Synced: 2025-02-19T05:17:32.911Z (over 1 year ago)
- Language: Python
- Homepage:
- Size: 15.9 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# DATA558 - Machine Learning
Polished python code required for one of the assignments for DATA558 - Statistical Machine Learning For Data Scientists at University of Washington.
## Linear Support Vector Machine with Square Hinge Loss (Classification)
The model is implemented in [linear_svm.py]models/linear_svm.py. It uses fast gradient descent with backtracking and simplifies the usage to sklearn style .fit and .predict functions. Cross validation is used to find the optimal value of the regularization parameter.
## Demos
For testing with the Spam dataset (from the book The Elements of Statistical Learning) - Binary classifier
```
python3 demo_spam.py
```
For testing with the Vowel dataset (from the book The Elements of Statistical Learning) - Multinomial classifier is built using binary classifiers in one-vs-one style.
```
python3 demo_vowel.py
```
For testing with a custom generated dataset (simulated) - Binary classifier. Bonus - compare performance with sklearn
```
python3 demo_simulated.py
```
For comparing custom implemented with sklearn on spam dataset (real world) - Binary classifier
```
python3 compare_spam.py
```
## Usage
```python
from models import LinearSVM
LSVM = LinearSVM()
weights = LSVM.fit(train_features,train_labels)
test_predictions = LSVM.predict(weights,test_features)
```
## Data
The data is present in the __data__ folder and can also be downloaded from https://web.stanford.edu/~hastie/ElemStatLearn/data.html. There are a few other datasets available there to play around with.
# Required Libraries (Python 3)
numpy
sklearn
scipy
matplotlib
pandas