https://github.com/mfl28/machinelearning
A collection of Jupyter notebooks and a Python library for Machine Learning projects.
https://github.com/mfl28/machinelearning
detection jupyter-notebooks kaggle-competition machine-learning machine-learning-library nbviewer notebook python pytorch
Last synced: 3 months ago
JSON representation
A collection of Jupyter notebooks and a Python library for Machine Learning projects.
- Host: GitHub
- URL: https://github.com/mfl28/machinelearning
- Owner: mfl28
- License: mit
- Created: 2019-07-30T06:25:11.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2022-12-27T08:28:34.000Z (almost 3 years ago)
- Last Synced: 2023-03-03T19:12:04.448Z (over 2 years ago)
- Topics: detection, jupyter-notebooks, kaggle-competition, machine-learning, machine-learning-library, nbviewer, notebook, python, pytorch
- Language: Jupyter Notebook
- Homepage:
- Size: 74.7 MB
- Stars: 6
- Watchers: 0
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Machine Learning
This repo contains a compilation of machine learning projects in the form of Jupyter notebooks. For some notebooks additional data, such as bounding box annotation files are needed, these files can be found in the *data* folder. [Pytorch](https://pytorch.org/) is used as the underlying library for projects involving deep learning.## `mltools` Library [](https://lgtm.com/projects/g/mfl28/MachineLearning/context:python)
This is a Python library which contains useful classes and functions for machine learning and data science tasks, such as feature exploration, object detection and classification as well as semantic segmentation using Pytorch.## How to open notebooks using Docker
**Requirements:** [Docker](https://www.docker.com/get-started), [docker-compose](https://docs.docker.com/compose/install/)The repo provides a [Dockerfile](Dockerfile) and [docker-compose.yml](docker-compose.yml) to create a Docker container that starts a Jupyter Notebook server (using [docker-stacks](https://github.com/jupyter/docker-stacks))
and allows you to open the notebooks without having to install the requirements on your system. The steps to do this are:1. Clone the repo:
```bash
git clone https://github.com/mfl28/MachineLearning.git
cd MachineLearning
```
2. Build the image and start the container using `docker-compose up`.
3. Copy the URL shown in the terminal to your browser's address bar and replace the internal port (`8888`) with the mapped host port `10000`.
4. When you are done, you can shut down the server from the terminal using `CTRL-C` and remove the created Docker container using `docker-compose down.`## Notebooks
### Semantic Segmentation
#### Kaggle Competition: Dstl Satellite Imagery Feature Detection ([notebook](https://github.com/mfl28/MachineLearning/blob/master/notebooks/Kaggle_Dstl_Satellite_Imagery_Feature_Detection.ipynb), [](https://nbviewer.jupyter.org/github/mfl28/MachineLearning/blob/master/notebooks/Kaggle_Dstl_Satellite_Imagery_Feature_Detection.ipynb), [](https://colab.research.google.com/github/mfl28/MachineLearning/blob/master/notebooks/Kaggle_Dstl_Satellite_Imagery_Feature_Detection.ipynb))
![]()
![]()
A notebook showing how to perform semantic segmentation using a fully convolutional neural network. Our aim is to locate buildings in satellite images from the [Kaggle Dstl Satellite Imagery Feature Detection Challenge](https://www.kaggle.com/c/dstl-satellite-imagery-feature-detection).
### Object Detection
#### Humpback Whale Fluke Detection ([notebook](https://github.com/mfl28/MachineLearning/blob/master/notebooks/Humpback_Whale_Fluke_Detection.ipynb), [](https://nbviewer.jupyter.org/github/mfl28/MachineLearning/blob/master/notebooks/Humpback_Whale_Fluke_Detection.ipynb), [](https://colab.research.google.com/github/mfl28/MachineLearning/blob/master/notebooks/Humpback_Whale_Fluke_Detection.ipynb))
![]()
A notebook showing how to perform object detection with a custom dataset using a pre-trained and subsequently fine-tuned neural network. Specifically, the aim is to detect and locate humpback whale flukes in images from the [Kaggle Humpback Whale Identification Challenge](https://www.kaggle.com/c/humpback-whale-identification). The ground truth bounding box labels for a selection of 800 images from the training dataset provided by the challenge were created using [Bounding Box Editor](https://github.com/mfl28/BoundingBoxEditor).
#### VOCXMLDataset Demo ([notebook](https://github.com/mfl28/MachineLearning/blob/master/notebooks/VOCXMLDataset_Demo.ipynb), [](https://nbviewer.jupyter.org/github/mfl28/MachineLearning/blob/master/notebooks/VOCXMLDataset_Demo.ipynb))
![]()
A notebook showcasing the use of the `VOCXMLDataset` class from `mltools.detection.datasets` using images and annotations from the [VOC2012 dataset](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/) for demonstrations.
### Classification
#### Kaggle Competition: Humpback Whale Identification ([notebook](https://github.com/mfl28/MachineLearning/blob/master/notebooks/Kaggle_Whale_Identification.ipynb), [](https://nbviewer.jupyter.org/github/mfl28/MachineLearning/blob/master/notebooks/Kaggle_Whale_Identification.ipynb), [](https://colab.research.google.com/github/mfl28/MachineLearning/blob/master/notebooks/Kaggle_Whale_Identification.ipynb))
In this notebook we'll train a classifier to identify humpback whales in images according to the [Kaggle Humpback Whale Identification Challenge](https://www.kaggle.com/c/humpback-whale-identification). We'll use the [fast.ai](https://github.com/fastai/fastai) deep learning library to perform this task.#### Kaggle Competition: MNIST Digit Recognizer ([notebook](https://github.com/mfl28/MachineLearning/blob/master/notebooks/Kaggle_Mnist_Digit_Recognizer.ipynb), [](https://nbviewer.jupyter.org/github/mfl28/MachineLearning/blob/master/notebooks/Kaggle_Mnist_Digit_Recognizer.ipynb))
![]()
A notebook showing how to train a convolutional neural network object classifier for the MNIST Dataset from the [Kaggle MNIST Digit Recognizer competition](https://www.kaggle.com/c/digit-recognizer). The aim is to predict hand-drawn digits in images as accurately as possible.
#### Kaggle Competition: Titanic - Machine Learning from Disaster ([notebook](https://github.com/mfl28/MachineLearning/blob/master/notebooks/Kaggle_Titanic_Machine_Learning_From_Disaster.ipynb), [](https://nbviewer.jupyter.org/github/mfl28/MachineLearning/blob/master/notebooks/Kaggle_Titanic_Machine_Learning_From_Disaster.ipynb))
![]()
The aim of this notebook is to build a model which can predict the survival of passengers of the Titanic. Problem and data come from the [Kaggle Titanic: Machine Learning from Disaster competition](https://www.kaggle.com/c/titanic). We start with an exploration and visualization of the provided features, then proceed to building a feature engineering [Pipeline](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html) using [scikit-learn](https://scikit-learn.org/stable/index.html). Finally we'll experiment with several machine learning approaches to solve the prediction problem.