https://github.com/szymonrucinski/x-ray

Image classification system for detection of pulmonary diseases.
https://github.com/szymonrucinski/x-ray

Last synced: 4 months ago
JSON representation

Image classification system for detection of pulmonary diseases.

Host: GitHub
URL: https://github.com/szymonrucinski/x-ray
Owner: szymonrucinski
Created: 2021-04-14T09:50:55.000Z (about 4 years ago)
Default Branch: master
Last Pushed: 2021-06-12T21:38:42.000Z (about 4 years ago)
Last Synced: 2025-01-08T21:22:12.628Z (5 months ago)
Language: Jupyter Notebook
Size: 1.82 MB
Stars: 0
Watchers: 1
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: Readme.md

Awesome Lists containing this project

README

        Objective

===

The aim of this task is to prepare a program which classifies

chest X-ray images into those of a healthy patient and those of a patient

healthy patient and those showing a patient with pneumonia. .

Introduction

============

The aim of the implementation is to train a model using a

convolutional neural network. The model will be used to correctly

classify the Roentgen image and classify the patient as:

patient with pneumonia or healthy..

Description of implementation

==================

The task was implemented in the Python 3 programming language.The model

was trained using the Pytorch machine learning library.

The process of training the model was implemented on a ready-made architecture

Resnet50 architecture to which 2 new layers were added.

Data sets

=============

In order to train the model we used the publicly available on the Internet

collection [*Chest X-Ray Images

(Pneumonia)*](https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia).

The collection already has a separate training, test and validation collection.

The following graphs show the sizes of the collections.

There are 2 classes in the collection: diseased lungs and healthy lungs. Below

The training set, the validation set and the test set are presented below.

| Dataset        	| Normal 	| Pneumonia 	|

|----------------	|--------	|-----------	|

| Training set   	| 3875   	| 1341      	|

| Test set       	| 234    	| 390       	|

| Validation set 	| 8      	| 8         	|




The fragments of the validation set are as follows:

![Fragments of the training set](images/Capture.PNG)

As you can see the number of images describing each class is not

balanced. For this purpose, each group was assigned an appropriate weight.

```python

weights = [3, 1]

class_weights = torch.FloatTensor(weights).cuda()

criterion = nn.CrossEntropyLoss(weight=class_weights)

```

Workout description

=============

I decided to choose an off-the-shelf architecture called Resnet50. It is

It is very often used to solve problems of image

classification problems and is quite popular. The first step was to add

to the existing layers two new layers : Linear layer and

drop-out layer. These were the ones that were trained, the rest of the layers

have been blocked.

```python

model.fc = nn.Sequential(

    nn.Dropout(0.5),

    nn.Linear(num_features, 2))

```

Next, we freeze all the layers:

```python

for param in model.parameters():

    param.requires_grad = False

for param in model.fc.parameters():

    param.requires_grad = True

```

This method can significantly reduce model learning time and

use previously trained model layers.

```python

23,512,130 total parameters. 4,098 training parameters.

```

The number of parameters to be taught has been significantly reduced. For such

prepared model we start one learning process. Below we present

the selected parameters for training the model:

```python

learning_rate = 2e-3

num_epochs = 100

patience = 20

batch_size = 80

criterion = nn.CrossEntropyLoss(weight=class_weights)

optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

```

The graphs for the first training show that the model achieved the best

result for 5 epochs then it just stopped. In the last stage

training was carried out for the model with unfrozen weights. After 5 epochs

no progress was noticed.

![Model training run](images/training.png)

Results

======

The graph shows the accuracy of the trained model during the whole cycle

of training with frozen and unlocked weights. After training the model

it is necessary to check on the test set what final

accuracy of **87.77%**.

Summary

============

In the end the model achieved an accuracy of **87.77%**. It is able to

correctly identify from the image whether the patient has healthy or diseased

lungs. The application of transfer learning and the use of previously

transfer learning and use of pre-calculated weights and parameters significantly shortened the learning time of the model.

I think, that in order to achieve the result closer to 99%, it would be necessary to enrich the set with

more lung images of healthy patients because there is

disproportion between classes.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/szymonrucinski/x-ray

Awesome Lists containing this project

README