An open API service indexing awesome lists of open source software.

https://github.com/jacekkala/food_classification_cnn

Food & Beverage Multiclass Image Classification with Convolutional Neural Network (CNN)
https://github.com/jacekkala/food_classification_cnn

cnn-classification grad-cam keras-tensorflow python

Last synced: 8 months ago
JSON representation

Food & Beverage Multiclass Image Classification with Convolutional Neural Network (CNN)

Awesome Lists containing this project

README

          

# 🍔 Food & Beverage Classification using CNN

## 📌 Project Overview
This project classifies **food & beverage images** using a **Convolutional Neural Network (CNN)**. The dataset consists of **9323 training images** and **484 test images** across **61 classes** (e.g., water, pizza-margherita-baked, broccoli, salad, egg, etc.).

### 🔹 Objective
- Train a **CNN model** to classify food items from images.
- Improve **generalization** using **data augmentation**.
- Monitor **training progress** with validation curves.
- Provide **model interpretability** using Grad-CAM visualizations.

---

## 📂 Dataset Details
### 📍 **Directories**
```
data/
├── training_set_128/ # 9323 images (train + validation)
├── test_set_128/ # 484 images (unlabeled test data)
loss_accuracy_curves/ # images for some of the models tested
├── accuracy/
├── loss/
saved_models/ # best model trained
images/ # for README
```

### 📍 **Example Classes & Image Distribution**
| Class | Number of Images |
|-------|-----------------|
| Water | 863 |
| Bread-White | 595 |
| Salad-Leaf | 535 |
| Pickle | 28 |

### 📍 **Example Training Images**
Below are some sample images from the training set (some of them are difficult to recognize even for human eyes - hard-cheese hard ineed!):

![Training Images](images/training_images.png)

---

## 🔹 Data Preprocessing & Augmentation
To improve generalization, applied:
- **Rotation (±30°)**
- **Zooming (20%)**
- **Shifting (20%)**
- **Horizontal Flipping**
- **Rescaling (0-255 → 0-1)**

---

## 🏗️ Model Architecture
The CNN consists of:
- **4 Convolutional Blocks** with **ReLU activation**
- **Softmax Activation** for multi-class classification

```python
model = Sequential([
Conv2D(32, (3, 3), activation='relu', input_shape=(128, 128, 3)),
MaxPooling2D((2, 2)),

Conv2D(64, (3, 3), activation='relu'),
MaxPooling2D((2, 2)),

Conv2D(128, (3, 3), activation='relu'),
MaxPooling2D((2, 2)),

Conv2D(256, (3, 3), activation='relu'),
MaxPooling2D((2, 2)),

Flatten(),
Dense(128, activation='relu'),
Dense(61, activation='softmax')
])
```
---

## 📊 Model Training & Evaluation
- **Optimizer:** Adam (`learning_rate=0.001`)
- **Loss Function:** Categorical Crossentropy
- **Metrics:** Accuracy, Precision, Recall

### 🖥️ **Training Curves**
Loss & Accuracy over epochs:

![Training Curves Placeholder](images/training_curves.jpg)

### 🔍 **Confusion Matrix**
Visualizing misclassified images:

![Confusion Matrix Placeholder](images/confusion_matrix.png)

---

## 🎨 Model Interpretability
We use **Grad-CAM** to visualize important regions in an image that influenced predictions.

### Example Grad-CAM Visualization:

![Grad-CAM Placeholder](images/grad_cam.png)

---

## 📌 Future Improvements
- Use **Transfer Learning** (e.g., MobileNet, ResNet) for better accuracy
- Optimize hyperparameters using **KerasTuner**

---