https://github.com/llegomark/image-classification-resnet-50

This project utilizes the Hono framework to build a Cloudflare Worker that exposes an API endpoint for image classification. It integrates with Cloudflare AI to run the Microsoft Vision Model ResNet-50 and classify images based on either image URLs or file uploads.
https://github.com/llegomark/image-classification-resnet-50

cloudflare cloudflare-ai cloudflare-workers hono honojs resnet resnet-50 resnet50

Last synced: 4 months ago
JSON representation

Host: GitHub
URL: https://github.com/llegomark/image-classification-resnet-50
Owner: llegomark
License: mit
Created: 2024-04-14T08:59:41.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-06-19T01:06:03.000Z (about 1 year ago)
Last Synced: 2025-03-15T00:02:45.763Z (4 months ago)
Topics: cloudflare, cloudflare-ai, cloudflare-workers, hono, honojs, resnet, resnet-50, resnet50
Language: TypeScript
Homepage:
Size: 68.4 KB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Image Classification with Microsoft Vision Model ResNet-50

The Microsoft Vision Model ResNet-50 is a powerful pretrained vision model created by the Multimedia Group at Microsoft Bing. It is a 50-layer deep convolutional neural network (CNN) trained on more than 1 million images from ImageNet. By leveraging multi-task learning and optimizing separately for four datasets, including ImageNet-22k, Microsoft COCO, and two web-supervised datasets containing 40 million image-label pairs, the model achieves state-of-the-art performance in image classification tasks.

## Technologies Used

- **Hono**: A lightweight web framework for building fast and scalable applications on Cloudflare Workers.
- **Cloudflare Workers**: A serverless execution environment that allows running JavaScript and TypeScript code at the edge, close to users.
- **Cloudflare AI**: A set of APIs and tools provided by Cloudflare for integrating AI capabilities into applications.

## Features

- Accepts both image URLs and file uploads for classification.
- Validates input using Zod schema validation.
- Supports CORS and CSRF protection middleware.
- Implements JWT authentication middleware for secure access to the API.
- Handles errors gracefully and returns appropriate error responses.
- Provides an optional `model` parameter to specify the model for additional analysis.
- Supported models: `llama` and `gemma`.
- If the `model` parameter is not provided or is set to a value other than `llama` or `gemma`, only image classification is performed without additional analysis.

## API Endpoint

- **URL**: `/api/classify/:model?`
- `:model` (optional): Specifies the model to use for additional analysis. Supported values: `llama` and `gemma`.
- **Method**: `POST`
- **Authentication**: JWT token required in the `Authorization` header.
- **Request Body**: JSON array of image objects, each containing either a `url` or `file` property.
- `url`: The URL of the image to classify (optional).
- `file`: The uploaded image file to classify (optional).
- **Response**: JSON object containing an array of responses for each image.
- Each response includes:
- `classification`: An array of classification results, each containing a `label` and a `score`.
- `analysis` (optional): The analysis summary generated by the specified model, if a supported model is provided.

## Usage

1. Set up a Cloudflare Worker and configure the necessary environment variables:
- `AI`: Your Cloudflare AI API token.
- `JWT_SECRET`: The secret key used for JWT authentication.
2. Deploy the worker code to your Cloudflare Worker.
3. Make a POST request to the `/api/classify` endpoint with the following payload:

```json
[
{
"url": "https://example.com/image1.jpg"
},
{
"file": ""
}
]
```

Replace `` with the actual file upload.

You can also specify an optional `model` parameter in the URL to use a specific model for analysis. The available models are `llama` and `gemma`. If the `model` parameter is not provided or is set to a value other than `llama` or `gemma`, only image classification will be performed without additional analysis.

Here are example cURL commands to classify images:

- Classify an image using a URL:

```bash
curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer " -d '[{"url": "https://example.com/image1.jpg"}]' https://your-worker-url.com/api/classify
```

- Classify an image using a file upload:

```bash
curl -X POST -H "Content-Type: multipart/form-data" -H "Authorization: Bearer " -F "file=@/path/to/image.jpg" https://your-worker-url.com/api/classify
```

- Classify an image using a URL with the `llama` model for analysis:

```bash
curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer " -d '[{"url": "https://example.com/image1.jpg"}]' https://your-worker-url.com/api/classify/llama
```

- Classify an image using a file upload with the `gemma` model for analysis:
```bash
curl -X POST -H "Content-Type: multipart/form-data" -H "Authorization: Bearer " -F "file=@/path/to/image.jpg" https://your-worker-url.com/api/classify/gemma
```

Replace `` with your actual JWT token and `https://your-worker-url.com` with the URL of your deployed Cloudflare Worker.

4. The API will return a JSON response with the classification results and analysis (if applicable) for each image:
```json
{
"responses": [
{
"classification": [
{
"label": "dog",
"score": 0.9
},
{
"label": "animal",
"score": 0.8
}
],
"analysis": "The image contains a dog, which is a type of animal. The classification scores indicate a high confidence in the presence of a dog in the image."
},
{
"classification": [
{
"label": "cat",
"score": 0.95
},
{
"label": "animal",
"score": 0.85
}
],
"analysis": "The image depicts a cat, which belongs to the animal category. The high classification scores suggest a strong likelihood of a cat being present in the image."
}
]
}
```
If the `model` parameter is not provided or is set to a value other than `llama` or `gemma`, the `analysis` field will be absent in the response.

## Limitations

- The Microsoft Vision Model ResNet-50 is pretrained on a specific set of image categories. It may not perform well on images outside its training domain.
- The model accepts only certain image formats, such as JPEG, PNG, and GIF. Other formats may not be supported.
- The performance of the model may vary depending on the quality and resolution of the input images.

## Contributing

Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.

## License

This project is licensed under the [MIT License](LICENSE).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/llegomark/image-classification-resnet-50

Awesome Lists containing this project

README