Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/llegomark/image-classification-resnet-50
This project utilizes the Hono framework to build a Cloudflare Worker that exposes an API endpoint for image classification. It integrates with Cloudflare AI to run the Microsoft Vision Model ResNet-50 and classify images based on either image URLs or file uploads.
https://github.com/llegomark/image-classification-resnet-50
cloudflare cloudflare-ai cloudflare-workers hono honojs resnet resnet-50 resnet50
Last synced: 11 days ago
JSON representation
This project utilizes the Hono framework to build a Cloudflare Worker that exposes an API endpoint for image classification. It integrates with Cloudflare AI to run the Microsoft Vision Model ResNet-50 and classify images based on either image URLs or file uploads.
- Host: GitHub
- URL: https://github.com/llegomark/image-classification-resnet-50
- Owner: llegomark
- License: mit
- Created: 2024-04-14T08:59:41.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-06-19T01:06:03.000Z (8 months ago)
- Last Synced: 2025-01-20T02:10:41.638Z (15 days ago)
- Topics: cloudflare, cloudflare-ai, cloudflare-workers, hono, honojs, resnet, resnet-50, resnet50
- Language: TypeScript
- Homepage:
- Size: 68.4 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Image Classification with Microsoft Vision Model ResNet-50
The Microsoft Vision Model ResNet-50 is a powerful pretrained vision model created by the Multimedia Group at Microsoft Bing. It is a 50-layer deep convolutional neural network (CNN) trained on more than 1 million images from ImageNet. By leveraging multi-task learning and optimizing separately for four datasets, including ImageNet-22k, Microsoft COCO, and two web-supervised datasets containing 40 million image-label pairs, the model achieves state-of-the-art performance in image classification tasks.
This project utilizes the Hono framework to build a Cloudflare Worker that exposes an API endpoint for image classification. It integrates with Cloudflare AI to run the Microsoft Vision Model ResNet-50 and classify images based on either image URLs or file uploads.
## Technologies Used
- **Hono**: A lightweight web framework for building fast and scalable applications on Cloudflare Workers.
- **Cloudflare Workers**: A serverless execution environment that allows running JavaScript and TypeScript code at the edge, close to users.
- **Cloudflare AI**: A set of APIs and tools provided by Cloudflare for integrating AI capabilities into applications.## Features
- Accepts both image URLs and file uploads for classification.
- Validates input using Zod schema validation.
- Supports CORS and CSRF protection middleware.
- Implements JWT authentication middleware for secure access to the API.
- Handles errors gracefully and returns appropriate error responses.
- Provides an optional `model` parameter to specify the model for additional analysis.
- Supported models: `llama` and `gemma`.
- If the `model` parameter is not provided or is set to a value other than `llama` or `gemma`, only image classification is performed without additional analysis.## API Endpoint
- **URL**: `/api/classify/:model?`
- `:model` (optional): Specifies the model to use for additional analysis. Supported values: `llama` and `gemma`.
- **Method**: `POST`
- **Authentication**: JWT token required in the `Authorization` header.
- **Request Body**: JSON array of image objects, each containing either a `url` or `file` property.
- `url`: The URL of the image to classify (optional).
- `file`: The uploaded image file to classify (optional).
- **Response**: JSON object containing an array of responses for each image.
- Each response includes:
- `classification`: An array of classification results, each containing a `label` and a `score`.
- `analysis` (optional): The analysis summary generated by the specified model, if a supported model is provided.## Usage
1. Set up a Cloudflare Worker and configure the necessary environment variables:
- `AI`: Your Cloudflare AI API token.
- `JWT_SECRET`: The secret key used for JWT authentication.
2. Deploy the worker code to your Cloudflare Worker.
3. Make a POST request to the `/api/classify` endpoint with the following payload:```json
[
{
"url": "https://example.com/image1.jpg"
},
{
"file": ""
}
]
```Replace `` with the actual file upload.
You can also specify an optional `model` parameter in the URL to use a specific model for analysis. The available models are `llama` and `gemma`. If the `model` parameter is not provided or is set to a value other than `llama` or `gemma`, only image classification will be performed without additional analysis.
Here are example cURL commands to classify images:
- Classify an image using a URL:
```bash
curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer " -d '[{"url": "https://example.com/image1.jpg"}]' https://your-worker-url.com/api/classify
```- Classify an image using a file upload:
```bash
curl -X POST -H "Content-Type: multipart/form-data" -H "Authorization: Bearer " -F "file=@/path/to/image.jpg" https://your-worker-url.com/api/classify
```- Classify an image using a URL with the `llama` model for analysis:
```bash
curl -X POST -H "Content-Type: application/json" -H "Authorization: Bearer " -d '[{"url": "https://example.com/image1.jpg"}]' https://your-worker-url.com/api/classify/llama
```- Classify an image using a file upload with the `gemma` model for analysis:
```bash
curl -X POST -H "Content-Type: multipart/form-data" -H "Authorization: Bearer " -F "file=@/path/to/image.jpg" https://your-worker-url.com/api/classify/gemma
```Replace `` with your actual JWT token and `https://your-worker-url.com` with the URL of your deployed Cloudflare Worker.
4. The API will return a JSON response with the classification results and analysis (if applicable) for each image:
```json
{
"responses": [
{
"classification": [
{
"label": "dog",
"score": 0.9
},
{
"label": "animal",
"score": 0.8
}
],
"analysis": "The image contains a dog, which is a type of animal. The classification scores indicate a high confidence in the presence of a dog in the image."
},
{
"classification": [
{
"label": "cat",
"score": 0.95
},
{
"label": "animal",
"score": 0.85
}
],
"analysis": "The image depicts a cat, which belongs to the animal category. The high classification scores suggest a strong likelihood of a cat being present in the image."
}
]
}
```
If the `model` parameter is not provided or is set to a value other than `llama` or `gemma`, the `analysis` field will be absent in the response.## Limitations
- The Microsoft Vision Model ResNet-50 is pretrained on a specific set of image categories. It may not perform well on images outside its training domain.
- The model accepts only certain image formats, such as JPEG, PNG, and GIF. Other formats may not be supported.
- The performance of the model may vary depending on the quality and resolution of the input images.## Contributing
Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.
## License
This project is licensed under the [MIT License](LICENSE).