https://github.com/dmdhrumilmistry/file-validation
Validate File Content Type using AI/ML models for S3 file uploads using AWS lambda
https://github.com/dmdhrumilmistry/file-validation
aws-lambda aws-security file-upload hacking security
Last synced: 6 months ago
JSON representation
Validate File Content Type using AI/ML models for S3 file uploads using AWS lambda
- Host: GitHub
- URL: https://github.com/dmdhrumilmistry/file-validation
- Owner: dmdhrumilmistry
- License: mit
- Created: 2024-02-24T09:38:22.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-02-24T22:25:13.000Z (over 2 years ago)
- Last Synced: 2025-04-02T16:15:10.372Z (about 1 year ago)
- Topics: aws-lambda, aws-security, file-upload, hacking, security
- Language: Python
- Homepage: https://dmdhrumilmistry.gitbook.io/home/blog/secure-software-development/validating-file-content-types-to-avoid-malicious-file-hosting-using-ml-model
- Size: 73.2 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# File Validation
AWS File validation using lambda function using AI/ML model from Google's Magika Project.
For more details read [post](https://dmdhrumilmistry.gitbook.io/home/blog/secure-software-development/validating-file-content-types-to-avoid-malicious-file-hosting-using-ml-model)

## Usage
### Using Amazon ECR
> Note: use x86_64 arch instead of arm64 since arm64 arch machines doesn't completely support
> environment required by onnix
>
> Reference: https://github.com/microsoft/onnxruntime/issues/10038
#### Installation
* Star (⭐️) and Fork (⑂) this Repo 😎
* Update `bucket_policy` in `file_validation.py` according to your needs.
* Create ECR Private Registry and new container repo (let's say `file-validation`)
* Create new IAM Role Policy with restricted permissions for accessing bucket (`my-aws-buckkett`) and deleting (malicious) objects for `aws-s3-file-upload-validation` lambda function (which will be created later)
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "GetAndDeleteBucketObject",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::my-aws-buckkett/*",
"arn:aws:s3:::my-aws-buckkett/",
"arn:aws:s3:::my-aws-buckkett"
]
},
{
"Sid":"CreateLogGroupActionForLambda",
"Effect": "Allow",
"Action": "logs:CreateLogGroup",
"Resource": "arn:aws:logs:us-east-1:aws-account-number:*"
},
{
"Sid":"CreateAndPushLogsFromLambda",
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": [
"arn:aws:logs:us-east-1:aws-account-number:log-group:/aws/lambda/aws-s3-file-upload-validation:*"
]
}
]
}
```
* Login to AWS docker
```bash
aws ecr get-login-password --region us-east-1 --profile profile-name | docker login --username AWS --password-stdin aws-acc-number.dkr.ecr.us-east-1.amazonaws.com
```
* Now build docker image and push to AWS ECR using below commands or Use [github action](./.github/workflows/build-ecr-image.yml)
```bash
docker buildx build -t aws-acc-number.dkr.ecr.us-east-1.amazonaws.com/file-validation:latest
docker push aws-acc-number.dkr.ecr.us-east-1.amazonaws.com/file-validation:latest
```
* Create `aws-s3-file-upload-validation` lambda function configure ECR image, IAM role policy, memory and timeout.
* Create s3 trigger event for object creation and link it to trigger lambda function
* Test Lambda function by uploading valid and invalid content type files.
### Using Zip (Might Not Work Properly)
* Build Zip
```bash
make all
```
* Upload zip to lambda function