https://github.com/zxkane/serverless-docker-images-analytics
Serverless Analytics app for analyzing docker image layers
https://github.com/zxkane/serverless-docker-images-analytics
analytics aws aws-athena aws-cdk aws-glue big-data data-lake
Last synced: 6 months ago
JSON representation
Serverless Analytics app for analyzing docker image layers
- Host: GitHub
- URL: https://github.com/zxkane/serverless-docker-images-analytics
- Owner: zxkane
- Created: 2020-04-29T10:26:30.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2023-03-04T15:11:50.000Z (over 2 years ago)
- Last Synced: 2025-02-02T02:43:08.203Z (8 months ago)
- Topics: analytics, aws, aws-athena, aws-cdk, aws-glue, big-data, data-lake
- Language: TypeScript
- Homepage: https://kane.mx/posts/2020/serverless-docker-images-analytics/
- Size: 8.92 MB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 17
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Serverless Docker Image Layers Analytics
It's a fun project to setup a serverless Analytics app in AWS to analyze the layers of docker images.
The app is powered by Amazon S3, AWS Glue and Amazon Athena.
## How to deploy
### Prerequisites
- Install Node LTS(such as 12.x)
- Configure your AWS account for [awscli](https://docs.aws.amazon.com/polly/latest/dg/setup-aws-cli.html)
### Deploy it
```shell
# install dependencies & init cdk toolkit
# only need run once
npm run init# deploy
npm run deploy
```### Cleanup
```shell
npm run cleanup
```## How to analyze the data
- Login the AWS console with your account, Go to [Athena][athena]
- In Query Editor, selction database `docker_image_db`, then select context menu `Load Partitions` from table `layers`
- Click the **Saved Queries** to find the built-in analysis queries starting with `Docker_Layers_Stats`Enjoy it!
## Disclaimer -- about image layers data
This project provides few pilot layers data of some official Docker images of [Docker hub][docker-hub], the data was fetched by the [simple script][image-layer-fetching-script]. This project does **NOT** guarantee the integrity of layers data and provides the continuous maintenance.
You are free to use this project and the script, make sure not violating the user agreements of Docker hub.
[athena]: https://aws.amazon.com/athena/?nc2=h_ql_prod_an_ath
[docker-hub]: https://hub.docker.com/
[image-layer-fetching-script]: https://gist.github.com/zxkane/23de226fee8806ee0ed8c05136972ce0