Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/narius2030/datalake-solution-imcp
This project involved the development and implementation of a Data Lake architecture to support an AI model capable of generating image captions. The architecture was designed to efficiently ingest, process, and centralized store large volumes of image and text data.
https://github.com/narius2030/datalake-solution-imcp
data-lake docker-container etl-pipeline fastapi medallion-architecture mlops nosql-database object-storage
Last synced: about 1 month ago
JSON representation
This project involved the development and implementation of a Data Lake architecture to support an AI model capable of generating image captions. The architecture was designed to efficiently ingest, process, and centralized store large volumes of image and text data.
- Host: GitHub
- URL: https://github.com/narius2030/datalake-solution-imcp
- Owner: Narius2030
- Created: 2024-08-24T15:53:17.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-12-01T14:10:38.000Z (about 1 month ago)
- Last Synced: 2024-12-01T15:23:15.909Z (about 1 month ago)
- Topics: data-lake, docker-container, etl-pipeline, fastapi, medallion-architecture, mlops, nosql-database, object-storage
- Language: Python
- Homepage:
- Size: 192 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Overal Architecture
![image](https://github.com/user-attachments/assets/e7fc0152-fe7c-4c00-8d83-c268d4fee4a9)## Detailed Architecture
![image](https://github.com/user-attachments/assets/13726b7e-6c91-4453-a291-1dda31684cd1)## Overal Data Pipeline
![image](https://github.com/user-attachments/assets/0f0e0040-8681-4b8f-9ba0-ec1eea828972)## Practical Data Pipeline
![image](https://github.com/user-attachments/assets/8ba59a0d-701b-4007-b248-0db1138f9263)