{"id":32353173,"url":"https://github.com/monzim/mini-etl","last_synced_at":"2026-04-10T12:32:20.445Z","repository":{"id":247020770,"uuid":"822412688","full_name":"monzim/mini-etl","owner":"monzim","description":"This is a mini-ETL project like Fiber (YC) but smaller version. It's allows users to extract data from a source, transform it, and load it into a destination. The project is built using next.js (Frontend) and NestJS (Backend).","archived":false,"fork":false,"pushed_at":"2024-07-07T07:13:36.000Z","size":333,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-07-08T05:57:52.971Z","etag":null,"topics":["digitalocean","docker","drizzle-orm","github-actions","microservice","nestjs","nextjs","pg","postgres","prisma-orm","rabbitmq","shadcn-ui","typescript","vercel"],"latest_commit_sha":null,"homepage":"https://mini-etl.vercel.app","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/monzim.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-01T05:34:40.000Z","updated_at":"2024-07-07T07:13:39.000Z","dependencies_parsed_at":"2024-07-06T06:02:42.853Z","dependency_job_id":null,"html_url":"https://github.com/monzim/mini-etl","commit_stats":null,"previous_names":["monzim/mini-etl"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/monzim/mini-etl","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/monzim%2Fmini-etl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/monzim%2Fmini-etl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/monzim%2Fmini-etl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/monzim%2Fmini-etl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/monzim","download_url":"https://codeload.github.com/monzim/mini-etl/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/monzim%2Fmini-etl/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31642796,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-10T07:40:12.752Z","status":"ssl_error","status_checked_at":"2026-04-10T07:40:11.664Z","response_time":98,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["digitalocean","docker","drizzle-orm","github-actions","microservice","nestjs","nextjs","pg","postgres","prisma-orm","rabbitmq","shadcn-ui","typescript","vercel"],"created_at":"2025-10-24T10:05:51.856Z","updated_at":"2026-04-10T12:32:20.437Z","avatar_url":"https://github.com/monzim.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# [Mini-ETL (Extract, transform, load)](https://mini-etl.vercel.app)\n\n**This project is inspired by [Fiber.Dev](https://fiber.dev)**\n\nLive: [mini-etl.vercel.app](https://mini-etl.vercel.app)\nGitHub: [mini-etl](https://github.com/monzim/mini-etl)\n\n## Introduction\n\nThis is a mini-ETL project like Fiber (YC) but smaller version. It's allows users to extract data from a source, transform it, and load it into a destination. The project is built using next.js (Frontend) and NestJS (Backend).\n\n- CURRENTLY IT SUPPORT ONLY GITHUB\n\nThis is just a demonstration project and it can be extended to support other data sources like Gitlab, Bitbucket, etc.\n\n## Features\n\n- Load the data into a destination (Support PostgreSQL and S3)\n- GitHub OAuth Authentication: Users can log in using their GitHub accounts.\n- Extract data from Github (Public Repositories, ISSUES, Pull Requests)\n- Transform the data\n- Data Source Management: Users can add and manage data sources such as S3 buckets and PostgreSQL databases.\n- Automatic and Manual Data Synchronization: Data is synced automatically at regular intervals, with an option for manual synchronization.\n- Data Viewing: Users can view their synchronized data in a user-friendly interface.\n\n# Backend (NestJS)\n\n- [ApiGateway](api_gateway) Built with NestJS, it handles all incoming REST API calls and routes them to the appropriate microservices.\n- [SyncService (Microservice)](sync_service) A dedicated microservice for handling data synchronization tasks.\n\n## Stack\n\n- NestJS (Node.js Framework)\n- PostgreSQL (Database)\n- PG, Prisma and Drizzle (ORM)\n- Docker (Containerization)\n- RabbitMQ (Message Broker)\n- DigitalOcean (Deployment)\n\n![image](https://github.com/monzim/public-assets/blob/main/mini-etl/stack-overview.png?raw=true)\n\n# How to Run the Project?\n\nTo run this project, you need to have node installed on your machine. You can download it from [here](https://nodejs.org/en/). This project have Two parts:\n\n1. Frontend (Next.js) - `cd frontend`\n2. Backend (NestJS)\n\n   - ApiGateway - `cd api_gateway`\n   - SyncService - `cd sync_service`\n\n### Backend (NestJS - SyncService)\n\nFirst we need to run the sync service. To run the sync service, you need to have RabbitMQ and postgres connection Strings. You can create a `.env` file in the `sync_service` directory as like the .env.example file.\n\n```bash\nRABBITMQ_QUEUE=\"\"\nRABBITMQ_URL=\"\"\nDATABASE_URL=\"\"\nDATABASE_URL_DRIZZLE=\"\" // no need this\n```\n\nAfter creating the `.env` file, you can run the following commands:\n\n```bash\n# generate the prisma client\npnpm install\nnpx prisma generate \u0026\u0026 npx prisma db push\npnpm start:dev\n```\n\n### Backend (NestJS - ApiGateway)\n\nFirst we need to run the api gateway. To run the api gateway, you need to have RabbitMQ and postgres connection Strings. You can create a `.env` file in the `api_gateway` directory as like the .env.example file.\n\n```bash\nDATABASE_URL=\nGITHUB_CALLBACK_URL=http://localhost:3000/auth/callback/github\nGITHUB_CLIENT_ID=\nGITHUB_CLIENT_SECRET=\nJWT_SECRET=\nRABBITMQ_QUEUE=\nRABBITMQ_URL=\nAUTH_FRONTEND_REDIRECT_URL=\"\"\nFRONTEND_URL=\"\"\n```\n\nAfter creating the `.env` file, you can run the following commands:\n\n```bash\npnpm install\nnpx prisma generate \u0026\u0026 npx prisma db push\npnpm start:dev\n```\n\n### Frontend (Next.js)\n\nTo run the frontend, you need to have the following environment variables. You can create a `.env.local` file in the `frontend` directory as like the .env.example file.\n\n```bash\nNEXT_PUBLIC_API_URL=http://localhost:3000/api\n```\n\nAfter creating the `.env.local` file, you can run the following commands:\n\n```bash\npnpm install\npnpm dev\n```\n\n## Mini ETL Workflow\n\nHere's how Mini-ETL works.\n\n1. **User Authentication**:\n\n   - Users log in using GitHub OAuth.\n   - Upon successful login, a JWT token is generated and stored in user cookies.\n\n2. **Adding Data Sources**:\n\n   - Users can add data sources by providing specific credentials.\n   - Supported destinations include S3 buckets (with optional Cloudflare R2) and PostgreSQL databases.\n   - The API gateway validates these credentials via the SyncMicroservice.\n\n3. **Data Source Validation**:\n\n   - If the data source credentials are valid, the data source is marked as valid.\n   - Users can then connect their GitHub provider to this valid data source.\n\n4. **Data Synchronization**:\n\n   - The SyncMicroservice automatically synchronizes data (public repositories, issues, and pull requests) from GitHub to the specified destination every ten minutes.\n   - Users can also manually trigger synchronization via a button in the app console.\n\n5. **Viewing Data**:\n   - In the app console, users can see all connected providers and data sources.\n   - Synced data is displayed in a nicely formatted table.\n   - Users can manually trigger synchronization if needed.\n\n![image](https://github.com/monzim/public-assets/blob/main/mini-etl/app-workflow.png?raw=true)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmonzim%2Fmini-etl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmonzim%2Fmini-etl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmonzim%2Fmini-etl/lists"}