Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/leobitto/dataforge
DataForge is a customizable Django-based template for building data-driven applications, enabling SMEs to easily manage, process, and visualize business data with modular workflows and scalable infrastructure
https://github.com/leobitto/dataforge
airflow business-intelligence django docker grafana helm kubernetes postgres prometheus python sql
Last synced: 19 days ago
JSON representation
DataForge is a customizable Django-based template for building data-driven applications, enabling SMEs to easily manage, process, and visualize business data with modular workflows and scalable infrastructure
- Host: GitHub
- URL: https://github.com/leobitto/dataforge
- Owner: leoBitto
- License: gpl-3.0
- Created: 2024-06-19T13:44:05.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2024-11-17T20:47:21.000Z (about 2 months ago)
- Last Synced: 2024-11-17T21:33:23.457Z (about 2 months ago)
- Topics: airflow, business-intelligence, django, docker, grafana, helm, kubernetes, postgres, prometheus, python, sql
- Language: Python
- Homepage: https://leobitto.github.io/DataForge/
- Size: 3.16 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# DataForge
![DataForge Logo](docs/assets/img/DataForge_logo.png)
**DataForge** is a comprehensive data management and processing platform designed for small and medium-sized enterprises (SMEs) looking for a simple and cost-effective solution for their data infrastructure. DataForge provides powerful tools to manage databases, automate workflows, and monitor the infrastructure.
## Key Features
- **Django Application**: A user-friendly application for easy data entry and visualization, allowing SMEs to access and manipulate data effortlessly.
- **"Silver" Database**: A database that stores structured data, ready for preliminary analysis and integration.
- **"Gold" Database**: An advanced database for optimized data ready for deeper analysis and machine learning.
- **Airflow**: Workflow automation tools for managing tasks and scheduling data pipelines.
- **Monitoring and Observability**: Grafana and Prometheus are integrated to monitor infrastructure and Kubernetes clusters, ensuring high availability and optimal performance.## Project Goal
DataForge aims to provide an affordable and scalable data management solution based on open-source technologies and tailored to the needs of SMEs. With a microservices architecture supported by Kubernetes, the platform can be easily extended and customized.## Infrastructure and Technologies Used
- **Django**: Main backend for data management and API support.
- **PostgreSQL**: Main databases (silver and gold) for data management.
- **Airflow**: Scheduler for orchestrating and automating complex data workflows.
- **Kubernetes**: For scalable deployment and container management.
- **Helm**: Kubernetes configuration and package management.
- **Grafana and Prometheus**: Infrastructure monitoring and metrics visualization.## Getting Started
1. Follow the [Installation Guide](docs/installation.md) for setup and installation instructions.
2. Use the CI/CD workflows (see `.github/workflows`) to automate the build and deployment processes.
3. Configure the databases and the Django application as outlined in the [Deployment Guide](docs/deployment.md).## Contributing
Contributions and feedback are welcome! Please check out [CONTRIBUTING.md](CONTRIBUTING.md) and [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) for more information on how to contribute.---