https://github.com/opendata-lab/opendataworks
opendataworks 是一个面向大数据平台的统一数据门户系统,基于DolphinScheduler、Doris等开源项目,旨在为企业提供一站式的数据资产管理、任务调度编排和血缘关系追踪解决方案。
https://github.com/opendata-lab/opendataworks
data-engineering datax dolphinscheduler doris
Last synced: 2 days ago
JSON representation
opendataworks 是一个面向大数据平台的统一数据门户系统,基于DolphinScheduler、Doris等开源项目,旨在为企业提供一站式的数据资产管理、任务调度编排和血缘关系追踪解决方案。
- Host: GitHub
- URL: https://github.com/opendata-lab/opendataworks
- Owner: opendata-lab
- License: gpl-3.0
- Created: 2025-10-18T14:39:52.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2026-05-27T07:35:28.000Z (7 days ago)
- Last Synced: 2026-05-27T08:26:26.197Z (7 days ago)
- Topics: data-engineering, datax, dolphinscheduler, doris
- Language: Java
- Homepage: https://opendataworks.vercel.app
- Size: 5.7 MB
- Stars: 36
- Watchers: 5
- Forks: 12
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
- Agents: AGENTS.md
Awesome Lists containing this project
- awesome-java - OpenDataWorks
README
# OpenDataWorks
**A unified data portal for workflow orchestration, intelligent query, and data lineage visualization.**
English | [简体中文](README_zh-CN.md)
[Website](https://opendataworks.vercel.app/) · [Quick Start](https://opendataworks.vercel.app/guide/quick-start.html) · [Features](https://opendataworks.vercel.app/guide/features.html) · [Architecture](https://opendataworks.vercel.app/architecture/overview.html) · [Configuration](https://opendataworks.vercel.app/guide/configuration.html) · [Contributing](https://opendataworks.vercel.app/guide/contribution.html) · [Slack](https://opendataworkshq.slack.com/)
---
## Overview
OpenDataWorks is an open-source data platform portal for teams that need one place to manage metadata, orchestrate data workflows, analyze lineage, and ask data questions with natural language.
It brings the core pieces of a modern data platform into a deployable full-stack application: a Java backend, a Vue frontend, a Python DataAgent service for intelligent query, and Docker Compose assets for local and production environments.
## Why OpenDataWorks
- **Unified data asset management**: organize table metadata, data domains, business domains, and layered data models.
- **Workflow orchestration**: configure batch and streaming jobs visually, with deep DolphinScheduler integration.
- **Lineage analysis**: parse SQL lineage automatically and explore upstream/downstream relationships in an interactive graph.
- **Intelligent query**: use natural language to generate SQL, execute analysis, and review results from the main portal.
- **Ready to deploy**: run the frontend, backend, DataAgent backend, Redis, MySQL, and Portal MCP from the provided Docker Compose setup.
## Feature Highlights
- Metadata management for ODS, DWD, DIM, DWS, and ADS layers
- Workflow authoring, publishing, scheduling, and execution monitoring
- SQL and Shell task support
- Data lineage visualization with ECharts
- Data Studio with catalog browsing, SQL editing, and table-level metadata context
- Built-in NL2SQL intelligent-query entrypoint
- Runtime logs, execution history, and operational statistics
## Demo
[https://opendataworks-demo.vercel.app](https://opendataworks-demo.vercel.app)
## Screenshots
### Workflow Orchestration

Manage workflow lists, publishing status, and common workflow actions.
### Data Lineage

Explore upstream and downstream table relationships around a selected table.
### Data Studio

Browse catalogs, write SQL, and inspect table metadata in one workspace.
## Docker Deployment
### Start the Development Environment
Use the development Docker Compose profile to start the frontend, backend, DataAgent backend, Redis, MySQL, and Portal MCP together:
```bash
# 1. Prepare configuration
cp deploy/.env.example deploy/.env
# 2. Pull the latest images
docker compose -f deploy/docker-compose.dev.yml pull
# 3. Start services
docker compose -f deploy/docker-compose.dev.yml up -d
# Access points
# Frontend: http://localhost:8081
# Backend: http://localhost:8080/api
# DataAgent Backend: http://localhost:8900
# Portal MCP: http://localhost:8801/mcp
```
### Production and Offline Deployment
See the [deployment guide](deploy/README.md) for production deployment and offline package instructions.
## Quick Start
Follow the [quick start guide](https://opendataworks.vercel.app/guide/quick-start.html) to deploy and run OpenDataWorks locally.
## Documentation
Full documentation is available at: **https://opendataworks.vercel.app/**
- [Quick Start](https://opendataworks.vercel.app/guide/quick-start.html)
- [Architecture](https://opendataworks.vercel.app/architecture/overview.html)
- [Configuration](https://opendataworks.vercel.app/guide/configuration.html)
- [FAQ](https://opendataworks.vercel.app/guide/faq.html)
## Community
- Join the [OpenDataWorks Slack community](https://opendataworkshq.slack.com/) to discuss usage, deployment, roadmap ideas, and contributions.
- Open a [GitHub Issue](https://github.com/opendata-lab/opendataworks/issues) for bugs, feature requests, or documentation feedback.
## Contributing
Contributions are welcome. Please read the [contribution guide](https://opendataworks.vercel.app/guide/contribution.html) before opening a pull request.
## License
OpenDataWorks is licensed under the [GNU General Public License v3.0 only](LICENSE).