Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mage-ai/mage-ai
๐ง Build, run, and manage data pipelines for integrating and transforming data.
https://github.com/mage-ai/mage-ai
artificial-intelligence data data-engineering data-integration data-pipelines data-science dbt elt etl machine-learning orchestration pipeline pipelines python reverse-etl spark sql transformation
Last synced: 6 days ago
JSON representation
๐ง Build, run, and manage data pipelines for integrating and transforming data.
- Host: GitHub
- URL: https://github.com/mage-ai/mage-ai
- Owner: mage-ai
- License: apache-2.0
- Created: 2022-05-16T22:11:39.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2024-06-13T07:23:07.000Z (7 months ago)
- Last Synced: 2024-06-13T11:17:02.630Z (7 months ago)
- Topics: artificial-intelligence, data, data-engineering, data-integration, data-pipelines, data-science, dbt, elt, etl, machine-learning, orchestration, pipeline, pipelines, python, reverse-etl, spark, sql, transformation
- Language: Python
- Homepage: https://www.mage.ai/
- Size: 189 MB
- Stars: 7,298
- Watchers: 63
- Forks: 667
- Open Issues: 331
-
Metadata Files:
- Readme: README.md
- Contributing: docs/contributing/backend/io/adding-a-class.mdx
- License: LICENSE
Awesome Lists containing this project
- project-awesome - mage-ai/mage-ai - ๐ง Build, run, and manage data pipelines for integrating and transforming data. (Python)
- awesome-starred - mage-ai/mage-ai - ๐ง The modern replacement for Airflow. Build, run, and manage data pipelines for integrating and transforming data. (artificial-intelligence)
README
Mage is a hybrid framework for transforming and integrating data. It combines the best of both worlds: the flexibility of notebooks with the rigor of modular code.
- Extract and synchronize data from 3rd party sources.
- Transform data with real-time and batch pipelines using Python, SQL, and R.
- Load data into your data warehouse or data lake using our pre-built connectors.
- Run, monitor, and orchestrate thousands of pipelines without losing sleep.
Plus hundreds of enterprise-class features, infrastructure innovations, and magical surprises.
#### Available in two spellbinding versions
For teams. Fully managed platform
for integrating and transforming data.
Self-hosted. System to build, run, and
manage data pipelines.
# Itโs magic.
For documentation on getting started, how to develop, and how to deploy to production check out the live
Developer documentation portal.
## ๐โโ๏ธ Install
The recommended way to install the latest version of Mage is through Docker with the following command:
```bash
docker pull mageai/mageai:latest
```You can also install Mage using pip or conda, though this may cause dependency issues without the proper environment.
```bash
pip install mage-ai
```
```bash
conda install -c conda-forge mage-ai
```Looking for help? The _fastest_ way to get started is by checking out our documentation [here](https://docs.mage.ai/getting-started/setup).
Looking for quick examples? Open a [demo](https://demo.mage.ai/) project right in your browser or check out our [guides](https://docs.mage.ai/guides/overview).
## ๐ฎ Demo
### Live demo
Build and run a data pipeline with our [demo app](https://demo.mage.ai/).
> WARNING
>
> The live demo is public to everyone, please donโt save anything sensitive (e.g. passwords, secrets, etc).
### Demo video (5 min)[![Mage quick start demo](https://github.com/mage-ai/assets/blob/main/overview/overview-video.png?raw=True)](https://youtu.be/GswOdShLGmg)
Click the image to play video
## ๐ฎ [Features](https://docs.mage.ai/about/features)
| | | |
| --- | --- | --- |
| ๐ถ | [Orchestration](https://docs.mage.ai/design/data-pipeline-management) | Schedule and manage data pipelines with observability. |
| ๐ | [Notebook](https://docs.mage.ai/about/features#notebook-for-building-data-pipelines) | Interactive Python, SQL, & R editor for coding data pipelines. |
| ๐๏ธ | [Data integrations](https://docs.mage.ai/data-integrations/overview) | Synchronize data from 3rd party sources to your internal destinations. |
| ๐ฐ | [Streaming pipelines](https://docs.mage.ai/guides/streaming-pipeline) | Ingest and transform real-time data. |
| โ | [dbt](https://docs.mage.ai/dbt/overview) | Build, run, and manage your dbt models with Mage. |
A sample data pipeline defined across 3 files โ
1. Load data โ
```python
@data_loader
def load_csv_from_file() -> pl.DataFrame:
return pl.read_csv('default_repo/titanic.csv')
```
1. Transform data โ
```python
@transformer
def select_columns_from_df(df: pl.DataFrame, *args) -> pl.DataFrame:
return df[['Age', 'Fare', 'Survived']]
```
1. Export data โ
```python
@data_exporter
def export_titanic_data_to_disk(df: pl.DataFrame) -> None:
df.to_csv('default_repo/titanic_transformed.csv')
```
[](https://www.mage.ai/)