https://github.com/naruaika/eruo-data-studio
A powerful yet friendly ETL tool powered by Polars backend
https://github.com/naruaika/eruo-data-studio
data-analysis data-science desktop-app gnome-desktop gtk4 proof-of-concept python spreadsheet
Last synced: 11 months ago
JSON representation
A powerful yet friendly ETL tool powered by Polars backend
- Host: GitHub
- URL: https://github.com/naruaika/eruo-data-studio
- Owner: naruaika
- License: gpl-3.0
- Created: 2025-06-26T11:19:24.000Z (12 months ago)
- Default Branch: master
- Last Pushed: 2025-07-13T22:41:09.000Z (11 months ago)
- Last Synced: 2025-07-14T00:28:05.380Z (11 months ago)
- Topics: data-analysis, data-science, desktop-app, gnome-desktop, gtk4, proof-of-concept, python, spreadsheet
- Language: Python
- Homepage:
- Size: 1.16 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: COPYING
Awesome Lists containing this project
README
# Eruo Data Studio
A powerful yet friendly ETL (Extract, Transform, Load) tool powered by [Polars](https://pola.rs/) backend, targeting the large data science community using Python.

## Status
Currently in early development. Stay tuned for updates.
## Use Cases
[TODO]
## Features
[TODO]
## Backgrounds
[TODO]
## Limitations
Currently we only support x86_64 architectures and Linux distributions using `glibc` (GNU C Library) due to lack of dependecy management by the team. Building [Polars](https://pola.rs/) from source doesn't seem to be so complicated though, so we'll make sure to try again in the near future.
Since we started developing the proof-of-concept with Libadwaita, a building blocks for GNOME applications, so it's supposed to be compatible only with GNOME desktop environment. I think it's possible that the application will still look correct and good on other distributions. Anyway, we'll add support for Windows in the future and hopefully for macOS as well as web platforms!
## Designing
The following are some of the resources used in decision making and development planning:
- [The ONLY Data Cleaning Framework You Need | Ep. 3](https://www.youtube.com/watch?v=y9wFFD2bXQM)
- [How to NAIL Exploratory Data Analysis (Lead Analyst Demo)](https://www.youtube.com/watch?v=deS6lETubdU)
We've been doing some research on similar applications as follows:
- [Microsoft Excel](https://www.microsoft.com/en-us/microsoft-365/excel)
- [Google Sheets](https://workspace.google.com/intl/en_id/products/sheets/)
- [Power BI](https://www.microsoft.com/en-us/power-platform/products/power-bi)
- [Tableau](https://www.tableau.com/products/desktop)
- [Alteryx Designer](https://www.alteryx.com/products/alteryx-designer)
- [SmoothCSV](https://smoothcsv.com/)
## Planning
[TODO]
## Development
The recommended way to build and run this project is using [GNOME Builder](https://apps.gnome.org/Builder/).
I personally use [Visual Studio Code](https://code.visualstudio.com/), but you can use whatever your favorite is. To run and build using Flatpak on VS Code, consider installing [Flatpak](https://marketplace.visualstudio.com/items?itemName=bilelmoussaoui.flatpak-vscode) extension. Run the following commands in the terminal to install the dependencies (on Fedora):
```sh
sudo dnf install flatpak flatpak-builder --assumeyes
flatpak remote-add --if-not-exists gnome-nightly https://nightly.gnome.org/gnome-nightly.flatpakrepo
flatpak install gnome-nightly org.gnome.Platform//master
flatpak install gnome-nightly org.gnome.Sdk//master
```
Type and run `Flatpak: Select or Change Active Manifest` in the command palette (F1 or Ctrl+Shift+P) and select the `com.macipra.eruo.Devel.json` manifest file. Finally, type and run `Flatpak: Build and Run` in the command palette or simply hit Ctrl+Alt+B.
If you're using a Python language server, you may want to install the requirements. For better dependency management, it's recommended to create a virtual environment rather than installing packages globally:
```sh
python -m venv .pyenv
source .pyenv/bin/activate
pip install -r requirements-devel.txt
```
To add new dependencies using [`pip`](https://packaging.python.org/en/latest/key_projects/#pip) to the [`flatpak-builder`](https://docs.flatpak.org/en/latest/flatpak-builder.html) manifest json file, you can use the [`flatpak-pip-generator`](https://github.com/flatpak/flatpak-builder-tools/tree/master/pip). Either adding the reference to the `com.macipra.eruo*.json` files or copy-pasting the content directly into the manifest files and delete the generated file. Do not forget to update the `requirements*.txt` files as well.
Here are some useful references for the project development:
- Flatpak: https://docs.flatpak.org/en/latest/index.html
- Flathub: https://docs.flathub.org/docs/category/for-app-authors
- GNOME developer: https://developer.gnome.org/documentation/index.html
- GNOME Python API: https://api.pygobject.gnome.org/index.html
- GTK4: https://docs.gtk.org/gtk4/index.html
- PyGObject: https://gnome.pages.gitlab.gnome.org/pygobject/index.html
- Pycairo: https://pycairo.readthedocs.io/en/latest/index.html
- Libadwaita: https://gnome.pages.gitlab.gnome.org/libadwaita/doc/1.4/index.html
- Polars: https://docs.pola.rs/api/python/stable/reference/index.html
Please bear with us, most of the docstrings are AI-generated, though sometimes under my supervision. Your help will be greatly appreciated.
## Licenses
This project is distributed under the [GNU General Public License Version 3](https://www.gnu.org/licenses/gpl-3.0.en.html). We use GTK and [Libadwaita](https://gitlab.gnome.org/GNOME/libadwaita) to build the user interface, which are licensed under the [GNU Lesser General Public License Version 2.1](https://www.gnu.org/licenses/lgpl-2.1.en.html). The backend for data manipulation uses [Polars](https://pola.rs/), which is distributed under the [MIT License](https://opensource.org/license/mit). For other dependencies, see the `requirements.txt` file.