{"id":35076530,"url":"https://github.com/flyingriverhorse/skyulf","last_synced_at":"2026-04-25T18:01:43.473Z","repository":{"id":323298881,"uuid":"1087915899","full_name":"flyingriverhorse/Skyulf","owner":"flyingriverhorse","description":"Build and ship production ML pipelines faster: a pipeline library with an optional self-hosted visual layer for modular, reproducible workflows, local testing, and experiment tracking.","archived":false,"fork":false,"pushed_at":"2026-04-23T16:11:26.000Z","size":107491,"stargazers_count":41,"open_issues_count":9,"forks_count":4,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-04-23T18:15:35.966Z","etag":null,"topics":["celery","data-orchestration","deep-learning","docker-compose","experiment-tracking","feature-engineering-python","local-first","low-code","machine-learning","ml-pipeline","ml-platform-workflow","mlops","mlops-workflow","model-deployment","model-registry","privacy-first","react","redis","self-hosted","visual-programming"],"latest_commit_sha":null,"homepage":"https://www.skyulf.com/","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/flyingriverhorse.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":".github/SUPPORT.md","governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":"COPYRIGHT.md","agents":null,"dco":null,"cla":"CLA.md"},"funding":{"github":["flyingriverhorse"],"ko_fi":"flyingriverhorse","custom":["https://buymeacoffee.com/flyingriverhorse"]}},"created_at":"2025-11-01T22:49:48.000Z","updated_at":"2026-04-21T05:56:12.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/flyingriverhorse/Skyulf","commit_stats":null,"previous_names":["flyingriverhorse/skyulf"],"tags_count":28,"template":false,"template_full_name":null,"purl":"pkg:github/flyingriverhorse/Skyulf","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flyingriverhorse%2FSkyulf","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flyingriverhorse%2FSkyulf/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flyingriverhorse%2FSkyulf/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flyingriverhorse%2FSkyulf/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/flyingriverhorse","download_url":"https://codeload.github.com/flyingriverhorse/Skyulf/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/flyingriverhorse%2FSkyulf/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32271243,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-25T09:15:33.318Z","status":"ssl_error","status_checked_at":"2026-04-25T09:15:31.997Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["celery","data-orchestration","deep-learning","docker-compose","experiment-tracking","feature-engineering-python","local-first","low-code","machine-learning","ml-pipeline","ml-platform-workflow","mlops","mlops-workflow","model-deployment","model-registry","privacy-first","react","redis","self-hosted","visual-programming"],"created_at":"2025-12-27T12:08:59.891Z","updated_at":"2026-04-25T18:01:43.467Z","avatar_url":"https://github.com/flyingriverhorse.png","language":"HTML","funding_links":["https://github.com/sponsors/flyingriverhorse","https://ko-fi.com/flyingriverhorse","https://buymeacoffee.com/flyingriverhorse"],"categories":[],"sub_categories":[],"readme":"# Skyulf\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"static/img/logo.png\" alt=\"Skyulf Logo\" width=\"200\"\u003e\n\u003c/p\u003e\n\n[![License](https://img.shields.io/badge/license-Apache--2.0-blue)](LICENSE)\n[![Commercial](https://img.shields.io/badge/enterprise-support-blueviolet)](COMMERCIAL-LICENSE.md)\n[![Python](https://img.shields.io/badge/python-3.10%20%7C%203.11%20%7C%203.12-blue)](#quick-start)\n[![CI](https://github.com/flyingriverhorse/Skyulf/actions/workflows/ci.yml/badge.svg)](https://github.com/flyingriverhorse/Skyulf/actions/workflows/ci.yml)\n[![Docs](https://github.com/flyingriverhorse/Skyulf/actions/workflows/docs.yml/badge.svg)](https://github.com/flyingriverhorse/Skyulf/actions/workflows/docs.yml)\n[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit\u0026logoColor=white)](.pre-commit-config.yaml)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![Checked with mypy](http://www.mypy-lang.org/static/mypy_badge.svg)](http://mypy-lang.org/)\n[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](CONTRIBUTING.md)\n[![Skyulf](https://img.shields.io/badge/Skyulf-Privacy--First_MLOps_Hub-blueviolet)](#key-features)\n[![codecov](https://codecov.io/github/flyingriverhorse/Skyulf/graph/badge.svg?token=47ED2R6ZHC)](https://codecov.io/github/flyingriverhorse/Skyulf)\n[![Codacy Badge](https://app.codacy.com/project/badge/Grade/51e3ad3ce18e41b2922cf62a6dd6ce99)](https://app.codacy.com/gh/flyingriverhorse/Skyulf/dashboard?utm_source=gh\u0026utm_medium=referral\u0026utm_content=\u0026utm_campaign=Badge_grade)\n[![Downloads](https://img.shields.io/pypi/dm/skyulf-core.svg)](https://pypi.org/project/skyulf-core)\n[![issues](https://img.shields.io/github/issues/flyingriverhorse/Skyulf.svg)](https://github.com/flyingriverhorse/Skyulf/issues) \n[![contributors](https://img.shields.io/github/contributors/flyingriverhorse/Skyulf.svg)](https://github.com/flyingriverhorse/Skyulf/graphs/contributors)\n\n\u003e **Skyulf:** The Visual MLOps Builder\n\nSkyulf is a self-hosted, privacy-first. It is designed to be the \"glue\" that holds your data science workflow together (soon with export project option). Bring your data, clean it visually, engineer features with a node-based canvas, and train models, all in one place.\n\n## What is the meaning of Skyulf?\n\nI named it Skyulf after two ideas. Sky is the open space above Earth, where the sun, moon, stars, and clouds live. Ulf means “wolf,” with Nordic roots, and the wolf is also a strong symbol in Turkic tradition. Together it fits the project: independent and helpful to community.\n\n## Table of Contents\n\n- [Quick Start](#quick-start)\n- [Using Skyulf as a Library](#using-skyulf-as-a-library)\n- [Key Features](#key-features)\n- [Roadmap](#roadmap)\n- [Version History](#version-history)\n- [Workflow Overview](#workflow-overview)\n- [Development](#development)\n- [Contributing](#contributing)\n- [License](#license)\n\n## Quick Start\n\nPrerequisites: **Python 3.10+**\n\n### Fastest Path (One Command)\n\n**Windows:** Double-click `start.bat`  \n**macOS/Linux:** Run `./start.sh`\n\nThese scripts auto-create a virtualenv, install deps, generate a `.env` with safe defaults (SQLite, no Redis), and launch the server. Open w when ready.\n\n### On Windows PowerShell (Manual)\n\n**Using pip:**\n```powershell\npython -m venv .venv\n.\\.venv\\Scripts\\Activate.ps1\npip install --upgrade pip\npip install -r requirements-fastapi.txt\npython run_skyulf.py\n```\n\n**Using uv (Faster):**\n```powershell\nuv venv\n.\\.venv\\Scripts\\Activate.ps1\nuv pip install -r requirements-fastapi.txt\npython run_skyulf.py\n```\n\nThe `run_skyulf.py` script will automatically start the FastAPI server.\n\n**Optional: Celery \u0026 Redis**\nBy default, Skyulf uses Celery and Redis for robust background task management. However, for simple local testing or environments where you cannot run Redis, you can disable this dependency.\n\nAdd this to your `.env` file:\n```ini\nUSE_CELERY=false\n```\nWhen disabled, background tasks (training, ingestion) will run in background threads within the main application process instead of a separate worker.\n\n### S3 Configuration (Optional)\nTo enable S3 integration for data and artifacts, add these to your `.env`:\n```ini\nAWS_ACCESS_KEY_ID=your_key\nAWS_SECRET_ACCESS_KEY=your_secret\nAWS_REGION=us-east-1\nS3_BUCKET_NAME=your-bucket\n# Optional: Upload local training artifacts to S3\nUPLOAD_TO_S3_FOR_LOCAL_FILES=true\n# Optional: Force local storage even for S3 data\nSAVE_S3_ARTIFACTS_LOCALLY=false\n```\n\n### With Docker Compose (Recommended)\n\n```powershell\ndocker compose up --pull=always --build\n```\n\nThis will start the full stack:\n- **FastAPI Backend** (Port 8000)\n- **Redis** (Port 6379)\n- **Celery Worker** (Background jobs)\n\n**Open:**\n- API health — http://127.0.0.1:8000/health\n- Docs (dev mode) — http://127.0.0.1:8000/docs\n\n## Skyulf Core Library\n\nThe core machine learning logic of Skyulf (preprocessing, modeling, tuning) is available as a standalone library on PyPI. You can use it to build reproducible pipelines in your own scripts or notebooks, independent of the web platform.\n\n```bash\n# Base (lightweight)\npip install skyulf-core\n\n# EDA-focused (recommended for profiling + charts)\npip install skyulf-core[eda,viz]\n\n# Full optional feature set\npip install skyulf-core[eda,viz,tuning,preprocessing-imbalanced,modeling-xgboost]\n\n# or\nuv add skyulf-core\n\n# EDA-focused with uv\nuv add \"skyulf-core[eda,viz]\"\n\n# Full optional feature set with uv\nuv add \"skyulf-core[eda,viz,tuning,preprocessing-imbalanced,modeling-xgboost]\"\n```\n\n## Using Skyulf as a Library\n\nSkyulf isn't just a web application; its core logic is available as a standalone Python library (`skyulf-core`). You can use it in your own scripts or Jupyter notebooks for powerful EDA and pipeline building. Using EDA is a great way to get started and it is really easy to use.\n\n### Example: Automated EDA\n\n```python\nimport polars as pl\nfrom skyulf.profiling.analyzer import EDAAnalyzer\nfrom skyulf.profiling.visualizer import EDAVisualizer\n\n# 1. Load Data\ndf = pl.read_csv(\"your_dataset.csv\")\n\n# 2. Run Analysis\nanalyzer = EDAAnalyzer(df)\nprofile = analyzer.analyze(\n    target_col=\"target\",\n    task_type=\"Classification\", # Optional: Force \"Classification\" or \"Regression\"\n    date_col=\"timestamp\",       # Optional: Manually specify if auto-detection fails\n    lat_col=\"latitude\",         # Optional\n    lon_col=\"longitude\"         # Optional\n)\n\n# 3. Visualize Results (The Easy Way)\n# This single class handles all the rich terminal output and matplotlib plots\nviz = EDAVisualizer(profile, df)\n\n# Print the dashboard\nviz.summary()\n\n# Show the plots\nviz.plot()\n```\n\nFor detailed examples including **Time Series**, **Geospatial Analysis**, and **Causal Inference**, see the [EDA User Guide](docs/user_guide/eda_profiling.md).\n\n## Key Features\n\n*   **🎨 Visual Feature Canvas:** A node-based editor to clean, transform, and engineer features without writing spaghetti code. (25+ built-in nodes).\n*   **Automated EDA:** Professional-grade Exploratory Data Analysis with interactive charts, causal discovery (DAGs), decision trees for rule extraction, segmentation, outlier detection, and statistical alerts.\n*   **Drift Analysis** Built on the EDA engine to monitor data and model drift over time with statistical tests and visualizations.\n*   **High-Performance Engine:** Built on **FastAPI** and **Polars** for lightning-fast data processing and easy API extension.\n*   **⚡ Async by Default:** Heavy training jobs run in the background via Celery \u0026 Redis (or background threads)—your UI never freezes.\n*   **💾 Flexible Data:** Ingest CSV, Excel, JSON, Parquet, or SQL. Start with SQLite (zero-config) and scale to PostgreSQL.\n*   **☁️ S3 Integration:** Full support for S3-compatible storage (AWS, MinIO) for data ingestion, artifact storage, and model registry.\n*   **🧠 Model Training:** Built-in support for Scikit-Learn models with hyperparameter search (Grid/Random/Halving) and optional Optuna integration.\n*   **📦 Model Registry \u0026 Deployment:** Version control your models, track metrics, and deploy them to a live inference API with a single click.\n*   **📊 Experiment Tracking:** Compare multiple runs side-by-side with interactive charts, confusion matrices, and ROC curves.\n\n## Roadmap\n\nWe have a clear vision to turn Skyulf into a complete **App Hub** for AI.\n\n*   **Phase 1: Polish \u0026 Stability** (Done) - Architecturing, type safety, and documentation.\n*   **Phase 2: Deepening Data Science** (Current Focus) - Advanced EDA, Ethics/Fairness checks, Synthetic Data, and Public Data Hubs, more models, NLP and more.\n*   **Phase 3: The \"App Hub\" Vision** - Plugin system, GenAI/LLM Builders, and Deployment.\n*   **Phase 4: Expansion** - Real-time collaboration, Edge/IoT export, and Audio support.\n\n👉 **[View the full ROADMAP.md](./ROADMAP.md)** for details.\n\n## Version History\n\nWe maintain a detailed changelog of all major updates, new features, and architectural changes.\n\n👉 **[View the full CHANGELOG.md](./CHANGELOG.md)** for the version index, or browse the detailed series files in [`changelog/`](./changelog/).\n\n## Workflow Overview\n\nThe high-level flow from dataset to model training inside Skyulf:\n\n\u003cp align=\"center\"\u003e\n\t\u003cimg src=\"static/img/image.png\" alt=\"Dataset → Train/Val/Test split → Celery-driven model trainer\" width=\"520\"\u003e\n\t\u003cbr /\u003e\n\t\u003cem\u003eDataset source → train/val/test split → background model training (Celery)\u003c/em\u003e\n\u003c/p\u003e\n\n## Development\n- Configuration via `backend/config/` package with domain mixins (SQLite, dev CORS)\n- Lifespan hooks initialize the async DB engine automatically\n- Tests under `tests/` cover core feature engineering and training helpers\n- `docker-compose.yml` to run API + Redis (+ Celery worker)\n\n## Contributing\n\nWe welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for setup and workflow guidance, and read our [Code of Conduct](CODE_OF_CONDUCT.md).\n\n## License\n\nSkyulf uses a split licensing model to balance open standards with sustainable development:\n\n*   **Backend \u0026 Core:** [Apache 2.0](LICENSE) (Permissive) - Ideal for integration and enterprise use.\n*   **Frontend (ML Canvas):** [GNU AGPLv3](frontend/ml-canvas/LICENSE) (Copyleft) - Ensures UI improvements are shared back to the community.\n\n**Commercial Use:**\nNo separate commercial license is required for internal use or building proprietary plugins on the backend.\nHowever, if you are building a proprietary SaaS that modifies the frontend and cannot comply with AGPLv3, please see [`COMMERCIAL-LICENSE.md`](COMMERCIAL-LICENSE.md) for partnership options.\n\n---\n\nIf you'd like to contribute, sponsor, or request a commercial license, please star the repo, open a Discussion or issue, or see `.github/FUNDING.yml` for sponsorship options.\n\n---\n\n## 🤝 Join the Journey\n\nI'm building this because I love it, but I can't do it alone forever.\n*   **Try it out:** Clone the repo, run it, break it.\n*   **Give Feedback:** Tell me what sucks. Tell me what you love.\n*   **Contribute:** Even a typo fix in the README helps.\n\nLet's build the simplest, most powerful MLOps hub together.\n\n\u003e \"Not all those who wander are lost.\" — J.R.R. Tolkien \u003cimg src=\"static/images/lotr-ring.svg\" alt=\"ring\" width=\"20\" height=\"20\" style=\"vertical-align:middle;margin-left:6px;\"\u003e\n\n---\n\n© 2025 Murat Unsal — Skyulf Project  \nSPDX-License-Identifier: Apache-2.0\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflyingriverhorse%2Fskyulf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fflyingriverhorse%2Fskyulf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fflyingriverhorse%2Fskyulf/lists"}