https://github.com/yuummmer/fairy-lab

a FAIRy prototype that validates and guides datasets through repository-ready, standards-compliant metadata.
https://github.com/yuummmer/fairy-lab

bioinformatics data-validation fair-data metadata reproducibility

Last synced: about 1 month ago
JSON representation

a FAIRy prototype that validates and guides datasets through repository-ready, standards-compliant metadata.

Host: GitHub
URL: https://github.com/yuummmer/fairy-lab
Owner: yuummmer
License: mit
Created: 2025-09-24T01:12:30.000Z (9 months ago)
Default Branch: main
Last Pushed: 2025-12-03T17:33:03.000Z (7 months ago)
Last Synced: 2025-12-06T22:30:10.758Z (7 months ago)
Topics: bioinformatics, data-validation, fair-data, metadata, reproducibility
Language: Python
Homepage:
Size: 51.7 MB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 63
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Citation: CITATION.cff

Awesome Lists containing this project

README

# FAIRy Lab

> **External name:** FAIRy Lab (GitHub: `FAIRy-lab`) • **Internal name:** `fairy-skeleton`
> Web + CLI demo runner for [FAIRy Core](https://github.com/yuummmer/fairy-core)

FAIRy Lab is the self-hosted demo environment for **FAIRy** — a local-first validator and packager for FAIR-friendly, repository-ready datasets.

This repo gives you:

- A small CLI demo runner (`fairy-skel`) that calls the real engine in **FAIRy Core**
- Toy datasets and configs that show PASS / WARN / FAIL workflows
- Example outputs you can use in demos, talks, and screenshots
- (Experimental) a Streamlit app entry point (`app.py`) for a lightweight web UI

**FAIRy Lab is about workflows and demos.**
**FAIRy Core is where the validator engine and rulepacks live.**

---

## What FAIRy Lab is / isn’t

**FAIRy Lab _is_:**

- A reference, self-hosted “lab” for exploring FAIRy:
- Upload / point at sample TSVs
- Run rulepacks via `fairy-skel`
- Inspect findings, PASS/WARN/FAIL, and reports
- A place to keep:
- Demo configs
- Toy datasets
- Example outputs (for screenshots, grants, and onboarding)

**FAIRy Lab _is not_:**

- The validator engine (that’s **FAIRy Core**)
- A multi-tenant hosted service (that would be a future “FAIRy Preflight+ / Teams” product)

All actual validation logic, rulepacks, and the `fairy` CLI live in
👉 **[FAIRy Core](https://github.com/yuummmer/fairy-core)**

---

## TL;DR (quick start)

Assuming `fairy-core` and `fairy-skeleton` are siblings:

```bash
# 1) Create a virtualenv
python -m venv .venv
source .venv/bin/activate

# 2) Install FAIRy Core (engine)
cd ../fairy-core
pip install -e .

# 3) Install FAIRy Lab (this repo)
cd ../fairy-skeleton
pip install -e .

# 4) List and run demos
fairy-skel demos
fairy-skel run bulk_rnaseq_min # intentionally FAILS (shows findings)
fairy-skel run bulk_rnaseq_pass # clean PASS

```
Outputs are written to each demo’s out/ path defined in its config.yaml.
---
## Requirements
- Python 3.10+ (FAIRy-core polyfills datetime.UTC for 3.10)
- Unix-like shell (Linux/macOS/WSL)
- pip and venv (or conda/mamba equivalent)

---
## Getting Started
1. Clone both repos side-by-side (recommended layout):
projects/
fairy-core/
fairy-skeleton/
2. Install:

```bash
cd projects/fairy-core
python -m venv .venv && source .venv/bin/activate
pip install -e .

cd ../fairy-skeleton
pip install -e .

```
3. Run a demo:
```bash
fairy-skel demos
fairy-skel run bulk_rnaseq_min # FAIL + WARN (shows findings)
fairy-skel run bulk_rnaseq_pass # PASS (submission_ready: True)

```
Or call the engine directly
```bash
fairy preflight \
--rulepack /absolute/path/to/fairy-core/src/fairy/rulepacks/GEO-SEQ-BULK/v0_1_0.json \
--samples /absolute/path/to/samples.tsv \
--files /absolute/path/to/files.tsv \
--out /path/to/out/report.json

```

## 🧪 Demos
Each demo is a folder under demos/ with a config.yaml:
```yaml
rulepack: /abs/path/to/fairy-core/src/fairy/rulepacks/GEO-SEQ-BULK/v0_1_0.json
inputs:
samples: demos//inputs/samples.tsv
files: demos//inputs/files.tsv
out: demos//out/report.json
```
Current demos:
- bulk_rnaseq_min - intentionally FAIL + WARN to show findings
- bulk_rnaseq_pass - clean PASS

List all the demos:
```bash
fairy-skel demos
```
Run one:
```bash
fairy-skel run
```
---
## 🗺️ Roadmap (v0.1 scope)
Streamlit Export & Validate tab wired to backend (warn-mode).

Deterministic report.json writer validated by JSON Schema.

Golden fixture test for bad.csv.

(See GitHub issues for v0.2 items like bundles, manifests, ZIP export, and provenance.)
---
## Create your own demo
```bash
mkdir -p demos/my_demo/inputs

# Provide TAB-separated TSVs
# samples.tsv
cat > demos/my_demo/inputs/samples.tsv <<'TSV'
sample_id organism collection_date
S1 Homo sapiens 2025-01-01
S2 Homo sapiens 2025-01-02
TSV

# files.tsv
cat > demos/my_demo/inputs/files.tsv <<'TSV'
sample_id path
S1 reads/S1_R1.fastq.gz
S2 reads/S2_R1.fastq.gz
TSV

# config.yaml
cat > demos/my_demo/config.yaml <<'YAML'
rulepack: /absolute/path/to/fairy-core/src/fairy/rulepacks/GEO-SEQ-BULK/v0_1_0.json
inputs:
samples: demos/my_demo/inputs/samples.tsv
files: demos/my_demo/inputs/files.tsv
out: demos/my_demo/out/report.json
YAML

# Try it
fairy-skel run my_demo

```
Tip: you can also point inputs: to files inside fairy-core/demos/... via absolute paths or symlinks.
---
## Repo structure & legacy note
- fairy_skeleton/ — small demo runner CLI (fairy-skel)
- demos/ — demo configs + inputs + outputs
- scripts/run_demo.sh — helper invoked by the runner
- _legacy/ — archived code moved out of the package; not maintained.
We prefer our branch during merges for this folder via .gitattributes.

Packaging: only fairy_skeleton* is packaged. The validator engine lives in FAIRy-core.
---
## FAIRy-core + versions
- Engine repo: https://github.com/yuummmer/fairy-core
- CLI commands available there: fairy validate, fairy preflight
- This skeleton requires FAIRy-core ≥ 0.1.0

(Coming soon: a tiny matrix mapping skeleton tags → minimum core version.)
---
## Development & tests
Skeleton has a minimal test/smoke setup. Most tests live in FAIRy-core.
```bash
pytest -q
```
Optional GitHub Actions “smoke” workflow suggestion:

- run pip install -e . (core + skeleton)
- fairy-skel run bulk_rnaseq_min and assert an output file exists
---
## Issues & support
- Engine/validator bugs → FAIRy-core issues
- Demo runner or example configs here → this repo’s issues

Security reports should target FAIRy-core.
---
## 📜 License

- **FAIRy Lab** (this repo, internally `fairy-skeleton`):
Application / UI code in this repository is licensed under the **MIT License**.
See [`LICENSE`](./LICENSE).

- **FAIRy-core (engine)**:
Licensed under **AGPL-3.0-only** in the core repository.

When this Lab UI is used together with FAIRy-core, the AGPL terms apply to
FAIRy-core and any modifications to it. The Lab code in this repo remains MIT,
and may also be adapted to work with other backends.

- **Third-party components**:
See [`THIRD_PARTY_LICENSES.md`](./THIRD_PARTY_LICENSES.md) if present.

The underlying engine, **FAIRy-core**, is licensed under **AGPL-3.0-only**
(see the core repository for details). If you embed or call FAIRy-core as part
of a product or service, the AGPL terms for FAIRy-core still apply unless you
have a separate commercial license for the core. For commercial licensing
questions around FAIRy-core, contact **hello@datadabra.com**.
---

## 📸 Screenshot

### Dashboard view
![FAIRy Dashboard](FAIRy_Dash.png)
---

## Citation

If you use FAIRy in demos, talks, or publications, please cite:

FAIRy (v0.1). Local-first validator for FAIR, AI-ready research data.
FAIRy-core (engine): https://github.com/yuummmer/fairy-core
FAIRy Lab (UI & labs): https://github.com/yuummmer/FAIRy-lab

For more detailed citation metadata (authors, version, DOI if applicable),
see [`CITATION.cff`](./CITATION.cff).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/yuummmer/fairy-lab

Awesome Lists containing this project

README