https://github.com/tiledb-inc/scverse-ml-workshop-2024
Scripts/Notebooks for "Training models on atlas-scale single-cell datasets" at scverse Conference 2024
https://github.com/tiledb-inc/scverse-ml-workshop-2024
scverse single-cell single-cell-rna-seq tiledb
Last synced: about 1 year ago
JSON representation
Scripts/Notebooks for "Training models on atlas-scale single-cell datasets" at scverse Conference 2024
- Host: GitHub
- URL: https://github.com/tiledb-inc/scverse-ml-workshop-2024
- Owner: TileDB-Inc
- Created: 2024-08-29T15:49:57.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2025-03-08T03:45:26.000Z (over 1 year ago)
- Last Synced: 2025-04-10T04:58:18.322Z (about 1 year ago)
- Topics: scverse, single-cell, single-cell-rna-seq, tiledb
- Language: Jupyter Notebook
- Homepage: https://cfp.scverse.org/2024/talk/GQHNYE/
- Size: 9.4 MB
- Stars: 8
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Training models on atlas-scale single-cell datasets
Workshop at [scverse Conference 2024]:
- **Conference page:** [Training models on atlas-scale single-cell datasets]
- **Slides:** [Google Slides][slides], [PDF]
- **Notebook:** [workshop.ipynb] is synced to the TileDB-Cloud namespace, and a copy created for each workshop participant (by [GitHub Action]).
[][slides]
---
## `scverse`: TileDB-Cloud CLI
We used this command-line tool to manage signups to the TileDB-Cloud namespace where live notebooks were hosted during the workshop, especially `scverse user add [emails...]` which:
- Invites a user to the TileDB-Cloud namespace used in the workshop (`scverse invite send [emails...]`)
- Creates a copy of [workshop.ipynb] in the namespace, renamed after the user's email address (`scverse nb cp [emails...]`)
- Sets relevant defaults on their copy of the notebook ("Genomics" image, "Large" size, "us-west-2" region; `scverse nb set-defaults [notebook names...]`)
### Install
```bash
git clone https://github.com/ryan-williams/scverse-workshop
pip install -e scverse-workshop
```
```bash
scverse
# Usage: scverse [OPTIONS] COMMAND [ARGS]...
#
# TileDB-Cloud CLI for the scverse project.
#
# Options:
# --help Show this message and exit.
#
# Commands:
# invite Create/List invitations to a TileDB-Cloud namespace.
# nb Notebook-CRUD TileDB-Cloud commands for the...
# org Manage a TileDB-Cloud organization / namespace.
# user Manage a scverse TileDB-Cloud user.
```
#### Auth
Put a TileDB-Cloud auth token in `.tiledb-cloud-token`, and it will be picked up automatically by the CLI commands below.
### `scverse org`: Show organization/user info
```bash
scverse org
# Usage: scverse org [OPTIONS] COMMAND [ARGS]...
#
# Manage a TileDB-Cloud organization / namespace.
#
# Options:
# --help Show this message and exit.
#
# Commands:
# show (info) Print a TileDB-Cloud organization.
```
scverse org show --help
```
Usage: scverse org show [OPTIONS]
Print a TileDB-Cloud organization.
Options:
-t, --cloud-token-path TEXT Path to file containing TileDB-Cloud auth
token; default: ".tiledb-cloud-token".
$TILEDB_REST_TOKEN takes precedence, if set.
-N, --namespace TEXT TileDB-Cloud namespace to work in; default:
"scverse-ml-workshop-2024"
--help Show this message and exit.
```
### `scverse invite`: Send/List/Revoke TileDB-Cloud workspace invitations
```bash
scverse invite
# Usage: scverse invite [OPTIONS] COMMAND [ARGS]...
#
# Create/List invitations to a TileDB-Cloud namespace.
#
# Options:
# --help Show this message and exit.
#
# Commands:
# do (send) Invite a user to a TileDB-Cloud namespace.
# ls (list) List a TileDB-Cloud namespace's outstanding invitations
# rm Revoke one or more outstanding invitations to a TileDB-Cloud
# namespace
```
scverse invite send --help
```
Usage: scverse invite send [OPTIONS] [EMAILS]...
Invite a user to a TileDB-Cloud namespace.
Options:
-t, --cloud-token-path TEXT Path to file containing TileDB-Cloud auth
token; default: ".tiledb-cloud-token".
$TILEDB_REST_TOKEN takes precedence, if set.
-N, --namespace TEXT TileDB-Cloud namespace to work in; default:
"scverse-ml-workshop-2024"
-R, --role [owner|admin|read_write|read_only]
Role to invite new user as (options:
['owner', 'admin', 'read_write',
'read_only']; default: "read_write")
--help Show this message and exit.
```
scverse invite list --help
```
Usage: scverse invite list [OPTIONS]
List a TileDB-Cloud namespace's outstanding invitations
Options:
-t, --cloud-token-path TEXT Path to file containing TileDB-Cloud auth
token; default: ".tiledb-cloud-token".
$TILEDB_REST_TOKEN takes precedence, if set.
-N, --namespace TEXT TileDB-Cloud namespace to work in; default:
"scverse-ml-workshop-2024"
-C, --compact Print compact JSON
--help Show this message and exit.
```
scverse invite rm --help
```
Usage: scverse invite rm [OPTIONS] [EMAILS]...
Revoke one or more outstanding invitations to a TileDB-Cloud namespace
Options:
-t, --cloud-token-path TEXT Path to file containing TileDB-Cloud auth
token; default: ".tiledb-cloud-token".
$TILEDB_REST_TOKEN takes precedence, if set.
-N, --namespace TEXT TileDB-Cloud namespace to work in; default:
"scverse-ml-workshop-2024"
-n, --dry-run Print commands that would be run, but don't run
them
-S, --no-strict Raise and exit if any email is not found,
without revoking any invites
--help Show this message and exit.
```
### `scverse nb`: Upload, list, and delete notebooks
```bash
scverse nb
# Usage: scverse nb [OPTIONS] COMMAND [ARGS]...
#
# Notebook-CRUD TileDB-Cloud commands for the `scverse-ml-workshop-2024`
# namespace.
#
# Options:
# --help Show this message and exit.
#
# Commands:
# cp (copy) Create one or more copies of a "template" notebook, with
# names...
# get (download) Download a TileDB-Cloud notebook; [DST] of "-" prints to
# stdout.
# ls (list) List TileDB-Cloud notebooks.
# md (metadata) Show a TileDB-Cloud notebook's metadata.
# put (upload) Upload a notebook to TileDB-Cloud.
# rm (delete) Delete a notebook from TileDB-Cloud, by namespace and
# name.
# sd (set-defaults) Set default image, region, and size for a notebook.
```
scverse nb ls --help
```
Usage: scverse nb ls [OPTIONS]
List TileDB-Cloud notebooks.
Options:
-t, --cloud-token-path TEXT Path to file containing TileDB-Cloud auth
token; default: ".tiledb-cloud-token".
$TILEDB_REST_TOKEN takes precedence, if set.
-N, --namespace TEXT TileDB-Cloud namespace to work in; default:
"scverse-ml-workshop-2024"
--help Show this message and exit.
```
scverse nb cp
```
Usage: scverse nb cp [OPTIONS] [EMAILS]...
Create one or more copies of a "template" notebook, with names corresponding
to provided email addresses.
Options:
-t, --cloud-token-path TEXT Path to file containing TileDB-Cloud auth
token; default: ".tiledb-cloud-token".
$TILEDB_REST_TOKEN takes precedence, if set.
-c, --credential-name TEXT Storage credential name; default: "scverse-ml-
workshop-2024"
-N, --namespace TEXT TileDB-Cloud namespace to work in; default:
"scverse-ml-workshop-2024"
-n, --dry-run Print commands that would be run, but don't
run them
-s, --src-notebook-name TEXT "Read-only" notebook name, to be copied and
renamed for each user (default:
"instructor_scverse-ml-workshop-2024").
--help Show this message and exit.
```
scverse nb get
```
Usage: scverse nb get [OPTIONS] NB_NAME [DST]
Download a TileDB-Cloud notebook; [DST] of "-" prints to stdout.
Options:
-t, --cloud-token-path TEXT Path to file containing TileDB-Cloud auth
token; default: ".tiledb-cloud-token".
$TILEDB_REST_TOKEN takes precedence, if set.
-N, --namespace TEXT TileDB-Cloud namespace to work in; default:
"scverse-ml-workshop-2024"
--help Show this message and exit.
```
scverse nb md
```
Usage: scverse nb md [OPTIONS] NB_NAME
Show a TileDB-Cloud notebook's metadata.
Options:
-t, --cloud-token-path TEXT Path to file containing TileDB-Cloud auth
token; default: ".tiledb-cloud-token".
$TILEDB_REST_TOKEN takes precedence, if set.
-N, --namespace TEXT TileDB-Cloud namespace to work in; default:
"scverse-ml-workshop-2024"
-C, --compact Print compact JSON
--help Show this message and exit.
```
scverse nb sd
```
Usage: scverse nb sd [OPTIONS] NB_NAME
Set default image, region, and size for a notebook.
Options:
-t, --cloud-token-path TEXT Path to file containing TileDB-Cloud auth
token; default: ".tiledb-cloud-token".
$TILEDB_REST_TOKEN takes precedence, if set.
-N, --namespace TEXT TileDB-Cloud namespace to work in; default:
"scverse-ml-workshop-2024"
-i, --image TEXT Default image
-r, --region TEXT Default region
-z, --size TEXT Default server size
--help Show this message and exit.
```
scverse nb put
```
Usage: scverse nb put [OPTIONS] SRC [DST_NAME]
Upload a notebook to TileDB-Cloud.
Options:
-t, --cloud-token-path TEXT Path to file containing TileDB-Cloud auth
token; default: ".tiledb-cloud-token".
$TILEDB_REST_TOKEN takes precedence, if set.
-c, --credential-name TEXT Storage credential name; default: "scverse-ml-
workshop-2024"
-N, --namespace TEXT TileDB-Cloud namespace to work in; default:
"scverse-ml-workshop-2024"
-S, --storage-path TEXT Storage path; default: "s3://tiledb-
conferences-us-west-2/scverse-ml-workshop-2024"
-d, --delete If True, delete the notebook after uploading
(e.g. for testing uploading/deleting)
--help Show this message and exit.
```
scverse nb rm
```
Usage: scverse nb rm [OPTIONS] [NB_NAMES]...
Delete a notebook from TileDB-Cloud, by namespace and name.
Options:
-t, --cloud-token-path TEXT Path to file containing TileDB-Cloud auth
token; default: ".tiledb-cloud-token".
$TILEDB_REST_TOKEN takes precedence, if set.
-N, --namespace TEXT TileDB-Cloud namespace to work in; default:
"scverse-ml-workshop-2024"
-n, --dry-run Print commands that would be run, but don't run
them
--help Show this message and exit.
```
### `scverse user`: Add/Show/List organization users
```bash
scverse user
# Usage: scverse user [OPTIONS] COMMAND [ARGS]...
#
# Manage a scverse TileDB-Cloud user.
#
# Options:
# --help Show this message and exit.
#
# Commands:
# add Add and initialize (notebook copy + defaults) one or more users...
# ls List users in a TileDB-Cloud organization.
# show Show a TileDB-Cloud user.
```
```bash
scverse user show --help
# Usage: scverse user show [OPTIONS] [USERNAME]
#
# Show a TileDB-Cloud user.
#
# Options:
# -t, --cloud-token-path TEXT Path to file containing TileDB-Cloud auth
# token; default: ".tiledb-cloud-token".
# $TILEDB_REST_TOKEN takes precedence, if set.
# -C, --compact Print compact JSON
# --help Show this message and exit.
```
```bash
scverse user ls --help
# Usage: scverse user ls [OPTIONS]
#
# List users in a TileDB-Cloud organization.
#
# Options:
# -t, --cloud-token-path TEXT Path to file containing TileDB-Cloud auth
# token; default: ".tiledb-cloud-token".
# $TILEDB_REST_TOKEN takes precedence, if set.
# -N, --namespace TEXT TileDB-Cloud namespace to work in; default:
# "scverse-ml-workshop-2024"
# -C, --compact Print compact JSON
# --help Show this message and exit.
```
```bash
scverse user add
# Usage: scverse user add [OPTIONS] [EMAILS]...
#
# Add and initialize (notebook copy + defaults) one or more users to a TileDB-
# Cloud namespace.
#
# Options:
# -t, --cloud-token-path TEXT Path to file containing TileDB-Cloud auth
# token; default: ".tiledb-cloud-token".
# $TILEDB_REST_TOKEN takes precedence, if set.
# -c, --credential-name TEXT Storage credential name; default: "scverse-
# ml-workshop-2024"
# -N, --namespace TEXT TileDB-Cloud namespace to work in; default:
# "scverse-ml-workshop-2024"
# -R, --role [owner|admin|read_write|read_only]
# Role to invite new user as (options:
# ['owner', 'admin', 'read_write',
# 'read_only']; default: "read_write")
# -s, --src-notebook-name TEXT "Read-only" notebook name, to be copied and
# renamed for each user (default:
# "instructor_scverse-ml-workshop-2024").
# -i, --image TEXT Default image
# -r, --region TEXT Default region
# -z, --size TEXT Default server size
# --help Show this message and exit.
```
## Example notebooks/tutorials
See [examples/](examples/):
- [pytorch.ipynb]: copy of [Training a PyTorch Model][pytorch.html] (CELLxGENE Census tutorial)
- [cshl.ipynb]: copy of [CELLxGENE Discover Census Workshop - CSHL Single-Cell Analysis 2023][cshl-2023] (Python)
- [cshl-R.ipynb]: copy of [CELLxGENE Discover Census Workshop - CSHL Single-Cell Analysis 2023][cshl-2023 R] (R)
[Training models on atlas-scale single-cell datasets]: https://cfp.scverse.org/2024/talk/GQHNYE/
[scverse Conference 2024]: https://scverse.org/conference2024
[slides]: https://docs.google.com/presentation/d/1VnAKyOUUdzTZkgcYjoavtDU5_drFu5flC5oG6I7RnP0/edit
[PDF]: census-tiledb-atlas-scale-models_r1.pdf
[workshop.ipynb]: workshop.ipynb
[pytorch.ipynb]: examples/pytorch.ipynb
[pytorch.html]: https://chanzuckerberg.github.io/cellxgene-census/notebooks/experimental/pytorch.html
[cshl.ipynb]: examples/cshl.ipynb
[cshl-R.ipynb]: examples/cshl-R.ipynb
[Papermill]: https://papermill.readthedocs.io/en/latest/
[cshl-2023]: https://colab.research.google.com/drive/1QgZQRF_ZM9q5oKbynnD9ToklVFdui7pq
[cshl-2023 R]: https://colab.research.google.com/drive/158f6Ggl5MRxtnxC9Q01TjJMbkIPQxcim
[GitHub Action]: .github/workflows/cp-template.yml