https://github.com/fal-ai/lavender-data
Load & manage evolving datasets efficiently
https://github.com/fal-ai/lavender-data
data dataloader ml torch
Last synced: 7 months ago
JSON representation
Load & manage evolving datasets efficiently
- Host: GitHub
- URL: https://github.com/fal-ai/lavender-data
- Owner: fal-ai
- License: mit
- Created: 2025-03-24T08:19:38.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-06-16T02:08:53.000Z (8 months ago)
- Last Synced: 2025-06-16T03:38:28.432Z (8 months ago)
- Topics: data, dataloader, ml, torch
- Language: Python
- Homepage: https://docs.lavenderdata.com/
- Size: 11 MB
- Stars: 14
- Watchers: 6
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Load & evolve datasets efficiently
Please visit our docs for more information.
docs.lavenderdata.com
## Quick Start
### Installation
```bash
pip install lavender-data
```
#### Start the server
```bash
lavender-data server start --init
```
```
lavender-data is running on 0.0.0.0:8000
UI is running on http://localhost:3000
API key created: la-...
```
Save the API key to use it in the next steps.
```bash
export LAVENDER_API_URL=http://0.0.0.0:8000
export LAVENDER_API_KEY=la-...
```
### Create an example dataset
```bash
lavender-data client \
datasets create \
--name my_dataset \
--uid-column-name id \
--shardset-location https://docs.lavenderdata.com/example-dataset/images/
```
### Iterate over the dataset
```python
import lavender_data.client as lavender
lavender.init()
iteration = lavender.LavenderDataLoader(
dataset_name="my_dataset",
shuffle=True,
shuffle_block_size=10,
)
for i in iteration:
print(i["id"])
```
Please visit our docs for more information.
docs.lavenderdata.com