https://github.com/zeiss/things_eeg2_dataset

🧠⚡A CLI to access and process the THINGS-EEG2 dataset by Gifford et al. (2022)
https://github.com/zeiss/things_eeg2_dataset

cli eeg electroencephalography mne neuroscience

Last synced: about 2 months ago
JSON representation

🧠⚡A CLI to access and process the THINGS-EEG2 dataset by Gifford et al. (2022)

Host: GitHub
URL: https://github.com/zeiss/things_eeg2_dataset
Owner: ZEISS
License: other
Created: 2025-11-26T07:38:43.000Z (7 months ago)
Default Branch: main
Last Pushed: 2026-03-07T15:02:10.000Z (4 months ago)
Last Synced: 2026-03-07T17:44:03.632Z (4 months ago)
Topics: cli, eeg, electroencephalography, mne, neuroscience
Language: Python
Homepage:
Size: 14.2 MB
Stars: 8
Watchers: 0
Forks: 0
Open Issues: 8
Metadata Files:
- Readme: README.md
- License: LICENSE
- Codeowners: .github/CODEOWNERS

Awesome Lists containing this project

README

          
  

  

  



[![PyPI][pypi-badge]][pypi]

[![Conda Platform][conda-badge]][conda-url]

[![License][license-badge]][license-url]

[![CI Status][ci-badge]][ci-url]

[pypi-badge]: https://img.shields.io/pypi/v/things_eeg2_dataset?style=flat-square&label=PyPI

[pypi]: https://pypi.org/project/things-eeg2-dataset/

[license-badge]: https://img.shields.io/badge/License-CC%20BY--NC%204.0-yellow.svg?style=flat-square

[license-url]: LICENSE

[ci-badge]: https://img.shields.io/github/actions/workflow/status/zeiss/things_eeg2_dataset/ci.yml?branch=main&style=flat-square&label=CI

[ci-url]: https://github.com/zeiss/things_eeg2_dataset/actions/workflows/ci.yml

[conda-badge]: https://img.shields.io/conda/vn/conda-forge/things_eeg2_dataset?style=flat-square

[conda-url]: https://prefix.dev/channels/conda-forge/packages/things_eeg2_dataset



# Introduction

This package provides tools for downloading, preprocessing the raw THINGS-EEG2 data, and generating image embeddings using various vision models.

> [!WARNING]

> This repository builds upon the original data processing by [Gifford et al (2022)](https://github.com/gifale95/eeg_encoding).

> Please check out their original code and the [corresponding paper](https://www.sciencedirect.com/science/article/pii/S1053811922008758?via%3Dihub).

>

> We are in no way associated with the authors.

> Nonetheless we hope, that this makes things easier (pun intended) to use.

## Installation

### CLI-only

If you only need the CLI functionality, you can run it using one line of code:

#### Using the PyPI package (with uv)

```bash

uvx run --from things_eeg2_dataset things-eeg2

```

#### Using the conda package (with pixi)

```bash

pixi exec --with things_eeg2_dataset things-eeg2

```

### From GitHub

```bash

git clone git@github.com:ZEISS/things_eeg2_dataset.git

cd things_eeg2_dataset

uv sync

uv pip install --editable .

source .venv/bin/activate

things-eeg2 --help

things-eeg2 --install-completion

# Then restart your shell

# Example for zsh:

source ~/.zshrc

```

### From PyPI

```bash

# Using UV

uv init

uv add things_eeg2_dataset

source .venv/bin/activate

things-eeg2 --help

things-eeg2 --install-completion

# Then restart your shell

# Example for zsh:

source ~/.zshrc

```

### Using the conda package

```bash

# Using pixi  

pixi init

pixi add things_eeg2_dataset

pixi shell

things-eeg2 --help

things-eeg2 --install-completion

# Then restart your shell

# Example for zsh:

source ~/.zshrc

```

## Usage

![things_eeg2_dataset demo](https://raw.githubusercontent.com/ZEISS/things_eeg2_dataset/refs/heads/main/.github/assets/demo/demo-light.gif#gh-light-mode-only)

![things_eeg2_dataset demo](https://raw.githubusercontent.com/ZEISS/things_eeg2_dataset/refs/heads/main/.github/assets/demo/demo-dark.gif#gh-dark-mode-only)

## Data Structure

You can understand the data structure that is created by the CLI by referring to [paths.py](src/things_eeg2_dataset//paths.py).

It contains the ground truth data structure used throughout the project.

### Embedding Generation (`embedding_processing/`)

The package supports multiple state-of-the-art vision models for generating image embeddings:

| Model | Embedder Class | Description |

|-------|----------------|-------------|

| `open-clip-vit-h-14` | `OpenClipViTH14Embedder` | OpenCLIP ViT-H/14 (SDXL image encoder) |

| `openai-clip-vit-l-14` | `OpenAIClipVitL14Embedder` | OpenAI CLIP ViT-L/14 |

| `dinov2` | `DinoV2Embedder` | DINOv2 with registers (self-supervised) |

| `ip-adapter` | `IPAdapterEmbedder` | IP-Adapter Plus projections |

Each embedder generates:

- **Pooled embeddings**: Single vector per image (e.g., `(1024,)` for ViT-H-14)

- **Full sequence embeddings**: All tokens (e.g., `(257, 1280)` for ViT-H-14)

- **Text embeddings**: Corresponding text features from image captions

**Output Files:**

```bash

embeddings/

├── ViT-H-14_features_training.safetensors           # Pooled embeddings

├── ViT-H-14_features_training_full.safetensors      # Full token sequences

├── ViT-H-14_features_test.safetensors

└── ViT-H-14_features_test_full.safetensors

```

### Using the dataloader

```python

from things_eeg2_dataset.dataloader import ThingsEEGDataset

dataset = ThingsEEGDataset(

    image_model="ViT-H-14",

    data_path="/path/to/processed_data",

    img_directory_training="/path/to/images/train",

    img_directory_test="/path/to/images/test",

    embeddings_dir="/path/to/embeddings",

    train=True,

    time_window=(0.0, 1.0),

)

```

See `things_eeg2_dataloader/README.md` for detailed usage.

## References & Citation

We are happy users of the [THINGS-EEG2 dataset](https://things-initiative.org/), but not associated with the original authors.

If you use this code, please cite the [THINGS-EEG2 paper](https://www.sciencedirect.com/science/article/pii/S1053811922008758?via%3Dihub):

> Gifford, A. T., Lahner, B., Saba-Sadiya, S., Vilas, M. G., Lascelles, A., Oliva, A., ... & Cichy, R. M. (2022). The THINGS-EEG2 dataset. Scientific Data.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zeiss/things_eeg2_dataset

Awesome Lists containing this project

README