An open API service indexing awesome lists of open source software.

https://github.com/codesignal/learn_simulation-transformers

Bespoke simulation about the Transformer architecture.
https://github.com/codesignal/learn_simulation-transformers

Last synced: 7 months ago
JSON representation

Bespoke simulation about the Transformer architecture.

Awesome Lists containing this project

README

          

# Travel Through Transformers

An interactive web-based simulation that lets learners follow a single token step-by-step through every component of a Transformer encoder/decoder stack.

## Features

- **Component-focused visualization**: Click through different transformer components to see detailed internals
- **Interactive parameters**: Adjust layers, model dimensions, attention heads, and sequence length in real-time
- **Dual visualization modes**: Abstract shape view for understanding flow, or detailed numerical values
- **Multi-head attention visualization**: See how different attention heads process information
- **Event logging**: All interactions are logged for analytics

## Quick Start

### Prerequisites

- Node.js 16+ and npm
- Python 3.7+

### Installation

1. **Install dependencies**:
```bash
npm install
```

2. **Build the application**:
```bash
npm run build
```

3. **Start the server**:
```bash
python server/server.py
```

4. **Open your browser**:
Navigate to `http://localhost:3000`

### Development Mode

For development with hot reloading:

```bash
# Terminal 1 - Start the development server
npm run dev

# Terminal 2 - Start the logging server
python server/server.py
```

Then open `http://localhost:5173` (development) or `http://localhost:3000` (production).

## Usage

### Controls

1. **Component Selection**: Click on transformer components in the diagram to explore their internals
2. **Show Values Toggle**: Switch between abstract block view and actual numerical matrices
3. **Model Parameters**:
- Adjust number of layers (1-6)
- Change model dimension (32-512)
- Set attention heads (1-8, must divide model dimension)
- Modify sequence length (3-10)
- Choose positional encoding type
- Enable dropout visualization

### Understanding the Visualization

#### Abstract Mode (Default)
- Colored blocks represent matrices with dimensions shown
- Different colors indicate different types of operations
- Active components highlighted with distinctive colors
- Attention heads shown in different colors

#### Values Mode
- Heat maps show actual numerical values
- Color intensity represents magnitude
- Hover for precise values
- Active components highlighted

## Architecture

```
travel-through-transformers/
├── src/
│ ├── components/ # React UI components
│ │ ├── MatrixVisualization.tsx # D3-powered matrix visualization
│ │ ├── TokenVisualization.tsx # Token display and interaction
│ │ ├── SettingsMenu.tsx # Parameter controls
│ │ ├── ComponentDetailsPanel.tsx # Component detail exploration
│ │ ├── HelpMenu.tsx # Help system
│ │ └── TransformerDiagram.tsx # Main architecture diagram
│ ├── hooks/ # Custom React hooks
│ │ ├── useTransformerMachine.ts # Main state management (XState)
│ │ ├── useTransformerDiagram.ts # Diagram interaction logic
│ │ └── useEventLogger.ts # Analytics logging
│ ├── utils/ # Utility functions
│ │ ├── math.ts # Matrix operations
│ │ ├── randomWeights.ts # Seeded random generation
│ │ ├── constants.ts # Configuration and steps
│ │ ├── data.ts # Sample data generation
│ │ ├── componentDataGenerator.ts # Component data creation
│ │ └── componentTransformations.ts # Math transformations
│ ├── state/ # State management
│ │ └── transformerMachine.ts # XState machine definition
│ └── types/ # TypeScript definitions
│ └── events.d.ts # Event and parameter types
├── server/
│ └── server.py # Python logging server
└── logs/ # Event logs (generated)
```

## Educational Goals

This simulation helps learners understand:

1. **Component Architecture**: How transformer components are organized and connected
2. **Attention Mechanism**: How queries, keys, and values interact
3. **Multi-Head Attention**: How different heads capture different patterns
4. **Residual Connections**: How information flows around attention blocks
5. **Layer Normalization**: How activations are normalized
6. **Feed-Forward Networks**: How information is processed after attention
7. **Positional Encoding**: How position information is added to tokens
8. **Cross-Attention**: How decoder attends to encoder representations

## Technical Details

- **Frontend**: React + TypeScript + Vite
- **State Management**: XState for complex state transitions
- **Visualization**: D3.js for interactive SVG graphics
- **Styling**: TailwindCSS with CodeSignal brand colors
- **Math**: Custom lightweight tensor operations (no external ML libraries)
- **Backend**: Simple Python HTTP server for logging
- **Data**: Seeded random weights for reproducible results

## Customization

### Adding New Components

1. Add component definition to transformer machine states
2. Implement component logic in `componentDataGenerator.ts`
3. Add appropriate visualizations in component files
4. Update component transformations in `componentTransformations.ts`

### Modifying Visualization

- Matrix colors: Edit `COLORS` in `constants.ts`
- D3 rendering: Modify `MatrixVisualization.tsx`
- Component descriptions: Update transformer machine configuration

### Analytics

Event logs are stored in `logs/simulation_log.jsonl` with schema:
```json
{
"timestamp": 1625239200,
"event_type": "param_change" | "component_select" | "toggle" | "zoom_change",
"payload": { /* event-specific data */ }
}
```

## Browser Support

- Chrome 90+
- Firefox 88+
- Safari 14+
- Edge 90+

## Performance

Optimized for:
- 60 FPS animations
- Sequence length ≤ 10
- Attention heads ≤ 8
- Model dimension ≤ 512

## License

MIT License - see [LICENSE](LICENSE) for details.

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests if applicable
5. Submit a pull request

## Troubleshooting

### Common Issues

**"Cannot find module" errors**: Run `npm install`

**Server won't start**: Check that port 3000 is available, or specify a different port: `python server/server.py 3001`

**Visualization not updating**: Try refreshing the page or clearing browser cache

**Performance issues**: Reduce model parameters (fewer layers, smaller dimensions)

### Debug Mode

Enable debug logging:
```bash
DEBUG=1 python server/server.py
```

## Educational Extensions

Future enhancements could include:

- Real model weights from Hugging Face
- Attention pattern analysis
- Interactive quizzes between steps
- Comparison with other architectures
- Custom text input
- Export/import configurations