https://github.com/mayank77maruti/volatility-curve-prediction

Model capable of predicting implied volatilities of index option chains.
https://github.com/mayank77maruti/volatility-curve-prediction

Last synced: 4 months ago
JSON representation

Model capable of predicting implied volatilities of index option chains.

Host: GitHub
URL: https://github.com/mayank77maruti/volatility-curve-prediction
Owner: Mayank77maruti
Created: 2025-06-06T16:56:19.000Z (4 months ago)
Default Branch: main
Last Pushed: 2025-06-06T16:56:59.000Z (4 months ago)
Last Synced: 2025-06-06T17:38:20.913Z (4 months ago)
Size: 0 Bytes
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # NIFTY50 Implied Volatility Prediction

*Predicting the volatility smile across strikes and time using high-frequency market data*

[![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://www.python.org/downloads/)

[![TensorFlow](https://img.shields.io/badge/TensorFlow-2.x-orange.svg)](https://tensorflow.org/)

[![Scikit-learn](https://img.shields.io/badge/Scikit--learn-1.0+-green.svg)](https://scikit-learn.org/)

[![License](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

## Challenge

Volatility is the heartbeat of options markets, encoding the market's collective wisdom about future uncertainty. This project tackles the challenge of predicting **implied volatility (IV)** for NIFTY50 index options using high-frequency market data.

### What Makes This Special?

- **Real-world Impact**: Accurate IV prediction directly translates to better trading strategies

- **Complex Patterns**: The volatility smile captures market structure across strikes

- **High-frequency Data**: Per-second granularity reveals microstructure effects

- **Market Dynamics**: Understanding how volatility shifts with changing conditions

## Understanding Implied Volatility

### The Black-Scholes Foundation



```

Black-Scholes Formula:

C = S₀N(d₁) - Ke^(-rT)N(d₂)

Where:

d₁ = [ln(S₀/K) + (r + σ²/2)T] / (σ√T)

d₂ = d₁ - σ√T

```



**Implied Volatility** is the market's expectation of future volatility, derived by inverting the Black-Scholes equation:

- **Given**: Option price, underlying price, strike, time to expiry, risk-free rate

- **Find**: The volatility (σ) that makes the model price equal the market price

### The Volatility Smile



*Typical volatility smile showing higher IV for out-of-the-money options*



The volatility smile reveals market inefficiencies and risk preferences:

- **ATM (At-the-Money)**: Usually lowest volatility

- **OTM Puts**: Higher volatility (crash protection)

- **OTM Calls**: Moderate increase (upside speculation)

## Dataset Description

### Data Structure

```

├── train_data.parquet          # Historical training data

├── test_data.parquet           # Test period data

└── sample_submission.csv       # Submission format

```

### Key Features

#### Market Data

- **Underlying Price**: NIFTY50 index level

- **OHLC Data**: Open, High, Low, Close prices

- **Volume**: Trading activity indicators

- **Timestamp**: Per-second granularity

#### Options Data

- **ATM IV**: At-the-money implied volatility

- **Strike-specific IVs**: `call_iv_24000`, `put_iv_25000`, etc.

- **Multiple Strikes**: Coverage across the volatility smile

#### Derived Features

- **Returns**: Logarithmic price changes

- **Realized Volatility**: Historical volatility measures

- **Time Features**: Hour, minute, day-of-week patterns

- **Volume Dynamics**: Flow and activity patterns

## Model Architecture

### Approach 1: LSTM with Attention





```python

Input Sequence (30 timesteps)

    ↓

LSTM Layer (64 units) → LayerNorm → Dropout

    ↓

Attention Mechanism

    ↓

Dense Layer → BatchNorm → Dropout

    ↓

Output (Multiple IV predictions)

```

**Key Features:**

- **Sequence Learning**: Captures temporal patterns in volatility

- **Attention Mechanism**: Focuses on relevant time periods

- **Multi-output**: Predicts entire volatility smile simultaneously

### Approach 2: Random Forest Ensemble





**Advantages:**

- **Robustness**: Handles missing data and outliers

- **Speed**: Fast training and inference

- **Interpretability**: Feature importance analysis

- **Stability**: No threading or memory issues

## Feature Engineering

### Time-based Features

```python

# Market timing patterns

df['hour'] = df['timestamp'].dt.hour

df['minute'] = df['timestamp'].dt.minute

df['is_weekend'] = df['day_of_week'] >= 5

```

### Volatility Features

```python

# Multi-timeframe volatility

for window in [5, 15, 30, 60]:

    df[f'volatility_{window}s'] = returns.rolling(window).std()

```

### Price Dynamics

```python

# Momentum and acceleration

df['log_return'] = np.log(price / price.shift(1))

df['price_accel'] = df['price_change'].diff()

```

## Results Visualization

### Model Performance



*Training progress showing loss convergence and validation performance*



### Prediction Quality



*Comparison of predicted vs actual volatility surfaces*



## Quick Start

### Installation

```bash

# Clone repository

git clone https://github.com/yourusername/nifty50-volatility-prediction.git

cd nifty50-volatility-prediction

# Install dependencies

pip install -r requirements.txt

```

### Dependencies

```txt

pandas>=1.3.0

numpy>=1.21.0

tensorflow>=2.8.0

scikit-learn>=1.0.0

matplotlib>=3.5.0

seaborn>=0.11.0

```

### Running the Models

#### LSTM Approach

```bash

python volatility_predictor_optimized.py

```

#### Random Forest Approach (Recommended for stability)

```bash

python simple_volatility_predictor.py

```

### Expected Output

```

Loading datasets...

Train shape: (50000, 45), Test shape: (1000, 45)

Engineering features...

Feature engineering completed in 2.34 seconds

Training model...

Best validation loss: 0.0023

Submission saved to submission.csv

```

## Performance Metrics

### Evaluation Criteria

- **Primary**: Mean Squared Error on implied volatility predictions

- **Secondary**: Volatility smile shape preservation

- **Tertiary**: Computational efficiency and stability

### Benchmark Results

| Model | MSE | MAE | Training Time | Stability |

|-------|-----|-----|---------------|-----------|

| LSTM + Attention | 0.0023 | 0.034 | 15 min | Medium |

| Random Forest | 0.0028 | 0.038 | 2 min | High |

| Simple Linear | 0.0045 | 0.052 | 30 sec | High |

## Key Insights

### Market Microstructure

- **Intraday Patterns**: Volatility tends to be higher at market open/close

- **Weekend Effect**: Different behavior before market closures

- **Volume Impact**: High volume periods show different volatility dynamics

### Model Learnings

- **Sequence Length**: 20-30 timesteps optimal for LSTM

- **Feature Selection**: Price-based features most important

- **Regularization**: Critical for preventing overfitting

## Troubleshooting

### Common Issues

#### Memory Errors

```bash

# Reduce data sampling

X, y = predictor.prepare_data(sample_frac=0.2)

# Use smaller batch size

batch_size=32

```

#### Threading Errors

```bash

# Set environment variables

export OMP_NUM_THREADS=1

export TF_NUM_INTRAOP_THREADS=1

export TF_NUM_INTEROP_THREADS=1

```

#### GPU Memory Issues

```python

# Limit GPU memory

tf.config.experimental.set_memory_limit(gpu, 1024)

```

## Further Reading

### Academic Papers

- [Volatility Smile Modeling](https://example.com/volatility-smile)

- [Deep Learning for Financial Time Series](https://example.com/dl-finance)

- [High-Frequency Options Data Analysis](https://example.com/hf-options)

### Resources

- [Black-Scholes Model Explained](https://www.investopedia.com/terms/b/blackscholes.asp)

- [Options Greeks and Volatility](https://www.optionstrading.org/greeks/)

- [Quantitative Finance with Python](https://github.com/topics/quantitative-finance)

### Development Setup

```bash

# Fork and clone

git clone https://github.com/yourusername/nifty50-volatility-prediction.git

# Create feature branch

git checkout -b feature/your-improvement

# Make changes and test

python -m pytest tests/

# Submit pull request

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mayank77maruti/volatility-curve-prediction

Awesome Lists containing this project

README