https://github.com/chirindaopensource/non_linear_forecasting_backcasting
Python implementation of Gourieroux-Jasiak's (2025) mixed causal-noncausal VAR models. Features probabilistic forecasting, nonlinear innovation filtering, and state-dependent IRF analysis for financial time series with explosive dynamics. Enables robust risk assessment and structural analysis of speculative behavior.
https://github.com/chirindaopensource/non_linear_forecasting_backcasting
bootstrap bubble-analysis financial-modeling forecasting jupyter-notebook monte-carlo nonlinear-modeling numpy pandas python quantitative-finance research-implementation risk-management scipy statistical-modeling time-series uncertainty-quantification var-models
Last synced: 7 months ago
JSON representation
Python implementation of Gourieroux-Jasiak's (2025) mixed causal-noncausal VAR models. Features probabilistic forecasting, nonlinear innovation filtering, and state-dependent IRF analysis for financial time series with explosive dynamics. Enables robust risk assessment and structural analysis of speculative behavior.
- Host: GitHub
- URL: https://github.com/chirindaopensource/non_linear_forecasting_backcasting
- Owner: chirindaopensource
- License: mit
- Created: 2025-07-18T14:22:54.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2025-07-18T16:23:29.000Z (7 months ago)
- Last Synced: 2025-07-18T19:05:03.661Z (7 months ago)
- Topics: bootstrap, bubble-analysis, financial-modeling, forecasting, jupyter-notebook, monte-carlo, nonlinear-modeling, numpy, pandas, python, quantitative-finance, research-implementation, risk-management, scipy, statistical-modeling, time-series, uncertainty-quantification, var-models
- Language: Jupyter Notebook
- Homepage:
- Size: 88.9 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# `README.md`
# *Nonlinear Fore(Back)casting and Innovation Filtering for Causal-Noncausal VAR Models* Implementation
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/downloads/)
[](https://github.com/psf/black)
[](https://pycqa.github.io/isort/)
[](http://mypy-lang.org/)
[](https://jupyter.org/)
[](https://pandas.pydata.org/)
[](https://numpy.org/)
[](https://scipy.org/)
[](https://matplotlib.org/)
[](https://joblib.readthedocs.io/)
[](https://www.statsmodels.org/)
[](https://arxiv.org/abs/2205.09922)
[](https://github.com/chirindaopensource/non_linear_forecasting_backcasting)
[](https://github.com/chirindaopensource/non_linear_forecasting_backcasting)
[](https://github.com/chirindaopensource/non_linear_forecasting_backcasting)
[](https://github.com/chirindaopensource/non_linear_forecasting_backcasting)
**Repository:** https://github.com/chirindaopensource/non_linear_forecasting_backcasting
**Owner:** 2025 Craig Chirinda (Open Source Projects)
This repository contains an **independent**, professional-grade Python implementation of the research methodology from the 2025 paper entitled **"Nonlinear Fore(Back)casting and Innovation Filtering for Causal-Noncausal VAR Models"** by:
* Christian Gourieroux
* Joann Jasiak
The project provides a complete, computationally tractable system for the quantitative analysis of dynamic systems prone to speculative bubbles and other forms of locally explosive behavior. It enables robust, state-dependent risk assessment, probabilistic forecasting, and structural "what-if" scenario analysis that accounts for both nonlinear dynamics and model estimation uncertainty.
## Table of Contents
- [Introduction](#introduction)
- [Theoretical Background](#theoretical-background)
- [Features](#features)
- [Methodology Implemented](#methodology-implemented)
- [Core Components (Notebook Structure)](#core-components-notebook-structure)
- [Key Callable: run_full_research_pipeline](#key-callable-run_full_research_pipeline)
- [Prerequisites](#prerequisites)
- [Installation](#installation)
- [Input Data Structure](#input-data-structure)
- [Usage](#usage)
- [Output Structure](#output-structure)
- [Project Structure](#project-structure)
- [Customization](#customization)
- [Contributing](#contributing)
- [License](#license)
- [Citation](#citation)
- [Acknowledgments](#acknowledgments)
## Introduction
This project provides a Python implementation of the advanced econometric framework presented in Gourieroux and Jasiak (2025). The core of this repository is the iPython Notebook `non_linear_forecasting_backcasting_draft.ipynb`, which contains a comprehensive suite of functions to estimate, forecast, and analyze mixed causal-noncausal Vector Autoregressive (VAR) models.
Standard linear VAR models are purely causal and assume Gaussian errors, making them ill-suited for capturing the dynamics of financial and economic time series that exhibit bubbles, sudden crashes, or other forms of explosive behavior. The mixed causal-noncausal framework addresses this by allowing for roots of the VAR characteristic polynomial to lie both inside and outside the unit circle, generating a strictly stationary process with highly nonlinear, state-dependent dynamics.
This codebase enables researchers, quantitative analysts, and macroeconomists to:
- Rigorously estimate mixed VAR models using the semi-parametric Generalized Covariance (GCov) method.
- Generate full, non-Gaussian predictive densities for probabilistic forecasting.
- Quantify estimation uncertainty using a novel backward-bootstrap procedure to create confidence sets for prediction intervals.
- Filter the underlying nonlinear, structural innovations of the system.
- Conduct state-dependent Impulse Response Function (IRF) analysis to understand how the system responds to shocks in "on-bubble" versus "off-bubble" states.
## Theoretical Background
The methodology implemented in this project is a direct translation of the unified framework presented in the source paper. It leverages the state-space representation of a VAR(p) process to separate its dynamics into stable (causal) and unstable (non-causal) components.
### 1. The Mixed Causal-Noncausal VAR Model
The model is defined by the standard VAR(p) equation, but with a critical difference in its assumptions:
$Y_t = \Phi_1 Y_{t-1} + \dots + \Phi_p Y_{t-p} + \epsilon_t$
The roots of the characteristic polynomial `det(I - \sum \Phi_i \lambda^i) = 0` can be both inside (`causal`) and outside (`non-causal`) the unit circle. The errors `\epsilon_t` are assumed to be i.i.d. and non-Gaussian.
### 2. State-Space Decomposition and Predictive Density
The VAR(p) process is transformed into a VAR(1) in state-space using the companion matrix `\Psi`. A **Jordan Decomposition** (`\Psi = A J A^{-1}`) separates the system into latent causal (`Z_1`) and non-causal (`Z_2`) states. This separation is the key to the paper's central theoretical result: a closed-form expression for the one-step-ahead predictive density, given in **Equation 3.1**:
$l(y | Y_T) = \frac{l_2(A^2 \tilde{y}_{T+1})}{l_2(A^2 \tilde{Y}_T)} |\det J_2| g(y - \sum \Phi_i Y_{T-i+1})$
- `g` is the density of the error `\epsilon_t`.
- `l_2` is the stationary density of the non-causal state `Z_2`.
This density is nonlinear and state-dependent, allowing it to capture complex dynamics.
### 3. Uncertainty Quantification via Backward Bootstrap
To account for estimation uncertainty, the framework uses a novel "backward bootstrap" procedure. Since the model is Markovian in both forward and reverse time, one can generate synthetic data paths by **backcasting** from the terminal observation `Y_T`. By re-estimating the model on many such paths, the sampling distribution of the prediction interval is obtained, which is then used to construct a robust **Confidence Set for the Prediction Interval (CSPI)**, as defined in **Equation 4.10**.
### 4. Nonlinear Innovation Filtering and State-Dependent IRFs
Standard VAR shocks are not meaningful in this context. The paper defines true, past-independent structural innovations `v_t` via the **Probability Integral Transform (PIT)**. This involves estimating the conditional CDF of the latent states and transforming it to a standard normal distribution.
**Equation 5.5:** $v_{2,t} = \Phi^{-1}[F_2(Z_{2,t}|Z_{t-1})]$
Simulating the model forward using the inverse of this transformation allows for the computation of **state-dependent Impulse Response Functions (IRFs)**, which show how the system's response to a shock `\delta` changes depending on its initial state (e.g., during a bubble).
## Features
The `non_linear_forecasting_backcasting_draft.ipynb` notebook implements the full research pipeline:
- **Robust Data Pipeline:** Validation, cleaning, and preparation of time series data.
- **Advanced Estimator:** A complete implementation of the semi-parametric GCov estimator for VAR parameters.
- **Probabilistic Forecasting:** Functions to compute the full predictive density, point forecasts (mode), and prediction intervals.
- **Advanced Uncertainty Quantification:** A parallelized implementation of the backward bootstrap with SIR sampling to generate confidence sets.
- **Structural Analysis:** Functions to filter nonlinear innovations and simulate state-dependent IRFs.
- **Model Validation:** A full simulation study framework to assess the finite-sample properties of the pipeline.
- **Sensitivity Analysis:** Tools to conduct robustness checks on key model parameters.
- **Integrated Visualization:** A dedicated class for generating all key publication-quality plots.
## Methodology Implemented
The codebase is a direct, one-to-one implementation of the paper's methodology:
1. **Data Preparation (Tasks 1-2):** Ingests and prepares data as per the paper's empirical application.
2. **Estimation (Tasks 3-5):** Implements the GCov estimator, Jordan decomposition, and non-parametric density estimation.
3. **Forecasting (Tasks 6-8):** Implements the predictive density formula and extracts point and interval forecasts.
4. **Uncertainty (Task 9):** Implements the full backward bootstrap with SIR sampling to compute confidence sets.
5. **Structural Analysis (Tasks 10-11):** Implements the Nadaraya-Watson estimator for innovation filtering and the inverse for IRF simulation.
6. **Validation & Orchestration (Tasks 12-17):** Provides high-level orchestrators for empirical analysis, simulation studies, robustness checks, and comparative analysis.
## Core Components (Notebook Structure)
The `non_linear_forecasting_backcasting_draft.ipynb` notebook is structured as a series of modular, professional-grade functions, each corresponding to a specific task in the pipeline. Key functions include:
- **`validate_and_cleanse_data`**: The initial data quality gate.
- **`prepare_var_data`**: Transforms data to be stationary and demeaned.
- **`estimate_gcov_var`**: The core GCov estimation engine.
- **`compute_jordan_decomposition`**: Separates causal/non-causal dynamics.
- **`estimate_functional_components`**: Fits the non-parametric KDEs.
- **`compute_predictive_density`**: The engine for probabilistic forecasting.
- **`compute_point_forecast` & `compute_prediction_interval`**: Extracts forecast products.
- **`compute_bootstrap_confidence_set`**: The advanced uncertainty quantification engine.
- **`filter_nonlinear_innovations`**: Extracts structural shocks.
- **`simulate_irf`**: Simulates state-dependent IRFs.
- **`run_empirical_analysis`**: Orchestrates a full analysis of a single dataset.
- **`run_full_research_pipeline`**: The single, top-level entry point to the entire project.
## Key Callable: run_full_research_pipeline
The central function in this project is `run_full_research_pipeline`. It orchestrates the entire analytical workflow from raw data to a final, comprehensive results dictionary.
```python
def run_full_research_pipeline(
raw_df: pd.DataFrame,
study_params: Dict[str, Any]
) -> Dict[str, Any]:
"""
Executes a complete, end-to-end research pipeline for the mixed
causal-noncausal VAR model.
... (full docstring is in the notebook)
"""
# ... (implementation is in the notebook)
```
## Prerequisites
- Python 3.9+
- Core dependencies: `pandas`, `numpy`, `scipy`, `matplotlib`, `seaborn`, `statsmodels`, `joblib`.
## Installation
1. **Clone the repository:**
```sh
git clone https://github.com/chirindaopensource/non_linear_forecasting_backcasting.git
cd non_linear_forecasting_backcasting
```
2. **Create and activate a virtual environment (recommended):**
```sh
python -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
```
3. **Install Python dependencies from `requirements.txt`:**
```sh
pip install -r requirements.txt
```
## Input Data Structure
The primary input is a `pandas.DataFrame` with a monthly `DatetimeIndex` and two columns: `"real_oil_price"` and `"real_gdp"`.
**Example:**
```
real_gdp real_oil_price
2019-04-30 18958.789123 63.870000
2019-05-31 19002.256789 60.210000
2019-06-30 19045.724455 57.430000
... ... ...
```
## Usage
The entire pipeline is executed through the `run_full_research_pipeline` function. The user must provide the raw data and a comprehensive `study_params` dictionary that controls which analyses are run.
```python
import pandas as pd
import numpy as np
# 1. Load your data
# raw_data_df = pd.read_csv(...)
# For this example, we create synthetic data.
date_rng = pd.date_range(start='1986-01-01', end='2019-06-30', freq='M')
# ... (data generation code) ...
raw_data_df = pd.DataFrame(...)
# 2. Define your configurations (see notebook for full example)
study_params = {
"run_empirical": {"enabled": True, ...},
"run_simulation": {"enabled": False, ...},
# ... other sections ...
}
# 3. Run the master pipeline
# from non_linear_forecasting_backcasting_draft import run_full_research_pipeline
# final_results = run_full_research_pipeline(
# raw_df=raw_data_df,
# study_params=study_params
# )
# 4. Instantiate the visualizer and plot results
# from non_linear_forecasting_backcasting_draft import ModelVisualizer
# visualizer = ModelVisualizer(final_results['empirical_analysis'])
# visualizer.plot_diagnostics()
# visualizer.plot_irf(irf_date=pd.Timestamp('2008-06-30'))
```
## Output Structure
The `run_full_research_pipeline` function returns a deeply nested dictionary containing all data artifacts. Top-level keys include:
- `pipeline_configuration`: A copy of the input `study_params`.
- `empirical_analysis`: Results from the core analysis on the provided data.
- `simulation_study`: A DataFrame summarizing the Monte Carlo results.
- `robustness_checks`: DataFrames detailing the sensitivity analysis.
- `comparative_analysis`: A dictionary with forecast and metric DataFrames from the horse race.
## Project Structure
```
non_linear_forecasting_backcasting/
│
├── non_linear_forecasting_backcasting_draft.ipynb # Main implementation notebook
├── requirements.txt # Python package dependencies
├── LICENSE # MIT license file
└── README.md # This documentation file
```
## Customization
The pipeline is highly customizable via the `study_params` dictionary. Users can easily modify:
- The control flags (`run_empirical`, etc.) to enable or disable parts of the analysis.
- The VAR lag order `p_lags`.
- The GCov moment specifications `H_moment_lags` and `error_powers`.
- All simulation parameters (`S_bootstrap_replications`, `n_baseline_sims`, etc.).
- The specific dates for targeted forecasting and IRF analysis.
## Contributing
Contributions are welcome. Please fork the repository, create a feature branch, and submit a pull request with a clear description of your changes. Adherence to PEP 8, type hinting, and comprehensive docstrings is required.
## License
This project is licensed under the MIT License. See the `LICENSE` file for details.
## Citation
If you use this code or the methodology in your research, please cite the original paper:
```bibtex
@article{gourieroux2022nonlinear,
title={Nonlinear Fore(Back)casting and Innovation Filtering for Causal-Noncausal VAR Models},
author={Gourieroux, Christian and Jasiak, Joann},
journal={arXiv preprint arXiv:2205.09922},
year={2022}
}
```
For the implementation itself, you may cite this repository:
```
Chirinda, C. (2025). A Python Implementation of the Gourieroux-Jasiak (2025) Framework for Causal-Noncausal VAR Models.
GitHub repository: https://github.com/chirindaopensource/non_linear_forecasting_backcasting
```
## Acknowledgments
- Credit to Christian Gourieroux and Joann Jasiak for their foundational theoretical and empirical work.
- Thanks to the developers of the `pandas`, `numpy`, `scipy`, `matplotlib`, `statsmodels`, and `joblib` libraries, which provide the essential toolkit for this implementation.
--
This README was generated based on the structure and content of non_linear_forecasting_backcasting_draft.ipynb and follows best practices for research software documentation.