Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ai-ahmed/ppcax
Probabilistic PCA and PKPCA for Stochastic Feature Extraction and Missing Data Reconstruction
https://github.com/ai-ahmed/ppcax
bayesian chex distrax finance jax pkpca ppca probabilistic probabilistic-models python quantitative-finance stochastic
Last synced: 12 days ago
JSON representation
Probabilistic PCA and PKPCA for Stochastic Feature Extraction and Missing Data Reconstruction
- Host: GitHub
- URL: https://github.com/ai-ahmed/ppcax
- Owner: AI-Ahmed
- License: apache-2.0
- Created: 2024-04-08T21:00:52.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-11-03T15:26:36.000Z (about 2 months ago)
- Last Synced: 2024-11-03T15:32:29.968Z (about 2 months ago)
- Topics: bayesian, chex, distrax, finance, jax, pkpca, ppca, probabilistic, probabilistic-models, python, quantitative-finance, stochastic
- Language: Python
- Homepage:
- Size: 107 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# PPCAx โ Probabilistic PCA with JAX
## Overview
Probabilistic Principal Component Analysis (PPCA) model using DeepMind's JAX library. The model is a robust feature extraction and dimensionality reduction technique for high-dimensional, sparse multivariate data.
PPCA is a probabilistic approach to Principal Component Analysis (PCA), which allows for imputing missing values and estimating latent features in the data. By leveraging the power of JAX, this implementation ensures efficient and scalable computation, making it suitable for large-scale financial datasets.
The methodology used in this project was initially proposed in our research manuscript titled *"Probabilistic PCA in High Dimensions: Stochastic Dimensionality Reduction on Sparse Multivariate Assets' Bars at High-Risk Regimes"*. This work presents a novel approach for analyzing portfolio behavior during periods of high market turbulence and risk by:
1. Using information-driven bar techniques to synchronize and sample imbalanced sequence volumes.
2. Applying a sampling event-based technique, the CUMSUM Filtering method, to create strategic trading plans based on volatility.
3. Employing an improved version of the Gaussian Linear System called PPCA for feature extraction from the latent space.Our findings suggest that PPCA is highly effective in estimating sparse data and forecasting the effects of individual assets within a portfolio under varying market conditions. This repository contains the core implementation of the PPCA model, demonstrating its capability to establish significant relationships among correlated assets during high-risk regimes.
## ๐ Directory Structure
```bash
.
โโโ LICENSE
โโโ README.md
โโโ config
โโโ data
โ โโโ bars
โ โโโ metadata
โ โโโ sample
โ โ โโโ r1
โ โ โโโ r2
โ โโโ tickers
โโโ models
โโโ notebooks
โโโ pyproject.toml
โโโ reports
โ โโโ docs
โ โโโ eval
โ โโโ figures
โ โโโ train
โโโ src
โ โโโ __init__.py
โ โโโ eval
โ โโโ ft_eng
โ โโโ ppcax
โ โ โโโ __init__.py
โ โ โโโ _ppcax.py
โ โโโ preprocessing
โ โโโ utils
โโโ tests
โโโ __init__.py
โโโ gen_data.py
โโโ test_ppcax.py25 directories, 17 files
```## ๐ ๏ธ Installation and Setup Instructions
### Prerequisites
- **Python**: Ensure you have Python **3.10** or newer installed on your system.
### Installation Steps
1. **Clone the Repository**
```shell
git clone https://github.com/AI-Ahmed/ppcax.git
cd ppcax
```2. **Install Flit**
If you don't already have Flit installed, install it using `pip`:
```shell
pip install flit
```3. **Install the Package and Dependencies**
Install the package along with its dependencies using Flit:
```shell
flit install --deps develop
```This command installs the `ppcax` package along with all required dependencies, including development and testing tools like `pytest` and `flake8`.
### Alternative: Install Directly from GitHub
If you prefer to install the package directly from GitHub without cloning the repository:
```shell
pip install git+https://github.com/AI-Ahmed/ppcax
```This command installs the latest version of `ppcax` from the main branch.
### Importing the Package
After installation, you can import the PPCA model in your Python code:
```python
from ppcax import PPCA
```## ๐งช Running Tests
To run the unit tests and ensure everything is working correctly:
1. **Navigate to the Project Directory**
If you haven't already, navigate to the project's root directory:
```shell
cd ppcax
```2. **Run Tests Using pytest**
```shell
pytest tests/test.py
```## ๐ Usage Example
Here's a simple example of how to use the `PPCA` class:
```python
import numpy as np
from ppcax import PPCA# Generate some sample data
data = np.random.rand(100, 1000)# Create a PPCA model instance
ppca_model = PPCA(q=150)# Fit the model to the data
ppca_model.fit(data, use_em=True)# Transform the data to the lower-dimensional space
transformed_data = ppca_model.transform(lower_dim_only=True)print("Transformed Data Shape:", transformed_data.shape)
```## ๐ License
This project is licensed under the [Apache License 2.0](LICENSE), which is a permissive open-source license that grants users extensive rights to use, modify, and distribute the software. See the [LICENSE](LICENSE) file for more details.
## ๐ฃ Cite Our Work
If you find this work useful in your research, please consider citing:
```bibtex
@article{Atwa2024,
author = {Ahmed Atwa and Ahmed Sedky and Mohamed Kholief},
title = {Probabilistic PCA in High Dimensions: Stochastic Dimensionality Reduction on Sparse Multivariate Assets' Bars at High-Risk Regimes},
journal = {SSRN Electronic Journal},
year = {2024},
note = {Available at SSRN: \url{https://ssrn.com/abstract=4874874} or \url{http://dx.doi.org/10.2139/ssrn.4874874}}
}
```---
## ๐ง Development Setup
If you're planning to contribute to the project or modify the code, follow these steps to set up your development environment:
1. **Clone the Repository**
```shell
git clone https://github.com/AI-Ahmed/ppcax.git
cd ppcax
```2. **Create a Virtual Environment**
It's recommended to use a virtual environment to manage dependencies:
```shell
python -m venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate
```3. **Install Flit**
```shell
pip install flit
```4. **Install the Package in Editable Mode**
- For development and testing, install the package with the `test` extras:
```shell
flit install --deps develop --extras test --symlink
```The `--symlink` option installs the package in editable mode, so changes to the code are immediately reflected without reinstallation.
5. **Install Pre-commit Hooks (Optional)**
If you use `pre-commit` for code formatting and linting:
```shell
pip install pre-commit
pre-commit install
```6. **Run Tests**
```shell
pytest tests/test.py
```## ๐ค Contributing
Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.
---
## ๐ฌ Contact
For any questions or inquiries, please contact [Ahmed Nabil Atwa](mailto:[email protected]).
---
## ๐ Changelog
Refer to the [CHANGELOG](CHANGELOG.md) for details on updates and changes to the project.
---
## ๐ฆ Publishing to PyPI (Maintainers Only)
To publish a new version of the package to PyPI:
1. **Update the Version Number**
Increment the version number in `pyproject.toml`.
2. **Build the Package**
```shell
flit build
```3. **Publish to PyPI**
```shell
flit publish
```---
## ๐ Links
- **Documentation**: [Github Package documentation](https://github.com/AI-Ahmed/ppcax/README.md)
- **Issue Tracker**: [GitHub Issues](https://github.com/AI-Ahmed/ppcax/issues)
- **Source Code**: [GitHub Repository](https://github.com/AI-Ahmed/ppcax)