https://github.com/farukalpay/psd

Perturbed Saddle-escape Descent (PSD): a first-order optimizer that escapes strict saddle points in nonconvex problems.
https://github.com/farukalpay/psd

first-order-methods gradient-descent machine-learning nonconvex-optimization numerical-methods optimization python saddle-points

Last synced: 5 months ago
JSON representation

Perturbed Saddle-escape Descent (PSD): a first-order optimizer that escapes strict saddle points in nonconvex problems.

Host: GitHub
URL: https://github.com/farukalpay/psd
Owner: farukalpay
License: mit
Created: 2025-08-22T13:30:43.000Z (11 months ago)
Default Branch: main
Last Pushed: 2025-08-22T15:28:00.000Z (11 months ago)
Last Synced: 2025-08-22T15:37:53.678Z (11 months ago)
Topics: first-order-methods, gradient-descent, machine-learning, nonconvex-optimization, numerical-methods, optimization, python, saddle-points
Language: Python
Homepage:
Size: 25.4 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # Perturbed Saddle-escape Descent (PSD)

[![CI](https://github.com/farukalpay/PSD/actions/workflows/ci.yml/badge.svg)](https://github.com/farukalpay/PSD/actions/workflows/ci.yml)

[![Coverage](https://img.shields.io/badge/coverage-90%25-brightgreen)](./)

[![Docs](https://img.shields.io/badge/docs-latest-blue)](./docs/index.md)

[![Python](https://img.shields.io/badge/python-3.8%2B-blue)](https://www.python.org/)

## Project Summary

This repository implements the **Perturbed Saddle-escape Descent (PSD)**

algorithm for escaping saddle points in non-convex optimisation problems, as described in [Alpay and Alakkad (2025)](https://arxiv.org/abs/2508.16540).

It contains reference NumPy implementations, framework specific optimisers

for PyTorch and TensorFlow, and utilities for reproducing the synthetic

experiments reported in the accompanying manuscript.

## Features

* Reference implementations of PSD, PSD-Probe and baseline gradient descent

  variants in pure NumPy.

* Suite of analytic test functions with gradients and Hessians.

* Synthetic data generator producing the tables and figures used in the

  paper (`experiments.py`).

* Framework specific optimisers: `PSDTorch`, `PSDTensorFlow` and a

  `PSDOptimizer`/`PerturbedAdam` package for PyTorch.

* Example training scripts for MNIST and CIFAR-10.

## Technology Stack

The core project depends on the following libraries:

| Library | Purpose |

| ------- | ------- |

| `numpy` | numerical routines for reference implementations |

| `torch`, `torchvision` | deep-learning framework and datasets |

| `optuna` | hyper-parameter search utilities |

| `matplotlib` | visualisation in notebooks |

Python 3.8 or later is required.

## Installation

Install the published optimiser package:

```bash

pip install psd-optimizer

```

Or install the repository in editable mode for development:

```bash

git clone https://github.com/farukalpay/PSD.git

cd PSD

pip install -e ".[dev]"

```

## Quick Start

```python

import numpy as np

from psd import algorithms, functions

x0 = np.array([1.0, -1.0])

x_star, _ = algorithms.gradient_descent(x0, functions.SEPARABLE_QUARTIC.grad)

```

Further examples are available in the [`examples/`](./examples) directory and the

[documentation](./docs/index.md).

## Usage

### Using the Reference Algorithms

The core PSD routines and test functions can be imported from the

``psd`` package:

```python

import numpy as np

from psd import algorithms, functions

x0 = np.array([1.0, -1.0])

x_star, _ = algorithms.gradient_descent(x0, functions.SEPARABLE_QUARTIC.grad)

```

This structure allows you to experiment with the reference NumPy

implementations directly in your projects.

The PyTorch optimisers ``PSDOptimizer`` and ``PerturbedAdam`` are also

available directly via ``from psd import ...``.

### All-in-One "Monster" Interface

For rapid experimentation without navigating submodules, import the aggregated

``psd.monster`` module.  It re-exports the core algorithms, analytic test

functions and framework-specific optimisers in a single namespace:

```python

import numpy as np

from psd import monster

x0 = np.array([1.0, -1.0])

x_star, _ = monster.gradient_descent(x0, monster.SEPARABLE_QUARTIC.grad)

```

This unified view aims to be approachable for both humans and language models

exploring the project.

### Generating Synthetic Data

```bash

python experiments.py

```

The command writes CSV summaries to `results/` and training curves to

`data/`.

## Performance

Profiling identified `rosenbrock_hess` as a hot path when computing the

Rosenbrock Hessian.  Vectorising the computation removed explicit

Python loops and yielded the following improvements (dimension 1000):

| Version | Mean time (ms) | Peak memory (MB) |

|---------|----------------|-----------------|

| Before  | 3.52           | 8.00            |

| After   | 1.01           | 8.04            |

Benchmarking is automated via `pytest-benchmark` using a fixed NumPy seed.

Hard time and memory thresholds guard against major regressions.

### Training with the PyTorch Optimiser

```python

from psd_optimizer import PSDOptimizer

model = ...

opt = PSDOptimizer(model.parameters(), lr=1e-3)

def closure():

    opt.zero_grad()

    output = model(x)

    loss = criterion(output, y)

    loss.backward()

    return loss

opt.step(closure)

```

Example scripts using this API are available in the `notebooks/`

directory.

### Training a Small Language Model

An illustrative example for fine-tuning a compact transformer with

``PSDOptimizer`` is provided in ``scripts/train_small_language_model.py``.

The script downloads a tiny GPT-style model from the Hugging Face Hub and

optimizes it on a short dummy corpus.

Run the example with default settings:

```bash

python scripts/train_small_language_model.py

```

Specify a different pretrained model and number of epochs:

```bash

python scripts/train_small_language_model.py --model distilgpt2 --epochs 5

```

## Documentation

Full API documentation and guides are available in the

[``docs/`` directory](./docs/index.md).

Additional materials include:

* `notebooks/10_minute_start.ipynb` – an interactive notebook showcasing the optimiser.

* `docs/section_1_5_extension.md` – theoretical notes on extending PSD to stochastic settings.

* `notebooks/navigation.ipynb` – links to all example notebooks including `advanced_usage.ipynb`.

## Testing

After installing the repository in editable mode, run the test suite to

verify that everything works:

```bash

pytest

```

The current suite is small but helps prevent regressions.

## Repository Structure

```

psd/              # Reference implementations and framework-specific optimisers

    algorithms.py # PSD and baseline algorithms

    functions.py  # Analytic test functions and registry

psd_optimizer/    # PyTorch optimiser package

experiments.py    # Synthetic data generation

```

## Contributing

Contributions are welcome!  Please open an issue or pull request on GitHub

and see `CONTRIBUTING.md` for guidelines.  By participating you agree to

abide by the `CODE_OF_CONDUCT.md`.

## Citation

If you use PSD in your research, please cite the following:

```bibtex

@misc{alpay2025escapingsaddlepointscurvaturecalibrated,

      title={Escaping Saddle Points via Curvature-Calibrated Perturbations: A Complete Analysis with Explicit Constants and Empirical Validation},

      author={Faruk Alpay and Hamdi Alakkad},

      year={2025},

      eprint={2508.16540},

      archivePrefix={arXiv},

      primaryClass={cs.LG},

      url={https://arxiv.org/abs/2508.16540},

}

```

## License

This project is released under the MIT License.  See `LICENSE` for details.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/farukalpay/psd

Awesome Lists containing this project

README