https://github.com/erwallace/neural-optimiser
Batched optimisation algorithms for neural network potential driven molecular dynamics.
https://github.com/erwallace/neural-optimiser
molecular-dynamics molecular-optimization neural-network-potentials torch-geometric
Last synced: 3 months ago
JSON representation
Batched optimisation algorithms for neural network potential driven molecular dynamics.
- Host: GitHub
- URL: https://github.com/erwallace/neural-optimiser
- Owner: erwallace
- License: apache-2.0
- Created: 2025-10-04T13:21:25.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-11-27T10:35:26.000Z (6 months ago)
- Last Synced: 2026-02-03T08:48:36.714Z (4 months ago)
- Topics: molecular-dynamics, molecular-optimization, neural-network-potentials, torch-geometric
- Language: Python
- Homepage:
- Size: 5.5 MB
- Stars: 17
- Watchers: 0
- Forks: 3
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# neural-optimiser
Batched optimisation algorithms for neural network potential–driven molecular structure relaxation on top of PyTorch Geometric.
### Key features
- Batched per-conformer BFGS with per-atom max-step control.
- Early exit on convergence (`fmax`), explosion (`fexit`), or step cap (`steps`).
- Trajectory collection per step (`batch.pos_dt`, `batch.forces_dt`, `batch.energies_dt`) and converged properties (`batch.pos`, `batch.energies`, `batch.forces`).
- IO methods for RDkit molecules and ASE atoms objects.
## Installation
Pre-requisities: Python 3.11, PyTorch and PyTorch Geometric compatible with your environment
```bash
# For example
uv pip install torch==2.8.0 -f https://data.pyg.org/whl/torch-2.8.0+cu128.html
uv pip install torch-geometric==2.7.0 torch-cluster -f https://data.pyg.org/whl/torch-2.8.0+cu128.html
```
### Install from PyPi
```bash
pip install neural-optimiser
```
### Install from source (uv)
```bash
uv sync
```
or create a virtual environment and install the packages manually:
```bash
uv venv .venv --python 3.11
source .venv/bin/activate
# [install torch and torch-geometric as above]
uv pip install -e .
```
Optional dev tools:
```bash
uv pip install -e ".[dev]"
uv run pre-commit install
```
## Quick Start
All of the code below and more is available in the [tutorial](notebooks/tutorial.ipynb).
### Run a Simple Batched BFGS Optimisation
This example uses `neural_optimiser.optimise._bfgs.BFGS`, and a dummy calculator, `neural_optimiser.calculators._rand.RandomCalculator`.
```python
from ase.build import molecule
from neural_optimiser.optimisers import BFGS
from neural_optimiser.calculators import RandomCalculator
from neural_optimiser.conformers import ConformerBatch
# Create a batch of molecules (each becomes a conformer)
atoms_list = [molecule("H2O"), molecule("NH3"), molecule("CH4")]
batch = ConformerBatch.from_ase(atoms_list)
batch.to("cuda") # if available
# Configure optimiser and attach a calculator that provides forces
optimiser = BFGS(steps=10, fmax=0.05, fexit=500.0, max_step=0.04)
optimiser.calculator = RandomCalculator()
# Run optimisation
converged = optimiser.run(batch)
print("All Converged:", converged)
for i, (conv, nsteps) in enumerate(zip(batch.converged, batch.converged_step)):
print(f"Conformer {i}: Converged: {conv}, On step {nsteps}")
# Trajectory [T, N, 3] and converged coordinates [N, 3]
print("pos_dt shape:", tuple(batch.pos_dt.shape))
print("pos shape:", tuple(batch.pos.shape))
```
**Notes:**
- `fmax` triggers convergence per conformer using the maximum per-atom force norm,. Ether `fmax` or `steps` must be specified.
- `fexit` triggers early exit if all non-converged conformers exceed the threshold.
- Trajectories are accumulated in memory as `batch.pos_dt` along with their energies and forces (`batch.forces_dt`, `batch.energies_dt`); converged geometries are indexed into `batch.pos`, `batch.energies`, `batch.forces` (final positions are returned for non-converged conformers). See `neural_optimiser.optimise.base.Optimiser` for more details.
### Run a Larger BFGS Optimisation using the ConformerDataLoader
For large datasets all your conformers may not fit in memory at once. In this case you can use the `neural_optimiser.datasets.base.ConformerDataLoader` to stream conformers in mini-batches.
```python
from rdkit import Chem
from rdkit.Chem import AllChem
from neural_optimiser.conformers import Conformer, ConformerBatch
from neural_optimiser.datasets.base import ConformerDataset, ConformerDataLoader
from neural_optimiser.optimisers import BFGS
from neural_optimiser.calculators import RandomCalculator
# Build a pool of conformers from RDKit molecules
smiles_list = ["CCO", "CC", "CCN"]
mols = []
for smiles in smiles_list:
m = Chem.AddHs(Chem.MolFromSmiles(smiles))
AllChem.EmbedMultipleConfs(m, numConfs=10, useExpTorsionAnglePrefs=True, useBasicKnowledge=True)
mols.append(m)
big_batch = ConformerBatch.from_rdkit(mols) # creates one Conformer per RDKit conformer
big_batch.to("cuda") # if available
# Dataset/DataLoader -> yields ConformerBatch
dataset = ConformerDataset(big_batch.to_data_list())
dataloader = ConformerDataLoader(dataset, batch_size=8, shuffle=True, num_workers=0)
# Configure optimiser and attach a calculator that provides forces
optimiser = BFGS(steps=10, fmax=0.05, fexit=500.0, max_step=0.04)
optimiser.calculator = RandomCalculator()
for batch in dataloader:
optimiser.run(batch)
```
### Using Your Own Calculator
Implement a calculator by subclassing `neural_optimiser.calculators.base.Calculator` and returning energies and forces for the full batch.
```python
import torch
from torch_geometric.data import Batch, Data
from neural_optimiser.calculators.base import Calculator
class MyCalculator(Calculator):
def _calculate(self, batch: Data | Batch) -> tuple[torch.Tensor, torch.Tensor]:
# energies: required shape [N_atoms]
energies = torch.zeros(batch.n_conformers, device=self.device, dtype=torch.float32)
# forces: required shape [N_atoms, 3] matching batch.pos
forces = torch.zeros_like(batch.pos, device=self.device)
# ... fill forces from your model ...
return energies, forces
```
### Data Containers
**Conformer**
Molecules with 3D geometries are stored as `neural_optimiser.conformers._conformer.Conformer` objects.
```python
from ase.build import molecule
from rdkit import Chem
from rdkit.Chem import AllChem
from neural_optimiser.conformers import Conformer
# From ASE
atoms = molecule("H2O")
conf1 = Conformer.from_ase(atoms, smiles="O")
print(type(conf1).__name__)
print("atom_types:", conf1.atom_types.shape) # [n_atoms]
print("pos:", conf1.pos.shape) # [n_atoms, 3]
print("smiles:", conf1.smiles)
# Convert back to ASE
atoms2 = conf1.to_ase()
print("ASE atoms:", atoms2.get_chemical_formula(), atoms2.positions.shape)
# From RDKit
mol = Chem.MolFromSmiles("CCO")
mol = Chem.AddHs(mol)
AllChem.EmbedMolecule(mol, AllChem.ETKDG())
conf2 = Conformer.from_rdkit(mol)
# Convert back to RDKit (returns a Mol with one 3D conformer)
mol2 = conf2.to_rdkit()
print("RDKit confs:", mol2.GetNumConformers())
```
**ConformerBatch**
Different molecules, or conformers of the same molecule (or both) can be stored in a `neural_optimiser.conformers._conformer_batch.ConformerBatch` object.
```python
from ase.build import molecule
from rdkit import Chem
from rdkit.Chem import AllChem
from neural_optimiser.conformers import Conformer, ConformerBatch
# Build from ASE
atoms_list = [molecule("H2O"), molecule("NH3"), molecule("CH4")]
batch_ase = ConformerBatch.from_ase(atoms_list)
print("ASE batch:", batch_ase.n_molecules, batch_ase.n_conformers, batch_ase.n_atoms)
# Slice a single conformer view
conf0 = batch_ase.conformer(0)
print("conf0 pos:", conf0.pos.shape)
# Indices available on the batch (per-atom)
print("batch index shape:", batch_ase.batch.shape)
print("molecule_idxs shape:", batch_ase.molecule_idxs.shape)
# Build from RDKit (multiple conformers per molecule also supported)
def rdkit_with_coords(smiles: str):
m = Chem.MolFromSmiles(smiles)
m = Chem.AddHs(m)
AllChem.EmbedMolecule(m, AllChem.ETKDG())
return m
mol_list = [rdkit_with_coords("O"), rdkit_with_coords("CCO")]
batch_rd = ConformerBatch.from_rdkit(mol_list)
print("RDKit batch:", batch_rd.n_molecules, batch_rd.n_conformers, batch_rd.n_atoms)
# Build from a list of Conformer objects
c1 = Conformer.from_ase(molecule("H2O"))
c2 = Conformer.from_ase(molecule("NH3"))
batch_list = ConformerBatch.from_data_list([c1, c2])
print("Data list batch:", batch_list.n_molecules, batch_list.n_conformers, batch_list.n_atoms)
```
## Testing
```bash
uv run pytest tests/
```
## License
Apache 2.0 - see [LICENSE](LICENSE).