https://github.com/night-fury-me/digital-alchemy
https://github.com/night-fury-me/digital-alchemy
Last synced: 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/night-fury-me/digital-alchemy
- Owner: night-fury-me
- Created: 2024-11-21T18:33:22.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-03-06T17:50:33.000Z (3 months ago)
- Last Synced: 2025-03-06T18:37:28.702Z (3 months ago)
- Language: Python
- Size: 4.23 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
> _This repository contains Digital Alchemy Project (WiSe24/25 at FAU), which utilizes [SchNetPack](https://github.com/atomistic-machine-learning/schnetpack), [MD17](https://www.sgdml.org/#datasets) and [QM7-X](https://zenodo.org/records/4288677) dataset._
### Project Title
**Geometric Deep Learning for Quantum-Accurate Force Prediction**: _A Comparative Study of SchNet, Field SchNet, PaiNN, and SO(3)Net on Energy and Molecular Dynamics_
---
### **Overview**
This project trains a deep learning models - `SchNet`, `FieldSchNet`, `PaiNN` and `SO(3)Net` using the `MD17` and `QM7-x` dataset to predict `molecular energies and forces`. The trained models are then used in `Molecular Dynamics (MD)` simulations to study atomic motion under various conditions.
---
### **Prerequisites**
Ensure that you have the following installed on your system:
- [Docker](https://docs.docker.com/get-docker/)
- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) (for GPU support)To verify that Docker is installed, run:
```bash
docker --version
```If you want to use GPU support, ensure the NVIDIA runtime is set up by running:
```bash
docker run --rm --gpus all nvidia/cuda:11.7.1-base nvidia-smi
```This should display information about your GPU.
---
### **Building the Docker Image**
Clone your repository and navigate to the project directory:
```bash
git clone https://github.com/night-fury-me/digital-alchemy.git
cd digital-alchemy
```Then, build the Docker image using:
```bash
docker build -f Dockerfile.custom -t digital_alchemy_img .
```This will create an image named `digital_alchemy_img`.
---
### **Dataset Preparation**
Run the dataset preparation script:
```bash
docker run --rm -it --gpus all -v $(pwd):/workspace -w /workspace digital_alchemy_img bash data/prepare-dataset.sh
```This script will:
- Run `download.py` to download and convert the dataset to an HDF5 file.
- Run `create-db.py` to create a database (.db) file from the dataset.---
### **Directory Structure**
After running the scripts, the project directory will look like this:
```bash
digital-alchemy/
├── ckpts
│ └── best_model
├── data
│ ├── create-db.py
│ ├── download.py
│ ├── prepare-dataset.sh
│ └── QM7X_Dataset
│ ├── 1000.hdf5
│ ├── 2000.hdf5
│ ├── 3000.hdf5
│ ├── 4000.hdf5
│ ├── 5000.hdf5
│ ├── 6000.hdf5
│ ├── 7000.hdf5
│ ├── 8000.hdf5
│ └── QM7X.db
├── energy-vs-time.py
├── environment.yml
├── evaluate.py
├── lightning_logs
│ └── version_1
│ ├── checkpoints
│ │ └── epoch=83-step=5292.ckpt
│ ├── events.out.tfevents.1739748819.redStation.1030311.0
│ └── hparams.yaml
├── mlruns
│ ├── 0
│ │ └── meta.yaml
│ ├── 595388344762858645
│ └── models
├── paper-presentation
│ ├── BiM-Network.ipynb
│ ├── dataset-explore.ipynb
│ └── README.md
├── qm7x_ase_calculations
│ └── test_molecule.xyz
├── README.md
├── requirements.txt
├── Dockerfile
├── Dockerfile.custom
├── simulate_md.py
├── simulation
│ └── trajectory.traj
├── split.npz
├── splitting.lock
├── train.py
└── __version__.py
```### **Running the Training Script (`train.py`) in a Docker Container**
After the MLflow server is running, open another terminal and execute:
```bash
docker run --rm -it --gpus all -v $(pwd):/workspace -w /workspace --network="host" digital_alchemy_img python train.py
```---
### **Training Details**
- Experimented with `SchNet`, `FieldSchNet`, `PaiNN`, `SO(3)Net`.
- Optimized using `AdamW optimizer`.
- Loss function balances energy (MAE loss) and force predictions.
- Logs training results in Tensorboard.---
### **Evaluating the Model**
Once training is complete, evaluate the model on unseen test molecules:
```bash
docker run --rm -it --gpus all -v $(pwd):/workspace -w /workspace --network="host" digital_alchemy_img python evaluate.py
```---
### **Running Molecular Dynamics (MD) Simulations**
To run MD simulations:
```bash
docker run --rm -it --gpus all -v $(pwd):/workspace -w /workspace --network="host" digital_alchemy_img python simulate_md.py
```#### **Debugging MD Issues:**
If energy remains **constant instead of fluctuating**, try:
- Reducing the timestep (`0.5 fs` instead of `1 fs`).
- Increasing Langevin friction (`0.1` instead of `0.02`).
- Printing force values:
```python
print("Forces at Step 100:", atoms.get_forces())
```
- Printing temperature:
```python
print("Temperature at Step 100:", atoms.get_temperature())
```---
### **Visualizing MD Results**
#### **Install ASE (Atomic Simulation Environment)**
```bash
pip install ase
```#### **View Atomic Motion in ASE GUI**
```bash
ase gui trajectory.traj
```#### **Plot Energy vs. Time**
```bash
python energy-vs-time.py
```#### **Extract MD Energies from Trajectory**
```python
from ase.io import Trajectory
traj = Trajectory("simulation/trajectory.traj")
for step, atoms in enumerate(traj):
print(f"Step {step}: Energy = {atoms.get_potential_energy()} eV")
```---
### **Troubleshooting**
#### **1. Permission Denied for `prepare-dataset.sh`**
```bash
chmod +x data/prepare-dataset.sh
```---
### **References**
- Schütt et al., (2023). SchNetPack 2.0: _A neural network toolbox for atomistic machine learning._ https://doi.org/10.1063/5.0138367
- Schütt, K. T., Kessel, P., Gastegger, M., Nicoli, K. A., Tkatchenko, A., & Müller, K.-R. (2019). _"SchNetPack: A Deep Learning Toolbox For Atomistic Systems."_ _Journal of Chemical Theory and Computation_, 15(1), 448–455. https://doi.org/10.1021/acs.jctc.8b00908
GitHub repository: [SchNetPack](https://github.com/atomistic-machine-learning/schnetpack)
- Johannes Hoja, Leonardo Medrano Sandonas, Brian G. Ernst, Alvaro Vazquez-Mayagoitia, Robert A. DiStasio Jr., Alexandre Tkatchenko. QM7-X, a comprehensive dataset. _Scientific Data_, 7(1), 1-7. https://arxiv.org/abs/2006.15139
- Chmiela, S., Sauceda, H. E., Müller, K. R., & Tkatchenko, A. (2017). _"Machine learning of accurate energy-conserving molecular force fields."_ _Science Advances_, 3(5), e1603015. https://doi.org/10.1126/sciadv.1603015