{"id":34833003,"url":"https://github.com/openforcefield/yammbs","last_synced_at":"2026-01-16T21:00:58.222Z","repository":{"id":84447980,"uuid":"539192850","full_name":"openforcefield/yammbs","owner":"openforcefield","description":"Internal tool for benchmarking force fields","archived":false,"fork":false,"pushed_at":"2026-01-13T01:47:35.000Z","size":20291,"stargazers_count":6,"open_issues_count":32,"forks_count":2,"subscribers_count":15,"default_branch":"main","last_synced_at":"2026-01-13T03:17:37.852Z","etag":null,"topics":["force-fields","molecular-modeling"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/openforcefield.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2022-09-20T21:08:39.000Z","updated_at":"2026-01-12T23:47:19.000Z","dependencies_parsed_at":null,"dependency_job_id":"efe53b1d-e9ae-465b-a676-ac4ff953985d","html_url":"https://github.com/openforcefield/yammbs","commit_stats":null,"previous_names":[],"tags_count":15,"template":false,"template_full_name":null,"purl":"pkg:github/openforcefield/yammbs","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openforcefield%2Fyammbs","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openforcefield%2Fyammbs/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openforcefield%2Fyammbs/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openforcefield%2Fyammbs/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/openforcefield","download_url":"https://codeload.github.com/openforcefield/yammbs/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openforcefield%2Fyammbs/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28482472,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-16T11:59:17.896Z","status":"ssl_error","status_checked_at":"2026-01-16T11:55:55.838Z","response_time":107,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["force-fields","molecular-modeling"],"created_at":"2025-12-25T15:57:14.776Z","updated_at":"2026-01-16T21:00:58.209Z","avatar_url":"https://github.com/openforcefield.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# YAMMBS\n\nYet Another Molecular Mechanics Benchmarking Suite (YAMMBS, pronounced like \"yams\") is a tool for\nbenchmarking force fields.\n\nYAMMBS is currently developed for internal use at Open Force Field. It is not currently recommended for external use. No guarantees are made about the stability of the API or the accuracy of any results. Feedback and contributions are welcome [on GitHub](https://github.com/openforcefield/yammbs).\n\n## Installation\n\nUse the file `./devtools/conda-envs/dev.yaml` and also install `yammbs` with something like `python -m pip install -e .`.\n\n## Getting started\n\nSee the file [run.py](run.py) for a start-to-finish example. Note that the pattern in the script\n\n```python\nfrom multiprocessing import freeze_support\n\ndef main():\n    # Your code here\n\nif __name__ == \"__main__\":\n    freeze_support()\n    main()\n```\n\nmust be used for Python's `multiprocessing` module to behave well.\n\n### Data sources\n\nIt is assumed that the input molecules are stored in an `openff-qcsubmit` model like `OptimizationResultCollection` or YAMMBS's own input models.\n\n### Preparing an input dataset\n\nYAMMBS relies on QCSubmit to provide datasets from QCArchive. See [their docs](https://docs.openforcefield.org/projects/qcsubmit/en/stable/), particularly the [dataset retrieval example](https://docs.openforcefield.org/projects/qcsubmit/en/stable/examples/retrieving-results.html), for more.\n\nCurrently only optimization datasets (`OptimizationResultCollection` in QCSubmit) are supported.\n\nFirst, retrieve a dataset from QCArchive:\n\n```python\nfrom qcportal import PortalClient\n\nfrom openff.qcsubmit.results import OptimizationResultCollection\n\n\nclient = PortalClient(\"https://api.qcarchive.molssi.org:443\", cache_dir=\".\")\n\nseason1_dataset = OptimizationResultCollection.from_server(\n    client=client,\n    datasets=\"OpenFF Industry Benchmark Season 1 v1.1\",\n    spec_name=\"default\",\n)\n```\n\nAfter retrieving it - and after applying filters to remove problematic records - you can dump it to disk to avoid pulling down all of the data from the server again.\n\n```python\nwith open(\"qcsubmit.json\", \"w\") as f:\n    f.write(season1_dataset.json())\n```\n\nOnce an `OptimizationResultCollection` is in memory, either by pulling it down from QCArchive or loading it from disk, convert it to a \"YAMMBS input\" model using the API:\n\n```python\nfrom yammbs.inputs import QCArchiveDataset\n\n\nseason1_dataset = OptimizationResultCollection.parse_file(\"qcsubmit.json\")\n\ndataset = QCArchiveDataset.from_qcsubmit_collection(season1_dataset)\n\nwith open(\"input.json\", \"w\") as f:\n    f.write(dataset.model_dump_json())\n```\n\nThis input model (`QCArchiveDataset`) stores a minimum amount of information to use these QM geometries as reference structures. The dataset has fields for tagging the name and model version, but mostly stores a list of structures. Each QM-optimized structure is stored as a `QCArchiveMolecule` object which stores:\n\n* (Mapped) SMILES which can be used to regenerate the `openff.toolkit.Molecule` and similar objects\n* QM-optimized geometry\n* Final energy from QM optimization\n* An ID uniquely defining this structure within the datasets\n\nIf running many benchmarks, we recommend using this file as a starting point.\n\nNote: This JSON file (\"input.json\") is from a different model than the JSON file written from QCSubmit - they are not interchangeable.\n\nNote: Both QCSubmit and YAMMBS rely on Pydantic for model validation and serialization. Even though both use V2 in packaging, YAMMBS uses the V2 API and (as of October 2024) QCSubmit still uses the V1 API. Usage like above should work fine; only esoteric use cases (in particular, defining a new model that has both YAMMBS and QCSubmit models as fields) should be unsupported.\n\n### Run a benchmark\n\nWith the input prepared, create a `MoleculeStore` object:\n\n```python\nfrom yammbs import MoleculeStore\n\nstore = MoleculeStore.from_qcarchive_dataset(dataset)\n```\n\nThis object is the focal point of running benchmarks; it stores the inputs (QM structures), runs minimizations with force field(s) of interest, stores the results (MM structures), and provides helper methods for use in analysis.\n\nRun MM optimizations of all molecules using a particular force field(s) using `optimize_mm`:\n\n```python\nstore.optimize_mm(force_field=\"openff-2.1.0.offxml\")\n\n# can also iterate over multiple force fields, and use more processors\nfor force_field in [\n    \"openff-1.0.0.offxml\",\n    \"openff-1.3.0.offxml\",\n    \"openff-2.0.0.offxml\",\n    \"openff-2.1.0.offxml\",\n    \"openff-2.2.1.offxml\",\n    \"gaff-2.11\",\n    \"de-force-1.0.1.offxml\",\n]:\n    store.optimize_mm(force_field=force_field, n_processes=16)\n```\n\nThis method short-circuits (i.e. does not run minimizations) if a force field's results are already stored. i.e. the Sage 2.1 optimizations in the loop will be skipped.\n\nThere are \"output\" models that mirror the input models, basically storing MM-minimized geometries without needing to re-load or re-optimize the QM geometries. This can again be saved out to disk as JSON:\n\n```python\nstore.get_outputs().model_dump_json(\"output.json\")\n```\n\nSummary metrics (including DDE, RMSD, TFD, and internal coordinate RMSDs) are available separately (in order to reduce file size when only summary statistics, and not whole molecular descriptions and geometries, are sought):\n\n```python\nstore.get_metrics().model_dump_json(\"metrics.json\")\n```\n\nThe basic structure of the metrics is a hierarchical dictionary. It is keyed by force field tag (i.e. \"openff-2.2.1\") mapping on to a dict of per-molecule summary metrics. Each of these dicts are keyed by QCArchvie ID (the same ID used to distinguish structures in the input and output models) mapping onto a dict of string-float keys that store the actual metrics (i.e. the DDE, RMSD, etc. of this particular structure optimized with the force field used as its high-level key). Access to these data is similar in memory (on the Pydantic models) and on disk (in JSON). Visually:\n\n```json\n{\n    \"metrics\": {\n        \"openff-1.0.0\": {\n            \"37016854\": {\n                \"dde\": 0.5890449032115157,\n                \"rmsd\": 0.011969891530473157,\n                \"tfd\": 0.001592046369769131,\n                \"icrmsd\": {\n                    \"Bond\": 0.0033974261816308144,\n                    \"Angle\": 0.9483605366613115,\n                    \"Dihedral\": 1.353163675708829,\n                    \"Improper\": 0.2922040744956022,\n                },\n            },\n            \"37016855\": {\"this molecule's metrics ...\"},\n        },\n        \"openff-2.0.0\": {\n            \"37016855\": {\"this force field's molecules ...\"},\n        }\n    }\n}\n```\n\nThis data can be transformed for plotting, summary statistics, etc. which compare the metrics of each force field (for this molecule dataset).\n\n### Run a TorsionDrive benchmark\n\n`YAMMBS` contains functionality for running TorsionDrive benchmarks in `yammbs.torsion`. Also, a convenience script for torsion analysis is provided in `yammbs/scripts/run_torsion_comparisons.py`. This can be run from anywhere using `yammbs_analyse_torsions`:\n\n```bash\nyammbs_analyse_torsions --help\n```\n\nFor example, to run the [OpenFF Rowley Biaryl v1.0 TorsionDrive dataset](https://github.com/openforcefield/qca-dataset-submission/tree/master/submissions/2020-06-17-OpenFF-Biaryl-set), first download it with\n\n```python\nfrom openff.qcsubmit.results import TorsionDriveResultCollection\nfrom qcportal import PortalClient\n\nfrom yammbs.torsion.inputs import QCArchiveTorsionDataset\n\nclient = PortalClient(\"https://api.qcarchive.molssi.org:443\", cache_dir=\".\")\n\nrowley_torsion_dataset = TorsionDriveResultCollection.from_server(\n    client=client,\n    datasets=\"OpenFF Rowley Biaryl v1.0\",\n    spec_name=\"default\",\n)\n\ndataset = QCArchiveTorsionDataset.from_qcsubmit_collection(rowley_torsion_dataset)\n\nwith open(\"input.json\", \"w\") as f:\n    f.write(dataset.model_dump_json())\n```\n\nThen, to run the benchmark with `openff-1.0.0.offxml` and `openff-2.2.1.offxml`:\n\n```bash\nyammbs_analyse_torsions --qcarchive-torsion-data input.json \\\n    --base-force-fields openff-1.0.0 \\\n    --base-force-fields openff-2.2.1\n```\nThis takes a bit over 10 minutes on a 32-core machine with 125 GB RAM. Note that when supplying your own force fields, make sure that these are the unconstrained versions (this is done automatically for e.g. `openff-1.0.0`), for example:\n\n```bash\nyammbs_analyse_torsions --qcarchive-torsion-data input.json \\\n    --extra-force-fields my_unconstrained_ff.offxml\n```\nA range of OpenFF force fields will be run for comparison if no `--base-force-fields` are specified.\n\n## Custom analyses\n\nSee [examples.ipynb](examples.ipynb) for some examples of interacting with benchmarking results and a starting point for custom analyses.\n\n### License\n\nYAMMBS is open-source software distrubuted under the MIT license (see LICENSE). It derives from\nother open-source work that may be distributed under other licenses (see LICENSE-3RD-PARTY).\n\n### Copyright\n\nCopyright (c) 2022, Open Force Field Initiative\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenforcefield%2Fyammbs","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopenforcefield%2Fyammbs","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenforcefield%2Fyammbs/lists"}