{"id":46952251,"url":"https://github.com/asad/smsd","last_synced_at":"2026-04-12T04:02:23.887Z","repository":{"id":55650931,"uuid":"1520524","full_name":"asad/SMSD","owner":"asad","description":"SMSD — exact substructure \u0026 MCS search for chemical graphs.","archived":false,"fork":false,"pushed_at":"2026-04-03T02:14:45.000Z","size":2839,"stargazers_count":48,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2026-04-03T02:19:58.655Z","etag":null,"topics":["bron-kerbosch","cdk","cheminformatics","chemistry","graph-matching","java","maven-central","mcs","rascal","rrsplit","smarts","smsd","stereo-aware","substructure","substructure-search","vf2","vf2pp"],"latest_commit_sha":null,"homepage":"https://central.sonatype.com/artifact/com.bioinceptionlabs/smsd/3.0.0","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/asad.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":"NOTICE","maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2011-03-24T11:18:56.000Z","updated_at":"2026-04-03T02:14:39.000Z","dependencies_parsed_at":"2025-09-17T16:28:17.388Z","dependency_job_id":"0459298e-ef61-4be9-9b0d-9594abbcd8f0","html_url":"https://github.com/asad/SMSD","commit_stats":null,"previous_names":[],"tags_count":60,"template":false,"template_full_name":null,"purl":"pkg:github/asad/SMSD","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asad%2FSMSD","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asad%2FSMSD/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asad%2FSMSD/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asad%2FSMSD/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/asad","download_url":"https://codeload.github.com/asad/SMSD/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/asad%2FSMSD/sbom","scorecard":{"id":210659,"data":{"date":"2025-08-11","repo":{"name":"github.com/asad/SMSD","commit":"173a3fc735ada4b35133e37ccea2ab81d5a380f1"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":1.5,"checks":[{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Code-Review","score":0,"reason":"Found 0/26 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":9,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Warn: project license file does not contain an FSF or OSI license."],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":0,"reason":"Project has not signed or included provenance with any releases.","details":["Warn: release artifact 2.2.0 not signed: https://api.github.com/repos/asad/SMSD/releases/13410237","Warn: release artifact smsd-2.0.0 not signed: https://api.github.com/repos/asad/SMSD/releases/3425878","Warn: release artifact SMSD-v1.8 not signed: https://api.github.com/repos/asad/SMSD/releases/3342044","Warn: release artifact v1.7 not signed: https://api.github.com/repos/asad/SMSD/releases/924762","Warn: release artifact 2.2.0 does not have provenance: https://api.github.com/repos/asad/SMSD/releases/13410237","Warn: release artifact smsd-2.0.0 does not have provenance: https://api.github.com/repos/asad/SMSD/releases/3425878","Warn: release artifact SMSD-v1.8 does not have provenance: https://api.github.com/repos/asad/SMSD/releases/3342044","Warn: release artifact v1.7 does not have provenance: https://api.github.com/repos/asad/SMSD/releases/924762"],"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 4 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Vulnerabilities","score":0,"reason":"19 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: GHSA-5mg8-w23w-74h3","Warn: Project is vulnerable to: GHSA-7g45-4rm6-3mm3","Warn: Project is vulnerable to: GHSA-mvr2-9pj6-7w5j","Warn: Project is vulnerable to: GHSA-78wr-2p64-hpwj","Warn: Project is vulnerable to: GHSA-gwrp-pvrq-jmwv","Warn: Project is vulnerable to: GHSA-269g-pwp5-87pp","Warn: Project is vulnerable to: GHSA-2qrg-x229-3v8q","Warn: Project is vulnerable to: GHSA-65fg-84f6-3jq3","Warn: Project is vulnerable to: GHSA-f7vh-qwp3-x37m","Warn: Project is vulnerable to: GHSA-fp5r-v3w9-4333","Warn: Project is vulnerable to: GHSA-w9p3-5cr8-m3jj","Warn: Project is vulnerable to: GHSA-7rp6-w7mg-h8rw","Warn: Project is vulnerable to: GHSA-9339-86wc-4qgf","Warn: Project is vulnerable to: GHSA-rc2w-r4jq-7pfx","Warn: Project is vulnerable to: GHSA-334p-wv2m-w3vp","Warn: Project is vulnerable to: GHSA-7j4h-8wpf-rqfh","Warn: Project is vulnerable to: GHSA-h65f-jvqw-m9fj","Warn: Project is vulnerable to: GHSA-vmqm-g3vh-847m","Warn: Project is vulnerable to: GHSA-w4jq-qh47-hvjq"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}}]},"last_synced_at":"2025-08-17T00:43:11.245Z","repository_id":55650931,"created_at":"2025-08-17T00:43:11.245Z","updated_at":"2025-08-17T00:43:11.245Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31422898,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-05T02:22:46.605Z","status":"ssl_error","status_checked_at":"2026-04-05T02:22:33.263Z","response_time":75,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bron-kerbosch","cdk","cheminformatics","chemistry","graph-matching","java","maven-central","mcs","rascal","rrsplit","smarts","smsd","stereo-aware","substructure","substructure-search","vf2","vf2pp"],"created_at":"2026-03-11T08:36:00.872Z","updated_at":"2026-04-12T04:02:23.866Z","avatar_url":"https://github.com/asad.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://github.com/asad/SMSD\" aria-label=\"SMSD Pro\"\u003e\n    \u003cimg src=\"icons/icon.svg\" alt=\"SMSD Pro\" width=\"180\"/\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003eSMSD Pro\u003c/h1\u003e\n\u003cp align=\"center\"\u003e\u003cstrong\u003eSubstructure \u0026amp; MCS Search for Chemical Graphs\u003c/strong\u003e\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://central.sonatype.com/artifact/com.bioinceptionlabs/smsd\"\u003e\u003cimg src=\"https://img.shields.io/maven-central/v/com.bioinceptionlabs/smsd\" alt=\"Maven Central\"/\u003e\u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/smsd/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/v/smsd\" alt=\"PyPI\"/\u003e\u003c/a\u003e\n  \u003ca href=\"https://pypi.org/project/smsd/\"\u003e\u003cimg src=\"https://img.shields.io/pypi/dm/smsd\" alt=\"Downloads\"/\u003e\u003c/a\u003e\n  \u003ca href=\"LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/License-Apache%202.0-blue.svg\" alt=\"License\"/\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/asad/SMSD/releases\"\u003e\u003cimg src=\"https://img.shields.io/github/v/release/asad/SMSD\" alt=\"Release\"/\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n---\n\nSMSD Pro provides exact substructure search and maximum common substructure\n(MCS) search for chemical graphs. It is available for **Java**, **C++**\n(header-only), and **Python**. Optional GPU paths are available for CUDA and\nApple Metal builds.\n\nVersion `6.12.2` adds a lightweight clique-based MCS solver, standalone\nfingerprint modules (ECFP, path, pharmacophore, torsion, MCS-FP),\nSmallExactMCSExplorer for small molecule pairs, FixedSizeBondMaximizer,\nglobal reaction deadline, Hungarian algorithm, scaffold library, and\nperiodic table headers. Also includes the 6.11.2 fixes: CIP Rule 3 (Z \u003e E),\nmemory leak and BK color-bound overflow fixes, corrected tautomer weights,\nand standardised API naming (`MCSResult`, `MCSOptions`, `overlapCoefficient`).\n\n### Guides and References\n\n| Document | Description |\n|----------|-------------|\n| **[Examples, How-To, and Cautions](docs/EXAMPLES.md)** | Worked examples for every feature with cautions and performance tips |\n| [Python API Guide](docs/PYTHON.md) | Full Python API reference with code examples |\n| [Java Guide](docs/JAVA.md) | Java API and CLI usage |\n| [C++ Guide](docs/CPP.md) | Header-only C++ integration |\n| [Release Notes 6.12.2](docs/RELEASE_NOTES_6.12.2.md) | What's new in this release |\n| [Whitepaper](docs/WHITEPAPER.md) | Algorithm design (11-level MCS, VF2++, ring perception) |\n| [How to Install](docs/HOWTO-INSTALL.md) | Build from source on all platforms |\n| [Changelog](CHANGELOG.md) | Full versioned change history |\n\n### Molfile Support\n\nV2000 and V3000 core graph round-trip, names/comments, SDF properties, charges,\nisotopes, atom classes/maps, `R#` plus `M  RGP`, and basic stereo flags.\n\n**Copyright (c) 2018-2026 Syed Asad Rahman — BioInception PVT LTD**\n\n---\n\n## Install\n\n### Java (Maven)\n\n```xml\n\u003cdependency\u003e\n  \u003cgroupId\u003ecom.bioinceptionlabs\u003c/groupId\u003e\n  \u003cartifactId\u003esmsd\u003c/artifactId\u003e\n  \u003cversion\u003e6.12.2\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n### Java (Download JAR)\n\n```bash\ncurl -LO https://github.com/asad/SMSD/releases/download/v6.12.2/smsd-6.12.2-jar-with-dependencies.jar\n\njava -jar smsd-6.12.2-jar-with-dependencies.jar \\\n  --Q SMI --q \"c1ccccc1\" --T SMI --t \"c1ccc(O)cc1\" --json -\n```\n\n### Python (pip)\n\n```bash\npip install smsd\n```\n\nSupported CPython versions: `3.9` through the latest stable release series.\nCurrent default test target: `Python 3.12`.\nCPU execution is the default path. CUDA and Metal acceleration are optional.\nRDKit and Open Babel are optional interop layers.\n\n```python\nimport smsd\n\nresult = smsd.substructure_search(\"c1ccccc1\", \"c1ccc(O)cc1\")\nmcs    = smsd.mcs(\"c1ccccc1\", \"c1ccc2ccccc2c1\")\n\n# Tautomer-aware MCS\nmcs    = smsd.mcs(\"CC(=O)C\", \"CC(O)=C\", tautomer_aware=True)\n\n# Prefer rare heteroatoms (S, P, Se) for reaction mapping\nmcs    = smsd.mcs(\"C[S+](C)CCC(N)C(=O)O\", \"SCCC(N)C(=O)O\",\n                   prefer_rare_heteroatoms=True)\n\n# Reaction-aware atom mapping\naam    = smsd.map_reaction_aware(\"CC(=O)O\", \"CCO\")\n\n# Similarity upper bound (fast pre-filter)\nsim    = smsd.similarity(\"c1ccccc1\", \"c1ccc(O)cc1\")\n\nfp     = smsd.fingerprint(\"c1ccccc1\", kind=\"mcs\")\n\n# Circular fingerprint (ECFP4 equivalent, tautomer-aware)\necfp4 = smsd.circular_fingerprint(\"c1ccccc1\", radius=2, fp_size=2048)\n```\n\n### Java API\n\n```java\nimport com.bioinception.smsd.core.*;\n\nSMSD smsd = new SMSD(mol1, mol2, new ChemOptions());\nboolean isSub = smsd.isSubstructure();\nvar mcs = smsd.findMCS();\n\n// Reaction-aware with bond-change scoring\nSearchEngine.MCSOptions opts = new SearchEngine.MCSOptions();\nopts.reactionAware = true;\nopts.bondChangeAware = true;  // penalise implausible bond transformations\nvar rxnMcs = SearchEngine.reactionAwareMCS(g1, g2, new ChemOptions(), opts);\n\n// CIP stereo assignment (Rules 1-5, including pseudoasymmetric r/s)\nMap\u003cInteger, Character\u003e stereo = CIPAssigner.assignRS(g);\nMap\u003cLong, Character\u003e ez = CIPAssigner.assignEZ(g);\n\n// Batch MCS with non-overlap constraints\nvar mappings = SearchEngine.batchMCSConstrained(queries, targets, new ChemOptions(), 10_000);\n```\n\n### Python — Advanced Features\n\n```python\nimport smsd\n\n# --- Reaction-Aware MCS ---\n# Prefer heteroatom-containing mappings for reaction center identification\nmapping = smsd.map_reaction_aware(\n    \"C[S+](CCC(N)C(=O)O)CC1OC(n2cnc3c(N)ncnc32)C(O)C1O\",  # SAM\n    \"SCCC(N)C(=O)OCC1OC(n2cnc3c(N)ncnc32)C(O)C1O\"           # SAH\n)\n\n# --- Structured MCS Result ---\nresult = smsd.mcs_result(\"c1ccccc1\", \"c1ccc(O)cc1\")\nprint(result.size)          # 6\nprint(result.overlapCoefficient)  # 0.857 (overlap coefficient)\nprint(result.mcs_smiles)    # \"c1ccccc1\"\nprint(result.mapping)       # {0: 0, 1: 1, ...}\n\n# --- Works with any input type ---\n# SMILES strings\nmcs = smsd.mcs(\"c1ccccc1\", \"c1ccc(O)cc1\")\n\n# MolGraph objects (pre-parsed, fastest for batch)\ng1 = smsd.parse_smiles(\"c1ccccc1\")\ng2 = smsd.parse_smiles(\"c1ccc(O)cc1\")\nmcs = smsd.mcs(g1, g2)\n\n# Native Mol objects (auto-detected, indices returned in native ordering)\n# from rdkit import Chem\n# mcs = smsd.mcs(Chem.MolFromSmiles(\"c1ccccc1\"), Chem.MolFromSmiles(\"c1ccc(O)cc1\"))\n\n# --- Fingerprints ---\necfp4  = smsd.circular_fingerprint(\"c1ccccc1\", radius=2, fp_size=2048)\nfcfp4  = smsd.circular_fingerprint(\"c1ccccc1\", radius=2, fp_size=2048, mode=\"fcfp\")\ncounts = smsd.ecfp_counts(\"c1ccccc1\", radius=2, fp_size=2048)\ntorsion = smsd.topological_torsion(\"c1ccccc1\", fp_size=2048)\ntan    = smsd.overlapCoefficient(ecfp4, ecfp4)\n\n# --- 2D Layout ---\ng = smsd.parse_smiles(\"c1ccc2c(c1)cc1ccccc1c2\")  # phenanthrene\ncoords = smsd.force_directed_layout(g, max_iter=500, target_bond_length=1.5)\ncoords = smsd.stress_majorisation(g, max_iter=300)\ncrossings = smsd.reduce_crossings(g, coords, max_iter=2000)\n```\n\n### Python — MCS Variants \u0026 Batch Operations\n\n```python\nimport smsd\n\n# --- All MCS variants ---\nmcs = smsd.mcs(\"c1ccccc1\", \"c1ccc(O)cc1\")                     # Connected MCS (default)\nmcs = smsd.mcs(\"c1ccccc1\", \"c1ccc(O)cc1\", connected_only=False) # Disconnected MCS\nmcs = smsd.mcs(\"c1ccccc1\", \"c1ccc(O)cc1\", induced=True)         # Induced MCS\nmcs = smsd.mcs(\"c1ccccc1\", \"c1ccc(O)cc1\", maximize_bonds=True)  # Edge MCS (MCES)\n\n# Find top-N distinct MCS solutions\nall_mcs = smsd.find_all_mcs(\"c1ccccc1\", \"c1ccc(O)cc1\", max_results=5)\n\n# SMARTS-based MCS\nmcs = smsd.find_mcs_smarts(\"[#6]~[#7]\", \"c1ccc(N)cc1\")\n\n# Scaffold MCS (Murcko framework)\nscaffold = smsd.find_scaffold_mcs(\"CC(=O)Oc1ccccc1C(=O)O\", \"Oc1ccccc1C(=O)O\")\n\n# R-group decomposition\nrgroups = smsd.decompose_r_groups(\"c1ccccc1\", [\"c1ccc(O)cc1\", \"c1ccc(N)cc1\"])\n\n# --- Substructure Search ---\nhit = smsd.substructure_search(\"c1ccccc1\", \"c1ccc(O)cc1\")\nall_matches = smsd.find_all_substructures(\"c1ccccc1\", \"c1ccc(O)cc1\", max_matches=10)\n\n# SMARTS pattern matching\nmatches = smsd.smarts_search(\"[OH]\", \"c1ccc(O)cc1\")\n\n# --- Similarity \u0026 Screening ---\nsim = smsd.overlapCoefficient(\n    smsd.circular_fingerprint(\"CCO\", radius=2),\n    smsd.circular_fingerprint(\"CCCO\", radius=2)\n)\ndice = smsd.dice_similarity(\n    smsd.ecfp_counts(\"CCO\", radius=2),\n    smsd.ecfp_counts(\"CCCO\", radius=2)\n)\n\n# --- Chemistry Options ---\n# Tautomer-aware with solvent and pH\nmcs = smsd.mcs(\"CC(=O)C\", \"CC(O)=C\",\n               tautomer_aware=True, solvent=\"DMSO\", pH=5.0)\n\n# Loose bond matching (FMCS-style)\nmcs = smsd.mcs(\"c1ccccc1\", \"C1CCCCC1\", bond_order_mode=\"loose\")\n\n# --- Canonical SMILES ---\nsmi = smsd.canonical_smiles(\"OC(=O)c1ccccc1\")   # deterministic canonical form\nmcs_smi = smsd.mcs_to_smiles(g1, mapping)        # extract MCS as SMILES\n\n# --- CIP Stereo Assignment ---\ng = smsd.parse_smiles(\"N[C@@H](C)C(=O)O\")  # L-alanine\nstereo = smsd.assign_rs(g)                   # {1: 'S'}\nez = smsd.assign_ez(smsd.parse_smiles(\"C/C=C/C\"))  # E-2-butene\n\n# --- Native MolGraph I/O ---\ng = smsd.parse_smiles(\"c1ccccc1\")\ng = smsd.read_molfile(\"molecule.mol\")\nmol_block = smsd.write_mol_block(g)\nv3000 = smsd.write_mol_block_v3000(g)\nsmsd.write_molfile(g, \"molecule_out.mol\", v3000=True)\nsmsd.export_sdf([g1, g2], \"output.sdf\")\n```\n\n### Publication-Quality Depiction (ACS 1996 Standard)\n\nZero-dependency SVG renderer — the same specification used by Nature, Science,\nJACS, and Springer journals. See [Examples](docs/EXAMPLES.md#7-depiction-svg)\nfor full usage guide.\n\n```python\nimport smsd\n\n# Render any molecule as publication-quality SVG\nsvg = smsd.depict_svg(\"CC(=O)Oc1ccccc1C(=O)O\")  # aspirin\nsmsd.save_svg(svg, \"aspirin.svg\")\n\n# MCS comparison — side-by-side with highlighted matching atoms\nmol1 = smsd.parse_smiles(\"c1ccccc1\")\nmol2 = smsd.parse_smiles(\"c1ccc(O)cc1\")\nmapping = smsd.find_mcs(mol1, mol2)\nsvg = smsd.depict_pair(mol1, mol2, mapping)\nsmsd.save_svg(svg, \"mcs_comparison.svg\")\n\n# Substructure highlighting\nsvg = smsd.depict_mapping(mol2, mapping)\n\n# Custom styling (all ACS proportions auto-scale from bond_length)\nsvg = smsd.depict_svg(\"Cn1cnc2c1c(=O)n(c(=O)n2C)C\",  # caffeine\n    bond_length=50, width=600, height=400)\n\n# Export to SDF file\nmols = [smsd.parse_smiles(s) for s in [\"CCO\", \"c1ccccc1\", \"CC(=O)O\"]]\nsmsd.export_sdf(mols, \"output.sdf\")\n```\n\nFeatures: skeletal formula, Jmol/CPK element colors, asymmetric double bonds,\nwedge/dash stereo, H-count subscripts, charge superscripts, bond-to-label\nclipping, aromatic inner circles, atom map numbers.\n\n### C++ (Header-Only)\n\n```bash\ngit clone https://github.com/asad/SMSD.git\n# Add SMSD/cpp/include to your include path — no other dependencies needed\n```\n\n```cpp\n#include \"smsd/smsd.hpp\"\n\nauto mol1 = smsd::parseSMILES(\"c1ccccc1\");\nauto mol2 = smsd::parseSMILES(\"c1ccc(O)cc1\");\n\nbool isSub = smsd::isSubstructure(mol1, mol2, smsd::ChemOptions{});\nauto mcs   = smsd::findMCS(mol1, mol2, smsd::ChemOptions{}, smsd::MCSOptions{});\n\n// Bond-change-aware MCS for reaction mapping\nauto opts = smsd::MCSOptions{};\nopts.reactionAware = true;\nopts.bondChangeAware = true;\nauto rxnMcs = smsd::reactionAwareMCS(mol1, mol2, smsd::ChemOptions{}, opts);\n\n// Batch MCS with non-overlap constraints (multi-fragment reactions)\nauto mappings = smsd::batchMCSConstrained(queries, targets, smsd::ChemOptions{});\n```\n\n### Build from Source\n\n```bash\ngit clone https://github.com/asad/SMSD.git\ncd SMSD\n\n# Java\nmvn -U clean package\n\n# C++\nmkdir cpp/build \u0026\u0026 cd cpp/build\ncmake .. -DCMAKE_BUILD_TYPE=Release\nmake -j$(nproc)\n\n# Python\ncd python \u0026\u0026 pip install -e .\n```\n\n### Docker\n\n```bash\ndocker build -t smsd .\ndocker run --rm smsd --Q SMI --q \"c1ccccc1\" --T SMI --t \"c1ccc(O)cc1\" --json -\n```\n\n---\n\n## Benchmarks\n\n### MCS Performance (Python)\n\nRepresentative pairs from the checked-in Python benchmark results on the same\nmachine and in the same Python process.\nFull data: [`benchmarks/results_python.tsv`](benchmarks/results_python.tsv)\nFor the maintained local core leaderboard, run `python3 benchmarks/benchmark_leaderboard.py --mode core --compare-mode strict`.\nUse the mode-matched core leaderboard for current cross-tool comparisons.\n\n| Pair | Category | SMSD (ms) | MCS Size |\n|---|---|---:|---:|\n| Cubane (self) | Cage | 0.003 | 8 |\n| Coronene (self) | PAH | 0.006 | 24 |\n| NAD / NADH | Cofactor | 0.012 | 44 |\n| Caffeine / Theophylline | N-methyl diff | 0.017 | 13 |\n| Morphine / Codeine | Alkaloid | 0.079 | 20 |\n| Ibuprofen / Naproxen | NSAID | 0.070 | 15 |\n| ATP / ADP | Nucleotide | 0.148 | 27 |\n| PEG-12 / PEG-16 | Polymer | 0.039 | 40 |\n| Paclitaxel / Docetaxel | Taxane | 1,691 | 56 |\n\n### Substructure Performance (Java)\n\nCurrent maintained cached Java core summary:\n**28/28 hit agreement** and **28/28 speed wins** on the local curated corpus.\n\nRun `python3 benchmarks/benchmark_leaderboard.py --mode core --compare-mode strict`\nto refresh the maintained local summary.\n\n### External Benchmark Datasets\n\nCommunity-standard datasets for reproducible evaluation, stored in [`benchmarks/data/`](benchmarks/data/):\n\n| Dataset | Pairs/Patterns | Source | Purpose |\n|---------|---------------|--------|---------|\n| Tautobase (Chodera subset) | 468 tautomer pairs | [Wahl \u0026 Sander 2020](https://doi.org/10.1021/acs.jcim.0c00035) | Tautomer-aware MCS validation |\n| Tautobase (full SMIRKS) | 1,680 pairs | [Wahl \u0026 Sander 2020](https://doi.org/10.1021/acs.jcim.0c00035) | Tautomer transform coverage |\n| Ehrlich-Rarey SMARTS v2.0 | 1,400 patterns | [Ehrlich \u0026 Rarey 2012](https://doi.org/10.1186/1758-2946-4-13) | Substructure search validation |\n| Dalke-style random pairs | 1,000 pairs | MoleculeNet drug collections | Low-similarity MCS scaling |\n| Dalke-style NN pairs | 1,000 pairs | MoleculeNet drug collections | High-similarity MCS quality |\n| Stress pairs | 12 pairs | Duesbury et al. 2017 | Timeout/robustness |\n| Molecule pool | 5,590 SMILES | MoleculeNet (BBBP, SIDER, ClinTox, BACE) | Pair generation source |\n\n```bash\n# Run external benchmarks (Java)\nmvn test -Dtest=ExternalBenchmarkTest -Dbenchmark=true\n\n# Run external benchmarks (Python)\nSMSD_BENCHMARK=1 pytest python/tests/test_external_benchmarks.py -v -s\n\n# Regenerate Dalke-style pairs (requires RDKit)\npython benchmarks/generate_dalke_pairs.py\n```\n\n---\n\n## Algorithms\n\n### MCS Pipeline (11-level funnel)\n\n| Level | Algorithm | Based on |\n|---|---|---|\n| L0 | Label-frequency upper bound | Degree-aware coverage-driven termination |\n| L0.25 | Chain fast-path | O(n*m) DP for linear polymers (PEG, lipids) |\n| L0.5 | Tree fast-path | Kilpelainen-Mannila DP for branched polymers (dendrimers, glycogen) |\n| L0.75 | Greedy probe | O(N) fast path for near-identical molecules |\n| L1 | Substructure containment | VF2++ check if smaller molecule is subgraph |\n| L1.25 | Augmenting path extension | Forced-extension bond growth from substructure seed |\n| L1.5 | Seed-and-extend | Bond-growth from rare-label seeds |\n| L2 | McSplit + RRSplit | Partition refinement (McCreesh 2017) with maximality pruning |\n| L3 | Bron-Kerbosch | Product-graph clique with Tomita pivoting + k-core + orbit pruning |\n| L4 | McGregor extension | Forced-assignment bond-grow frontier (McGregor 1982) |\n| L5 | Extra seeds | Ring skeleton, heavy-atom core, label-degree anchor seeds |\n\n### MCS Variants\n\n| Variant | Flag |\n|---|---|\n| MCIS (induced) | `induced=true` |\n| MCCS (connected) | default |\n| MCES (edge subgraph) | `maximizeBonds=true` |\n| dMCS (disconnected) | `disconnectedMCS=true` |\n| N-MCS (multi-molecule) | `findNMCS()` |\n| Weighted MCS | `atomWeights` |\n| Scaffold MCS | `findScaffoldMCS()` |\n| Tautomer-aware MCS | `ChemOptions.tautomerProfile()` |\n\n### Substructure Search (VF2++)\n\nVF2++ (Juttner \u0026 Madarasi 2018) with FASTiso/VF3-Light matching order, 3-level NLF pruning, bit-parallel candidate domains, and GPU-accelerated domain initialization (CUDA + Metal).\n\n### Ring Perception\n\nHorton's candidate generation + 2-phase GF(2) elimination (Vismara 1997) for relevant cycles, orbit-based grouping for Unique Ring Families (URFs).\n\n| Output | Description |\n|---|---|\n| SSSR / MCB | Smallest Set of Smallest Rings |\n| RCB | Relevant Cycle Basis |\n| URF | Unique Ring Families (automorphism orbit grouping) |\n\n---\n\n## Chemistry Options\n\n| Option | Values |\n|---|---|\n| Chirality | R/S tetrahedral, E/Z double bond |\n| Isotope | `matchIsotope=true` |\n| Tautomers | 30 transforms with pKa-informed weights (Sitzmann 2010, Dhaked \u0026 Nicklaus 2024) |\n| Solvent | AQUEOUS, DMSO, METHANOL, CHLOROFORM, ACETONITRILE, DIETHYL_ETHER |\n| Ring fusion | IGNORE / PERMISSIVE / STRICT |\n| Bond order | STRICT / LOOSE / ANY |\n| Aromaticity | STRICT / FLEXIBLE |\n| Lenient SMILES | `ParseOptions{.lenient=true}` (C++) / `ChemOptions.lenientSmiles` (Java) |\n\n**Preset profiles**: `ChemOptions()` (default), `.tautomerProfile()`, `.fmcsProfile()`\n\nWith the default chemistry profile, `ringMatchesRingOnly=true` enforces ring/non-ring\nparity for matched atoms and bonds in both directions. Use `.fmcsProfile()` when you\nexplicitly want loose FMCS-style topology where ring atoms may map to chain atoms and\npartial ring fragments are accepted.\n\n**Solvent-aware tautomers** (Tier 2 pKa): `opts.withSolvent(Solvent.DMSO)` adjusts tautomer equilibrium weights for non-aqueous environments.\n\n---\n\n## Platform \u0026 GPU Support\n\n| Platform | CPU | GPU |\n|---|---|---|\n| macOS (Apple Silicon) | OpenMP | Metal (zero-copy unified memory) |\n| Linux | OpenMP | CUDA |\n| Windows | OpenMP | CUDA |\n| Any (no GPU) | OpenMP | Automatic CPU fallback |\n\nGPU acceleration covers RASCAL batch screening and domain initialization. Recursive backtracking (VF2++, BK, McSplit) runs on CPU. Dispatch: `CUDA -\u003e Metal -\u003e OpenMP -\u003e sequential`.\n\n### Performance Caching\n\nSMSD employs multi-level caching to eliminate redundant computation in batch and reaction workloads:\n\n| Cache | Target | Benefit |\n|---|---|---|\n| MolGraph identity cache | Molecule object conversion | Same molecule reused across 6-18 calls per reaction pair |\n| Domain space cache | VF2++ atom compatibility matrix | Avoids O(Nq*Nt) rebuild on repeated queries |\n| ECFP/FCFP fingerprint cache | Default-parameter fingerprints | 337x speedup on repeated fingerprint calls |\n| Pharmacophore features cache | FCFP atom invariants | Eliminates O(n*degree^2) per FCFP call |\n| C++ GraphBuilder compat matrix | Seed-extend/McSplit/BK stages | Pre-computed once, shared across algorithms |\n\nCall `SearchEngine.clearMolGraphCache()` (Java) or reuse `MolGraph` instances (C++/Python) between batches.\n\n---\n\n## Additional Tools\n\n| Tool | Description |\n|---|---|\n| **CIP R/S/E/Z assignment** | Full digraph-based stereo descriptors (IUPAC 2013 Rules 1-5) including Rule 3 (Z \u003e E), like/unlike pairing, and pseudoasymmetric r/s |\n| Circular fingerprint (ECFP/FCFP) | Tautomer-aware Morgan/ECFP with configurable radius (-1 = whole molecule) |\n| Count-based ECFP/FCFP | `ecfpCounts()` / `fcfpCounts()` — superior to binary for ML |\n| Topological Torsion fingerprint | 4-atom path with atom typing (SOTA on peptide benchmarks) |\n| Path fingerprint | Graph-aware, tautomer-invariant path enumeration |\n| MCS fingerprint | MCS-aware, auto-sized |\n| Similarity metrics | Tanimoto, Dice, Cosine, Soergel (binary + count-vector) |\n| Fingerprint formats | `toBitSet()`, `toHex()`, `toBinaryString()`, `fromBitSet()`, `fromHex()` |\n| **MCS SMILES extraction** | `findMcsSmiles()` — extract MCS as canonical SMILES |\n| **findAllMCS** | Top-N MCS enumeration with canonical SMILES dedup |\n| **SMARTS-based MCS** | `findMcsSmarts()` — largest substructure matching a SMARTS pattern |\n| R-group decomposition | `decomposeRGroups()` |\n| **MatchResult** | Structured result: size, mapping, overlap coefficient, query/target atom counts |\n| RASCAL screening | O(V+E) similarity upper bound |\n| Canonical SMILES / SMARTS | deterministic, toolkit-independent (including `X` total connectivity) |\n| Reaction atom mapping | `mapReaction()` |\n| **Publication-quality SVG depiction** | ACS 1996 standard renderer: skeletal formulas, Jmol colors, stereo wedges, MCS highlighting, side-by-side pair rendering |\n| Lenient SMILES parser | Best-effort recovery from malformed SMILES |\n| N-MCS | Multi-molecule MCS with provenance tracking |\n| Tautomer validation | `validateTautomerConsistency()` — proton conservation check |\n| 30 tautomer transforms | pKa-informed weights, 6 solvents, pH-sensitive, ring-chain tautomerism |\n| **8-phase 2D layout pipeline** | Template match, ring-first, chain zig-zag, force-directed, overlap resolution, crossing reduction, canonical orientation, bond normalisation |\n| **Distance geometry 3D** | Bounds matrix, double-centering, power iteration, force-field refinement |\n| **40+ scaffold templates** | Pharmaceutical scaffolds, PAH, spiro, bridged (norbornane, adamantane) |\n| **Coordinate transforms** | translate, rotate, scale, mirror, center, align, bounding box, RMSD |\n| **Force-directed layout** | `forceDirectedLayout()` for bond-crossing minimisation |\n| **SMACOF stress majorisation** | `stressMajorisation()` for optimal 2D embedding |\n| **Reaction-aware MCS** | `reactionAwareMCS()` post-filter for reaction mapping |\n| **Bond-change-aware MCS** | `BondChangeScorer` re-ranks candidates by bond transformation plausibility (C-C breaks=3.0, heteroatom=0.5) |\n| **Batch constrained MCS** | `batchMCSConstrained()` multi-pair MCS with non-overlap atom exclusion for multi-fragment reactions |\n| **Two-phase crossing reduction** | `reduceCrossings()` Phase 1: system-level flipping, Phase 2: individual ring flipping with fusion-atom pivots |\n| **computeSSSR / layoutSSSR** | Clean SSSR APIs: minimum cycle basis and layout-ordered ring perception |\n\n---\n\n## File Formats\n\n| Format | Read | Write |\n|---|---|---|\n| SMILES | Java, C++ | Java, C++ |\n| SMARTS | Java, C++ | C++ |\n| MOL V2000 | Java, C++ | C++ |\n| SDF | Java, C++ | — |\n| Mol2, PDB, CML | Java | — |\n\n---\n\n## Release Downloads\n\nEvery release includes all platforms:\n\n| Download | Description |\n|----------|-------------|\n| `SMSD.Pro-6.12.2.dmg` | macOS installer (Apple Silicon) — drag to Applications |\n| `SMSD.Pro-6.12.2.msi` | Windows installer — next, next, finish |\n| `smsd-pro_6.12.2_amd64.deb` | Linux installer — `sudo dpkg -i` |\n| `smsd-6.12.2.jar` | Pure library JAR (Maven/Gradle dependency) |\n| `smsd-6.12.2-jar-with-dependencies.jar` | Standalone CLI (just `java -jar`) |\n| `smsd-cpp-6.12.2-headers.tar.gz` | C++ header-only library (unpack, `#include \"smsd/smsd.hpp\"`) |\n| `pip install smsd` | Python package (PyPI) |\n\n```bash\n# Native installer — download .dmg / .msi / .deb, double-click, done\n\n# CLI\njava -jar smsd-6.12.2-jar-with-dependencies.jar --Q SMI --q \"c1ccccc1\" --T SMI --t \"c1ccc(O)cc1\" --json -\n\n# Docker CLI\ndocker build -t smsd .\ndocker run --rm smsd --Q SMI --q \"c1ccccc1\" --T SMI --t \"c1ccc(O)cc1\" --json -\n\n# Python\npip install smsd\n```\n\n---\n\n## Tests\n\n**1,512 tests passed** across all platforms:\n\n| Suite | Tests | Coverage |\n|-------|------:|----------|\n| Java | 602 | MCS, substructure, reactions, tautomers, stereochemistry, ring perception, hydrogen handling |\n| C++ core | 114 | MCS, substructure, precision chemistry, kekulisation, implicit H |\n| C++ parser | 542 | SMILES, SMARTS, 1,003 diverse molecules, edge cases |\n| C++ layout | 42 | 2D/3D generation, transforms, overlap resolution, templates |\n| C++ CIP | 42 | R/S, E/Z, pseudoasymmetric, sequence rules |\n| Python | 170 | Full API coverage, hydrogen handling, charged species |\n\nAddressSanitizer: zero memory errors.\n\n---\n\n## Documentation\n\n| Document | Description |\n|---|---|\n| **[Examples, How-To, and Cautions](docs/EXAMPLES.md)** | Worked examples for every feature with cautions and performance tips |\n| [Python API Guide](docs/PYTHON.md) | Full Python API reference |\n| [Java Guide](docs/JAVA.md) | Java API and CLI usage |\n| [C++ Guide](docs/CPP.md) | Header-only C++ integration |\n| [Release Notes 6.12.2](docs/RELEASE_NOTES_6.12.2.md) | Previous release |\n| [Whitepaper](docs/WHITEPAPER.md) | Algorithms and design (11-level MCS, VF2++, ring perception) |\n| [How to Install](docs/HOWTO-INSTALL.md) | Build from source on all platforms |\n| [Changelog](CHANGELOG.md) | Full versioned change history |\n| [NOTICE](NOTICE) | Attribution, trademark, and novel algorithm terms |\n\n---\n\n## License and Commercial Use\n\nSMSD Pro is released under the **Apache License 2.0** — free for any use,\nincluding commercial, with no fee, registration, or approval required.\n\n| Use Case | Permitted |\n|----------|-----------|\n| Commercial products and services | Yes |\n| Proprietary / closed-source software | Yes |\n| SaaS platforms and cloud services | Yes |\n| Pharmaceutical, biotech, agrochemical pipelines | Yes |\n| Academic research and teaching | Yes |\n| Internal corporate tools | Yes |\n| Modify and redistribute | Yes |\n\n**What you must do** (Apache 2.0 Section 4): include the [LICENSE](LICENSE) and\n[NOTICE](NOTICE) files in your distribution, retain copyright notices, and state\nany changes you made to source files.\n\n**What you must not do**: use \"SMSD\", \"SMSD Pro\", or BioInception trademarks to\nendorse your product without permission (see [NOTICE](NOTICE) for trademark terms).\n\nFull details: [LICENSE](LICENSE) | [NOTICE](NOTICE)\n\n---\n\n## Citation\n\nIf you use SMSD Pro in your research, please cite the following paper describing\nthe coverage-driven, tautomer-aware MCS algorithm:\n\n\u003e Rahman SA.\n\u003e *SMSD Pro: Coverage-Driven, Tautomer-Aware Maximum Common Substructure Search.*\n\u003e ChemRxiv, 2025.\n\u003e DOI: [10.26434/chemrxiv.15001534](https://doi.org/10.26434/chemrxiv.15001534/v1)\n\nFor the original SMSD toolkit, please also cite:\n\n\u003e Rahman SA, Bashton M, Holliday GL, Schrader R, Thornton JM.\n\u003e *Small Molecule Subgraph Detector (SMSD) toolkit.*\n\u003e Journal of Cheminformatics, 1:12, 2009.\n\u003e DOI: [10.1186/1758-2946-1-12](https://doi.org/10.1186/1758-2946-1-12)\n\nGitHub renders a **\"Cite this repository\"** button from [CITATION.cff](CITATION.cff).\n\n---\n\n## Author\n\n**Syed Asad Rahman** — [BioInception PVT LTD](https://github.com/asad)\n\nCopyright (c) 2018-2026 BioInception PVT LTD. Algorithm Copyright (c) 2009-2026 Syed Asad Rahman.\n\n## License\n\nApache License 2.0 — see [LICENSE](LICENSE) and [NOTICE](NOTICE)\n\nSMSD Pro is developed at BioInception and distributed under Apache License 2.0.\nCommercial use and redistribution are allowed, subject to the license and\nnotice requirements.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fasad%2Fsmsd","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fasad%2Fsmsd","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fasad%2Fsmsd/lists"}