https://github.com/lucinamay/biosynfoni
a *biosynformatic* fingerprint to explore natural product distance and diversity
https://github.com/lucinamay/biosynfoni
bioinformatics biosynformatic-fingerprint biosynformatics cheminformatics metabolites metabolomics molecular-fingerprints natural-products
Last synced: 5 months ago
JSON representation
a *biosynformatic* fingerprint to explore natural product distance and diversity
- Host: GitHub
- URL: https://github.com/lucinamay/biosynfoni
- Owner: lucinamay
- License: mit
- Created: 2023-07-17T07:03:38.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2025-08-30T23:38:04.000Z (10 months ago)
- Last Synced: 2026-01-03T15:23:56.943Z (5 months ago)
- Topics: bioinformatics, biosynformatic-fingerprint, biosynformatics, cheminformatics, metabolites, metabolomics, molecular-fingerprints, natural-products
- Language: Jupyter Notebook
- Homepage: https://moltools.bioinformatics.nl/biosynfoni
- Size: 73.5 MB
- Stars: 20
- Watchers: 3
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
\
🌿 *a biosynformatic molecular fingerprint tailored to natural product chem- and bioinformatic research* 🌿
\________________________________________________________________________________________
**bi·o·syn·for·ma·tic**\
/ˌbaɪ oʊ sɪn fərˈ mæt ɪk/\
*adjective Computers, Biochemistry*
relating to biosynthetic information and biochemical logic.\
as a concatenation of *biosynthetic* and *bioinformatics*, it was coined\
during the creation of `BioSynFoni`.
\_________________________________________________________________________________________
### Getting started 🌿
Read more about Biosynfoni in our preprint [here](https://doi.org/10.26434/chemrxiv-2025-cwq74).
#### Predict biosynthetic class
We have trained a biosynthetic class predictor on `biosynfoni` fingerprints.
You can try out the predictor on your own molecules [here](https://moltools.bioinformatics.nl/biosynfoni)!
#### Installation
Biosynfoni requires Python 3.9 or later. RDKit is installed as a dependency when installing Biosynfoni.
To install the package, you can use pip:
```bash
pip install biosynfoni
```
Now you can import the `biosynfoni` package in your Python code or use the command line tool.
#### Usage in Python
Convert a SMILES string to a fingerprint:
```python
from biosynfoni import Biosynfoni
from rdkit import Chem
smi =
mol = Chem.MolFromSmiles(smi)
fp = Biosynfoni(mol).fingerprint # returns biosynfoni's count fingerprint of the molecule
```
#### Usage in the command line
Create a fingerprint from a SMILES string:
```bash
biosynfoni
```
Create a fingerprint from an InChI string:
```bash
biosynfoni
```
Write the fingerprints of all molecules in an SDF file to a CSV file:
```bash
biosynfoni
```
### Publication
#### Citation
If you use `biosynfoni` in your research, please cite our [publication](https://jcheminf.biomedcentral.com/articles/10.1186/s13321-025-01081-6).
#### Data availability
We created several biosynthetic class predictors for our manuscript, which can be downloaded from Zenodo [here](https://zenodo.org/records/14791239).
We have used data from the [COCONUT](https://coconut.naturalproducts.net) natural product database ([DOI](https://doi.org/10.1186/s13321-020-00478-9)) and [ZINC](https://zinc.docking.org) compound database ([DOI](https://pubs.acs.org/doi/10.1021/acs.jcim.0c00675)). The parsed data used for the analysis in our manuscript can be downloaded from Zenodo [here](https://zenodo.org/records/14791205).
Results for the stratified classification analysis (see: `experiments/classification_stratified.py`) can be downloaded from Zenodo [here](https://doi.org/10.5281/zenodo.15150841).