https://github.com/adam-maz/molport-webscraper
Here I provide a script that allows the user to browse molport pages and collect SMILES of molecules of their interest
https://github.com/adam-maz/molport-webscraper
lead-optimization medicinal-chemistry molport object-oriented-programming python selenium smiles webscraping
Last synced: 3 months ago
JSON representation
Here I provide a script that allows the user to browse molport pages and collect SMILES of molecules of their interest
- Host: GitHub
- URL: https://github.com/adam-maz/molport-webscraper
- Owner: Adam-maz
- Created: 2025-01-26T21:08:01.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2025-01-26T21:53:08.000Z (4 months ago)
- Last Synced: 2025-03-22T04:09:09.426Z (3 months ago)
- Topics: lead-optimization, medicinal-chemistry, molport, object-oriented-programming, python, selenium, smiles, webscraping
- Language: Python
- Homepage: https://www.molport.com/shop/index
- Size: 8.79 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Collecting SMILES from MolPort with Selenium
## 1. Introduction
This document introduces a script that enables users to collect **SMILES** strings from **MolPort** using web scraping techniques with **Selenium**. The script generates a `.csv` file as output, containing the IDs and SMILES strings of the desired particles. This `.csv` file can, for example, be used to build a library of reactants for reaction-based enumeration in lead optimization.## 2. Usage
To use this script:
1. Download an `.sdf` file containing the molecules you want to collect.
2. Convert this file to a `.csv` file (you can use tools like the **DataWarrior suite** for this step).
3. Launch the script and follow the provided instructions.## 3. Content
1. molport_webscraper - code to webscraping.
2. spiro_all.sdf - file with spirocyclic compounds, downloaded from MolPort.
3. sprio_all.csv - file with spirocyclic compounds, input file for script.## 4. Dependencies
To run this script, ensure the following packages are installed in your virtual environment:
- `pandas`
- `selenium`
- `webdriver_manager`You can install them by running the following command in terminal:
```bash
pip install pandas selenium webdriver_manager
```## 5. Selenium documentation
https://selenium-python.readthedocs.io/