https://github.com/adam-maz/molport-webscraper

Here I provide a script that allows the user to browse molport pages and collect SMILES of molecules of their interest
https://github.com/adam-maz/molport-webscraper

lead-optimization medicinal-chemistry molport object-oriented-programming python selenium smiles webscraping

Last synced: 3 months ago
JSON representation

Here I provide a script that allows the user to browse molport pages and collect SMILES of molecules of their interest

Host: GitHub
URL: https://github.com/adam-maz/molport-webscraper
Owner: Adam-maz
Created: 2025-01-26T21:08:01.000Z (4 months ago)
Default Branch: main
Last Pushed: 2025-01-26T21:53:08.000Z (4 months ago)
Last Synced: 2025-03-22T04:09:09.426Z (3 months ago)
Topics: lead-optimization, medicinal-chemistry, molport, object-oriented-programming, python, selenium, smiles, webscraping
Language: Python
Homepage: https://www.molport.com/shop/index
Size: 8.79 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Collecting SMILES from MolPort with Selenium

## 1. Introduction
This document introduces a script that enables users to collect **SMILES** strings from **MolPort** using web scraping techniques with **Selenium**. The script generates a `.csv` file as output, containing the IDs and SMILES strings of the desired particles. This `.csv` file can, for example, be used to build a library of reactants for reaction-based enumeration in lead optimization.

## 2. Usage
To use this script:
1. Download an `.sdf` file containing the molecules you want to collect.
2. Convert this file to a `.csv` file (you can use tools like the **DataWarrior suite** for this step).
3. Launch the script and follow the provided instructions.

## 3. Content
1. molport_webscraper - code to webscraping.
2. spiro_all.sdf - file with spirocyclic compounds, downloaded from MolPort.
3. sprio_all.csv - file with spirocyclic compounds, input file for script.

## 4. Dependencies
To run this script, ensure the following packages are installed in your virtual environment:
- `pandas`
- `selenium`
- `webdriver_manager`

You can install them by running the following command in terminal:
```bash
pip install pandas selenium webdriver_manager
```

## 5. Selenium documentation
https://selenium-python.readthedocs.io/

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/adam-maz/molport-webscraper

Awesome Lists containing this project

README