https://github.com/muhammedzohaib/drugscraper

A simple scrapy spider to scrape https://www.drugs.com
https://github.com/muhammedzohaib/drugscraper

Last synced: about 1 year ago
JSON representation

A simple scrapy spider to scrape https://www.drugs.com

Host: GitHub
URL: https://github.com/muhammedzohaib/drugscraper
Owner: MuhammedZohaib
Created: 2024-07-11T09:53:38.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-07-11T10:03:43.000Z (about 2 years ago)
Last Synced: 2025-06-04T14:44:09.527Z (about 1 year ago)
Language: Python
Size: 10.7 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# DrugScraper

This Scrapy spider (`DrugsspiderSpider`) scrapes drug information from `www.drugs.com`. It extracts drug names, generic names, drug classes, and URLs directly from the website.

## Spider Details

- **Name:** `DrugsspiderSpider`
- **Allowed Domains:** `www.drugs.com`

## Fields Extracted

- **Name:** Drug name.
- **Generic Name:** Generic name of the drug.
- **Drug Class:** Class of the drug.
- **URL:** URL of the drug page.

## Usage

1. Clone the repository:

```bash
git clone https://github.com/MuhammedZohaib/DrugScraper.git
cd DrugScraper
```

2. Run the spider:

```bash
scrapy crawl drugsSpider
```

3. Output: Drug information (name, generic name, drug class, URL) in JSON and CSV format.

---

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/muhammedzohaib/drugscraper

Awesome Lists containing this project

README