https://github.com/chathumiamarasinghe/web-scraping

A versatile Python script for scraping data from websites. This script automates data extraction, processes the information, and saves it in a structured format like CSV. Ideal for data collection, research, and analysis tasks.
https://github.com/chathumiamarasinghe/web-scraping

beautifulsoup csv-export dataextraction phyton pythonwebscraper webscraping

Last synced: 14 days ago
JSON representation

Host: GitHub
URL: https://github.com/chathumiamarasinghe/web-scraping
Owner: chathumiamarasinghe
Created: 2024-09-16T18:57:48.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-09-17T02:46:47.000Z (over 1 year ago)
Last Synced: 2025-04-19T20:52:44.619Z (10 months ago)
Topics: beautifulsoup, csv-export, dataextraction, phyton, pythonwebscraper, webscraping
Language: Python
Homepage:
Size: 9.77 KB
Stars: 3
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Academic Staff Scraper

This Python script scrapes academic staff information from the Faculty of Science, University of Kelaniya's website, specifically the staff details page. The script retrieves each staff member's name, position, room number, phone, fax, email, and specialization (if available) and exports the information into a CSV file.

## Prerequisites

Make sure you have the following Python packages installed before running the script:

- `requests`: For sending HTTP requests to fetch the webpage.
- `beautifulsoup4`: For parsing the HTML content of the webpage.
- `csv`: For writing the extracted data to a CSV file.

You can install the required packages using `pip`:
## How It Works

- **Extract Data from URL**: The script sends a request to the webpage containing the academic staff details.
- **Parse HTML**: It uses BeautifulSoup to parse the HTML and identify the relevant sections for staff data.
- **Retrieve Staff Information**: For each academic staff member, the script extracts:
- Name
- Position
- Room number
- Phone number
- Fax
- Email
- Specialization (scraped from a link if available)
- **CSV Output**: The data is written to a CSV file named `academic_staff.csv`.

## Example Output

-

Name
Position
Room
Phone
Fax
Email
Specialization

Prof.Janaka Wijanayake
Professorr
Room 201
011-2233445
011-2233446
janaka@stu.kln.ac.lk
Computer Science

Dr. Thilini Mahanama
Senior Lecture
Room 202
011-1234567
Not available
thilinie@uni.lk
Physics

## Usage

1. Clone or download the repository containing this script.
2. Make sure you have Python installed on your system.
3. Install the required Python libraries using the following command:
```bash
pip install requests beautifulsoup4

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/chathumiamarasinghe/web-scraping

Awesome Lists containing this project

README