https://github.com/dms-codes/httpswww.itb.ac.idstafflistby
ITB Staff Data Scraping Script The itb_staff_data_scraping.py script is a Python script for scraping staff data from the Institut Teknologi Bandung (ITB) website. It uses the BeautifulSoup library to parse web pages and extract information about staff members, including their names, expertise, school/faculty, titles, education history, and email ad
https://github.com/dms-codes/httpswww.itb.ac.idstafflistby
itb python scrape
Last synced: 4 months ago
JSON representation
ITB Staff Data Scraping Script The itb_staff_data_scraping.py script is a Python script for scraping staff data from the Institut Teknologi Bandung (ITB) website. It uses the BeautifulSoup library to parse web pages and extract information about staff members, including their names, expertise, school/faculty, titles, education history, and email ad
- Host: GitHub
- URL: https://github.com/dms-codes/httpswww.itb.ac.idstafflistby
- Owner: dms-codes
- Created: 2023-10-01T17:24:04.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2023-10-01T17:25:34.000Z (almost 2 years ago)
- Last Synced: 2025-01-18T21:20:05.116Z (6 months ago)
- Topics: itb, python, scrape
- Language: Python
- Homepage: https://github.com/dms-codes/httpswww.itb.ac.idstafflistby
- Size: 54.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ITB Staff Data Scraping Script
The `itb_staff_data_scraping.py` script is a Python script for scraping staff data from the Institut Teknologi Bandung (ITB) website. It uses the BeautifulSoup library to parse web pages and extract information about staff members, including their names, expertise, school/faculty, titles, education history, and email addresses.
## Prerequisites
Before using this script, make sure you have the following prerequisites installed:
- Python 3.x
- BeautifulSoup4
- RequestsYou can install BeautifulSoup4 and Requests using pip:
```bash
pip install beautifulsoup4 requests
```## Usage
1. Clone or download the script to your local machine.
2. Install the required Python libraries mentioned in the "Prerequisites" section.
3. Configure the script by specifying the `BASE_URL` variable, which should be the URL of the ITB staff listing page.
4. Run the script:
```bash
python itb_staff_data_scraping.py
```The script will start scraping staff data from the ITB website and save the results in a CSV file named `data_dosen.csv`.
## CSV Output
The output CSV file will have the following columns:
- Name: The name of the staff member.
- Expertise: The expertise or field of study of the staff member.
- School/Faculty: The school or faculty to which the staff member belongs.
- Title: The title or position of the staff member.
- S1: Undergraduate education information (Institution and Year).
- S2: Master's degree education information (Institution and Year).
- S3: Doctorate education information (Institution and Year).
- Email: The email address of the staff member.## Customization
You can customize the script by modifying the code to adapt it to the structure of the ITB website or to extract additional information if needed.
## Example
```bash
python itb_staff_data_scraping.py
```This command will execute the script, scrape staff data from the ITB website, and save the results in `data_dosen.csv`.
Feel free to use this script to collect staff data from ITB or similar websites for your data analysis or research purposes.
```You can customize this README by replacing `itb_staff_data_scraping.py` with the actual name of your script and providing additional information or examples as needed.