Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dms-codes/httpswww.itb.ac.idstafflistby
ITB Staff Data Scraping Script The itb_staff_data_scraping.py script is a Python script for scraping staff data from the Institut Teknologi Bandung (ITB) website. It uses the BeautifulSoup library to parse web pages and extract information about staff members, including their names, expertise, school/faculty, titles, education history, and email ad
https://github.com/dms-codes/httpswww.itb.ac.idstafflistby
itb python scrape
Last synced: 2 days ago
JSON representation
ITB Staff Data Scraping Script The itb_staff_data_scraping.py script is a Python script for scraping staff data from the Institut Teknologi Bandung (ITB) website. It uses the BeautifulSoup library to parse web pages and extract information about staff members, including their names, expertise, school/faculty, titles, education history, and email ad
- Host: GitHub
- URL: https://github.com/dms-codes/httpswww.itb.ac.idstafflistby
- Owner: dms-codes
- Created: 2023-10-01T17:24:04.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-10-01T17:25:34.000Z (over 1 year ago)
- Last Synced: 2023-10-01T20:45:15.223Z (over 1 year ago)
- Topics: itb, python, scrape
- Language: Python
- Homepage: https://github.com/dms-codes/httpswww.itb.ac.idstafflistby
- Size: 54.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ITB Staff Data Scraping Script
The `itb_staff_data_scraping.py` script is a Python script for scraping staff data from the Institut Teknologi Bandung (ITB) website. It uses the BeautifulSoup library to parse web pages and extract information about staff members, including their names, expertise, school/faculty, titles, education history, and email addresses.
## Prerequisites
Before using this script, make sure you have the following prerequisites installed:
- Python 3.x
- BeautifulSoup4
- RequestsYou can install BeautifulSoup4 and Requests using pip:
```bash
pip install beautifulsoup4 requests
```## Usage
1. Clone or download the script to your local machine.
2. Install the required Python libraries mentioned in the "Prerequisites" section.
3. Configure the script by specifying the `BASE_URL` variable, which should be the URL of the ITB staff listing page.
4. Run the script:
```bash
python itb_staff_data_scraping.py
```The script will start scraping staff data from the ITB website and save the results in a CSV file named `data_dosen.csv`.
## CSV Output
The output CSV file will have the following columns:
- Name: The name of the staff member.
- Expertise: The expertise or field of study of the staff member.
- School/Faculty: The school or faculty to which the staff member belongs.
- Title: The title or position of the staff member.
- S1: Undergraduate education information (Institution and Year).
- S2: Master's degree education information (Institution and Year).
- S3: Doctorate education information (Institution and Year).
- Email: The email address of the staff member.## Customization
You can customize the script by modifying the code to adapt it to the structure of the ITB website or to extract additional information if needed.
## Example
```bash
python itb_staff_data_scraping.py
```This command will execute the script, scrape staff data from the ITB website, and save the results in `data_dosen.csv`.
Feel free to use this script to collect staff data from ITB or similar websites for your data analysis or research purposes.
```You can customize this README by replacing `itb_staff_data_scraping.py` with the actual name of your script and providing additional information or examples as needed.