https://github.com/cobalt-uoft/uoft-scrapers
Public web scraping scripts for the University of Toronto.
https://github.com/cobalt-uoft/uoft-scrapers
open-data toronto uoft web-scraper
Last synced: about 1 year ago
JSON representation
Public web scraping scripts for the University of Toronto.
- Host: GitHub
- URL: https://github.com/cobalt-uoft/uoft-scrapers
- Owner: cobalt-uoft
- License: mit
- Created: 2015-02-20T22:03:50.000Z (over 11 years ago)
- Default Branch: master
- Last Pushed: 2020-05-16T15:47:08.000Z (about 6 years ago)
- Last Synced: 2025-04-01T16:56:58.399Z (about 1 year ago)
- Topics: open-data, toronto, uoft, web-scraper
- Language: Python
- Homepage: https://pypi.python.org/pypi/uoftscrapers
- Size: 619 KB
- Stars: 51
- Watchers: 8
- Forks: 14
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# UofT Scrapers [](https://badge.fury.io/py/uoftscrapers)
This is a library of scrapers for various University of Toronto websites.
## Table of Contents
- [Requirements](#requirements)
- [Installation](#installation)
- [Usage](#usage)
- [Library Reference](#library-reference)
- [Courses](#courses)
- [Buildings](#buildings)
- [Textbooks](#textbooks)
- [Food](#food)
- [Calendar](#calendar)
- [UTSG Calendar](#utsg-calendar)
- [UTM Calendar](#utm-calendar)
- [UTSC Calendar](#utsc-calendar)
- [Timetable](#timetable)
- [UTSG Timetable](#utsg-timetable)
- [UTM Timetable](#utm-timetable)
- [UTSC Timetable](#utsc-timetable)
- [Exams](#exams)
- [UTSG Exams](#utsg-exams)
- [UTM Exams](#utm-exams)
- [UTSC Exams](#utsc-exams)
- [Athletics](#athletics)
- [UTSG Athletics](#utsg-athletics)
- [UTM Athletics](#utm-exams)
- [UTSC Athletics](#utsc-athletics)
- [Parking](#parking)
- [Shuttle Bus Schedule](#shuttles)
- [Events](#events)
- [Libraries](#libraries)
- [Dates](#dates)
- [UTSG Dates](#utsg-dates)
- [UTM Dates](#utm-dates)
## Requirements
- [python3](https://www.python.org/download/releases/3.5.1)
- [pip](https://pypi.python.org/pypi/pip#downloads)
- [tidy](http://binaries.html-tidy.org/)
- [lxml](http://lxml.de/installation.html)
## Installation
```bash
pip install uoftscrapers
```
## Usage
```python
import uoftscrapers
# Example: scrape http://map.utoronto.ca building data to ./some/path
uoftscrapers.Buildings.scrape('./some/path')
# Example: scrape http://coursefinder.utoronto.ca to current working directory
uoftscrapers.Courses.scrape()
```
## Library Reference
### Courses
##### Class name
```python
uoftscrapers.Courses
```
##### Scraper source
http://coursefinder.utoronto.ca
#### Output format
```js
{
"id": String,
"code": String,
"name": String,
"description": String,
"division": String,
"department": String,
"prerequisites": String,
"exclusions": String,
"level": Number,
"campus": String,
"term": String,
"breadths": [Number],
"meeting_sections": [{
"code": String,
"instructors": [String],
"times": [{
"day": String,
"start": Number,
"end": Number,
"duration": Number,
"location": String
}],
"size": Number,
"enrolment": Number
}]
}
```
--------------------------------------------------------------------------------
### Buildings
##### Class name
```python
uoftscrapers.Buildings
```
##### Scraper source
http://map.utoronto.ca
##### Output format
```js
{
"id": String,
"code": String,
"name": String,
"short_name": String,
"campus": String,
"address": {
"street": String,
"city": String,
"province": String,
"country": String,
"postal": String
},
"lat": Number,
"lng": Number,
"polygon": [
[Number, Number]
]
}
```
--------------------------------------------------------------------------------
### Textbooks
##### Class name
```python
uoftscrapers.Textbooks
```
##### Scraper source
http://uoftbookstore.com
##### Output format
```js
{
"id": String,
"isbn": String,
"title": String,
"edition": Number,
"author": String,
"image": String,
"price": Number,
"url": String,
"courses":[{
"id": String,
"code": String,
"requirement": String,
"meeting_sections":[{
"code": String,
"instructors": [String]
}]
}]
}
```
--------------------------------------------------------------------------------
### Food
##### Class name
```python
uoftscrapers.Food
```
##### Scraper source
http://map.utoronto.ca
##### Output format
```js
{
"id": String,
"building_id": String,
"name": String,
"short_name": String,
"description": String,
"url": String,
"tags": [String],
"image": String,
"campus": String,
"lat": Number,
"lng": Number,
"address": String,
"hours": {
"sunday": {
"closed": Boolean,
"open": Number,
"close": Number
},
"monday": {
"closed": Boolean,
"open": Number,
"close": Number
}
"tuesday": {
"closed": Boolean,
"open": Number,
"close": Number
},
"wednesday": {
"closed": Boolean,
"open": Number,
"close": Number
},
"thursday": {
"closed": Boolean,
"open": Number,
"close": Number
},
"friday": {
"closed": Boolean,
"open": Number,
"close": Number
},
"saturday": {
"closed": Boolean,
"open": Number,
"close": Number
}
}
}
```
--------------------------------------------------------------------------------
### Calendar
##### Class name
```python
uoftscrapers.Calendar
```
##### Scraper source
- [UTSG Calendar](#utsg-calendar)
- [UTM Calendar](#utm-calendar)
- [UTSC Calendar](#utsc-calendar)
##### Output format
Not implemented.
----------------------------------------
### UTSG Calendar
##### Class name
```python
uoftscrapers.UTSGCalendar
```
##### Scraper source
http://www.artsandscience.utoronto.ca/ofr/calendar/
##### Output format
Refer to [Calendar](#calendar)
--------------------
### UTM Calendar
##### Class name
```python
uoftscrapers.UTMCalendar
```
##### Scraper source
https://student.utm.utoronto.ca/calendar/calendar.pl
##### Output format
Refer to [Calendar](#calendar)
--------------------
### UTSC Calendar
##### Class name
```python
uoftscrapers.UTSCCalendar
```
##### Scraper source
http://www.utsc.utoronto.ca/~registrar/calendars/calendar/index.html
##### Output format
Refer to [Calendar](#calendar)
--------------------------------------------------------------------------------
### Timetable
##### Class name
```python
uoftscrapers.Timetable
```
##### Scraper source
- [UTSG Timetable](#utsg-timetable)
- [UTM Timetable](#utm-timetable)
- [UTSC Timetable](#utsc-timetable)
##### Output format
```js
{
"id": String,
"code": String,
"name": String,
"description": String,
"division": String,
"department": String,
"prerequisites": String,
"exclusions": String,
"level": Number,
"campus": String,
"term": String,
"breadths": [Number],
"meeting_sections": [{
"code": String,
"instructors": [String],
"times": [{
"day": String,
"start": Number,
"end": Number,
"duration": Number,
"location": String
}],
"size": Number,
"enrolment": Number
}]
}
```
----------------------------------------
### UTSG Timetable
##### Class name
```python
uoftscrapers.UTSGTimetable
```
##### Scraper source
https://timetable.iit.artsci.utoronto.ca
##### Output format
Refer to [Timetable](#timetable)
--------------------
### UTM Timetable
##### Class name
```python
uoftscrapers.UTMTimetable
```
##### Scraper source
https://student.utm.utoronto.ca/timetable
##### Output format
Refer to [Timetable](#timetable)
--------------------
### UTSC Timetable
##### Class name
```python
uoftscrapers.UTSCTimetable
```
##### Scraper source
http://www.utsc.utoronto.ca/~registrar/scheduling/timetable
##### Output format
Refer to [Timetable](#timetable)
--------------------------------------------------------------------------------
### Exams
##### Class name
```python
uoftscrapers.Exams
```
##### Scraper source
- [UTSG Exams](#utsg-exams)
- [UTM Exams](#utm-exams)
- [UTSC Exams](#utsc-exams)
##### Output format
```js
{
"id": String,
"course_id": String,
"course_code": String
"period": String,
"date": String,
"start_time": Number,
"end_time": Number,
"duration": Number,
"sections": [{
"lecture_code": String,
"exam_section": String,
"location": String
}]
}
```
----------------------------------------
### UTSG Exams
##### Class name
```python
uoftscrapers.UTSGExams
```
##### Scraper source
http://www.artsci.utoronto.ca/current/exams
##### Output format
Refer to [Exams](#exams)
--------------------
### UTM Exams
##### Class name
```python
uoftscrapers.UTMExams
```
##### Scraper source
https://student.utm.utoronto.ca/examschedule/finalexams.php
##### Output format
Refer to [Exams](#exams)
--------------------
### UTSC Exams
##### Class name
```python
uoftscrapers.UTSCExams
```
##### Scraper source
http://www.utsc.utoronto.ca/registrar/examination-schedule
##### Output format
Refer to [Exams](#exams)
--------------------------------------------------------------------------------
### Athletics
##### Class name
```python
uoftscrapers.Athletics
```
##### Scraper source
- [UTSG Athletics](#utsg-athletics)
- [UTM Athletics](#utm-athletics)
- [UTSC Athletics](#utsc-athletics)
##### Output format
```js
{
"date": String,
"events":[{
"title": String,
"campus": String,
"location": String,
"building_id": String,
"start_time": Number,
"end_time": Number,
"duration": Number
}]
}
```
----------------------------------------
### UTSG Athletics
##### Class name
```python
uoftscrapers.UTSGAthletics
```
##### Scraper source
_Not yet implemented_
##### Output format
Refer to [Athletics](#athletics)
--------------------
### UTM Athletics
##### Class name
```python
uoftscrapers.UTMAthletics
```
##### Scraper source
http://www.utm.utoronto.ca/athletics/schedule/month/
##### Output format
Refer to [Athletics](#athletics)
--------------------
### UTSC Athletics
##### Class name
```python
uoftscrapers.UTSCAthletics
```
##### Scraper source
http://www.utsc.utoronto.ca/athletics/calendar-node-field-date-time/month/
##### Output format
Refer to [Athletics](#athletics)
--------------------------------------------------------------------------------
### Parking
##### Class name
```python
uoftscrapers.Parking
```
##### Scraper source
http://map.utoronto.ca
##### Output format
```js
{
"id": String,
"title": String,
"building_id": String,
"campus": String,
"type": String,
"description": String,
"lat": Number,
"lng": Number,
"address": String
}
```
--------------------------------------------------------------------------------
### Shuttles
##### Class name
```python
uoftscrapers.Shuttles
```
##### Scraper source
https://m.utm.utoronto.ca/shuttle.php
##### Output format
```js
{
"date": String,
"routes": [{
"id": String,
"name": String,
"stops": [{
"location": String,
"building_id": String,
"times": [{
"time": Number,
"rush_hour": Boolean,
"no_overload": Boolean
}]
}]
}]
}
```
------
### Events
##### Class name
```python
uoftscrapers.Events
```
##### Scraper source
https://www.events.utoronto.ca/
##### Output format
```js
{
id: String,
title: String,
start_date: String
end_date: String,
start_time: Number,
end_time: Number,
duration: Number,
url: String,
description: String,
admission_price: String,
campus: String,
location: String,
audiences: [String],
}
```
------
### Libraries
##### Class name
```python
uoftscrapers.Libraries
```
##### Scraper source
https://onesearch.library.utoronto.ca/
##### Output format
```js
{
id: String,
name: String,
image: String,
website: String,
address: String,
phone: String,
about: String,
collection_strengths: String,
access: String,
hours: {
sunday: {
closed: Boolean,
open: String,
close: String,
},
monday: {
closed: Boolean,
open: Number,
close: Number,
},
tuesday: {
closed: Boolean,
open: Number,
close: Number,
},
wednesday: {
closed: Boolean,
open: Number,
close: Number,
},
thursday: {
closed: Boolean,
open: Number,
close: Number,
},
friday: {
closed: Boolean,
open: Number,
close: Number,
},
saturday: {
closed: Boolean,
open: Number,
close: Number,
}
}
}
```
--------------------------------------------------------------------------------
### Dates
##### Class name
```python
uoftscrapers.Dates
```
##### Scraper source
- [UTSG Dates](#utsg-dates)
- [UTM Dates](#utm-dates)
##### Output format
```js
{
"date": String,
"events": [{
"end_date": String,
"session": String,
"campus": String,
"description": String
}]
}
```
----------------------------------------
### UTSG Dates
##### Class name
```python
uoftscrapers.UTSGDates
```
##### Scraper source
http://www.artsci.utoronto.ca/current/course/timetable/
http://www.undergrad.engineering.utoronto.ca/About/Dates_Deadlines.htm
##### Output format
Refer to [Exams](#exams)
--------------------
### UTM Dates
##### Class name
```python
uoftscrapers.UTMDates
```
##### Scraper source
http://m.utm.utoronto.ca/importantDates.php
##### Output format
Refer to [Exams](#exams)
--------------------------------------------------------------------------------