Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/rafaeelaudibert/ufrgs_scraper.js

Scraper written in Javascript, with Node.js, to fetch all UFRGS freshmen
https://github.com/rafaeelaudibert/ufrgs_scraper.js

freshmen javascript nodejs scraper ufrgs vestibular

Last synced: 17 days ago
JSON representation

Scraper written in Javascript, with Node.js, to fetch all UFRGS freshmen

Awesome Lists containing this project

README

        

# UFRGS - Vestibular Scraper

Scraper written in `JavaScript`, using `Node.js` to fetch all the freshmen in UFRGS vestibular.

There is also a `Shell` scraper with less functionalities but a lot faster

This code is tested to run in UFRGS's "Listão" from the 2022 and 2023 editions.
There are no warranties that it will run in future editions, as this is only a scraper and depends in the website layout, which can be changed by UFRGS at any time.

> **NOTE:** A previous version worked for the years between 2016 and 2021, but that version stopped working recently
> You might check it by looking at previous commits

---

## Configuring

You must have `Node.js` installed in your computer to run this code. You can download it [here](https://nodejs.org/en/download/).

You can clone this repository running `git clone https://github.com/rafaeelaudibert/UFRGS_scraper.js.git && cd UFRGS_scraper`.

After, you need to install the requirements, which can be easily installed with `npm install`.

You should also configure the year you want to be searched in the .ENV file, writing a key/value pair, such as `YEAR=2023`.

---

## Running the code

To run the code you can simply run `npm start`.

The code will erase any folder with the name `./json` in the root of the project, so be sure to not have important information in it before running the code and typing `YES` when prompted.

---

## Understanding the data

The data generated by the code is pretty easy to understand. It will generate a folder tree like so:

```
./json
|
\- course1
|
\- freshmen.json
\- freshmen.txt
|- course2
|- course3
|- course4
```

\
Each course will have its own folder containing 2 files: `freshmen.json` and `freshmen.txt`. The former has the following structure:

```json
[
{
"name": "First freshman name",
"semester": "First freshman semester (1º or 2º)",
},
{
"name": "Second freshman name",
"semester": "Second freshman semester (1º or 2º)",
},
{
...
},
...
]
```

\
The latter is a plain text file containing one freshman name per line, _without the semester_, as follows:

```text
First freshman name
Second freshman name
Third freshman name
...
```

## Disclaimer

This program is not associated with the Universidade Federal do Rio Grande do Sul in any ways, and it was just created to more easily fetch the freshmen through the popular Listão do Vestibular.