https://github.com/caesariodito/bangkit-dashboard-scraper
A side project to scrape bangkit dashboard website to make my life better; am also trying out puppeteer;
https://github.com/caesariodito/bangkit-dashboard-scraper
Last synced: 10 months ago
JSON representation
A side project to scrape bangkit dashboard website to make my life better; am also trying out puppeteer;
- Host: GitHub
- URL: https://github.com/caesariodito/bangkit-dashboard-scraper
- Owner: caesariodito
- License: mit
- Created: 2024-04-12T05:21:49.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-04-28T04:42:01.000Z (about 2 years ago)
- Last Synced: 2025-06-22T06:36:45.396Z (11 months ago)
- Language: TypeScript
- Size: 1.33 MB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Bangkit Dashboard Scraper
A side project to scrape bangkit dashboard website to make my life less burdening. kekw.
Used typescript, bun, puppeteer and exceljs.
Below is a brief intro and the reasons for the tech stack that I chose. **_It's just some personal notes, you don't need to read it tbh._**
Sooo, it's been a busy time for me doing mentoring on bangkit along with other stuff. Unfortunately, I wouldn't be able to keep up with the spreadsheet tracker that I made and shared to my mentees. It's pretty burdening for me to view all the data one by one and make sure everything is up-to-date weekly or even daily – it's just not possible for me.
Then this project came through my mind, I just want to automate things that I can automate and also help my mentees with the tracker (since I share it with them). At the time I developed this side project, I had been neglecting the spreadsheet tracker for almost a month 💀. So yes, I think this is the right move for me, and fortunately, I got some small spare time to develop this project at Eid holiday.
In this side project, I wanted to try new things, hence the stack was created. It was an _alien_ stack for me. **I had never gotten the time to use bun** _(well I once tried it to run laravel breeze project and everything went well so I chose to use it again, since it was fast)_ and **had never used puppetee**r – heck, even I don't know the syntax.
I used to scrape things with python selenium, and moving to puppeteer was pretty uncomfortable for me. But hey, it was an interesting experience for me. I also feel like puppeteer is much more faster compared to selenium _(pleb thoughts, since I haven't dug deep into selenium that much)_. Puppeteer feels like a adult teenager, while selenium feels like an adult LMAOAO.
**Anyways, a small disclaimer:** I used some of AI-generated code to develop this project since I don't know the syntax and mechanism of puppeteer that much. But ofc I read the puppeteer docs, but they were also vague for me, so I need further assistance. **Pardon me if the code is ugly as hell**. I only spent ~a day developing this.
If you want to contribute to this side project as the demands of the spreadsheet tracker grow, feel free to open a PR! But make sure you document it well, tho currently I don't have any guidelines for it.
## The Demo
### Scraper Video
[](https://drive.google.com/file/d/1Vm6LOc4BTR1BNAmyknrGwFYgjky16HK2/preview)
### Convert Video
[](https://drive.google.com/file/d/1xyuJiBIRteDHicauCl6sDtzlNWUvJ_jI/preview)
## How to Run
This is a step-by-step guide to run this project. Even tho I documented how to use this project starting from the empty folder (via the demo video), but anyways having a text documentation would be convenient too.
### Steps
1. clone the project and cd into it
2. install the package via
```bash
npm install
# or
pnpm install
# or
...
bun install # <-- I use this code, but everything should be working as expected too
```
3. run the script via
```bash
bun run start
# or
npm run start
```
you can always view or edit the script in the [`package.json`](package.json) file.
4. after you run it, it will open the login page, and asked you to login **manually** via google. you can just type your email and password as usual.
5. after u finish logging in, make sure you wait the page to fully load. it will show the list of available mentees in your class. if everything's good you need to press **Enter** on the terminal to continue the script.
6. it will scrape some of the important data related to the spreadsheet tracker and after it finishes, it will save the output to [`profile_data.json`](example_profile_data.json) file.
7. you are then free to tweak and preprocess the data further. for example, importing the .json file directly to spreadsheet.
8. Optionally, I created a script to convert the JSON file into XLSX (Excel), you can access it via `bun run convert` or if you want to modify the script, it is available on the [`convert.ts`](convert.ts) and [`package.json`](package.json) file. The result of the script can be viewed in [`example_profile.xlsx`](example_profile.xlsx)
## Future Improvements
- [x] export to excel/csv (soon)
- [ ] integrate with spreadsheet directly via API
- [ ] bypass login with indexedDB (idk really know much about this)
- [ ] create a weekly cron-job to fully automate the tasks