Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/j-hoplin/online-judge-scraper
Web Scraper for Online Judge System
https://github.com/j-hoplin/online-judge-scraper
baekjoon-online-judge prisma puppeteer typescript webscraper
Last synced: about 2 months ago
JSON representation
Web Scraper for Online Judge System
- Host: GitHub
- URL: https://github.com/j-hoplin/online-judge-scraper
- Owner: J-Hoplin
- License: mit
- Created: 2024-02-09T08:09:22.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2024-02-26T07:22:54.000Z (11 months ago)
- Last Synced: 2024-02-26T08:34:31.469Z (11 months ago)
- Topics: baekjoon-online-judge, prisma, puppeteer, typescript, webscraper
- Language: TypeScript
- Homepage:
- Size: 214 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.md
- License: LICENSE
Awesome Lists containing this project
README
## Online Judge Scraper
Online Web Scraper for [Online Judge System Backend](https://github.com/J-Hoplin/Online-Judge-System). This scraper scrape datas from [Baekjoon Online Judge](https://www.acmicpc.net/).
---
### Used Skills
- TypeScript(v5.3)
- [Puppeteer](https://pptr.dev/)
- [Prisma ORM](https://www.prisma.io/)### Be aware these
**Be aware that your [`Online Judge System`](https://github.com/J-Hoplin/Online-Judge-System)'s Database should be initialized.**
The `DATABASE_URL` in the .env file must be the same with `DATABASE_URL` of `Online Judge System`.
Set configuration datas through `.env` file.
```
BOJ_ROOT="https://www.acmicpc.net/problem"
CHUNK="3"
RANGE_START="1000" // Should be greater and equal than 1000
RANGE_END="1100" // should be less or equal than 31000
```If validation failed, it will return an error.
```
❌Fail to load config datas
🔧RANGE_START: RANGE_START must not be less than 1000
🔧RANGE_END: RANGE_END must not be greater than 31000
```**In this project, I set puppeteer cache directory to project directory to prevent chronium cache collision with other puppeteer application. If you don't want this, remove `.puppeteerrc.js` and reinstall puppeteer.**
### Add repository
For future update plan of `Online Judge System`, use pre-defined repository pattern if you need to change another database management system. Below are the example skeleton code of adding PostgreSQL Repository. **Repository should be defined in `src/database/repository`(this is just convention of the project)**.
```typescript
import { PrismaConnector } from '../connector';
import { IRepository } from './repository.interface';export class PostgreSQLRepository extends PrismaConnector implements IRepository{
consturctor(){
super();
}
async saveProblem(
title: string,
problemHTML: string,
inputHTML: string,
outputHTML: string,
timeLimit: number,
memoryLimit: number
examples:string[][]
): Promise {
// Implement your repository
}
}
```### How to use?
1. Install dependencies
```
yarn install
```2. Generate prisma client
```
yarn generate
```3. Modify `.env` in accordance with your preference
4. Start scraper
```
yarn start
```![](./img/1.png)