Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ecshreve/jepp
api fun with jeopardy!
https://github.com/ecshreve/jepp
api colly data gin-gonic github-actions go golang jeopardy mysql scrape trivia
Last synced: about 1 month ago
JSON representation
api fun with jeopardy!
- Host: GitHub
- URL: https://github.com/ecshreve/jepp
- Owner: ecshreve
- License: mit
- Created: 2023-06-14T04:02:31.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-02-23T05:02:03.000Z (9 months ago)
- Last Synced: 2024-10-02T05:05:03.375Z (about 1 month ago)
- Topics: api, colly, data, gin-gonic, github-actions, go, golang, jeopardy, mysql, scrape, trivia
- Language: Go
- Homepage: https://jepp.app
- Size: 117 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# [jepp](https://jepp.app)
API fun with Jeopardy! Access >100k Jeopardy clues scraped from [j-archive] via a simple api.
[![CI](https://github.com/ecshreve/jepp/actions/workflows/ci.yml/badge.svg?branch=main&event=push)](https://github.com/ecshreve/jepp/actions/workflows/ci.yml)
![GitHub go.mod Go version](https://img.shields.io/github/go-mod/go-version/ecshreve/jepp)
[![Go Report Card](https://goreportcard.com/badge/github.com/ecshreve/jepp)](https://goreportcard.com/report/github.com/ecshreve/jepp)
[![GoDoc](https://godoc.org/github.com/ecshreve/jepp?status.svg)](https://godoc.org/github.com/ecshreve/jepp)
![GitHub release (release name instead of tag name)](https://img.shields.io/github/v/release/ecshreve/jepp)---
![jepp](static/repo/jepp-ui.png)
# API
The api is backed by a go web server built with [gin] that exposes a few endpoints to access historical Jeopardy data.
## Types
The shape of the data returned from the api aligns with the db schema, this is accomplished via various struct tags on the type definitions.
### Struct tags quick reference
- `db` tag is used by the [sqlx] library to map the db columns to the struct fields
- `json` tag is used by the [gin] library to map the struct fields to the json response
- `example` tag is used by the [swaggo] library to generate example responses for the swagger docs
- `form` and `binding` tags are used by [gin] to map query arguments to a struct with some basic validationfor example, the `pkg.models.Clue` type is defined as follows:
```{golang}
type Clue struct {
ClueID int64 `db:"clue_id" json:"clueId" example:"804002032"`
GameID int64 `db:"game_id" json:"gameId" example:"8040"`
CategoryID int64 `db:"category_id" json:"categoryId" example:"804092001"`
Question string `db:"question" json:"question" example:"This is the question."`
Answer string `db:"answer" json:"answer" example:"This is the answer."`
}
```- Struct tags also appear on some helper structs like the `pkg.server.Filter` type:
```{golang}
// Filter describes the query parameters that can be used to filter the results of an API query.
type Filter struct {
Random *bool `form:"random"`
ID *int64 `form:"id"`
Page *int64 `form:"page,default=0" binding:"min=0"`
Limit *int64 `form:"limit,default=1" binding:"min=1,max=100"`
}
```## Frontend / UI
- The ui is served from the `/` endpoint and is an html template that displays the swagger docs, some
general information, and a sample request.
- The embedded swagger ui provides runnable request / response examples and type references.## Swagger Docs
- Swagger documentation is generated with [swaggo] and embedded in the homepage as part of the html template.
- Figuring out the right build/deploy configuration was challenging here, I ran into some problems in my Taskfile task dependencies. The main problem seemed to be multiple tasks with the same set of files listed as `sources` causing a watched task to continuously rebuild because of some circular dependencies.
- These problems seem to be solved after breaking up and organizing the taskfiles better.# DB
Currently the app uses a file based sqlite database. Below are some notes on the deprecated mysql setup.
All in all, the 15 seasons of data currently in the DB only end up as ~25 MB .sql file. Using
sqlite removed the need to run a mysql server and made the app easier to deploy and test.## Notes on deprecated mysql setup
Getting the data into the database started as a manual process, and hasn't been automated yet because the data is all there and I haven't needed to import / export it recently.
Here's how I went about doing it initially:
- For local development I set the `DB_HOST`, `DB_USER`, `DB_PASS`, `DB_NAME` environment variables to target a `mariadb/mysql` server running in my home lab.
- Most of the time I play with that local copy of the data, but the public api uses a mysql db hosted on [digital ocean](https://www.digitalocean.com/products/managed-databases-mysql)
- Initially to populate the prod db I just manually created a backup of my local database and restored it to the prod database, both via an [adminer](https://hub.docker.com/_/adminer/) instance running in my home lab.
- Currently the `task sql:dump` command will create a dump of the database defined by the environment variables and write it to `data/dump.sql.gz`.
- Recent dumps of the prod database are available in the [data](data/) directory or as downloads on repository's [Releases](https://github.com/ecshreve/jepp/releases) page.## Data Scraping
note: all the scraping was done against the mysql databse, not the current sqlite setup (though I did
some brief testing and things seemed to still work for the most part _ymmv_)The [scraper](pkg/scraper/) package contains the code to scrape [j-archive] for jeopardy clues and write the data to a mysql database. [Colly] is the package use to scrape the data and [sqlx] is used to write the data to the db. The scraping happened in a few passes, more or less following these steps:
Get all the seasons and populate the seasons table.
- This scrape targeted the season [summary page on j-archive](https://www.j-archive.com/listseasons.php) and pulled the season number, start date, end date for each season
Get all the games for each season and populate the game table.
- This scrape targets the individual [season show pages on j-archive](https://www.j-archive.com/showseason.php?season=1) and pulls the game number, air date, taped date for each season
Get all the clues for each game in each season and populate the category and clue tables- This scrape targeted the individual [game pages on j-archive](https://www.j-archive.com/showgame.php?game_id=7040) and pulls the clue data from the `` elements on the page
## references / prior art
- [jservice](https://jservice.io/)
- [jservice repo](https://github.com/sottenad/jService)
- [jeppy](https://github.com/ecshreve/jeppy)
- [illustrated sqlx](https://jmoiron.github.io/sqlx/)[sqlx]:
[gin]:
[swaggo]:
[j-archive]:
[colly]:
![cf](https://img.shields.io/badge/Cloudflare-F38020?style=for-the-badge&logo=Cloudflare&logoColor=white)
![do](https://img.shields.io/badge/Digital_Ocean-0080FF?style=for-the-badge&logo=DigitalOcean&logoColor=white)
![ga](https://img.shields.io/badge/GitHub_Actions-2088FF?style=for-the-badge&logo=github-actions&logoColor=white)
![mysql](https://img.shields.io/badge/MySQL-005C84?style=for-the-badge&logo=mysql&logoColor=white)
![mariadb](https://img.shields.io/badge/MariaDB-003545?style=for-the-badge&logo=mariadb&logoColor=white)
![dock](https://img.shields.io/badge/Docker-2CA5E0?style=for-the-badge&logo=docker&logoColor=white)
![swag](https://img.shields.io/badge/Swagger-85EA2D?style=for-the-badge&logo=Swagger&logoColor=white)
![golan](https://img.shields.io/badge/Go-00ADD8?style=for-the-badge&logo=go&logoColor=white)