https://github.com/hmarr/advisories-analysis
Analysing the GitHub Advisory Database with sqlite and pandas
https://github.com/hmarr/advisories-analysis
Last synced: about 1 year ago
JSON representation
Analysing the GitHub Advisory Database with sqlite and pandas
- Host: GitHub
- URL: https://github.com/hmarr/advisories-analysis
- Owner: hmarr
- Created: 2022-10-24T14:32:25.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-10-24T14:34:14.000Z (over 3 years ago)
- Last Synced: 2025-02-06T12:32:41.289Z (over 1 year ago)
- Language: Jupyter Notebook
- Size: 43 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# GitHub Advisory Database Analysis
Scripts for performing analysis on the GitHub Advisory Database.
## Building a sqlite database of GHSAs
The public [GitHub Advisory Database](https://github.com/github/advisory-database) is a repo with 180k+ JSON files, which is not very easy to work with. This repo contains a script to download the data, and a small Rust program to build a sqlite database of the GHSAs, which is much easier to work with.
Note: you'll need a recent version of Rust installed to import the data.
1. Download the data by running `./download-data.sh`. This will download the GHSA OSV-formatted JSON files to `data/advisory-database-main`.
2. Build the sqlite database by running `cargo run --release`. The database will be written to `data/advisory-database.db`.
Here's the schema for the database:
```sql
CREATE TABLE advisories (
ghsa TEXT PRIMARY KEY,
modified TEXT NOT NULL,
published TEXT,
withdrawn TEXT,
cve TEXT,
ecosystems TEXT,
summary TEXT,
details TEXT,
severity TEXT,
cwes TEXT
);
CREATE TABLE affected_packages (
ghsa TEXT,
name TEXT NOT NULL,
ecosystem TEXT NOT NULL,
ranges TEXT,
versions TEXT
);
```
## Analysis notebook
The `analysis.ipynb` notebook contains some basic analysis of the data, and should serve as a good starting point for anyone who wants to dig into the data. You'll need pandas, matplotlib, and jupyter (or the notebook plugin for vscode) installed to run the notebook.