Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/sgibson91/github-activity-dashboard

A tool to help visualise activity in issues and PRs across many repos and orgs
https://github.com/sgibson91/github-activity-dashboard

binder binder-ready github github-activity jupyter-notebook python template voila voila-dashboard

Last synced: about 1 month ago
JSON representation

A tool to help visualise activity in issues and PRs across many repos and orgs

Awesome Lists containing this project

README

        

# My GitHub Activity Dashboard

[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/sgibson91/github-activity-dashboard/main.svg)](https://results.pre-commit.ci/latest/github/sgibson91/github-activity-dashboard/main)

Jupyter-based dashboards to help visualise activity in issues and Pull Requests across many repositories and organisations - all in one place!

Click here to view the activity dashboard! :point_right: [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/sgibson91/github-activity-dashboard/notebook-env?urlpath=git-pull%3Frepo%3Dhttps%253A%252F%252Fgithub.com%252Fsgibson91%252Fgithub-activity-dashboard%26urlpath%3D%252Fvoila%252Frender%252Fgithub-activity-dashboard%252Factivity-dashboard.ipynb%26branch%3Dmain)

Click here to view the past activity summary! :point_right: [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/sgibson91/github-activity-dashboard/notebook-env?urlpath=git-pull%3Frepo%3Dhttps%253A%252F%252Fgithub.com%252Fsgibson91%252Fgithub-activity-dashboard%26urlpath%3D%252Fvoila%252Frender%252Fgithub-activity-dashboard%252Fpast-activity-summary.ipynb%26branch%3Dmain)

---

**Table of Contents:**

- [My GitHub Activity Dashboard](#my-github-activity-dashboard)
- [How the dashboards work](#how-the-dashboards-work)
- [Python script](#python-script)
- [Continuous Delivery of data](#continuous-delivery-of-data)
- [Visualising the data](#visualising-the-data)
- [Binder and `nbgitpuller`](#binder-and-nbgitpuller)
- [Get your own dashboards!](#get-your-own-dashboards)
- [Using the tools locally](#using-the-tools-locally)
- [Installation requirements](#installation-requirements)
- [Getting the data](#getting-the-data)
- [Viewing the dashboards](#viewing-the-dashboards)

## How the dashboards work

### Python script

`get-data.py` is a Python script that makes calls to the [GitHub REST API](https://docs.github.com/en/rest) in order to collect information about issues and pull requests.
It specifically makes requests to the [search endpoint](https://docs.github.com/en/rest/reference/search#search-issues-and-pull-requests) which allows us search for issues and pull requests as we would expect to do so in GitHub's own search bar.
For example, `is:issue is:open assignee:sgibson91` would return all open issues assigned to me.
This turned out to be much more efficient than using the ['list issues assigned to the authenticated user' endpoint](https://docs.github.com/en/rest/reference/issues#list-issues-assigned-to-the-authenticated-user) since it made fewer individual requests and, therefore, wouldn't rate-limit the script.

The script searches for all issues and pull requests that meet the following criteria:

- the user is either assigned to or has created them,
- they involve the user and were closed in the last month,
- they involve the user and were closed or updated in the last week;
- and, any pull requests where the user's review has been requested.

The results are compiled into a pandas dataframe, along with some metadata, and then written to CSV file called `github-activity.csv`.

You can provide a `.repoignore` file to prevent results from specific repos turning up the the dataset.
This is a plain text file with a repository to be ignored on each new line.
The repository to be ignored is represented by the form `ORG_OR_USER/REPO_NAME`.
You can also use [regular expressions](https://en.wikipedia.org/wiki/Regular_expression) here as well.
E.g., if you would like to ignore a whole organisation, this would look like `ORG_NAME/.*`.

### Continuous Delivery of data

The `get-data.py` script is run in a GitHub Actions workflow on a regular cron trigger.
This cron job runs as if running the script locally and commits the updated CSV file to the `main` branch.

### Visualising the data

The data are visualised using the `activity-dashboard.ipynb` and `past-activity-summary.ipynb` Jupyter Notebooks.
They each implement widgets to interact with the data so that users can filter by an individual repository and sort by time created, updated, or closed (past activity summary only).
The Notebooks are executed with `voila` in order to give the dashboards a more aesthetically pleasing look.

### Binder and `nbgitpuller`

The dashboards can be launched in Binder to generate a quick view without needing to use the repository locally.
Binder usually rebuilds the Docker image of the repository with every new commit it sees on the provided git reference.
However since the CSV file is regularly updated, this meant Binder was rebuilding _a lot_ when it didn't need to since only the data were changing - not the Notebooks or the environment required by the Notebooks.

To mitigate the number of rebuilds Binder would need to make, the `requirements.txt` file containing _only_ the packages needed to run the Notebooks has been separated out onto the `notebook-env` branch.
This is the branch we build with Binder.
We then use [`nbgitpuller`](https://jupyterhub.github.io/nbgitpuller/) to dynamically pull in the content from the `main` branch.
This results in a Binder environment that is only rebuilt when the Notebooks' requirements are changed, but still operates with the most up-to-date data from the `main` branch.

**Binder needs BOTH the `main` branch and the `notebook-env` branch to operate in this way!**
**If you are using this project as a template or forking it, DO NOT remove the `notebook-env` branch without ALSO updating the Binder link!**

## Get your own dashboards!

1. [Create your own version of this repository](https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-repository-from-a-template) by clicking the "Use this template" button at the top of this page.
:fire: **Make sure to check the "Include all branches" box when creating your repo, as you will need the `notebook-env` branch as well for the Binder links to work!** :fire:
You can delete any other branches, **except** for `main` and `notebook-env`.

![include-all-branches](https://docs.github.com/assets/cb-28415/images/help/repository/include-all-branches.png)

2. Delete the `github-activity.csv` file from your repo.
(It will be regenerated when the CI job next runs!)
3. Delete the `.repoignore` file **or** edit it contain a list of repos you'd like excluded from the dataset, in the form `ORG_OR_USER/REPO_NAME`.
4. [Create a Personal Access Token](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token) with `public_repo` scope and [add it as a repository secret](https://docs.github.com/en/actions/security-guides/encrypted-secrets#creating-encrypted-secrets-for-a-repository) called `ACCESS_TOKEN`
5. Edit the [README](./README.md) and update the Binder badges at the top of the document, replacing all instances of `{{ YOUR_GITHUB_HANDLE_HERE }}` (including `{{}}`!!!) with your GitHub handle in the below snippet:

```markdown
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/{{ YOUR_GITHUB_HANDLE_HERE }}/github-activity-dashboard/notebook-env?urlpath=git-pull%3Frepo%3Dhttps%253A%252F%252Fgithub.com%252F{{ YOUR_GITHUB_HANDLE_HERE }}%252Fgithub-activity-dashboard%26urlpath%3D%252Fvoila%252Frender%252Fgithub-activity-dashboard%252Factivity-dashboard.ipynb%26branch%3Dmain)
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/{{ YOUR_GITHUB_HANDLE_HERE }}/github-activity-dashboard/notebook-env?urlpath=git-pull%3Frepo%3Dhttps%253A%252F%252Fgithub.com%252F{{ YOUR_GITHUB_HANDLE_HERE }}%252Fgithub-activity-dashboard%26urlpath%3D%252Fvoila%252Frender%252Fgithub-activity-dashboard%252Fpast-activity-summary.ipynb%26branch%3Dmain)
```

:rotating_light: Be careful not to edit anything else in the URL! :rotating_light:

You can either get started straight away by [manually triggering the 'Update GitHub Activity' workflow](https://docs.github.com/en/actions/managing-workflow-runs/manually-running-a-workflow#running-a-workflow) or wait for the cron job to run it for you to produce your `github-activity.csv`.
Once that has been added to your repo, click your edited Binder badges to see your dashboards!

## Using the tools locally

### Installation requirements

This project requires a Python installation.
Any minor patch of Python3 should suffice, but that hasn't been tested so proceed with caution!

The packages required to run this project are stored in `requirements.txt` and can be installed via `pip`:

```bash
pip install -r requirements.txt
```

### Getting the data

1. If you have not already done so, [create a Personal Access Token](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token) with the `public_repo` scope
2. Add this as a variable called `ACCESS_TOKEN` to your shell environment

```bash
export ACCESS_TOKEN="PASTE YOUR TOKEN HERE"
```

3. Run the Python script to generate the `github-activity.csv` file

```bash
python get-data.py
```

:rotating_light: If you see the message "You are rate limited! :scream:", you will need to wait ~1hour before trying to run the script again :rotating_light:

### Viewing the dashboards

Once `github-activity.csv` has been generated, view the dashboards by running:

```bash
voila activity-dashboard.ipynb
voila past-activity-summary.ipynb
```

A browser window should be automatically opened.