An open API service indexing awesome lists of open source software.

https://github.com/digitalinteraction/catalyst-trello-scraper

A Dockerized CLI to scrape a Trello list, parses label-relationships and stores into redis
https://github.com/digitalinteraction/catalyst-trello-scraper

Last synced: about 1 year ago
JSON representation

A Dockerized CLI to scrape a Trello list, parses label-relationships and stores into redis

Awesome Lists containing this project

README

          

# Catalyst | Trello Scraper

This is the repo for the Not-Equal Catalyst's Trello scraper.
It is a [node.js](https://nodejs.org)
CLI written in [TypeScript](https://www.typescriptlang.org/)
and deployed through [Docker](https://www.docker.com/).
It queries the Trello API for cards on a specific list,
parses relationships from card labels, then writes into [redis](https://redis.io/).

[What is Not-Equal Catalyst?](https://github.com/unplatform/catalyst-about)

## Table of contents

- [Why a CLI](#why-a-cli)
- [Usage](#usage)
- [Environment variables](#environment-variables)
- [Commands](#commands)
- [Development](#development)
- [Setup](#setup)
- [Regular use](#regular-use)
- [Irregular use](#irregular-use)
- [Commits](#commits)
- [Code Structure](#code-structure)
- [Code formatting](#code-formatting)
- [Testing](#testing)
- [Deployment](#deployment)
- [Building the image](#building-the-image)
- [Future work](#future-work)

## Why a CLI

This is a CLI that is designed to be run with Docker and docker-compose.
This means you can set environment variables and link in a redis container.

It uses Docker's `ENTRYPOINT` so that you can pass your CLI command via docker.
This means you can deploy and run through `docker-compose` and customise the command
at the deployment level, not at the application level.

## Usage

This repo has a Docker image for each version of the scraper.

### Environment variables

Theses are the envionment variables to set inside the docker container.
See [Setup](#setup) for more info about generating these.

- `TRELLO_APP_KEY` - Your trello app key, from https://trello.com/app-key
- `TRELLO_TOKEN` - Your generated app token
- `TRELLO_BOARD_ID` - The board id to scrape from
- `TRELLO_TARGET_LIST_ID` - The list id to pull cards from
- `TRELLO_CONTENT_LIST_ID` - The list if to pull content cards from (optional)
- `REDIS_URL` - The url to access redis from

### Commands

Set the container's command to what you want it to perform,
here is the output of the `--help` option for reference.

```
Usage: @openlab/catalyst-trello-scraper [options] [command]

Options:
-V, --version output the version number
-h, --help output usage information

Commands:
fetch [options] Fetch the current projects and content and store them in redis
schedule [options] Schedule a fetch based on a cron job, https://crontab.guru
show:labels Show the trello labels in redis
show:cards Show the matched trello cards in redis
show:content Show the content stored in redis
```

## Development

### Setup

To develop on this repo you will need to have [Docker](https://www.docker.com/) and
[node.js](https://nodejs.org) installed on your dev machine and have an understanding of them.
This guide assumes you have the repo checked out and are on macOS.
You will also need a Trello account which is used to pull the data from.

You'll only need to follow this setup once for your dev machine.

```bash
# Install dependancies
npm install

# Start up a redis container
# -> Launches a redis:4-alpine docker container
# -> Remember to 'docker-compose stop' after developing
# -> Binds port 6379 to localhost:6379
docker-compose up -d

# Setup your environment
cp .env.example .env

# Get your Trello App Key and put it into your .env
open https://trello.com/app-key

# Generate a Trello token for development, then fill it into your .env
source .env
open "https://trello.com/1/authorize?expiration=never&scope=read&response_type=token&name=Not-Equal%20Catalyst&key=$TRELLO_APP_KEY"

# To get your TRELLO_BOARD_ID & TRELLO_LIST_ID, visit the board you want to pull from on the trello.com
# Then add .json to the end of the url and inspect the contents to find your values
open https://trello.com
```

### Regular use

These are the commands you'll regularly run to develop the CLI, in no particular order.

```bash
# Run the CLI in development mode
# -> Runs the TypeScript directly with `ts-node` and loads the .env
# -> Use the `--` to stop npm swallowing the dash-dash arguments
npm run dev -- --help

# Execute redis commands to inspect the store
# -> Runs a command in the redis container (started by `npm run redis`)
# -> Attaches std in/output so it behaves like you've ssh'd into it
# -> Runs the internal redis-cli, so you can directly talk to redis
# -> For reference: https://redis.io/commands
npm run redis-cli

# Useful redis commands
127.0.0.1:6379> keys * # List all keys
127.0.0.1:6379> get cards # Get cards
127.0.0.1:6379> get labels # Get labels
127.0.0.1:6379> get content # Get content

# Run unit tests
# -> Looks for files named `*.spec.ts` in the src directory
npm run test
```

### Irregular use

These are commands you might need to run but probably won't, also in no particular order.

```bash
# Generate the table of contents for this readme
# -> It'll replace content between the toc-head and toc-tail HTML comments
npm run gen-readme-toc

# Manually lint code with TypeScript's `tsc`
npm run lint

# Manually format code
# -> This repo is setup to automatically format code on git-push
npm run prettier

# Manually transpile TypeScript to JavaScript
# -> This is part of the docker build which is triggered when deploying
# -> Writes files to dist, which is git-ignored
npm run build

# Manually start code from transpilled JavaScript
# -> It'll automatically load your local .env
npm run start
```

### Commits

All commits to this repo must follow [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/).
This ensures changes are structured and means the [CHANGELOG.md](/CHANGELOG.md) can be automatically generated.

This standard is enforced through a `commit-msg` hook using [yorkie](https://www.npmjs.com/package/yorkie).

### Code Structure

| Folder | Contents |
| ------------ | -------------------------------------------- |
| dist | Where the transpilled JavaScript is built to |
| node_modules | Where npm's modules get installed into |
| src | Where the code of the app is |

### Code formatting

This repo uses [Prettier](https://prettier.io/) to automatically format code to a consistent standard.
This works using the [husky](https://www.npmjs.com/package/husky)
and [lint-staged](https://www.npmjs.com/package/lint-staged) packages to
automatically format code whenever you commit code.
This means that code that is pushed to the repo is always formatted to a consistent standard.

You can manually run the formatter with `npm run prettier` if you want.

Prettier is slightly configured in [.prettierrc.yml](/.prettierrc.yml)
and also ignores files using [.prettierignore](/.prettierignore).

### Testing

> This CLI is currently quite simple and doesn't have any unit tests yet

This repo uses [unit tests](https://en.wikipedia.org/wiki/Unit_testing) to ensure that everything is working correctly, guide development, avoid bad code and reduce defects.
The [Jest](https://www.npmjs.com/package/jest) package is used to run unit tests.
Tests are any file in `src/` that ends with `.spec.ts`, by convention they are inline with the source code,
in a parallel folder called `__tests__`.

```bash
# Run the tests
npm test -s

# Generate code coverage
npm run coverage -s
```

## Deployment

### Building the image

This repo uses a [GitLab CI](https://about.gitlab.com/product/continuous-integration/)
to build a Docker image when you push a git tag.
This is designed to be used with the `npm version` command so all docker images are [semantically versioned](https://semver.org/).
The `:latest` docker tag is not used.

This job runs using the [.gitlab-ci.yml](/.gitlab-ci.yml) file which
runs a docker build using the [Dockerfile](/Dockerfile)
and **only** runs when you push a [tag](https://git-scm.com/book/en/v2/Git-Basics-Tagging).

It pushes these docker images to the [GitLab registry](https://openlab.ncl.ac.uk/gitlab/catalyst/trello-scraper/container_registry).
A slight nuance is that it will replace a preceding `v` in tag names, formatting `v1.0.0` to `1.0.0`.

```bash
# Generate a new release
# -> Generates a new version based on the commits since the last version
# -> Generates the CHANGELOG.md based on those commits
# -> There is a "preversion" script to lint & run tests
npm run release

# Push the new version
# -> The GitLab CI will build a new docker image for it
git push --follow-tags
```

## Future work

- Add automated testing