Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/prx/castle.prx.org
Here, let me carry those metrics for you
https://github.com/prx/castle.prx.org
Last synced: about 1 month ago
JSON representation
Here, let me carry those metrics for you
- Host: GitHub
- URL: https://github.com/prx/castle.prx.org
- Owner: PRX
- License: agpl-3.0
- Created: 2017-03-21T16:58:02.000Z (almost 8 years ago)
- Default Branch: main
- Last Pushed: 2024-07-30T19:21:35.000Z (5 months ago)
- Last Synced: 2024-07-30T23:47:08.629Z (5 months ago)
- Language: Elixir
- Homepage: https://castle.prx.org
- Size: 946 KB
- Stars: 0
- Watchers: 10
- Forks: 0
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Castle.prx.org
## Description
[Phoenix App](http://www.phoenixframework.org) providing an API to PRX metrics in BigQuery.
This project follows the [standards for PRX services](https://github.com/PRX/meta.prx.org/wiki/Project-Standards#services).
## Installation
### Local
To get started, make sure you have completed the [Phoenix install guide](https://hexdocs.pm/phoenix/installation.html#content). Then:
```
# Get the code
git clone [email protected]:PRX/castle.prx.org.git# Install dependencies
mix deps.get# Configure your environment (you'll need a bigquery table and service account)
cp env-example .env
vi .env
```### Docker
Currently on OSX, [Dinghy](https://github.com/codekitchen/dinghy) is probably
the best way to set up your dev environment. Using VirtualBox is recommended.
Also be sure to install `docker-compose` along with the toolbox.This project is setup primarily to build a `MIX_ENV=prod` docker image. To avoid
recompiling dependencies every time you run a non-prod docker-compose command,
mount some local directories to mask the build/deps directories in the image.```
docker-compose build# mount dev dependencies locally
mkdir _build_docker_compose deps_docker_compose
docker-compose run castle compile# now you can run a local server
docker-compose up
open http://castle.prx.docker# or run the tests
docker-compose run castle test
docker-compose run castle test --include external# or run a single test
docker-compose run castle test test/controllers/api/root_controller_test.exs
```## Dependencies
BigQuery, Redis, and Postgres. But if you use docker-compose, you'll only need
to configure BigQuery.## Usage
```
# Start the phoenix server
mix phx.server# Or run interactively
iex -S mix phx.server# Or just get a console
iex -S mix
```### Authorization
By default, Castle will restrict what podcasts you can see based on the
account-ids granted to you by [ID](https://github.com/PRX/id.prx.org). If you
want to impersonate other accounts, just set `DEV_AUTH=123,456,789` in your
ENV to grant you access to that comma-separated list of account ids. You can
also set `DEV_AUTH=*` to allow access to all accounts.Note that the `DEV_AUTH` ENV does not work at all in production environments.
## Tasks
Background worker tasks are configured to run on a cron, in `config/prod.exs`.
By default, these are commented out in `dev.exs`, so you'll need to run them
manually or uncomment that line. Generally, these tasks run with a `--lock`
flag, which uses a redis lock to prevent multiple prod instances from doing the
same work at the same time.### Feeder Sync
Sync all podcasts/episodes from `FEEDER_HOST` into your local Postgres database.
By default, will only go through a few pages of results at a time, before
returning. Use `--all` to process all pages (which might take a long time for
all episodes in Feeder). Similarly, the `--force` flag will sync all
podcasts/episodes since the beginning of time, and can take a long time.```
mix feeder.sync [--lock,--all,--force]
```### BigQuery Sync
Sync all podcasts/episodes from your local Postgres database back to BigQuery.
Currently, this replaces the entire table in BigQuery, but someday we may want
a more progressive sync process.```
mix bigquery.sync.podcasts [--lock]
mix bigquery.sync.episodes [--lock]
mix bigquery.sync.agentnames [--lock]
mix bigquery.sync.geonames [--lock]
```### Downloads Rollup
These tasks query BigQuery for `dt_downloads` on a single day, and inserts that
day of data into Postgres. It also updates the `rollup_logs` to keep track
track of which days have already been rolled up, and marks them as "complete"
days if that day is in the past.The only exception to this is the `monthly_downloads` table, which is calculated
from the `hourly_downloads` to provide more efficient access to podcast/episode
"total" downloads, without having to scan every partition of hourly data.By default, most tasks will find 5 incomplete days (not present in `rollup_logs`)
and process those. But you can change that number with the `--count 99` flag.
Or explicitly rollup a certain day with `--date 20180425`. Rollup operations
are idempotent, you can run them repeatedly for the same day/month.These are all the rollup tasks available:
```
mix castle.rollup.hourly [--lock,--date [YYYYMMDD],--count [INT]]
mix castle.rollup.monthly [--lock,--date [YYYYMMDD],--count [INT]]
mix castle.rollup.geocountries [--lock,--date [YYYYMMDD],--count [INT]]
mix castle.rollup.geometros [--lock,--date [YYYYMMDD],--count [INT]]
mix castle.rollup.geosubdivs [--lock,--date [YYYYMMDD],--count [INT]]
mix castle.rollup.agents [--lock,--date [YYYYMMDD],--count [INT]]
mix castle.rollup.weekly_uniques [--lock,--date [yyyymmdd],--count [int]]
mix castle.rollup.monthly_uniques [--lock,--date [YYYYMMDD],--count [INT]]
mix castle.rollup.last_week_uniques [--lock,--date [YYYYMMDD],--count [INT]]
mix castle.rollup.last_28_uniques [--lock,--date [YYYYMMDD],--count [INT]]
```### BigQuery Migrations
Changes to the BigQuery table structures are defined in `priv/migrations/***.exs`. These are
**NOT** run automatically on deploy, and will need to be run locally.To run or rollback migrations:
1. Change your local `.env` to have a `BQ_PRIVATE_KEY` that can make schema changes.
2. Change your local `.env` to have the `BQ_DATASET` you want to change. **ALWAYS** try out
your changes in development/staging before production.
3. Run `mix bigquery.migrate` to run a single migration at a time.
- You'll be prompted many times to double-check you know what you're doing.
- Watch the output, and double-check the changes it made to your schema.
- Alternatively, `mix bigquery.rollback` rolls back a single migration.
- Note that BigQuery will eventually throw `was recently deleted` errors if you keep adding
and removing the same column names.To add a new migration, just do something like:
```bash
touch "priv/big_query/migrations/$(date -u +"%Y%m%d%H%M%S")_make_a_change.exs"
```## Testing
```
# Run all the tests
mix test# Run a specific test
mix test test/big_query/base/http_test.exs# Include external dependency tests (requires a valid .env)
mix test --include external
```## Scripts
The `/scripts/` directory contains some useful utilities for load/reloading
3rd party data (Geolite, User-Agents, etc). These are not intended to be run
often, so buyer beware.You will need to install ruby and some gems to get the scripts to work.
These are managed with `.ruby-version` and `Gemfile`, install them as follows:
```
# install the ruby specified in /scripts/.ruby-version
rbenv install# install the necessary gems
bundle install
```You'll also need to [set up a Google API key](https://support.google.com/googleapi/answer/6158862).
Create a Service Account key with write access to the project/tables you want to
alter, and save it to `/scripts/.credentials.json`.## License
[AGPL License](https://www.gnu.org/licenses/agpl-3.0.html)
## Contributing
Completing a Contributor License Agreement (CLA) is required for PRs to be accepted.