Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dosm-malaysia/aksara-back

backend data django open-data python

Last synced: about 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/dosm-malaysia/aksara-back
Owner: dosm-malaysia
License: mit
Created: 2022-06-27T08:57:55.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-03-29T00:31:01.000Z (almost 2 years ago)
Last Synced: 2024-11-30T00:03:43.934Z (about 2 months ago)
Topics: backend, data, django, open-data, python
Language: Python
Homepage: https://open.dosm.gov.my
Size: 7.19 MB
Stars: 25
Watchers: 2
Forks: 4
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: license.md

Awesome Lists containing this project

README

# OpenDOSM-back

The backend API for OpenDOSM that serves data available via [`dosm-malaysia/aksara-data`](https://github.com/dosm-malaysia/aksara-data) to the frontend at [`dosm-malaysia/aksara-front`](https://github.com/dosm-malaysia/aksara-front).
- Django app name: `aksara`
- Database: `postgresql`

## Table of Contents

- [Virtual environment setup](#setup-virtual-environment)
- [Database setup](#setup-db)
- [Run Development Server](#run-development-server)
- [API Endpoints](#api-endpoints)
- [Fetching new data](#fetching-new-data)
- [Private tokens required](#private-tokens-required)
- [How it works](#how-it-works)
- [Meta Jsons version 1.0](#meta-jsons)
- [Catalog Jsons version 1.0](#catalog-jsons)
- [Chart Builders version 1.0](#chart-builders)
- [Contributing](#contributing)

## Setup virtual environment

```bash
-m venv env

# activate and install dependencies
source env/bin/activate
pip install -r requirements.txt
```

## Setup DB

1. Setup a DB server (PostgresSQL DB recommended) and populate your DB instance settings in the `.env` file.
2. Run migrations to setup all tables: `python manage.py migrate`
3. Fetch data from `dosm-malaysia/aksara-data` repo and populate or update the DB: `python manage.py loader UPDATE`
4. To rebuild the DB from scratch: `python manage.py loader REBUILD`

## Run Development Server
`python manage.py runserver`

## API Endpoints
API will be running at `http://127.0.0.1:8000/`

Despite OpenDOSM having multiple dashboards and data variables, there is only 1 endpoint, dedicated to each type, to serve all variables and dashboards.

- `dashboard/`
- Every GET request to this endpoint, will have to be associated with a query parameter, `'dashboard'`.
- This `dashboard` parameter indicates which dashboard API should be returned.
- There can be multiple parameters as long as they are defined in the META Json and individual charts' parameter - this enables the `handle_request` method to execute them accordingly.
- `data-variable/`
- Every get request to this endpoint, will have to be associated with a query parameter, `'id'`.
- The `'id'` represents the unique identifier for each variable, within the data catalog.
- The response from this endpoint, are information, such as chart data, variable explanations, download links, and more.
- `chart/`
- Every get request to this endpoint, will have to be associated with 2 query parameters, `'dashboard'` and `'chart'`.
- There can be multiple parameters as long as they are defined in the META Json and individual charts' parameter.
- This endpoint allows for the ability to request data for individual charts from within a dashboard.
- `update/`
- Every POST request to this endpoint, will trigger an update, to perform a selective update, based on the files changed in the git repository.
- `data-catalog/`
- Every get request to this endpoint, will populate the homepage, with a list of categories, as well as the relevant data variables.

## Fetching new data
As of now, OpenDOSM uses GitHub Actions to trigger a rebuild (when there is new code) or update of data (when there is new data). Alternatively, with your preferred deployment setup, there are 2 ways to update the data.
- Using POST method
- A post request can be sent to the endpoint `https:///update/` , which would by default trigger an update, and populate the DB with the new data.
- Using command line
* Alternatively, if your desired cron / task scheduler runs locally, you could use the command line to trigger an update and populate the DB with the new data.
> python manage.py loader ***category*** ***operation*** ***files***

* `category` - Possible values are either `DASHBOARDS` or `DATA_CATALOG`, depending on whichever you choose to update.
* `operation` - Possible values are either `UPDATE` or `REBUILD`. `UPDATE` will update the datasets, by inserting new rows, or updating existing rows by their ID. Whereas `REBUILD` would delete all rows from the database, and populate from an empty database.
* `files` - When using the `UPDATE` operation, will update specific files of your choice. The values must be a string, concatenated with a `,` dividing each dataset. E.g : `'meta_1,meta_2,meta_3'`. There is no need to add the additional `.json` suffix. This parameter is optional, and if left empty, will update every existing dataset of the chosen category.

## Private tokens required:
- GitHub token - Required for interacting with the GitHub API to pull data from the `aksara-data` repo.
- Telegram ChatID and token - The OpenDOSM backend loader updates status messages to a private Telegram channel for the development team to monitor data pipelines daily. Feel free to comment out any calls to the `trigger` module for the loader to work without Telegram updates.

## How it works
Before explaining the logic of OpenDOSM's backend operations, here is a brief description of the thinking behind the architecture choices:
1. OpenDOSM will only keep expanding as more dashboards are added, so we wanted to ensure that the number of endpoints did not balloon along with it.
2. OpenDOSM will likely include more chart types over time, which may vary in what they require in terms of API structure and business logic.

Therefore, to minimise complexity, it was vital that we used an architecture that could keep the number of endpoints minimal, while simultaneously ensuring that chart data could be served in as versatile a manner as possible. Here are the key ingredients we used to do this:

## META Jsons
Meta Json version 1.0
A META Json contains all information required to build a dashboard. These files can be found within the `dashboard` folder of the `aksara-data` repo. Each one is responsible for either a dashboard, or a dedicated chart.
- Each META Json contains information regarding the required and optional parameters for the API, and the list of charts needed.
- Each chart key contains multiple sub-keys representing chart names.
- Each sub-key contains all information required to fully specify a chart, such as parameters, chart types, API types, and variables.
- The variables and `chart_source` keys contain data which is fed to the chart builders, which serves the final data.

## Catalog Jsons
Catalog Json version 1.0
A Catalog Json contains all information required to build a data variable, or a set of data-variables. These files can be found within the `catalog` forlder of the `aksara-data` repo. Each one is responsible for 1 or more data variables.
- Each Catalog Json contains information regarding the chart type of the variable, the explanations of the variables, as well as the file source of the data.
- The Catalog JSON can be divided into 2 essential parts :
- File
- The `file` key contains the description of the dataset, as well as the source of the data set.
- Within the `file` key, there is a key named `variables`. This key, is an array of objects, containing the id, name, title, and description of each data-variable within the dataset.
- The following are the conventions of the variable's id :
1. `-1` : Indicates a variable should be used in the explanation, but has no dedicated chart / page for itself.
2. `0` : Indicates the variables excluded for a data-variable of the table chart.
3. `>= 1` : Indicates the variables included within the dataset, AND has a dedicated chart / page for itself.
- Catalog Data
- the `catalog_data` key, contains an array of objects, which represents the meta-data, chart variables and chart filters, required to build the respective data-variable page. These id in these variables will also correspond to the id within the `variables` key, nested in the `file` key.

## Chart Builders
Chart Builders version 1.0
Critical design principle: All charts of the same type should be built using only 1 method. For instance, all time series charts on the site should (as far as possible) be handled by the same code unit.
- Within the `utils` folder in the project, the `chart_builder` file contains several methods for various charts, such as heatmaps, bar charts, choropleths, etc.
- As long as sufficient information is supplied (by the META Json), the chart builder can dynamically format a chart's api, thus requiring minimal to no 'hard-coding'. This increases versatility.
- Variable structures and data-types, of each chart type can be viewed in the `variable_structures.py` file, within the `utils` folder.

## Contributing
Pull requests are welcome. For any pull request, please make sure to either create a new branch, or submit a pull request to the `dev` branch. Please make sure to briefly describe the changes introduced in the pull request.