https://github.com/ckan/ckanext-report
CKAN report infrastructure
https://github.com/ckan/ckanext-report
Last synced: 9 months ago
JSON representation
CKAN report infrastructure
- Host: GitHub
- URL: https://github.com/ckan/ckanext-report
- Owner: ckan
- License: other
- Created: 2014-05-22T11:24:29.000Z (over 11 years ago)
- Default Branch: master
- Last Pushed: 2023-03-31T18:17:17.000Z (almost 3 years ago)
- Last Synced: 2025-04-04T17:51:46.607Z (9 months ago)
- Language: Python
- Size: 1.22 MB
- Stars: 18
- Watchers: 7
- Forks: 36
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
[](https://github.com/ckan/ckanext-report/actions/workflows/test.yml)
# ckanext-report
====================
ckanext-report is a CKAN extension that provides a reporting infrastructure. Here are the features offered:
* All available reports are listed on a central web page and from the command-line.
* Breadcrumbs allow navigation from a report back to the reports page.
* Reports are served as a web page, JSON or CSV from a cache.
* The reports can be run in a nightly batch and saved to the cache.
* Admins can regenerate reports from the report's web page.
Example report:

A number of extensions currently offer reports that rely on this extension, e.g. [ckanext-archiver](https://github.com/ckan/ckanext-archiver/blob/master/ckanext/archiver/reports.py), [ckanext-qa](https://github.com/ckan/ckanext-qa/blob/master/ckanext/qa/reports.py), [ckanext-dgu](https://github.com/datagovuk/ckanext-dgu/blob/master/ckanext/dgu/lib/reports.py).
TODO:
* Stop a report from being generated multiple times in parallel (unnecessary waste) - use a queue?
* Stop more than one report being generated in parallel (high load for the server) - maybe use a queue.
## Compatibility:
| CKAN version | Compatibility |
| --------------- | ------------------- |
| 2.6 and earlier | yes |
| 2.7 | yes |
| 2.8 | yes |
| 2.9-py2 | yes |
| 2.9 (py3) | yes |
| 2.10 (py3) | yes |
Status: was in production at data.gov.uk around 2014-2016, but since that uses its own CSS rather than core CKAN's, for others to use it CSS needs adding. For an example, see this branch: see https://github.com/GSA/ckanext-report/tree/geoversion
Author(s): David Read and contributors
## Install & setup
Install ckanext-report into your CKAN virtual environment in the usual way:
(pyenv) $ pip install -e git+https://github.com/ckan/ckanext-report.git#egg=ckanext-report
Initialize the database tables needed by ckanext-report:
CKAN < 2.9 (pyenv) $ paster --plugin=ckanext-report report initdb --config=mysite.ini
CKAN >= 2.9 (pyenv) $ ckan -c mysite.ini report initdb
Enable the plugin. In your config (e.g. development.ini or production.ini) add ``report`` to your ckan.plugins. e.g.:
ckan.plugins = report
## Command-line interface
The following operations can be run from the command line using the ``paster --plugin=ckanext-report report`` or ``ckan report`` commands:
```
report list
- lists the reports
report generate [report1,report2,...]
- generate the specified reports, or all of them if none specified
```
Get the list of reports:
(pyenv) $ paster --plugin=ckanext-report report list --config=mysite.ini
(pyenv) $ ckan --config=mysite.ini report list
Generate all reports:
(pyenv) $ paster --plugin=ckanext-report report generate --config=mysite.ini
(pyenv) $ ckan --config=mysite.ini report generate
Generate a single report:
(pyenv) $ paster --plugin=ckanext-report report generate --config=mysite.ini
(pyenv) $ ckan --config=mysite.ini report generate
## Demo report - Tagless Datasets
There is a simple demonstration report included in ckanext-report which you can enable by adding `tagless_report` to your list of `ckan.plugins` in your ckan.ini. Once you've restarted paster or whichever webserver, you should see it listed on the webpage at: `/report`.
## Dataset Notes
Reports that examine datasets include a column 'Dataset Notes', designed to show custom properties of the datasets. There are often key properties that you want to show, such as whether a dataset is private, harvested etc., but it is configurable because every CKAN install is different. To configure the contents of this: put a python expression in the CKAN config `ckanext-report.notes.dataset`.
For example at data.gov.uk we flag up if a dataset is 'unpublished', has been harvested or was imported from ONSHUB:
```
ckanext-report.notes.dataset = ' '.join(('Unpublished' if asbool(pkg.extras.get('unpublished')) else '', 'UKLP' if asbool(pkg.extras.get('UKLP')) else '', 'National Statistics Pub Hub' if pkg.extras.get('external_reference')=='ONSHUB' else ''))
```
# Creating a Report
A report has three key elements:
1. Report Code - a python function that produces the report.
2. Template - HTML for displaying the report data.
3. Registration - containing the configuration of the report.
The examples below are taken from the worked example "Tagless datasets" in this repository - see above for how to run this demo.
## Report Code
The code that produces the report will probably make some calls to the logic layer or database, assemble the data into dicts/lists and then return them. This will be saved as JSON in the database data_cache.
The returned data should be a dict like this:
```javascript
{
'table': [
{'name': 'river-levels', 'title': 'River levels', 'notes': 'Harvested', 'user': 'bob', 'created': '2008-06-13T10:24:59.435631'},
{'name': 'co2-monthly', 'title' 'CO2 monthly', 'notes': '', 'user': 'bob', 'created': '2009-12-14T08:42:45.473827'},
],
'num_packages': 56,
'packages_without_tags_percent': 4,
'average_tags_per_package': 3.5,
}
```
There should be a `table` with the main body of the data, and any other totals or incidental pieces of data.
Note: the table is required because of the CSV download facility, and CSV demands a table. (The CSV download only includes the table, ignoring any other values in the data.) Although the data has to essentially be stored as a table, you do have the option to display it differently in the web page by using a clever template.
Dates should be returned as an ISO format string.
The convention is to put the report code in: `ckanext//reports.py`
## Template
When you view a report, ckanext-report will automatically show the title, options, the CSV/JSON download buttons and for the administrator a 'refresh' button. Everything below that, the display of the data itself, is the job of the report template.
The report template will probably display the incidental data and then the table:
```html
{#
Report (snippet)
table - main data, as a list of rows, each row is a dict
data - other data values, as a dict
#}
{% set ckan_29_or_higher = h.ckan_version().split('.')[1] | int >= 9 %}
{% set dataset_read_route = 'dataset.read' if ckan_29_or_higher else 'dataset_read' %}
- Datasets without tags: {{ table|length }} / {{ data['num_packages'] }} ({{ data['packages_without_tags_percent'] }})
- Average tags per package: {{ data['average_tags_per_package'] }} tags
Dataset
Notes
User
Created
{% for row in table %}
{{ row.title }}
{{ row.notes }}
{{ h.linked_user(row.user) }}
{{ h.render_datetime(row.created) }}
{% endfor %}
```
The convention is to put the report templates in: `ckanext//templates/report/.html`
Note: currently ckanext-report has not been styled yet for the core CKAN templates, due to the author using custom templates. Feel free to add styling.
## Option templates
Each option needs a template snippet containing the widget for the user to change the value. The snippet for selecting the 'organization' is already included, and you can define others.
* It must be located at: `report/option_.html`
* It is passed values:
* value - Value of this option
* default - Default value for this option
* The widget should be a field, with name: `option-` that returns the value for the option when the form is submitted.
* If it is not missing then the report page falls back to just showing the option value (i.e. read-only).
Here's an example of a checkbox for whether to 'include sub-organizations' from 'ckanext/report/templates/report/option_include_sub_organizations.html':
```
{#
Option snippet - organization
value - Value of this option
#}
Include results from sub-organizations
```
## Registration
Register your report with ckanext-report with the IReport plugin and supply its configuration.
Your extension will probably have a file `plugin.py` defining plugins - classes which inherit from `p.SingletonPlugin`. Make a plugin implement IReport, based on this example plugin.py:
```python
import ckan.plugins as p
from ckanext.report.interfaces import IReport
class TaglessReportPlugin(p.SingletonPlugin):
p.implements(IReport)
# IReport
def register_reports(self):
import reports
return [reports.tagless_report_info]
```
The last line refers to `tag_report_info` which is a dictionary with properties of the report. This is stored in `reports.py` together with the report code (see above). The info dict looks like this:
```python
from collections import OrderedDict
tagless_report_info = {
'name': 'tagless-datasets',
'description': 'Datasets which have no tags.',
'option_defaults': OrderedDict((('organization', None),
('include_sub_organizations', False),
)),
'option_combinations': tagless_report_option_combinations,
'generate': tagless_report,
'template': 'report/tagless.html',
}
```
Info dict spec:
* name - forms part of the URL
* title (optional) - this is the report title as it is displayed. Defaults to name, capitalized and with dashes changed to spaces.
* description (optional) - this is displayed in the report list page and on the report page.
* generate - function returning the report data
* template - filepath of the report HTML template
* option_defaults - dict of ALL option names and their default values. Use ckan.common.OrderedDict. If there are no options, you can return None.
* option_combinations - function returning a list of all the options combinations (reports for these combinations are generated by default). If there are no options, return None.
* authorize (optional) - a function that says if the user is allowed to view the report. Takes params: (user_object, options_dict) and should return a boolean - if the user is authorized or not. The default is that anyone can see reports.
Finally we need to define the function that returns the option_combinations:
```python
def tagless_report_option_combinations():
for organization in lib.all_organizations(include_none=True):
for include_sub_organizations in (False, True):
yield {'organization': organization,
'include_sub_organizations': include_sub_organizations}
```
## Translations
To translate plugin to a new language (ie. "pl") run `python setup.py init_catalog -l pl`.
To update template file with new translation added in the code or templates
run `python setup.py extract_messages` in the root plugin directory. Then run
`./ckanext/report/i18n/unique_pot.sh -v` to strip core ckan's translations.
To update translation files for locale "pl" with new template run `python setup.py update_catalog -l pl`.