Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kanedata/find-that-charity
Reconciliation for UK Charities and other nonprofit organisations, with elasticsearch back end.
https://github.com/kanedata/find-that-charity
charity charity-commission elasticsearch python-django reconciliation reconciliation-api scottish-charities
Last synced: about 2 months ago
JSON representation
Reconciliation for UK Charities and other nonprofit organisations, with elasticsearch back end.
- Host: GitHub
- URL: https://github.com/kanedata/find-that-charity
- Owner: kanedata
- License: mit
- Created: 2017-06-05T09:27:15.000Z (over 7 years ago)
- Default Branch: main
- Last Pushed: 2024-08-22T10:19:32.000Z (4 months ago)
- Last Synced: 2024-08-28T18:14:16.829Z (4 months ago)
- Topics: charity, charity-commission, elasticsearch, python-django, reconciliation, reconciliation-api, scottish-charities
- Language: Python
- Homepage: https://findthatcharity.uk/
- Size: 10.5 MB
- Stars: 28
- Watchers: 7
- Forks: 9
- Open Issues: 47
-
Metadata Files:
- Readme: readme.md
- License: LICENSE
Awesome Lists containing this project
- awesome-techforgood - Find That Charity - search engine for UK charities, with elasticsearch back end. (Awesome tech for good projects)
README
# Find that charity
Elasticsearch-powered search engine for looking for charities and other non-profit organisations. Allows for:
- importing data nearly 20 sources in the UK, ensuring that duplicates
are matched to one record.
- An elasticsearch index that can be queried.
- [Org-ids](http://org-id.guide/about) are added to organisations.
- Reconciliation API for searching organisations, based on an optimised search query.
- Facility for uploading a CSV of charity names and adding the (best guess) at a
charity number.
- HTML pages for searching for a charity## Installation
1. [Clone repository](https://github.com/kanedata/find-that-charity)
2. Create virtual environment (`python -m venv env`)
3. Activate virtual environment (`env/bin/activate` or `env/Scripts\activate`)
4. Install requirements (`pip install -r requirements.txt`)
5. [Install postgres](https://www.postgresql.org/download/)
6. Start postgres
7. Create 2 postgres databases - one for admin (eg `ftc_admin` and one for data eg `ftc_data`)
8. [Install elasticsearch 7](https://www.elastic.co/guide/en/elasticsearch/reference/current/_installation.html) - you may need to increase available memory (see below)
9. Start elasticsearch
10. Create `.env` file in root directory. Contents based on `.env.example`.
11. Create the database tables (`python ./manage.py migrate --database=data && python ./manage.py migrate --database=admin && python ./manage.py createcachetable --database=admin`)
12. Import data on charities (`python ./manage.py import_charities`)
13. Import data on nonprofit companies (`python ./manage.py import_ch`)
14. Import data on other non-profit organisations (`python ./manage.py import_all`)
15. Add organisations to elasticsearch index (`python ./manage.py es_index`) - (Don't use the default `search_index` command as this won't setup aliases correctly)## Dokku Installation
### 1. Set up dokku server
SSH into server and run:
```bash
# create app
dokku apps:create ftc# postgres
sudo dokku plugin:install https://github.com/dokku/dokku-postgres.git postgres
dokku postgres:create ftc-db-data
dokku postgres:link ftc-db-data ftc --alias "DATABASE_URL"
dokku postgres:create ftc-db-admin
dokku postgres:link ftc-db-admin ftc --alias "DATABASE_ADMIN_URL"# elasticsearch
sudo dokku plugin:install https://github.com/dokku/dokku-elasticsearch.git elasticsearch
echo 'vm.max_map_count=262144' | sudo tee -a /etc/sysctl.conf; sudo sysctl -p
export ELASTICSEARCH_IMAGE="elasticsearch"
export ELASTICSEARCH_IMAGE_VERSION="7.7.1"
dokku elasticsearch:create ftc-es
dokku elasticsearch:link ftc-es ftc
# configure elasticsearch 7:
# https://github.com/dokku/dokku-elasticsearch/issues/72#issuecomment-510771763# setup elasticsearch increased memory (might be needed)
nano /var/lib/dokku/services/elasticsearch/ftc-es/config/jvm.options
# replace `-Xms512m` with `-Xms2g`
# replace `-Xms512m` with `-Xmx2g`
# restart elasticsearch
dokku elasticsearch:restart ftc-es# Redirect
dokku plugin:install https://github.com/dokku/dokku-redirect.git
dokku redirect:set ftc www.findthatcharity.uk findthatcharity.uk# SSL
sudo dokku plugin:install https://github.com/dokku/dokku-letsencrypt.git
dokku letsencrypt:set ftc email [email protected]
dokku letsencrypt:enable ftc
dokku letsencrypt:cron-job --add
```### 2. Add as a git remote and push
On local machine:
```bash
git remote add dokku dokku@SERVER_HOST:ftc
git push dokku main
```### 3. Setup and run import
On Dokku server run:
```bash
# setup
dokku run ftc python ./manage.py migrate --database=data
dokku run ftc python ./manage.py migrate --database=admin
dokku run ftc python ./manage.py createcachetable --database=admin# run import
dokku run ftc python ./manage.py charity_setup
dokku run ftc python ./manage.py import_oscr
dokku run ftc python ./manage.py import_charities
dokku run ftc python ./manage.py import_ch
dokku run ftc python ./manage.py import_other_data
dokku run ftc python ./manage.py import_all
dokku run ftc python ./manage.py es_index
```## Server
The server uses [django](https://www.djangoproject.com/). Run it with the
following command:`python ./manage.py runserver`
The server offers the following API endpoints:
- `/reconcile`: a [reconciliation service API](https://github.com/OpenRefine/OpenRefine/wiki/Reconciliation-Service-API)
conforming to the OpenRefine reconciliation API specification.- `/charity/12345`: Look up information about a particular charity
## Todo
Priorities:
- tests for ensuring data is correctly imported
- server tests
- use results of `server/recon_test.py` to produce the best reconciliation
search query for use in the server (`recon_test_7` seems the best at the moment)
- threshold for when to use the result vs discardFuture development:
- upload a CSV file and reconcile each row with a charity
- allow updating a charity with additional possible names## Testing
```sh
coverage run pytest && coverage html
python -m http.server -d htmlcov --bind 127.0.0.1 8001
```