Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/geoadmin/service-search-wsgi
Web service for sphinx-search. Managed by geoadmin/infra-terraform-github-bgdi
https://github.com/geoadmin/service-search-wsgi
flask-application sphinxsearch
Last synced: about 1 month ago
JSON representation
Web service for sphinx-search. Managed by geoadmin/infra-terraform-github-bgdi
- Host: GitHub
- URL: https://github.com/geoadmin/service-search-wsgi
- Owner: geoadmin
- License: bsd-3-clause
- Created: 2021-11-15T15:31:24.000Z (about 3 years ago)
- Default Branch: develop
- Last Pushed: 2024-04-12T12:05:50.000Z (8 months ago)
- Last Synced: 2024-04-16T02:00:13.676Z (8 months ago)
- Topics: flask-application, sphinxsearch
- Language: Python
- Homepage:
- Size: 379 KB
- Stars: 0
- Watchers: 15
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# service-search-wsgi
| Branch | Status |
| ------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| develop | ![Build Status](https://codebuild.eu-central-1.amazonaws.com/badges?uuid=eyJlbmNyeXB0ZWREYXRhIjoiUDZNMlVLR3d5bUhsTUF3ZEo3RTRPdDFKdS90czR4ZE5vYmNjTXhtK2tzNGlOckNXb29yaE1DNktwVXFJSVpMdExEVWYzZHA5U1drcmdsTE5BU3lJWDBJPSIsIml2UGFyYW1ldGVyU3BlYyI6IjM2YlhQR1ltcEtlTU16WC8iLCJtYXRlcmlhbFNldFNlcmlhbCI6MX0%3D&branch=develop) |
| master | ![Build Status](https://codebuild.eu-central-1.amazonaws.com/badges?uuid=eyJlbmNyeXB0ZWREYXRhIjoiUDZNMlVLR3d5bUhsTUF3ZEo3RTRPdDFKdS90czR4ZE5vYmNjTXhtK2tzNGlOckNXb29yaE1DNktwVXFJSVpMdExEVWYzZHA5U1drcmdsTE5BU3lJWDBJPSIsIml2UGFyYW1ldGVyU3BlYyI6IjM2YlhQR1ltcEtlTU16WC8iLCJtYXRlcmlhbFNldFNlcmlhbCI6MX0%3D&branch=master) |## Table of content
- [Table of content](#table-of-content)
- [Description](#description)
- [Versioning](#versioning)
- [Local Development](#local-development)
- [Make Dependencies](#make-dependencies)
- [Setting up to work](#setting-up-to-work)
- [Database access](#database-access)
- [Linting and formatting your work](#linting-and-formatting-your-work)
- [Test your work](#test-your-work)
- [Docker](#docker)
- [Deployment](#deployment)
- [Deployment configuration](#deployment-configuration)## Description
This is the `SearchServer` service from mf-chsdi3. How the service can be queried, is currently described here:
[api3.geo.admin.ch/search](https://api3.geo.admin.ch/services/sdiservices.html#search). But this will have to be migrated in some way to this repository. This service is a simple Flask Application that query a [Sphinx Search](http://sphinxsearch.com/docs/current.html) Server. Currently supported Sphinx Search server is v2.2.11.## Versioning
This service uses [SemVer](https://semver.org/) as versioning scheme. The versioning is automatically handled by `.github/workflows/main.yml` file.
See also [Git Flow - Versioning](https://github.com/geoadmin/doc-guidelines/blob/master/GIT_FLOW.md#versioning) for more information on the versioning guidelines.
## Local Development
### Make Dependencies
The **Make** targets assume you have **python3.9**, **pipenv**, **bash**, **curl** and **docker** installed.
### Setting up to work
First, you'll need to clone the repo
```bash
git clone [email protected]:geoadmin/service-search-wsgi
```Then, you can run the `setup` target to ensure you have everything needed to develop, test and serve locally
Virtual environment to develop and debug the service
```bash
make setup
```To run the service you will have to adapt **.env.local**, which is a copy of **.env.default** And to set the variables.
For local development you will need access to a running sphinx search server and to the database. To do so you can use
ssh port forwarding to the DB and to the current sphinx deployment server.#### Database access
Right now the database BOD is being accessed , to retrieve the and to do on labels.
### Linting and formatting your work
In order to have a consistent code style the code should be formatted using `yapf`. Also to avoid syntax errors and non
pythonic idioms code, the project uses the `pylint` linter. Both formatting and linter can be manually run using the
following command:```bash
make format-lint
```**Formatting and linting should be at best integrated inside the IDE, for this look at
[Integrate yapf and pylint into IDE](https://github.com/geoadmin/doc-guidelines/blob/master/PYTHON.md#yapf-and-pylint-ide-integration)**### Test your work
Testing if what you developed work is made simple. You have four targets at your disposal. **test, serve, gunicornserve, dockerrun**
```bash
make test
```This command run the unit tests.
```bash
summon make serve
```This will serve the application through Flask without any wsgi in front.
```bash
summon make gunicornserve
```This will serve the application with the Gunicorn layer in front of the application
```bash
summon make dockerrun
```This will serve the application with the wsgi server, inside a container.
## Docker
The service is encapsulated in a Docker image. Images are pushed on the `swisstopo-bgdi-builder` account of [AWS ECR](https://eu-central-1.console.aws.amazon.com/ecr/repositories?region=eu-central-1) registry. From each github PR that is merged into develop branch, one Docker image is built and pushed with the following tags:
- `develop.latest`
- `CURRENT_VERSION-beta.INCREMENTAL_NUMBER`From each github PR that is merged into master, one Docker image is built an pushed with the following tag:
- `VERSION`
Each image contains the following metadata:
- author
- git.branch
- git.hash
- git.dirty
- versionThese metadata can be seen directly on the dockerhub registry in the image layers or can be read with the following command
```bash
# NOTE: jq is only used for pretty printing the json output,
# you can install it with `apt install jq` or simply enter the command without it
docker image inspect --format='{{json .Config.Labels}}' 974517877189.dkr.ecr.eu-central-1.amazonaws.com/service-search-wsgi:develop.latest | jq
```You can also check these metadata on a running container as follows
```bash
docker ps --format="table {{.ID}}\t{{.Image}}\t{{.Labels}}"
```## Deployment
This service is going to be deployed on a vhost. The configuration of the ***docker-compose.yml*** of the vhost setup is going to be here:
[https://github.com/geoadmin/infra-vhost](https://github.com/geoadmin/infra-vhost)### Deployment configuration
The service is configured by Environment Variable:
| Env | Default | Description |
| --------------------------- | ----------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| HTTP_PORT | 5000 | The port on which the service can be queried. |
| SEARCH_WORKERS | `0` | Number of workers. `0` or negative value means that the number of worker are computed from the number of cpu |
| TESTING | False | When TESTING=True, the application does not need a db connection to retrieve a list of topics. A list with the topics used in the tests is being set. |
| BOD_DB_NAME | - | Depending on the staging level usually |
| BOD_DB_HOST | - | The db host. |
| BOD_DB_PORT | 5432 | The db port |
| BOD_DB_USER | - | The read-only db user |
| BOD_DB_PASSWD | - | The db password. |
| GEODATA_STAGING | prod | In the database bod, a dataset itself has the attribute staging. This staging (dev, int and prod) is being filtered when querying the indexes. |
| SEARCH_SPHINX_HOST | localhost | The host for sphinx search server. |
| SEARCH_SPHINX_PORT | 9321 | The port for sphinx search server. |
| SEARCH_SPHINX_TIMEOUT | 3 | Sphinx server timeout |
| CACHE_DEFAULT_TIMEOUT | 86400 | The time in seconds in which the db queries for `topics` and `translations` will be cached. Default 24 hours, as changing rarely. |
| LOGGING_CFG | logging-cfg-local.yml | Logging configuration file |
| FORWARED_ALLOW_IPS | `*` | Sets the gunicorn `forwarded_allow_ips` (see https://docs.gunicorn.org/en/stable/settings.html#forwarded-allow-ips). This is required in order to `secure_scheme_headers` to works. |
| FORWARDED_PROTO_HEADER_NAME | `X-Forwarded-Proto` | Sets gunicorn `secure_scheme_headers` parameter to `{FORWARDED_PROTO_HEADER_NAME: 'https'}`, see https://docs.gunicorn.org/en/stable/settings.html#secure-scheme-headers. |
| SCRIPT_NAME | '' | The script name. This will be used once, when we have an idea about how to query search-wsgi later on. F.ex. `/api/search/` f.ex. used by gunicorn (wsgi-server). |
| CACHE_CONTROL_HEADER | `'public, max-age=600'` | Cache-Control header value for the search endpoint |
| GZIP_COMPRESSION_LEVEL | `9` | GZIP compression level |
| WSGI_TIMEOUT | 1 | WSGI timeout, note the final timout used is `SEARCH_SPHINX_TIMEOUT + WSGI_TIMEOUT`, so `WSGI_TIMEOUT` should the maximum amount of time that the WSGI app should have to handle the data received from sphinx server. |
| GUNICORN_WORKER_TMP_DIR | `None` | This should be set to an tmpfs file system for better performance. See https://docs.gunicorn.org/en/stable/settings.html#worker-tmp-dir. |