Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/city-of-helsinki/unified-search
Common unified search
https://github.com/city-of-helsinki/unified-search
Last synced: about 1 month ago
JSON representation
Common unified search
- Host: GitHub
- URL: https://github.com/city-of-helsinki/unified-search
- Owner: City-of-Helsinki
- License: mit
- Created: 2021-02-23T08:20:22.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2024-01-04T10:12:13.000Z (12 months ago)
- Last Synced: 2024-04-08T16:43:06.412Z (9 months ago)
- Language: Python
- Homepage:
- Size: 826 KB
- Stars: 1
- Watchers: 17
- Forks: 1
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Common unified search
This is common unified search: multi domain search over multiple services.
Solution consists of following parts:
[Data collector](https://github.com/City-of-Helsinki/unified-search/tree/develop/sources)
- Python Django application for fetching data from multiple sources and storing it to the OpenSearch (or earlier to the Elasticsearch).
- Django management commands are triggered by Kubernetes cron jobs.OpenSearch
- Search engine for indexing the data.
[GraphQL API](https://github.com/City-of-Helsinki/unified-search/tree/develop/graphql)
- GraphQL API on top of OpenSearch providing high level interface for end (frontend) users.
# Endpoints
- Stable at https://unified-search.prod.kuva.hel.ninja/search
- Staging at https://unified-search.test.kuva.hel.ninja/search# Development
Docker compose sets up 3 node local test environment with Kibana. Make sure at least 4 GB of RAM is allocated to Docker.
docker-compose up
To verify nodes are up and running:
curl -X GET "localhost:9200/_cat/nodes?v=true&pretty"
Services:
- GraphQL search API: http://localhost:4000/search
- OpenSearch Dashboard at http://localhost:5601
- OpenSearch Dashboard Dev Tools at http://localhost:5601/app/dev_tools#/console
- OpenSearch at http://localhost:9200
- Data sources (data collector) at http://localhost:5000/Deprecated:
- Graphene based testing GraphQL search API at http://localhost:5001/graphql
## Fetching data with data collector
Following management command can be used to fetch data from external data sources and store it to OpenSearch:
docker-compose exec sources python manage.py ingest_data
It is also possible to limit command to certain importer:
docker-compose exec sources python manage.py ingest_data location
Delete all data:
docker-compose exec sources python manage.py ingest_data --delete
Delete data imported by given importer:
docker-compose exec sources python manage.py ingest_data location --delete
Currently implemented importers and the indexes they create:
- **event** (event)
- **location** (location)
- **ontology_tree** (ontology_tree)
- **ontology_word** (ontology_word)
- **administrative_division** (administrative_division, helsinki_common_administrative_division)## Testing
Following test script is available for basic health check:
pytest --log-cli-level=debug test_es_health.py
Sources tests, in docker-compose:
docker-compose exec sources pytest
GraphQL tests:
npx jest
## GraphQL search API
If not running with docker-compose, start Apollo based GraphQL server at `unified-search/graphql/`:
node index.js
## GraphQL queries
It is recommended to use GraphQL client such as Altair for sending queries.
### Search all with specified ontology
query {
unifiedSearch(index: "event", text: "*", ontology: "vapaaehtoistoiminta", languages:FINNISH) {
edges {
cursor
node {
event {
name {
fi
}
description {
fi
}
}
}
}
}
}### Free text search - event index
query {
unifiedSearch(index: "event", text: "koira", languages:FINNISH) {
edges {
cursor
node {
event {
name {
fi
}
description {
fi
}
}
}
}
}
}### Free text search - location index
query {
unifiedSearch(index: "location", text: "koira", languages:FINNISH) {
edges {
cursor
node {
venue {
name {
fi
}
description {
fi
}
}
}
}
}
}### Pagination and scores
query {
unifiedSearch(text: "koira", index: "location") {
count
max_score
pageInfo {
startCursor
endCursor
hasNextPage
hasPreviousPage
}
edges {
cursor
node {
venue {
name {
fi
sv
en
}
openingHours {
url
is_open_now_url
}
location {
url {
fi
}
}
}
_score
searchCategories
}
}
}
}### Raw data for debugging purposes
query {
unifiedSearch(text: "koira", index: "location", first: 3) {
count
max_score
edges {
node {
venue {
name {
fi
sv
en
}
}
_score
}
}
es_results {
took
hits {
max_score
total {
value
}
hits {
_index
_source {
data
}
}
}
}
}
}### Suggestions for text completion
query {
unifiedSearchCompletionSuggestions(prefix:"ki", languages:FINNISH, index:"location")
{
suggestions {
label
}
}
}### Date ranges
Date can be used in queries assuming mapping type is correct (`date` in ES, `datetime.datetime` in Python):
Get documents created in the last 2 minutes:
GET /location/_search
{
"query": {
"range": {
"venue.meta.createdAt": {
"gte": "now-2m/m"
}
}
}
}For references, see
https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html#date-math
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html#ranges-on-dates
## GraphQL search API - using curl
$ curl --insecure -X POST -H "Content-Type: application/json" --data '{"query":"query{unifiedSearch(text:\"leikkipuisto\", index:\"location\"){count}}"}' /search
{"data":{"unifiedSearch":{"count":61}}}
## Python dependencies
Compile requirements.in to requirements.txt:
pip-compile
Install dependencies from requirements.txt:
pip install -r requirements.txt
## Known issues
1. New index is added but Elasticsearch returns elasticsearch.exceptions.AuthorizationException.
Elasticsearch access control list needs to be updated with access to new index. When using Aiven it
can be done from its control panel (under ACL).# Issues board
https://helsinkisolutionoffice.atlassian.net/projects/US/issues/