https://github.com/meedan/alegre
A text and media analysis service for Meedan Check, a collaborative media annotation platform
https://github.com/meedan/alegre
hacktoberfest image-classification language-detection natural-language-processing similarity-search translation-memory
Last synced: 5 months ago
JSON representation
A text and media analysis service for Meedan Check, a collaborative media annotation platform
- Host: GitHub
- URL: https://github.com/meedan/alegre
- Owner: meedan
- License: mit
- Created: 2015-10-21T19:59:30.000Z (over 10 years ago)
- Default Branch: develop
- Last Pushed: 2025-10-03T16:39:56.000Z (8 months ago)
- Last Synced: 2025-10-03T18:42:41.638Z (8 months ago)
- Topics: hacktoberfest, image-classification, language-detection, natural-language-processing, similarity-search, translation-memory
- Language: HTML
- Homepage: https://meedan.com/check
- Size: 156 MB
- Stars: 16
- Watchers: 14
- Forks: 7
- Open Issues: 23
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
- Codeowners: CODEOWNERS
Awesome Lists containing this project
README
alegre
------
A media similarity analysis service. Part of the [Check platform](https://meedan.com/check). Refer to the [main repository](https://github.com/meedan/check) for quick start instructions.
There is also an [overview of the similairty infrastructure](doc/meedan_similarity_infra_overview.md) and more [detailed explanation of the process for each media type](doc/similarity-media-type-detail.md).
## Development
- Update your [virtual memory settings](https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html), e.g. by setting `vm.max_map_count=262144` in `/etc/sysctl.conf`. This can also be done by the Docker UI, adjusting Resource settings to 12GB memory and 128GB of disk.
- Ensure that the services needed are uncommented in the `docker-compose.yml` file. Specifically, to run the default tests the `xlm_r_bert_base_nli_stsb_mean_tokens`, `indian_sbert`, `video` and `audio` definitions are needed.
- `docker-compose build`
- `docker-compose up --abort-on-container-exit`
- Open http://localhost:3100 for the Alegre API
The Alegre API Swagger UI unfortunately [does not support sending body payloads to GET methods](https://github.com/swagger-api/swagger-ui/issues/2136). To test those API methods, you can still fill in your arguments, and click "Execute" - Swagger will fail, but show you a `curl` command that you can use in your console.
- Open http://localhost:5601 for the Kibana UI
- Open http://localhost:9200 for the Elasticsearch API
- `docker-compose exec alegre flask shell` to get inside a Python shell in docker container with the loaded app
## Testing
- For the full set of tests to pass, some configuration secrets are required (i.e. Google Translate API keys, etc)
- `docker-compose -f docker-compose.yml -f docker-test.yml up --abort-on-container-exit`
- Wait for the logs to settle, then in a different console:
- `docker-compose exec alegre make test`
- `docker-compose exec alegre coverage report`
To test individual modules:
- `docker-compose exec alegre bash` (opens a bash shell with appropriate environment in the docker container)
- `python manage.py test -p test_similarity.py`
## Troubleshooting
- If you're having trouble starting Elasticsearch on macOS, with the error `container_name exited with code 137`, you will need to adjust your Docker settings, as per https://www.petefreitag.com/item/848.cfm
- Note that the alegre docker service definitions in the `alegre` repo may not align with the alegre service definitions in the `check` repository, so different variations of the service may be spun up depending on the directory where `docker-compose up` is executed.
## Diagrams
NOTE: these diagrams need to be updated with the new endpoints from Presto migration
### Similarity-Related HTTP requests Alegre receives from Check API

(Source: https://docs.google.com/drawings/d/1-teqtZJfU4MSDUGVwWL9F4cXDKDnVObDYg3a9jJOP1Y/edit)
### Text Queries generated by Similarity Requests from Check API within Alegre

(Source: https://docs.google.com/drawings/d/1jvwn5wM6T2jlnaS_fS7_u6sH02HVHi6L8Q9H_vD4SuY/edit)