{"id":40209603,"url":"https://github.com/meedan/alegre","last_synced_at":"2026-01-19T21:04:17.329Z","repository":{"id":9772311,"uuid":"44700808","full_name":"meedan/alegre","owner":"meedan","description":"A text and media analysis service for Meedan Check, a collaborative media annotation platform","archived":false,"fork":false,"pushed_at":"2025-10-03T16:39:56.000Z","size":163423,"stargazers_count":16,"open_issues_count":23,"forks_count":7,"subscribers_count":14,"default_branch":"develop","last_synced_at":"2025-10-03T18:42:41.638Z","etag":null,"topics":["hacktoberfest","image-classification","language-detection","natural-language-processing","similarity-search","translation-memory"],"latest_commit_sha":null,"homepage":"https://meedan.com/check","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/meedan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2015-10-21T19:59:30.000Z","updated_at":"2025-10-03T16:39:04.000Z","dependencies_parsed_at":"2024-04-24T18:40:31.364Z","dependency_job_id":"f8a31e44-4dcb-4b1d-910d-6de10a923f31","html_url":"https://github.com/meedan/alegre","commit_stats":null,"previous_names":[],"tags_count":127,"template":false,"template_full_name":null,"purl":"pkg:github/meedan/alegre","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/meedan%2Falegre","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/meedan%2Falegre/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/meedan%2Falegre/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/meedan%2Falegre/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/meedan","download_url":"https://codeload.github.com/meedan/alegre/tar.gz/refs/heads/develop","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/meedan%2Falegre/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28585305,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-19T20:45:59.482Z","status":"ssl_error","status_checked_at":"2026-01-19T20:45:41.500Z","response_time":67,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hacktoberfest","image-classification","language-detection","natural-language-processing","similarity-search","translation-memory"],"created_at":"2026-01-19T21:04:17.256Z","updated_at":"2026-01-19T21:04:17.316Z","avatar_url":"https://github.com/meedan.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"alegre\n------\n\nA media similarity analysis service. Part of the [Check platform](https://meedan.com/check). Refer to the [main repository](https://github.com/meedan/check) for quick start instructions.\n\nThere is also an [overview of the similairty infrastructure](doc/meedan_similarity_infra_overview.md) and more [detailed explanation of the process for each media type](doc/similarity-media-type-detail.md). \n\n## Development\n\n- Update your [virtual memory settings](https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html), e.g. by setting `vm.max_map_count=262144` in `/etc/sysctl.conf`. This can also be done by the Docker UI, adjusting Resource settings to 12GB memory and 128GB of disk.  \n- Ensure that the services needed are uncommented in the `docker-compose.yml` file.  Specifically, to run the default tests the `xlm_r_bert_base_nli_stsb_mean_tokens`, `indian_sbert`, `video` and `audio` definitions are needed.\n- `docker-compose build`\n- `docker-compose up --abort-on-container-exit`\n- Open http://localhost:3100 for the Alegre API\n\nThe Alegre API Swagger UI unfortunately [does not support sending body payloads to GET methods](https://github.com/swagger-api/swagger-ui/issues/2136). To test those API methods, you can still fill in your arguments, and click \"Execute\" - Swagger will fail, but show you a `curl` command that you can use in your console.\n\n- Open http://localhost:5601 for the Kibana UI\n- Open http://localhost:9200 for the Elasticsearch API\n- `docker-compose exec alegre flask shell` to get inside a Python shell in docker container with the loaded app\n\n## Testing\n- For the full set of tests to pass, some configuration secrets are required (i.e. Google Translate API keys, etc)\n- `docker-compose -f docker-compose.yml -f docker-test.yml up --abort-on-container-exit`\n- Wait for the logs to settle, then in a different console:\n- `docker-compose exec alegre make test`\n- `docker-compose exec alegre coverage report`\n\nTo test individual modules:\n- `docker-compose exec alegre bash` (opens a bash shell with appropriate environment in the docker container)\n- `python manage.py test -p test_similarity.py`\n\n## Troubleshooting\n\n- If you're having trouble starting Elasticsearch on macOS, with the error `container_name exited with code 137`, you will need to adjust your Docker settings, as per https://www.petefreitag.com/item/848.cfm\n- Note that the alegre docker service definitions in the `alegre` repo may not align with the alegre service definitions in the `check` repository, so different variations of the service may be spun up depending on the directory where `docker-compose up` is executed. \n\n\n## Diagrams\n\nNOTE: these diagrams need to be updated with the new endpoints from Presto migration\n\n### Similarity-Related HTTP requests Alegre receives from Check API\n\n![Similarity-Related HTTP requests Alegre receives from Check API](doc/elasticsearch_detail.png?raw=true \"Similarity-Related HTTP requests Alegre receives from Check API\")\n\n(Source: https://docs.google.com/drawings/d/1-teqtZJfU4MSDUGVwWL9F4cXDKDnVObDYg3a9jJOP1Y/edit)\n### Text Queries generated by Similarity Requests from Check API within Alegre\n\n![Text Queries generated by Similarity Requests from Check API within Alegre](doc/alegre_parameter_breakdown.png?raw=true \"Text Queries generated by Similarity Requests from Check API within Alegre\")\n\n(Source: https://docs.google.com/drawings/d/1jvwn5wM6T2jlnaS_fS7_u6sH02HVHi6L8Q9H_vD4SuY/edit)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmeedan%2Falegre","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmeedan%2Falegre","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmeedan%2Falegre/lists"}