{"id":15912090,"url":"https://github.com/ulf1/simiscore-semantic","last_synced_at":"2025-04-03T02:26:22.020Z","repository":{"id":98707308,"uuid":"355460576","full_name":"ulf1/simiscore-semantic","owner":"ulf1","description":"An ML API to compute semantic similarity scores between sentence examples.","archived":false,"fork":false,"pushed_at":"2023-07-30T19:12:15.000Z","size":72,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-02-08T16:38:09.412Z","etag":null,"topics":["ml-api","sbert","sentence-bert","similarity-score"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ulf1.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null},"funding":{"github":["ulf1","knit-bee"]}},"created_at":"2021-04-07T08:00:26.000Z","updated_at":"2023-07-30T19:00:33.000Z","dependencies_parsed_at":"2023-07-30T20:33:29.004Z","dependency_job_id":null,"html_url":"https://github.com/ulf1/simiscore-semantic","commit_stats":null,"previous_names":["ulf1/simiscore-semantic"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ulf1%2Fsimiscore-semantic","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ulf1%2Fsimiscore-semantic/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ulf1%2Fsimiscore-semantic/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ulf1%2Fsimiscore-semantic/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ulf1","download_url":"https://codeload.github.com/ulf1/simiscore-semantic/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246923927,"owners_count":20855641,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ml-api","sbert","sentence-bert","similarity-score"],"created_at":"2024-10-06T16:01:41.421Z","updated_at":"2025-04-03T02:26:21.997Z","avatar_url":"https://github.com/ulf1.png","language":"Python","funding_links":["https://github.com/sponsors/ulf1","https://github.com/sponsors/knit-bee"],"categories":[],"sub_categories":[],"readme":"[![DOI](https://zenodo.org/badge/355460576.svg)](https://zenodo.org/badge/latestdoi/355460576)\n\n\n# simiscore-semantic\nAn ML API to compute semantic similarity scores between sentence examples. \nThe API is programmed with the [`fastapi` Python package](https://fastapi.tiangolo.com/), \nand the semantic similarities are computed based on [SBert (`sentence-transformers` package)](https://github.com/UKPLab/sentence-transformers). \nThe deployment is configured for Docker Compose.\n\n\n\n## Docker Deployment\nCall Docker Compose\n\n```sh\nexport API_PORT=8083\ndocker-compose -f docker-compose.yml up --build\n\n# or as oneliner:\nAPI_PORT=8083 docker-compose -f docker-compose.yml up --build\n```\n\n(Start docker daemon before, e.g. `open /Applications/Docker.app` on MacOS).\n\nCheck\n\n```sh\ncurl http://localhost:8083\n```\n\nNotes: Only `main.py` is used in `Dockerfile`.\n\n\n\n## Local Development\n\n### Install a virtual environment\n\n```sh\npython3 -m venv .venv\nsource .venv/bin/activate\npip install --upgrade pip\npip install -r requirements.txt --no-cache-dir\npip install -r requirements-dev.txt --no-cache-dir\n```\n\n(If your git repo is stored in a folder with whitespaces, then don't use the subfolder `.venv`. Use an absolute path without whitespaces.)\n\n### Specify where to store the SBert model\nSBert allows to set the `cache_folder` via the environment variable `SENTENCE_TRANSFORMERS_HOME` (See [here](https://github.com/UKPLab/sentence-transformers/blob/bd19871d99068f4824ff6ef213d91596885889f7/sentence_transformers/SentenceTransformer.py#L48)).\n\n```sh\nmkdir ./sbert-models\nexport SENTENCE_TRANSFORMERS_HOME=\"$(pwd)/sbert-models\"\n```\n\n\n### Start Server\n\n```sh\nsource .venv/bin/activate\n# uvicorn app.main:app --reload\ngunicorn app.main:app --reload --bind=0.0.0.0:8083 \\\n    --worker-class=uvicorn.workers.UvicornH11Worker \\\n    --workers=1 --timeout=600\n```\n\n\n### Usage Examples\n\na) Send a list of strings.\n\n```sh\ncurl -X 'POST' \\\n  'http://localhost:8083/similarities/' \\\n  -H 'accept: application/json' \\\n  -H 'Content-Type: application/json' \\\n  -d '[\n    \"Der Film ist super.\",\n    \"Der Spielfilm ist gut.\",\n    \"Der Film ist Müll.\",\n    \"Der Spielfilm ist schlecht.\"\n  ]'\n```\n\nb) Send an JSON object with UUID4 as keys and text as values.\n\n```sh\ncurl -X 'POST' \\\n  'http://localhost:8083/similarities/' \\\n  -H 'accept: application/json' \\\n  -H 'Content-Type: application/json' \\\n  -d '{\n    \"80ba0456-1d26-4d22-8e80-113b919502ee\": \"Der Film ist super.\",\n    \"a47fe293-26e7-40f0-b0b5-202e0955458f\": \"Der Spielfilm ist gut.\",\n    \"86e356a3-5b42-4e03-91fc-cf69098b6dd2\": \"Der Film ist Müll.\",\n    \"779d0245-8f54-49ec-9f0f-8e29dc987b41\": \"Der Spielfilm ist schlecht.\"\n   }'\n```\n\n### Other commands and help\n* Check syntax: `flake8 --ignore=F401 --exclude=$(grep -v '^#' .gitignore | xargs | sed -e 's/ /,/g')`\n* Run Unit Tests: `PYTHONPATH=. pytest`\n- Show the docs: [http://localhost:8083/docs](http://localhost:8083/docs)\n- Show Redoc: [http://localhost:8083/redoc](http://localhost:8083/redoc)\n\n\n### Clean up \n```sh\nfind . -type f -name \"*.pyc\" | xargs rm\nfind . -type d -name \"__pycache__\" | xargs rm -r\nrm -r .pytest_cache\nrm -r .venv\n```\n\n\n## Appendix\n\n### Citation\n\n```\n@software{ulf_hamster_2022_7096002,\n  author       = {Ulf Hamster and\n                  Luise Köhler},\n  title        = {simiscore-semantic: ML API for semantic similarities},\n  month        = sep,\n  year         = 2022,\n  publisher    = {Zenodo},\n  version      = {0.1.0},\n  doi          = {10.5281/zenodo.7096002},\n  url          = {https://doi.org/10.5281/zenodo.7096002}\n}\n```\n\n### References\n- Sebastián Ramírez, 2018, FastAPI, [https://github.com/tiangolo/fastapi](https://github.com/tiangolo/fastapi)\n- Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3982–3992, Hong Kong, China. Association for Computational Linguistics. [http://dx.doi.org/10.18653/v1/D19-1410](http://dx.doi.org/10.18653/v1/D19-1410)\n\n### Support\nPlease [open an issue](https://github.com/satzbeleg/simiscore-semantic/issues) for support. \n\n\n### Contributing\nPlease contribute using [Github Flow](https://guides.github.com/introduction/flow/). Create a branch, add commits, and [open a pull request](https://github.com/satzbeleg/simiscore-semantic/compare/).\n\n\n### Acknowledgements\nThe \"Evidence\" project was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - [433249742](https://gepris.dfg.de/gepris/projekt/433249742) (GU 798/27-1; GE 1119/11-1).\n\n### Maintenance\n- till 31.Aug.2023 (v0.1.0) the code repository was maintained within the DFG project [433249742](https://gepris.dfg.de/gepris/projekt/433249742)\n- since 01.Sep.2023 (v0.2.0) the code repository is maintained by Ulf Hamster.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fulf1%2Fsimiscore-semantic","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fulf1%2Fsimiscore-semantic","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fulf1%2Fsimiscore-semantic/lists"}