{"id":15911776,"url":"https://github.com/ulf1/simiscore-biblio","last_synced_at":"2026-01-27T07:05:28.708Z","repository":{"id":98707149,"uuid":"355466087","full_name":"ulf1/simiscore-biblio","owner":"ulf1","description":"An ML API to compute similarity scores between meta information about sentence examples.","archived":false,"fork":false,"pushed_at":"2023-07-30T19:11:06.000Z","size":67,"stargazers_count":2,"open_issues_count":2,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-05-30T11:35:02.077Z","etag":null,"topics":["k-shingling","ml-api","similarity-scores"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ulf1.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null},"funding":{"github":["ulf1","knit-bee"]}},"created_at":"2021-04-07T08:17:55.000Z","updated_at":"2023-07-30T18:59:22.000Z","dependencies_parsed_at":"2023-07-30T20:33:42.399Z","dependency_job_id":null,"html_url":"https://github.com/ulf1/simiscore-biblio","commit_stats":null,"previous_names":["ulf1/simiscore-biblio"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/ulf1/simiscore-biblio","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ulf1%2Fsimiscore-biblio","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ulf1%2Fsimiscore-biblio/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ulf1%2Fsimiscore-biblio/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ulf1%2Fsimiscore-biblio/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ulf1","download_url":"https://codeload.github.com/ulf1/simiscore-biblio/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ulf1%2Fsimiscore-biblio/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28807172,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-27T06:25:51.065Z","status":"ssl_error","status_checked_at":"2026-01-27T06:25:50.640Z","response_time":168,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["k-shingling","ml-api","similarity-scores"],"created_at":"2024-10-06T16:01:00.219Z","updated_at":"2026-01-27T07:05:28.690Z","avatar_url":"https://github.com/ulf1.png","language":"Python","funding_links":["https://github.com/sponsors/ulf1","https://github.com/sponsors/knit-bee"],"categories":[],"sub_categories":[],"readme":"[![DOI](https://zenodo.org/badge/355466087.svg)](https://zenodo.org/badge/latestdoi/355466087)\n\n\n# simiscore-biblio\nAn ML API to compute similarity scores between meta information about sentence examples. \nThe API is programmed with the [`fastapi` Python package](https://fastapi.tiangolo.com/), \nuses the packages [`datasketch`](http://ekzhu.com/datasketch/index.html) and [`kshingle`](https://github.com/ulf1/kshingle) to compute similarity scores.\nThe deployment is configured for Docker Compose.\n\n## Docker Deployment\nCall Docker Compose\n\n```sh\nexport API_PORT=8081\ndocker-compose -f docker-compose.yml up --build\n# or as oneliner:\n\nAPI_PORT=8081 docker-compose up --build\n```\n\n(Start docker daemon before, e.g. `open /Applications/Docker.app` on MacOS).\n\nCheck\n\n```sh\ncurl http://localhost:8081\n```\n\nNotes: Only `main.py` is used in `Dockerfile`.\n\n\n## Local Development\n\n### Install a virtual environment\n\n```sh\npython3 -m venv .venv\nsource .venv/bin/activate\npip install --upgrade pip\npip install -r requirements.txt --no-cache-dir\npip install -r requirements-dev.txt --no-cache-dir\n```\n\n(If your git repo is stored in a folder with whitespaces, then don't use the subfolder `.venv`. Use an absolute path without whitespaces.)\n\n\n### Start Server\n\n```sh\nsource .venv/bin/activate\n# uvicorn app.main:app --reload\ngunicorn app.main:app --reload --bind=0.0.0.0:8081 \\\n    --worker-class=uvicorn.workers.UvicornH11Worker \\\n    --workers=1 --timeout=600\n```\n\n### Run some requests\n\n```sh\ncurl -X POST \"http://localhost:8081/similarities/\" \\\n    -H \"accept: application/json\" \\\n    -H \"Content-Type: application/json\" \\\n    -d '[\n        \"Christ, Lena: Die Rumplhanni. In: Deutsche Literatur von Frauen, Berlin: Directmedia Publ. 2001 [1917], S. 13229\", \n        \"Christ, Lena: Erinnerungen einer Überflüssigen. In: Deutsche Literatur von Frauen, Berlin: Directmedia Publ. 2001 [1912], S. 12498\"\n    ]'\n```\n\n### Other commands and help\n* Check syntax: `flake8 --ignore=F401 --exclude=$(grep -v '^#' .gitignore | xargs | sed -e 's/ /,/g')`\n* Run Unit Tests: `PYTHONPATH=. pytest`\n- Show the docs: [http://localhost:8081/docs](http://localhost:8081/docs)\n- Show Redoc: [http://localhost:8081/redoc](http://localhost:8081/redoc)\n\n\n### Clean up \n```sh\nfind . -type f -name \"*.pyc\" | xargs rm\nfind . -type d -name \"__pycache__\" | xargs rm -r\nrm -r .pytest_cache\nrm -r .venv\n```\n\n\n## Appendix\n\n### Citation\n```\n@software{ulf_hamster_2022_7096467,\n  author       = {Ulf Hamster and\n                  Luise Köhler},\n  title        = {simiscore-biblio: ML API for bibliographic similarities},\n  month        = sep,\n  year         = 2022,\n  publisher    = {Zenodo},\n  version      = {0.1.0},\n  doi          = {10.5281/zenodo.7096467},\n  url          = {https://doi.org/10.5281/zenodo.7096467}\n}\n```\n\n### References\n- Sebastián Ramírez, 2018, FastAPI, [https://github.com/tiangolo/fastapi](https://github.com/tiangolo/fastapi)\n- Eric Zhu, Vadim Markovtsev, aastafiev, Wojciech Łukasiewicz, ae-foster, Sinusoidal36, Ekevoo, Kevin Mann, Keyur Joshi, Peter Kubov, Qin TianHuan, Spandan Thakur, Stefano Ortolani, Titusz, Vojtech Letal, Zac Bentley, fpug, \u0026 oisincar. (2021). ekzhu/datasketch: v1.5.4 (v1.5.4). Zenodo. [https://doi.org/10.5281/zenodo.5758425](https://doi.org/10.5281/zenodo.5758425)\n- Ulf Hamster. (2022). kshingle: Shingling text data (0.10.0). Zenodo. [https://doi.org/10.5281/zenodo.7096407](https://doi.org/10.5281/zenodo.7096407)\n- Leonard Richardson, 2007, Beautiful soup, [https://www.crummy.com/software/BeautifulSoup/](https://www.crummy.com/software/BeautifulSoup/)\n\n### Support\nPlease [open an issue](https://github.com/satzbeleg/simiscore-biblio/issues/new) for support.\n\n\n### Contributing\nPlease contribute using [Github Flow](https://guides.github.com/introduction/flow/). Create a branch, add commits, and [open a pull request](https://github.com/satzbeleg/simiscore-biblio/compare/).\n\n### Acknowledgements\nThe \"Evidence\" project was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - [433249742](https://gepris.dfg.de/gepris/projekt/433249742) (GU 798/27-1; GE 1119/11-1).\n\n### Maintenance\n- till 31.Aug.2023 (v0.1.0) the code repository was maintained within the DFG project [433249742](https://gepris.dfg.de/gepris/projekt/433249742)\n- since 01.Sep.2023 (v0.2.0) the code repository is maintained by Ulf Hamster.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fulf1%2Fsimiscore-biblio","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fulf1%2Fsimiscore-biblio","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fulf1%2Fsimiscore-biblio/lists"}