{"id":15883218,"url":"https://github.com/cboulanger/citext","last_synced_at":"2025-07-31T06:13:40.965Z","repository":{"id":138345037,"uuid":"531562468","full_name":"cboulanger/citext","owner":"cboulanger","description":"A web-based annotation tool and frontend for ML-based citation extraction from PDFs based on the AnyStyle library","archived":false,"fork":false,"pushed_at":"2023-05-19T20:07:04.000Z","size":128816,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-02T06:12:19.395Z","etag":null,"topics":["anystyle","bibliometric-analysis","citation-mining","citations","references","textmining"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cboulanger.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-01T14:46:55.000Z","updated_at":"2024-06-22T10:56:32.000Z","dependencies_parsed_at":null,"dependency_job_id":"f5200b40-e056-4e9f-ad8f-310bc65be691","html_url":"https://github.com/cboulanger/citext","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/cboulanger/citext","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cboulanger%2Fcitext","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cboulanger%2Fcitext/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cboulanger%2Fcitext/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cboulanger%2Fcitext/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cboulanger","download_url":"https://codeload.github.com/cboulanger/citext/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cboulanger%2Fcitext/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267997183,"owners_count":24178251,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-31T02:00:08.723Z","response_time":66,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anystyle","bibliometric-analysis","citation-mining","citations","references","textmining"],"created_at":"2024-10-06T04:08:48.068Z","updated_at":"2025-07-31T06:13:40.942Z","avatar_url":"https://github.com/cboulanger.png","language":"JavaScript","readme":"# Web frontend for reference extraction\n\nThis is a docker image that provides a web application to produce training\nmaterial for ML-based reference extraction \u0026 segmentation engines. Currently\nsupported are\n\n- [AnyStyle](https://github.com/inukshuk/anystyle) (Annotation \u0026 reference extraction)\n- [EXParser](http://exparser.readthedocs.io) (Only editing of existing\n  annotations; EXparser reference extraction was supported in\n  [v1.0.0](https://github.com/cboulanger/excite-docker/tree/v1.0.0))\n\nThe image provides a Web UI for producing training material which is needed to\nimprove citation recognition for particular corpora of scholarly literature\nwhere the current models do not perform well.\n\nA demo of the web frontend (without backend functionality) is available \n[here](https://cboulanger.github.io/excite-docker/web/index.html).\n\n## Installation\n\n1. Install [Docker](https://docs.docker.com/install)\n2. Clone this repo with: `git clone https://github.com/cboulanger/excite-docker.git \u0026\u0026 cd excite-docker`\n3. Build docker image: `./bin/build`\n4. If you want to use AnyCite, please consult its GitHub page on how to install it: https://github.com/inukshuk/anystyle\n\n## Use of the web frontend\n\n1. Run server: `./bin/start-servers`\n2. Open frontend at http://127.0.0.1:8000/web/index.html\n3. Click on \"Help\" for instructions (also lets you download the Zotero add-ons)\n\n## Zotero integration \n\nYou can connect the app to a local [Zotero](https://zotero.org) client to upload extracted\nreferences. This feature requires the installation of the following add-ons: \n\n- [Cita](https://github.com/diegodlh/zotero-cita/) \n- [API-Endpoint](https://github.com/Dominic-DallOsto/zotero-api-endpoint)\n\nThe webapp will then enable additional commands that let you retrieve the\nPDF attachment(s) of the currently selected item/collection, extract references\nfrom them and store them with the citing item.\n\nIf the Zotero storage folder is not located in `~/Zotero/storage`, you need to\nrename `.env.dist` to `.env` and in this file, set the `ZOTERO_STORAGE_PATH`\nenvironment variable to the path pointing to this directory.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcboulanger%2Fcitext","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcboulanger%2Fcitext","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcboulanger%2Fcitext/lists"}