https://github.com/cboulanger/citext
A web-based annotation tool and frontend for ML-based citation extraction from PDFs based on the AnyStyle library
https://github.com/cboulanger/citext
anystyle bibliometric-analysis citation-mining citations references textmining
Last synced: 27 days ago
JSON representation
A web-based annotation tool and frontend for ML-based citation extraction from PDFs based on the AnyStyle library
- Host: GitHub
- URL: https://github.com/cboulanger/citext
- Owner: cboulanger
- License: gpl-3.0
- Created: 2022-09-01T14:46:55.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-05-19T20:07:04.000Z (almost 2 years ago)
- Last Synced: 2025-02-07T20:31:29.380Z (3 months ago)
- Topics: anystyle, bibliometric-analysis, citation-mining, citations, references, textmining
- Language: JavaScript
- Homepage:
- Size: 123 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Web frontend for reference extraction
This is a docker image that provides a web application to produce training
material for ML-based reference extraction & segmentation engines. Currently
supported are- [AnyStyle](https://github.com/inukshuk/anystyle) (Annotation & reference extraction)
- [EXParser](http://exparser.readthedocs.io) (Only editing of existing
annotations; EXparser reference extraction was supported in
[v1.0.0](https://github.com/cboulanger/excite-docker/tree/v1.0.0))The image provides a Web UI for producing training material which is needed to
improve citation recognition for particular corpora of scholarly literature
where the current models do not perform well.A demo of the web frontend (without backend functionality) is available
[here](https://cboulanger.github.io/excite-docker/web/index.html).## Installation
1. Install [Docker](https://docs.docker.com/install)
2. Clone this repo with: `git clone https://github.com/cboulanger/excite-docker.git && cd excite-docker`
3. Build docker image: `./bin/build`
4. If you want to use AnyCite, please consult its GitHub page on how to install it: https://github.com/inukshuk/anystyle## Use of the web frontend
1. Run server: `./bin/start-servers`
2. Open frontend at http://127.0.0.1:8000/web/index.html
3. Click on "Help" for instructions (also lets you download the Zotero add-ons)## Zotero integration
You can connect the app to a local [Zotero](https://zotero.org) client to upload extracted
references. This feature requires the installation of the following add-ons:- [Cita](https://github.com/diegodlh/zotero-cita/)
- [API-Endpoint](https://github.com/Dominic-DallOsto/zotero-api-endpoint)The webapp will then enable additional commands that let you retrieve the
PDF attachment(s) of the currently selected item/collection, extract references
from them and store them with the citing item.If the Zotero storage folder is not located in `~/Zotero/storage`, you need to
rename `.env.dist` to `.env` and in this file, set the `ZOTERO_STORAGE_PATH`
environment variable to the path pointing to this directory.