Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/greenelab/preprint-similarity-search
A web app that uses machine learning to recommend the most suitable journals based on the text content of your preprint
https://github.com/greenelab/preprint-similarity-search
journals nlp nlp-machine-learning web-app
Last synced: 7 days ago
JSON representation
A web app that uses machine learning to recommend the most suitable journals based on the text content of your preprint
- Host: GitHub
- URL: https://github.com/greenelab/preprint-similarity-search
- Owner: greenelab
- License: other
- Created: 2020-06-16T14:09:22.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2023-05-23T01:19:29.000Z (over 1 year ago)
- Last Synced: 2023-08-21T19:10:49.620Z (about 1 year ago)
- Topics: journals, nlp, nlp-machine-learning, web-app
- Language: Python
- Homepage: https://greenelab.github.io/preprint-similarity-search/
- Size: 43.9 MB
- Stars: 18
- Watchers: 5
- Forks: 4
- Open Issues: 26
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# Preprint Similarity Search
[⭐ OPEN THE APP](https://greenelab.github.io/preprint-similarity-search/) to start using the tool right away
[📜 READ THE MANUSCRIPT](http://greenelab.github.io/annorxiver_manuscript) for technical details on the machine learning model behind the tool
[🤖 USE THE API](https://api-pss.greenelab.com/doi/10.1101/833400) like `https://api-pss.greenelab.com/doi/YOUR-DOI`
Based on the work and classifiers in the [AnnoRxiver project](http://github.com/greenelab/annorxiver/)
### About
This tool uses a machine learning model trained on 2.3 million [PubMed Central open access documents](https://www.ncbi.nlm.nih.gov/pmc/tools/openftlist/)
to find similar papers and journals based on the textual content of your [bioRxiv](https://www.biorxiv.org/) or [medRxiv](https://www.medrxiv.org/) preprint.
These results can be used as a starting point when searching for a place to publish your paper.The tool also provides a "map" of the PubMed Central documents, grouped into bins based on similar textual content, and shows you where your preprint
falls on the map. Select a square to learn more about the papers in that bin.The map also incorporates a set of 50 [principal components](https://en.wikipedia.org/wiki/Principal_component_analysis) (PCs) generated from bio/medRxiv.
Each PC represents two high level concepts characterized by keywords of various strengths, illustrated in the word cloud thumbnails above the map.
Select a thumbnail to color the map by that PC.
Deeper orange squares will be papers that correlate more with the orange keywords in the image, and vice versa for blue.