Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/creisle/acronym_parser
Basic regex-based acronym parser for pulling acronym definitions out of scientific articles
https://github.com/creisle/acronym_parser
Last synced: 7 days ago
JSON representation
Basic regex-based acronym parser for pulling acronym definitions out of scientific articles
- Host: GitHub
- URL: https://github.com/creisle/acronym_parser
- Owner: creisle
- License: mit
- Created: 2024-09-10T19:07:19.000Z (2 months ago)
- Default Branch: master
- Last Pushed: 2024-09-12T17:04:49.000Z (2 months ago)
- Last Synced: 2024-09-13T06:13:13.091Z (2 months ago)
- Language: Python
- Size: 27.3 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Acronym Parser
A simple regex-based acronym parser to pull of acronym definitions from scientific articles which follow the pattern of
```text
some words definining an acronym (SWDA)
```## Getting Started
Currently this pacakge must be installed from this repo but in the future I may publish it to pip. I use poetry to install and manage dependencies. Note this requires python3.11 or higher
```bash
git clone ....acronym_parser
cd arconym_parser
poetry install
```After install you can use the package to parse acronym definitions from bioc documents. In the following example we are downloading a bioc document from pubmed and then applying the acronym parser
```python
import bioc
import requestsfrom acronym_parser import mark_acronyms
url = 'https://www.ncbi.nlm.nih.gov/research/bionlp/RESTful/pmcoa.cgi/BioC_xml/PMC8345926/unicode'
resp = requests.get(url)
resp.raise_for_status()doc = bioc.loads(resp.text).documents[0]
acronyms = grab_acronyms(doc)
```