An open API service indexing awesome lists of open source software.

https://github.com/demetersson83/arxiv-parser

A set of scripts for parsing scientific articles from arXiv.
https://github.com/demetersson83/arxiv-parser

arxiv-api metadata-extraction scientific-papers scientific-publications scientific-research

Last synced: 7 months ago
JSON representation

A set of scripts for parsing scientific articles from arXiv.

Awesome Lists containing this project

README

          

# arXiv Parser

(C) 2021 Mark M. Bailey, PhD

## About
This set of scripts is useful for parsing arXiv using its API. The 'arxiv_scraper.py' script will save atom XML output from the API as a set of JSON files. The 'arxiv_parse.py' script will convert all the json files into one json file with the arxiv query metadata removed. This script is useful for collecting data for meta analysis of large bodies of scientific work.

## Future Work
At some point, maybe I will build this into a library.