https://github.com/edsu/oai2pairtree
command line utility to dump records in an oai-pmh repository as xml in a pairtree
https://github.com/edsu/oai2pairtree
Last synced: over 1 year ago
JSON representation
command line utility to dump records in an oai-pmh repository as xml in a pairtree
- Host: GitHub
- URL: https://github.com/edsu/oai2pairtree
- Owner: edsu
- Created: 2011-08-05T01:44:04.000Z (almost 15 years ago)
- Default Branch: master
- Last Pushed: 2012-04-03T10:09:23.000Z (about 14 years ago)
- Last Synced: 2024-10-12T14:08:26.337Z (over 1 year ago)
- Language: Python
- Homepage:
- Size: 92.8 KB
- Stars: 4
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
oai2pairtree
============
oai2pairtree.py harvests records from an [oai-pmh](http://www.openarchives.org/OAI/openarchivesprotocol.html) repository and stores them in a [pairtree](https://confluence.ucop.edu/display/Curation/PairTree) on the filesystem.
Usage
-----
oai2pairtree.py http://www.pubmedcentral.nih.gov/oai/oai.cgi
or if you want to limit to a particular set:
oai2pairtree.py http://www.pubmedcentral.nih.gov/oai/oai.cgi --set pmc-open
or if you want to also limit to a particular kind of record metadata:
oai2pairtree.py http://www.pubmedcentral.nih.gov/oai/oai.cgi --set pmc-open --metadata_prefix pmc
Installation
------------
oai2pairtree requires that the [lxml](http://lxml.de/) and [ptree](http://pypi.python.org/pypi/ptree) to run. The best way to get these is to:
easy_install oai2pairtree
or:
pip install oai2pairtree
or, if you prefer:
git clone https://github.com/edsu/oai2pairtree.git
cd oai2pairtree
python setup.py install
License
-------
* CC0