https://github.com/sckott/pyminer
Text-mining toolset for Crossref data
https://github.com/sckott/pyminer
crossref python text-mining
Last synced: 6 months ago
JSON representation
Text-mining toolset for Crossref data
- Host: GitHub
- URL: https://github.com/sckott/pyminer
- Owner: sckott
- License: mit
- Created: 2016-02-02T18:45:50.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2021-05-03T21:48:19.000Z (over 4 years ago)
- Last Synced: 2025-04-18T18:29:24.880Z (6 months ago)
- Topics: crossref, python, text-mining
- Language: Python
- Homepage:
- Size: 7.93 MB
- Stars: 8
- Watchers: 2
- Forks: 0
- Open Issues: 4
-
Metadata Files:
- Readme: README.rst
- Changelog: Changelog.rst
- License: LICENSE
Awesome Lists containing this project
README
pyminer
=======|pypi| |docs| |travis| |coverage|
Python client for text mining levaraging `Crossrefs Text and Data Mining service
`__.`Source on GitHub at sckott/pyminer `__
Other Crossref text mining (and related) clients:
* R: `rcrossref`, `ropensci/rcrossref `__
* R: `crminer`, `ropensci/crminer `__
* R: `fulltext`, `ropensci/fulltext `__
* Ruby: `textminer`, `sckott/textminer `__
* Python: `habanero`, `sckott/habanero `__Installation
============Stable from pypi
.. code-block:: console
pip install pyminer
Development version
.. code-block:: console
[sudo] pip install git+git://github.com/sckott/pyminer.git#egg=pyminer
Search
======Strongly recommend for search using your email in the mailto parameter in the
Miner() call to get in the "fast lane"... code-block:: python
from pyminer import Miner
import os
m = Miner(mailto = os.environ['crossref_email'])
m.search(filter = {'has_full_text': True}, limit = 5)Fetch
=====If you have a Crossref Text and Data Mining key/token, you can give it in the
tdmkey parameter in the Miner() call.. code-block:: python
# a Pensoft article
from pyminer import Miner
import os
m = Miner(mailto = os.environ['crossref_email'])
x = m.search(ids = '10.3897/rio.2.e10445')
x
out = x.fetch(type = "pdf")
out
out[0].url
out[0].path
out[0].type
out[0].parse()# an Elsevier article - BEWARE, they check IP addresses, so your IP address
# must be at a member institution or similar
from pyminer import Miner
import os
m = Miner(mailto = os.environ['crossref_email'], tdmkey = os.environ['CROSSREF_TDM'])
x = m.search(ids = "10.1016/j.funeco.2010.11.003")
out = x.fetch(type = "xml")
out
out[0].path
out[0].parse()Extract
=======.. code-block:: python
from pyminer import fetch, extract
url = 'http://www.nepjol.info/index.php/JSAN/article/viewFile/13527/10928'
x = fetch(url)
extract(x.path)Meta
====* License: MIT, see `LICENSE file `__
* Please note that this project is released with a `Contributor Code of Conduct `__. By participating in this project you agree to abide by its terms... |pypi| image:: https://img.shields.io/pypi/v/pyminer.svg
:target: https://pypi.python.org/pypi/pyminer.. |docs| image:: https://readthedocs.org/projects/pyminer/badge/?version=latest
:target: http://pyminer.readthedocs.io/en/latest/?badge=latest.. |travis| image:: https://travis-ci.org/sckott/pyminer.svg
:target: https://travis-ci.org/sckott/pyminer.. |coverage| image:: https://coveralls.io/repos/sckott/pyminer/badge.svg?branch=master&service=github
:target: https://coveralls.io/github/sckott/pyminer?branch=master