Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/inspirehep/hepcrawl

Scrapy project for feeds into INSPIRE-HEP
https://github.com/inspirehep/hepcrawl

crawler harvest-data publishing python

Last synced: about 4 hours ago
JSON representation

Scrapy project for feeds into INSPIRE-HEP

Awesome Lists containing this project

README

        

..
This file is part of hepcrawl.
Copyright (C) 2015, 2016, 2017 CERN.

hepcrawl is a free software; you can redistribute it and/or modify it
under the terms of the Revised BSD License; see LICENSE file for
more details.

==========
HEPcrawl
==========

.. image:: https://img.shields.io/travis/inspirehep/hepcrawl.svg
:target: https://travis-ci.org/inspirehep/hepcrawl

.. image:: https://img.shields.io/github/tag/inspirehep/hepcrawl.svg
:target: https://github.com/inspirehep/hepcrawl/releases

.. image:: https://img.shields.io/pypi/dm/hepcrawl.svg
:target: https://pypi.python.org/pypi/hepcrawl

.. image:: https://img.shields.io/github/license/inspirehep/hepcrawl.svg
:target: https://github.com/inspirehep/hepcrawl/blob/master/LICENSE

HEPcrawl is a harvesting library based on Scrapy (http://scrapy.org) for INSPIRE-HEP
(http://inspirehep.net) that focuses on automatic and semi-automatic retrieval of
new content from all the sources the site aggregates. In particular content from
major and minor publishers in the field of High-Energy Physics.

The project is currently in early stage of development.

See full documentation at http://pythonhosted.org/hepcrawl