Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/nicolaslm/atoma

Atom, RSS and JSON feed parser for Python 3
https://github.com/nicolaslm/atoma

atom feed json-feed parser python3 rfc-4287 rss rss-feed-parser syndication

Last synced: about 1 hour ago
JSON representation

Atom, RSS and JSON feed parser for Python 3

Awesome Lists containing this project

README

        

Atoma
=====

.. image:: https://github.com/NicolasLM/atoma/actions/workflows/test.yml/badge.svg
:target: https://github.com/NicolasLM/atoma/actions/workflows/test.yml
.. image:: https://codecov.io/gh/NicolasLM/atoma/branch/main/graph/badge.svg
:target: https://codecov.io/gh/NicolasLM/atoma

Atom, RSS and JSON feed parser for Python 3.

Quickstart
----------

Install Atoma with pip::

pip install atoma

Load and parse an Atom XML file:

.. code:: python

>>> import atoma
>>> feed = atoma.parse_atom_feed('atom-feed.xml')
>>> feed.description
'The blog relating the daily life of web agency developers'
>>> len(feed.items)
5

A small change is needed if you are dealing with an RSS XML file:

.. code:: python

>>> feed = atoma.parse_rss_feed('rss-feed.xml')

Parsing feeds from the Internet is easy as well:

.. code:: python

>>> import atoma, requests
>>> response = requests.get('http://lucumr.pocoo.org/feed.atom')
>>> feed = atoma.parse_atom_bytes(response.content)
>>> feed.title.value
"Armin Ronacher's Thoughts and Writings"

Features
--------

* RSS 2.0 - `RSS 2.0 Specification `_
* Atom Syndication Format v1 - `RFC4287 `_
* JSON Feed v1 - `JSON Feed specification `_
* OPML 2.0, to share lists of feeds - `OPML 2.0 `_
* Typed: feeds decomposed into meaningful Python objects
* Secure: uses defusedxml to load untrusted feeds
* Compatible with Python 3.6+

Security warning
----------------

If you use this library to display content from feeds in a web page, you NEED
to clean the HTML contained in the feeds to prevent `Cross-site scripting (XSS)
`_. The `bleach
`_ library is recommended for cleaning feeds.

Useful Resources
----------------

To use this library a basic understanding of feeds is required. For Atom, the
`Introduction to Atom `_ is a must
read. The `RFC 4287 `_ can help lift some
ambiguities. Finally the `feed validator `_ is
great to test hand-crafted feeds.

For RSS, the `RSS specification `_ and
`rssboard.org `_ have a ton of information and
examples.

For OPML, the `OPML specification
`_ has a paragraph dedicated
to its usage for syndication

Non-implemented Features
------------------------

Some seldom used features are not implemented:

* XML signature and encryption
* Some Atom and RSS extensions
* Atom content other than ``text``, ``html`` and ``xhtml``

License
-------

MIT