Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/Promptly-Technologies-LLC/rss-fetch-action
Github Action to scrape an RSS feed to display on a Github Pages website
https://github.com/Promptly-Technologies-LLC/rss-fetch-action
Last synced: 2 months ago
JSON representation
Github Action to scrape an RSS feed to display on a Github Pages website
- Host: GitHub
- URL: https://github.com/Promptly-Technologies-LLC/rss-fetch-action
- Owner: Promptly-Technologies-LLC
- License: mit
- Created: 2023-09-25T18:03:58.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-01T07:31:51.000Z (6 months ago)
- Last Synced: 2024-08-03T01:17:56.909Z (6 months ago)
- Language: JavaScript
- Size: 1.64 MB
- Stars: 9
- Watchers: 0
- Forks: 0
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Codeowners: CODEOWNERS
Awesome Lists containing this project
- awesome-rainmana - Promptly-Technologies-LLC/rss-fetch-action - Github Action to scrape an RSS feed to display on a Github Pages website (JavaScript)
README
# RSS Feed Fetch Action
[![Github Super-Linter](https://github.com/Promptly-Technologies-LLC/rss-fetch-action/actions/workflows/linter.yml/badge.svg)](https://github.com/Promptly-Technologies-LLC/rss-fetch-action/actions/workflows/linter.yml)
![CI](https://github.com/actions/javascript-action/actions/workflows/ci.yml/badge.svg)## Introduction
The RSS Feed Fetch Action is a GitHub Action designed to automate the fetching of RSS feeds. It fetches an RSS feed from a given URL and saves it to a specified file in your GitHub repository. This action is particularly useful for populating content on GitHub Pages websites or other static site generators.
This GitHub Action is a wrapper around the [feed-extractor](https://www.npmjs.com/package/@extractus/feed-extractor) library's `extract` function. Understanding the `extract` function's parameters will enable you to make the most of this GitHub Action. This tool offers powerful parsing and standardization across a wide range of different feed formats, while also enabling you to save feeds in an unopinionated and non-standardized way if you so choose.
## Features
- Fetches RSS, Atom, RDF, and JSON feeds
- Customizable parser and fetch options
- Saves the fetched RSS feed to a specified `.json` file
- Optionally removes the `published` field from the fetched feed to prevent unnecessary commits## Usage
Here's a basic example to add to your GitHub Actions workflow YAML file:
```yaml
name: Fetch RSS Feedon:
push:
branches:
- mainjobs:
fetch-rss:
runs-on: ubuntu-lateststeps:
- name: Checkout code
uses: actions/checkout@v4- name: Fetch RSS Feed
uses: Promptly-Technologies-LLC/rss-fetch-action@v2
with:
feed_url: 'https://example.com/rss'
file_path: './feed.json'
- name: Commit and push changes to repository
uses: stefanzweifel/git-auto-commit-action@v4
with:
commit_message: 'Update RSS feed'
file_pattern: '*.json'```
In this workflow, we fetch the RSS feed at `https://example.com/rss` and save it to the file `./feed.json`. We then commit and push the changes to the repository.
By default, the saved output will have the format:
```
{
title: String,
link: String,
description: String,
generator: String,
language: String,
published: ISO Date String,
entries: Array[
{
id: String,
title: String,
link: String,
description: String,
published: ISO Datetime String
},
// ...
]
}
```## Advanced Usage
To customize the fetch and parser options used in calling `feed-extractor`, you can use the `parser_options` and `fetch_options` inputs. For example, if you want to fetch the original, unaltered feed rather than impose a standardized format, you can set `parser_options` to `{"normalization": false}`. For more information on the available options, see the [feed-extractor README](https://www.npmjs.com/package/@extractus/feed-extractor#extract).
A `remove_published` option is also available. If set to `true`, this option will remove the `published` field from the fetched feed. This is useful if you want to prevent unnecessary commits to your repository. Many Atom feed providers update the `published` field once an hour, causing the feed to appear as if it has been updated (fail a `diff` check) even though none of the actual content has changed. You can prevent this from happening by removing the `published` field.
In the example below, we fetch a Substack Atom feed and save it to the file `./feed.json`. Because we want to get entire blog posts rather than just titles and descriptions, we request the 'content:encoded' field for each entry in the Atom feed. We also request a human-readable date format rather than an ISO timestamp. To achieve this, we pass a `parser_options` object with `useISODateFormat` and `getExtraEntryFields`. And finally, we opt to remove the published date from the blog feed, since many Atom feed providers update this field once an hour, causing unnecessary commits to the repository.
```yaml
name: Fetch RSS Feedon:
push:
branches:
- mainjobs:
fetch-rss:
runs-on: ubuntu-lateststeps:
- name: Checkout code
uses: actions/checkout@v4- name: Fetch RSS Feed
uses: Promptly-Technologies-LLC/rss-fetch-action@v2
with:
feed_url: https://knowledgeworkersguide.substack.com/feed
file_path: ./feed.json
parser_options: "{\"useISODateFormat\": false, \"getExtraEntryFields\": \"(feedEntry) => { return { 'content:encoded': feedEntry['content:encoded'] || '' }; }\"}"
fetch_options: "{}"
remove_published: true
- name: Commit and push changes to repository
uses: stefanzweifel/git-auto-commit-action@v4
with:
commit_message: 'Update RSS feed'
file_pattern: '*.json'```
Because of the `parser_options` we specified in this example, the output of the sample code would include the `content:encoded` field for each entry, and the `published` fields would be human-readable date strings rather than ISO timestamps.
## Inputs
### `feed_url`
**Required**
The URL(s) of the RSS feed(s) you want to fetch. Can be either a string or a JSON array (of same length as `file_path`).### `file_path`
**Required**
The relative file path(s) where you want to save the fetched RSS feed(s). Can be either a string or a JSON array (of same length as `feed_url`).### `parser_options`
**Optional**
A JSON string representing parser options. This maps directly to the parserOptions parameter in the feed-extractor library's extract function. For example, to disable ISO date formatting, you can pass {"useISODateFormat": false}.### `fetch_options`
**Optional**
A JSON string representing fetch options. This maps directly to the fetchOptions parameter in the feed-extractor library's extract function. Note that you will need to enclose JSON in quotes and to escape interior quote marks with backslashes (e.g, `\"`). For example, to set custom headers, you can pass `"{\"headers\": {\"user-agent\": \"Custom-Agent\"}}"`.### `remove_published`
**Optional**
A boolean value indicating whether to remove the `published` field from the fetched feed.