{"id":26830682,"url":"https://github.com/guichaguri/post-feed-reader","last_synced_at":"2025-04-30T08:13:52.211Z","repository":{"id":37961314,"uuid":"446229727","full_name":"Guichaguri/post-feed-reader","owner":"Guichaguri","description":"Discovers and parses news, blog and podcast posts from any website","archived":false,"fork":false,"pushed_at":"2025-01-12T20:26:22.000Z","size":57,"stargazers_count":6,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-30T08:13:47.668Z","etag":null,"topics":["atom","autodiscovery","feed","jsonfeed","parser","posts","rss","wordpress"],"latest_commit_sha":null,"homepage":"https://www.npmjs.com/package/post-feed-reader","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Guichaguri.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-01-09T23:47:33.000Z","updated_at":"2025-01-12T20:26:19.000Z","dependencies_parsed_at":"2022-09-13T15:31:00.488Z","dependency_job_id":"ba230f7e-1e2c-492b-b1c4-d38bcfceb707","html_url":"https://github.com/Guichaguri/post-feed-reader","commit_stats":{"total_commits":26,"total_committers":2,"mean_commits":13.0,"dds":0.07692307692307687,"last_synced_commit":"69de1d9aed2380ce4a5f5e7f78e9695fcb1a1471"},"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Guichaguri%2Fpost-feed-reader","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Guichaguri%2Fpost-feed-reader/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Guichaguri%2Fpost-feed-reader/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Guichaguri%2Fpost-feed-reader/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Guichaguri","download_url":"https://codeload.github.com/Guichaguri/post-feed-reader/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251666361,"owners_count":21624298,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["atom","autodiscovery","feed","jsonfeed","parser","posts","rss","wordpress"],"created_at":"2025-03-30T14:16:59.235Z","updated_at":"2025-04-30T08:13:52.189Z","avatar_url":"https://github.com/Guichaguri.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# post-feed-reader\n\n[![npm](https://img.shields.io/npm/v/post-feed-reader.svg)](https://www.npmjs.com/package/post-feed-reader)\n[![license](https://img.shields.io/github/license/Guichaguri/post-feed-reader)](https://github.com/Guichaguri/post-feed-reader/blob/main/LICENSE)\n\nA library to fetch news, blog or podcast posts from any site.\nIt works by auto-discovering a post source, which can be an RSS/Atom/JSON feed or the Wordpress REST API, then fetches and parses the list of posts.\n\nIt's meant for NodeJS, but as it is built on Isomorphic Javascript, it can work on browsers if the website allows cross-origin requests.\n\nOriginally built for apps that need to list the posts with their own UI, but don't actually manage the blog and need automatic fallbacks when the blog technology does change.\n\n# Features\n- **Simple**: Two-liner usage. Just discovers and fetches the posts.\n- **Supports multiple sources**: [Wordpress REST API](https://developer.wordpress.org/rest-api/reference/posts/), [RSS 2.0.1](https://www.rssboard.org/rss-2-0-11), [RSS 2.0](https://www.rssboard.org/rss-2-0), [RSS 1.0](https://web.resource.org/rss/1.0/spec), [RSS 0.91](https://www.rssboard.org/rss-0-9-1), [Atom 1.0](https://datatracker.ietf.org/doc/html/rfc4287), [JSON Feed 1.1](https://www.jsonfeed.org/version/1.1/), [JSON Feed 1.0](https://www.jsonfeed.org/version/1/) and [RSS-in-JSON](https://github.com/scripting/Scripting-News/blob/master/rss-in-json/README.md).\n- **Auto-discovery**: Give any site URL and the library will try to find the data automatically.\n- **Pagination**: For feeds that support it, you can fetch more than a single set of posts.\n\n# Getting Started\n\nInstall it with NPM or Yarn:\n\n```sh\nnpm install post-feed-reader # or yarn add post-feed-reader\n```\n\nYou first need to discover the post source, which will return an object containing a URL to the RSS/Atom/JSON Feed or the Wordpress REST API.\n\nThen you can pass the discovered source to the `getPostList`, which will fetch and parse it.\n\n```ts\nimport { discoverPostSource, getPostList } from 'post-feed-reader';\n\n// Looks for metadata pointing to the Wordpress REST API or Atom/RSS Feeds\nconst source = await discoverPostSource('https://www.nytimes.com');\n\n// Retrieves the posts from the given source\nconst list = await getPostList(source);\n\n// Logs all post titles\nconsole.log(list.posts.map(post =\u003e post.title));\n```\n\nSimple enough, eh? Try it on [RunKit](https://runkit.com/guichaguri/post-feed-reader)\n\n## Output\n\nSee [an example](https://gist.github.com/Guichaguri/f3d67ae99aeb9ca20fd5a19fafeb1afb) of the post list based on the Mozilla blog.\n\n## Options\n\n```ts\nconst source = await discoverPostSource('https://techcrunch.com', {\n  // Custom axios instance\n  axios: axios.create(...),\n\n  // Whether it will prioritize feeds over the wordpress api\n  preferFeeds: false,\n\n  // Custom data source filtering\n  canUseSource: (source: DiscoveredSource) =\u003e true,\n\n  // Whether it will try to guess wordpress api and feed urls if the auto-discovery process fails\n  tryToGuessPaths: false,\n  \n  // The paths that it will query trying to guess both the Wordpress API or the RSS/Atom/JSON feed\n  wpApiPaths: ['./wp-json', '?rest_route=/'],\n  feedPaths: ['./feed', './atom', './rss', './feed.json', './feed.xml', '?feed=atom'],\n});\n\nconst posts = await getPostList(source, {\n  // Custom axios instance\n  axios: axios.create(...),\n\n  // Whether missing plain text contents will be filled automatically from html contents\n  fillTextContents: false,\n\n  // Wordpress REST API only options\n  wordpress: {\n    // Whether it will include author, taxonomy and media data from the wordpress api\n    includeEmbedded: true,\n\n    // Whether it will fetch the blog info, such as the title, description, url and images\n    // Setting this to true adds one extra http request\n    fetchBlogInfo: false,\n\n    // The amount of items to return\n    limit: 10,\n\n    // The search string filter\n    search: '',\n\n    // The author id filter\n    authors: [...],\n\n    // The category id filter\n    categories: [...],\n\n    // The tag id filter\n    tags: [...],\n\n    // Any additional querystring parameter for the wordpress api you may want to include\n    additionalParams: { ... },\n  },\n});\n```\n\n## Skip the auto-discovery\n\nIf you already have an Atom/RSS/JSON Feed or the Wordpress REST API url in hands, you can fetch the posts directly:\n```ts\n// RSS, Atom or JSON Feed\nconst feedPosts = await getFeedPostList('https://news.google.com/atom');\n\n// WordPress API\nconst wpApiPosts = await getWordpressPostList('https://blog.mozilla.org/en/wp-json/');\n```\n\n## Pagination\n\nThe post list may have pagination metadata attached. You can use it to navigate through pages. Here's an example:\n```ts\nconst result = await getPostList(...);\n\nif (result.pagination.next) {\n  // There is a next page!\n  \n  const nextResult = await getPostList(result.pagination.next);\n  \n  // ...\n}\n\n// You can also check for result.pagination.previous, result.pagination.first and result.pagination.last\n```\n\n## Why support other sources, isn't RSS enough?\n\nRSS is the most widely feed format used on the web, but not only it lacks information that might be trivial to your application, [the specification is a mess](https://www.xml.com/pub/a/2002/12/18/dive-into-xml.html) with many vague to implementation properties, meaning how the information is formatted differs from feed to feed.\nFor instance, the description can be the full post as HTML, or just an excerpt, or in plain text, or even just an HTML link to the post page.\n\nAtom's specification is way more rigid and robust, which makes relying on the data trustworthier. It's definitely the way to go in the topic of feeds. But it still lacks some properties that can only be fetched through the Wordpress REST API.\n\nSince [WordPress is by far the most used CMS](https://w3techs.com/technologies/details/cm-wordpress), supporting its API is a great alternative. The Wordpress REST API supports the following over RSS and Atom feeds:\n- Filtering by category, tag and/or author\n- Searching\n- Pagination\n- Featured media\n- Author profile \n\nThe JSON Feed format is also just as good as the Atom format, but at the moment very few websites produce it.\n\n## How does the auto-discovery works?\n\n1. Fetches the site's main page\n2. Looks for [WordPress API Link headers](https://developer.wordpress.org/rest-api/using-the-rest-api/discovery/#link-header)\n3. Looks for [RSS](https://www.rssboard.org/rss-autodiscovery), [Atom](https://blog.whatwg.org/feed-autodiscovery) and [JSON Feed](https://www.jsonfeed.org/version/1.1/#discovery) `\u003clink\u003e` metatags\n4. If `tryToGuessPaths` is set to `true`, it will query a few common paths to try to find a feed or the WP API.\n\n## Most properties are optional, what am I guaranteed to have?\n\nNothing.\n\nYeah, there's no property that is required in all specs, thus we can't guarantee any of them will be present.\n\nBut! The most basic properties are very likely to be present, such as `guid`, `title` and `link`.\n\nFor all the other properties, it's highly recommended implementing your own fallbacks.\nFor instance, showing a substring of the content when the summary isn't available. \n\nThe library will try its best to fetch the most data available.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fguichaguri%2Fpost-feed-reader","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fguichaguri%2Fpost-feed-reader","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fguichaguri%2Fpost-feed-reader/lists"}