{"id":14155188,"url":"https://github.com/extractus/feed-extractor","last_synced_at":"2026-05-03T09:03:58.182Z","repository":{"id":2359408,"uuid":"46329174","full_name":"extractus/feed-extractor","owner":"extractus","description":"Simplest way to read \u0026 normalize RSS/ATOM/JSON feed data","archived":false,"fork":false,"pushed_at":"2025-09-04T06:41:47.000Z","size":1180,"stargazers_count":182,"open_issues_count":5,"forks_count":36,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-11-23T05:12:10.479Z","etag":null,"topics":["atom-feed","feed-reader","jsonfeed","nodejs","rss"],"latest_commit_sha":null,"homepage":"https://extractus.pwshub.com/feed","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/extractus.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2015-11-17T07:00:18.000Z","updated_at":"2025-11-07T02:07:13.000Z","dependencies_parsed_at":"2023-11-06T10:43:34.125Z","dependency_job_id":"e6cb0407-748d-4524-a57d-4d2033f5c734","html_url":"https://github.com/extractus/feed-extractor","commit_stats":{"total_commits":143,"total_committers":8,"mean_commits":17.875,"dds":"0.15384615384615385","last_synced_commit":"b81646a23dbec4f9224cb1b19ea10f106d28d27f"},"previous_names":["ndaidong/feed-reader"],"tags_count":54,"template":false,"template_full_name":null,"purl":"pkg:github/extractus/feed-extractor","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/extractus%2Ffeed-extractor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/extractus%2Ffeed-extractor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/extractus%2Ffeed-extractor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/extractus%2Ffeed-extractor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/extractus","download_url":"https://codeload.github.com/extractus/feed-extractor/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/extractus%2Ffeed-extractor/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32563522,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-03T06:36:36.687Z","status":"ssl_error","status_checked_at":"2026-05-03T06:36:09.306Z","response_time":103,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["atom-feed","feed-reader","jsonfeed","nodejs","rss"],"created_at":"2024-08-17T08:02:25.606Z","updated_at":"2026-05-03T09:03:58.176Z","avatar_url":"https://github.com/extractus.png","language":"JavaScript","funding_links":["https://paypal.me/ndaidong"],"categories":["rss"],"sub_categories":[],"readme":"# feed-extractor\n\nTo read \u0026 normalize RSS/ATOM/JSON feed data.\n\n[![npm version](https://badge.fury.io/js/@extractus%2Ffeed-extractor.svg)](https://badge.fury.io/js/@extractus%2Ffeed-extractor)\n![CodeQL](https://github.com/extractus/feed-extractor/workflows/CodeQL/badge.svg)\n![CI test](https://github.com/extractus/feed-extractor/workflows/ci-test/badge.svg)\n\n(This library is derived from [feed-reader](https://www.npmjs.com/package/feed-reader) renamed.)\n\n## Demo\n\n- [Give it a try!](https://extractus.pwshub.com/feed)\n\n\n## Install\n\n```bash\n# bun\nbun add @extractus/feed-extractor\n\n# npm\nnpm i @extractus/feed-extractor\n\n# pnpm\npnpm install @extractus/feed-extractor\n\n# yarn\nyarn add @extractus/feed-extractor\n```\n\n## Usage\n\n```ts\nimport { extract } from '@extractus/feed-extractor'\n\nconst data = await extract(RSS_URL)\nconsole.log(data)\n```\n\n## Automate RSS feed extraction with GitHub Actions\n\n[RSS Feed Fetch Action](https://github.com/Promptly-Technologies-LLC/rss-fetch-action) is a GitHub Action designed to automate the fetching of RSS feeds.\nIt fetches an RSS feed from a given URL and saves it to a specified file in your GitHub repository.\nThis action is particularly useful for populating content on GitHub Pages websites or other static site generators.\n\n\n## CJS Deprecated\n\nCJS is deprecated for this package.  When calling `require('@extractus/feed-extractor')` a deprecation warning is now logged.  You should update your code to use the ESM export.\n\n- You can ignore this warning via the environment variable `FEED_EXTRACTOR_CJS_IGNORE_WARNING=true`\n- To see where the warning is coming from you can set the environment variable `FEED_EXTRACTOR_CJS_TRACE_WARNING=true`\n\n\n## APIs\n\n- [extract()](#extract)\n- [extractFromJson()](#extractfromjson)\n- [extractFromXml()](#extractfromxml)\n\n#### Note:\n\n- *Old method `read()` has been marked as deprecated and will be removed in next major release.*\n\n---\n\n### `extract()`\n\nLoad and extract feed data from given RSS/ATOM/JSON source. Return a Promise object.\n\n#### Syntax\n\n```ts\nextract(String url)\nextract(String url, Object parserOptions)\nextract(String url, Object parserOptions, Object fetchOptions)\n```\n\nExample:\n\n```js\nimport { extract } from '@extractus/feed-extractor'\n\nconst result = await extract('https://news.google.com/atom')\nconsole.log(result)\n```\n\nWithout any options, the result should have the following structure:\n\n```ts\n{\n  title: String,\n  link: String,\n  description: String,\n  generator: String,\n  language: String,\n  published: ISO Date String,\n  entries: Array[\n    {\n      id: String,\n      title: String,\n      link: String,\n      description: String,\n      published: ISO Datetime String\n    },\n    // ...\n  ]\n}\n```\n\n#### Parameters\n\n##### `url` *required*\n\nURL of a valid feed source\n\nFeed content must be accessible and conform one of the following standards:\n\n  - [RSS Feed](https://www.rssboard.org/rss-specification)\n    - [RDF Feed](https://web.resource.org/rss/1.0/spec)\n  - [ATOM Feed](https://datatracker.ietf.org/doc/html/rfc5023)\n  - [JSON Feed](https://www.jsonfeed.org/version/1.1/)\n\n##### `parserOptions` *optional*\n\nObject with all or several of the following properties:\n\n  - `normalization`: Boolean, normalize feed data or keep original. Default `true`.\n  - `useISODateFormat`: Boolean, convert datetime to ISO format. Default `true`.\n  - `descriptionMaxLen`: Number, to truncate description. Default `250` characters. Set to `0` = no truncation.\n  - `xmlParserOptions`: Object, used by xml parser, view [fast-xml-parser's docs](https://github.com/NaturalIntelligence/fast-xml-parser/blob/master/docs/v4/2.XMLparseOptions.md)\n  - `getExtraFeedFields`: Function, to get more fields from feed data\n  - `getExtraEntryFields`: Function, to get more fields from feed entry data\n  - `baseUrl`: URL string, to absolutify the links within feed content\n\nFor example:\n\n```ts\nimport { extract } from '@extractus/feed-extractor'\n\nawait extract('https://news.google.com/atom', {\n  useISODateFormat: false\n})\n\nawait extract('https://news.google.com/rss', {\n  useISODateFormat: false,\n  getExtraFeedFields: (feedData) =\u003e {\n    return {\n      subtitle: feedData.subtitle || ''\n    }\n  },\n  getExtraEntryFields: (feedEntry) =\u003e {\n    const {\n      enclosure,\n      category\n    } = feedEntry\n    return {\n      enclosure: {\n        url: enclosure['@_url'],\n        type: enclosure['@_type'],\n        length: enclosure['@_length']\n      },\n      category: isString(category) ? category : {\n        text: category['@_text'],\n        domain: category['@_domain']\n      }\n    }\n  }\n})\n```\n\n##### `fetchOptions` *optional*\n\n`fetchOptions` is an object that can have the following properties:\n\n- `headers`: to set request headers\n- `proxy`: another endpoint to forward the request to\n- `agent`: a HTTP proxy agent\n- `signal`: AbortController signal or AbortSignal timeout to terminate the request\n\nFor example, you can use this param to set request headers to fetch as below:\n\n```js\nimport { extract } from '@extractus/feed-extractor'\n\nconst url = 'https://news.google.com/rss'\nawait extract(url, null, {\n  headers: {\n    'user-agent': 'Opera/9.60 (Windows NT 6.0; U; en) Presto/2.1.1'\n  }\n})\n```\n\nYou can also specify a proxy endpoint to load remote content, instead of fetching directly.\n\nFor example:\n\n```js\nimport { extract } from '@extractus/feed-extractor'\n\nconst url = 'https://news.google.com/rss'\n\nawait extract(url, null, {\n  headers: {\n    'user-agent': 'Opera/9.60 (Windows NT 6.0; U; en) Presto/2.1.1'\n  },\n  proxy: {\n    target: 'https://your-secret-proxy.io/loadXml?url=',\n    headers: {\n      'Proxy-Authorization': 'Bearer YWxhZGRpbjpvcGVuc2VzYW1l...'\n    }\n  }\n})\n```\n\nAnother way to work with proxy is use `agent` option instead of `proxy` as below:\n\n```js\nimport { extract } from '@extractus/feed-extractor'\n\nimport { HttpsProxyAgent } from 'https-proxy-agent'\n\nconst proxy = 'http://abc:RaNdoMpasswORd_country-France@proxy.packetstream.io:31113'\n\nconst url = 'https://news.google.com/rss'\n\nconst feed = await extract(url, null, {\n  agent: new HttpsProxyAgent(proxy),\n})\nconsole.log('Run feed-extractor with proxy:', proxy)\nconsole.log(feed)\n```\n\nFor more info about [https-proxy-agent](https://www.npmjs.com/package/https-proxy-agent), check [its repo](https://github.com/TooTallNate/proxy-agents).\n\nBy default, there is no request timeout. You can use the option `signal` to cancel request at the right time.\n\nThe common way is to use AbortControler:\n\n```js\nconst controller = new AbortController()\n\n// stop after 5 seconds\nsetTimeout(() =\u003e {\n  controller.abort()\n}, 5000)\n\nconst data = await extract(url, null, {\n  signal: controller.signal,\n})\n```\n\nA newer solution is AbortSignal's `timeout()` static method:\n\n```js\n// stop after 5 seconds\nconst data = await extract(url, null, {\n  signal: AbortSignal.timeout(5000),\n})\n```\n\nFor more info:\n\n- [AbortController constructor](https://developer.mozilla.org/en-US/docs/Web/API/AbortController)\n- [AbortSignal: timeout() static method](https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal/timeout_static)\n\n\n### `extractFromJson()`\n\nExtract feed data from JSON string.\nReturn an object which contains feed data.\n\n#### Syntax\n\n```ts\nextractFromJson(String json)\nextractFromJson(String json, Object parserOptions)\n```\n\nExample:\n\n```js\nimport { extractFromJson } from '@extractus/feed-extractor'\n\nconst url = 'https://www.jsonfeed.org/feed.json'\n// this resource provides data in JSON feed format\n// so we fetch remote content as json\n// then pass to feed-extractor\nconst res = await fetch(url)\nconst json = await res.json()\n\nconst feed = extractFromJson(json)\nconsole.log(feed)\n```\n\n#### Parameters\n\n##### `json` *required*\n\nJSON string loaded from JSON feed resource.\n\n##### `parserOptions` *optional*\n\nSee [parserOptions](#parseroptions-optional) above.\n\n\n### `extractFromXml()`\n\nExtract feed data from XML string.\nReturn an object which contains feed data.\n\n#### Syntax\n\n```ts\nextractFromXml(String xml)\nextractFromXml(String xml, Object parserOptions)\n```\n\nExample:\n\n```js\nimport { extractFromXml } from '@extractus/feed-extractor'\n\nconst url = 'https://news.google.com/atom'\n// this resource provides data in ATOM feed format\n// so we fetch remote content as text\n// then pass to feed-extractor\nconst res = await fetch(url)\nconst xml = await res.text()\n\nconst feed = extractFromXml(xml)\nconsole.log(feed)\n```\n\n#### Parameters\n\n##### `xml` *required*\n\nXML string loaded from RSS/ATOM feed resource.\n\n##### `parserOptions` *optional*\n\nSee [parserOptions](#parseroptions-optional) above.\n\n\n## Test\n\n```bash\ngit clone https://github.com/extractus/feed-extractor.git\ncd feed-extractor\npnpm i\npnpm test\n```\n\n![feed-extractor-test.png](https://i.imgur.com/2b5xt6S.png)\n\n\n## Quick evaluation\n\n```bash\ngit clone https://github.com/extractus/feed-extractor.git\ncd feed-extractor\npnpm i\npnpm eval https://news.google.com/rss\n```\n\n## License\nThe MIT License (MIT)\n\n## Support the project\n\nIf you find value from this open source project, you can support in the following ways:\n\n- Give it a star ⭐\n- Buy me a coffee: https://paypal.me/ndaidong 🍵\n- Subscribe [Feed Reader service](https://rapidapi.com/pwshub-pwshub-default/api/feed-reader1/) on RapidAPI 😉\n\nThank you.\n\n---\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fextractus%2Ffeed-extractor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fextractus%2Ffeed-extractor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fextractus%2Ffeed-extractor/lists"}