{"id":13701075,"url":"https://github.com/Ranchero-Software/RSParser","last_synced_at":"2025-05-04T20:31:59.267Z","repository":{"id":42384818,"uuid":"115465328","full_name":"Ranchero-Software/RSParser","owner":"Ranchero-Software","description":"Parser for RSS, Atom, JSON Feed, RSS-inJSON, OPML, and HTML.","archived":true,"fork":false,"pushed_at":"2024-04-08T02:44:17.000Z","size":1417,"stargazers_count":365,"open_issues_count":11,"forks_count":39,"subscribers_count":10,"default_branch":"main","last_synced_at":"2025-04-03T02:18:32.509Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://inessential.com/2017/12/26/evergreens_parser_as_separate_open_sour","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Ranchero-Software.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-12-27T00:36:52.000Z","updated_at":"2025-03-07T18:28:50.000Z","dependencies_parsed_at":"2024-04-08T03:47:58.346Z","dependency_job_id":null,"html_url":"https://github.com/Ranchero-Software/RSParser","commit_stats":null,"previous_names":["brentsimmons/rsparser"],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ranchero-Software%2FRSParser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ranchero-Software%2FRSParser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ranchero-Software%2FRSParser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ranchero-Software%2FRSParser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Ranchero-Software","download_url":"https://codeload.github.com/Ranchero-Software/RSParser/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252395416,"owners_count":21741033,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T20:01:16.504Z","updated_at":"2025-05-04T20:31:54.255Z","avatar_url":"https://github.com/Ranchero-Software.png","language":"HTML","funding_links":[],"categories":["HTML"],"sub_categories":[],"readme":"# RSParser\n\nThis framework was developed for [NetNewsWire](https://github.com/brentsimmons/NetNewsWire) and is made available here for developers who just need the parsing code. It has no dependencies that aren’t provided by the system.\n\n_Update 6 Feb. 2018_: RSParser is now a CocoaPod, with the much-appreciated help of [Silver Fox](https://github.com/dcilia). (We _think_ it worked, anyway. Looked like it did.)\n\n## What’s inside\n\nThis framework includes parsers for:\n\n* [RSS](http://cyber.harvard.edu/rss/rss.html), [Atom](https://tools.ietf.org/html/rfc4287), [JSON Feed](https://jsonfeed.org/), and [RSS-in-JSON](https://github.com/scripting/Scripting-News/blob/master/rss-in-json/README.md)\n* [OPML](http://dev.opml.org/)\n* Internet dates\n* HTML metadata and links\n* HTML entities\n\nIt also includes Objective-C wrappers for libXML2’s XML SAX and HTML SAX parsers. You can write your own parsers on top of these.\n\nThis framework builds for macOS. It *could* be made to build for iOS also, but I haven’t gotten around to it yet.\n\n## How to parse feeds\n\nTo get the type of a feed, even with partial data, call `FeedParser.feedType(parserData)`, which will return a `FeedType`.\n\nTo parse a feed, call `FeedParser.parse(parserData)`, which will return a [ParsedFeed](Feeds/ParsedFeed.swift). Also see related structs: `ParsedAuthor`, `ParsedItem`, `ParsedAttachment`, and `ParsedHub`.\n\nYou do *not* need to know the type of feed when calling `FeedParser.parse` — it will figure it out and use the correct concrete parser.\n\nHowever, if you do want to use a concrete parser directly, see [RSSInJSONParser](Feeds/JSON/RSSInJSONParser.swift), [JSONFeedParser](Feeds/JSON/JSONFeedParser.swift), [RSSParser](Feeds/XML/RSSParser.swift), and [AtomParser](Feeds/XML/AtomParser.swift).\n\n(Note: if you want to write a feed reader app, please do! You have my blessing and encouragement. Let me know when it’s shipping so I can check it out.)\n\n## How to parse OPML\n\nCall `+[RSOPMLParser parseOPMLWithParserData:error:]`, which returns an `RSOPMLDocument`. See related objects: `RSOPMLItem`, `RSOPMLAttributes`, `RSOPMLFeedSpecifier`, and `RSOPMLError`.\n\n## How to parse dates\n\nCall `RSDateWithString` or `RSDateWithBytes` (see `RSDateParser`). These handle the common internet date formats. You don’t need to know which format.\n\n## How to parse HTML\n\nTo get an array of `\u003ca href=…` links from from an HTML document, call `+[RSHTMLLinkParser htmlLinksWithParserData:]`. It returns an array of `RSHTMLLink`.\n\nTo parse the metadata in an HTML document, call `+[RSHTMLMetadataParser HTMLMetadataWithParserData:]`. It returns an `RSHTMLMetadata` object.\n\nTo write your own HTML parser, see `RSSAXHTMLParser`. The two parsers above can serve as examples.\n\n## How to parse HTML entities\n\nWhen you have a string with things like `\u0026#8212;` and `\u0026euml;` and you want to turn those into the correct characters, call `-[NSString rsparser_stringByDecodingHTMLEntities]`. (See `NSString+RSParser.h`.)\n\n## How to parse XML\n\nIf you need to parse some XML that isn’t RSS, Atom, or OPML, you can use `RSSAXParser`. Don’t subclass it — instead, create an `RSSAXParserDelegate`. See `RSRSSParser`, `RSAtomParser`, and `RSOPMLParser` as examples.\n\n### Why use libXML2’s SAX API?\n\nSAX is kind of a pain because of all the state you have to manage.\n\nAn alternative is to use `NSXMLParser`, which is event-driven like SAX. However, `RSSAXParser` was written to avoid allocating Objective-C objects except when absolutely needed. You’ll note use of things like `memcp` and `strncmp`.\n\nNormally I avoid this kind of thing *strenuously*. I prefer to work at the highest level possible.\n\nBut my more-than-a-decade of experience parsing XML has led me to this solution, which — last time I checked, which was, admittedly, a few years ago — was not only fastest but also uses the least memory. (The two things are related, of course: creating objects is bad for performance, so this code attempts to do the minimum possible.)\n\nAll that low-level stuff is encapsulated, however. If you just want to parse one of the popular feed formats, see `FeedParser`, which makes it easy and Swift-y.\n\n## Thread safety\n\nEverything here is thread-safe.\n\nEverything’s pretty fast, too, so you probably could just use the main thread/queue. But it’s totally a-okay to use a non-serial background queue.\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FRanchero-Software%2FRSParser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FRanchero-Software%2FRSParser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FRanchero-Software%2FRSParser/lists"}