{"id":16487766,"url":"https://github.com/fiam/readable","last_synced_at":"2025-09-10T14:24:29.150Z","repository":{"id":1869586,"uuid":"2794792","full_name":"fiam/readable","owner":"fiam","description":"C library for extracting interesting content from web pages","archived":false,"fork":false,"pushed_at":"2013-01-22T19:46:58.000Z","size":642,"stargazers_count":28,"open_issues_count":1,"forks_count":6,"subscribers_count":6,"default_branch":"master","last_synced_at":"2024-12-03T07:14:23.749Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":"feross/webtorrent","license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fiam.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2011-11-17T11:01:29.000Z","updated_at":"2019-11-19T23:49:47.000Z","dependencies_parsed_at":"2022-09-09T09:00:26.364Z","dependency_job_id":null,"html_url":"https://github.com/fiam/readable","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fiam%2Freadable","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fiam%2Freadable/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fiam%2Freadable/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fiam%2Freadable/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fiam","download_url":"https://codeload.github.com/fiam/readable/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":228413881,"owners_count":17915914,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-11T13:35:47.235Z","updated_at":"2024-12-06T04:56:56.724Z","avatar_url":"https://github.com/fiam.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# C implementation of the Readability algorithm (plus some goodies)\n\n## Dependencies:\n\n- libxml2\n- libpcre or ICU for regular expressions support\n\n\n## Building:\n\n- make\n\n## Building with ICU rather than pcre:\n\n- ICU=1 make\n\nBy default, both the readable program and the Python extension\nwill be built.\n\n## Building for OS X using Xcode\n\n- Create a new directory named readable\n- Copy readable.h and readable.c in the newly created directory\n- Copy the directory named unicode from the ICU headers into your project\n(you can get it from the iPhoneSimulator SDK, under /usr/include/unicode)\n- Add the readable parent directory, the unicode parent directory and\n/usr/include/libxml2 to Header Search Path under Build Settings\n- Add libicucore.dylib and libxml2.xylib to the Link Binary with libraries\nBuild Phase\n- In your code, import readable.h\n\n## Building for iOS using Xcode\n\n- Create a new directory named readable\n- Copy readable.h and readable.c in the newly created directory\n- Add the readable parent directory and /usr/include/libxml2 to\nHeader Search Path under Build Settings\n- Add libicucore.dylib and libxml2.xylib to the Link Binary with libraries\nBuild Phase\n- In your code, import readable.h\n\n## API:\n\n### char * readable(const char *html, const char *url, const char *encoding, int options)\n\n\nParses HTML to extract the interesting contents.\n\n\n- html: HTML code to parse\n- url: URL where this HTML was fetched from\n- encoding: HTML encoding\n- options: See readable.h for the avaialble options\n\n### char * next_page_url(const char *html, const char *url, const char *encoding);\n\n\nReturns the url for the next page in a multipage article (pretty much in alpha):\n\n\n- html: HTML code to parse\n- url: URL where this HTML was fetched from\n- encoding: HTML encoding\n\n## License\n\nThis code is licensed under the AGPLv3. If you'd like to use the code under\na different license, drop me a line to alberto@garciahierro.com\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffiam%2Freadable","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffiam%2Freadable","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffiam%2Freadable/lists"}