{"id":25922071,"url":"https://github.com/gitup24/jsoup_scraper","last_synced_at":"2025-07-11T14:15:06.843Z","repository":{"id":278081669,"uuid":"934462708","full_name":"gitup24/jsoup_scraper","owner":"gitup24","description":null,"archived":false,"fork":false,"pushed_at":"2025-02-17T22:07:12.000Z","size":0,"stargazers_count":0,"open_issues_count":2,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-02-17T22:33:36.593Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gitup24.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGES.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-17T21:56:52.000Z","updated_at":"2025-02-17T21:59:41.000Z","dependencies_parsed_at":"2025-02-17T22:43:57.660Z","dependency_job_id":null,"html_url":"https://github.com/gitup24/jsoup_scraper","commit_stats":null,"previous_names":["gitup24/jsoup_scraper"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gitup24%2Fjsoup_scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gitup24%2Fjsoup_scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gitup24%2Fjsoup_scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gitup24%2Fjsoup_scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gitup24","download_url":"https://codeload.github.com/gitup24/jsoup_scraper/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241696160,"owners_count":20004748,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-03T16:17:18.035Z","updated_at":"2025-03-03T16:17:18.749Z","avatar_url":"https://github.com/gitup24.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# jsoup: Java HTML Parser\n\n**jsoup** is a Java library that makes it easy to work with real-world HTML and XML. It offers an easy-to-use API for URL fetching, data parsing, extraction, and manipulation using DOM API methods, CSS, and xpath selectors.\n\n**jsoup** implements the [WHATWG HTML5](https://html.spec.whatwg.org/multipage/) specification, and parses HTML to the same DOM as modern browsers.\n\n* scrape and [parse](https://jsoup.org/cookbook/input/parse-document-from-string) HTML from a URL, file, or string\n* find and [extract data](https://jsoup.org/cookbook/extracting-data/selector-syntax), using DOM traversal or CSS selectors\n* manipulate the [HTML elements](https://jsoup.org/cookbook/modifying-data/set-html), attributes, and text\n* [clean](https://jsoup.org/cookbook/cleaning-html/safelist-sanitizer) user-submitted content against a safe-list, to prevent XSS attacks\n* output tidy HTML\n\njsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree.\n\nSee [**jsoup.org**](https://jsoup.org/) for downloads and the full [API documentation](https://jsoup.org/apidocs/).\n\n[![Build Status](https://github.com/jhy/jsoup/workflows/Build/badge.svg)](https://github.com/jhy/jsoup/actions?query=workflow%3ABuild)\n\n## Example\nFetch the [Wikipedia](https://en.wikipedia.org/wiki/Main_Page) homepage, parse it to a [DOM](https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model/Introduction), and select the headlines from the *In the News* section into a list of [Elements](https://jsoup.org/apidocs/org/jsoup/select/Elements.html):\n\n```java\nDocument doc = Jsoup.connect(\"https://en.wikipedia.org/\").get();\nlog(doc.title());\nElements newsHeadlines = doc.select(\"#mp-itn b a\");\nfor (Element headline : newsHeadlines) {\n  log(\"%s\\n\\t%s\", \n    headline.attr(\"title\"), headline.absUrl(\"href\"));\n}\n```\n[Online sample](https://try.jsoup.org/~LGB7rk_atM2roavV0d-czMt3J_g), [full source](https://github.com/jhy/jsoup/blob/master/src/main/java/org/jsoup/examples/Wikipedia.java).\n\n## Open source\njsoup is an open source project distributed under the liberal [MIT license](https://jsoup.org/license). The source code is available on [GitHub](https://github.com/jhy/jsoup).\n\n## Getting started\n1. [Download](https://jsoup.org/download) the latest jsoup jar (or add it to your Maven/Gradle build)\n2. Read the [cookbook](https://jsoup.org/cookbook/)\n3. Enjoy!\n\n### Android support\nWhen used in Android projects, [core library desugaring](https://developer.android.com/studio/write/java8-support#library-desugaring) with the [NIO specification](https://developer.android.com/studio/write/java11-nio-support-table) should be enabled to support Java 8+ features.\n\n## Development and support\nIf you have any questions on how to use jsoup, or have ideas for future development, please get in touch via [jsoup Discussions](https://github.com/jhy/jsoup/discussions).\n\nIf you find any issues, please file a [bug](https://jsoup.org/bugs) after checking for duplicates.\n\nThe [colophon](https://jsoup.org/colophon) talks about the history of and tools used to build jsoup.\n\n## Status\njsoup is in general, stable release.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgitup24%2Fjsoup_scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgitup24%2Fjsoup_scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgitup24%2Fjsoup_scraper/lists"}