{"id":13461667,"url":"https://github.com/scrapy/parsel","last_synced_at":"2025-05-14T03:08:06.779Z","repository":{"id":30969532,"uuid":"34527721","full_name":"scrapy/parsel","owner":"scrapy","description":"Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors","archived":false,"fork":false,"pushed_at":"2025-05-12T06:07:25.000Z","size":908,"stargazers_count":1224,"open_issues_count":44,"forks_count":152,"subscribers_count":35,"default_branch":"master","last_synced_at":"2025-05-12T07:24:17.057Z","etag":null,"topics":["css","hacktoberfest","lxml","python","scraping","selectors","xml","xpath"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/scrapy.png","metadata":{"files":{"readme":"README.rst","changelog":"NEWS","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2015-04-24T15:53:36.000Z","updated_at":"2025-05-12T06:06:46.000Z","dependencies_parsed_at":"2024-04-24T19:46:01.173Z","dependency_job_id":"24160ee8-ccf0-420b-a515-29fd29cfc33a","html_url":"https://github.com/scrapy/parsel","commit_stats":{"total_commits":629,"total_committers":51,"mean_commits":"12.333333333333334","dds":0.8282988871224165,"last_synced_commit":"54678b5425650a37780b2ef85225c2e45fe88e3d"},"previous_names":[],"tags_count":26,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapy%2Fparsel","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapy%2Fparsel/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapy%2Fparsel/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scrapy%2Fparsel/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/scrapy","download_url":"https://codeload.github.com/scrapy/parsel/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253694283,"owners_count":21948660,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["css","hacktoberfest","lxml","python","scraping","selectors","xml","xpath"],"created_at":"2024-07-31T11:00:51.590Z","updated_at":"2025-05-14T03:08:06.762Z","avatar_url":"https://github.com/scrapy.png","language":"Python","readme":"======\nParsel\n======\n\n.. image:: https://github.com/scrapy/parsel/actions/workflows/tests-ubuntu.yml/badge.svg\n   :target: https://github.com/scrapy/parsel/actions/workflows/tests-ubuntu.yml\n   :alt: Tests\n\n.. image:: https://img.shields.io/pypi/pyversions/parsel.svg\n   :target: https://github.com/scrapy/parsel/actions/workflows/tests.yml\n   :alt: Supported Python versions\n\n.. image:: https://img.shields.io/pypi/v/parsel.svg\n   :target: https://pypi.python.org/pypi/parsel\n   :alt: PyPI Version\n\n.. image:: https://img.shields.io/codecov/c/github/scrapy/parsel/master.svg\n   :target: https://codecov.io/github/scrapy/parsel?branch=master\n   :alt: Coverage report\n\n\nParsel is a BSD-licensed Python_ library to extract data from HTML_, JSON_, and\nXML_ documents.\n\nIt supports:\n\n-   CSS_ and XPath_ expressions for HTML and XML documents\n\n-   JMESPath_ expressions for JSON documents\n\n-   `Regular expressions`_\n\nFind the Parsel online documentation at https://parsel.readthedocs.org.\n\nExample (`open online demo`_):\n\n.. code-block:: python\n\n    \u003e\u003e\u003e from parsel import Selector\n    \u003e\u003e\u003e text = \"\"\"\n            \u003chtml\u003e\n                \u003cbody\u003e\n                    \u003ch1\u003eHello, Parsel!\u003c/h1\u003e\n                    \u003cul\u003e\n                        \u003cli\u003e\u003ca href=\"http://example.com\"\u003eLink 1\u003c/a\u003e\u003c/li\u003e\n                        \u003cli\u003e\u003ca href=\"http://scrapy.org\"\u003eLink 2\u003c/a\u003e\u003c/li\u003e\n                    \u003c/ul\u003e\n                    \u003cscript type=\"application/json\"\u003e{\"a\": [\"b\", \"c\"]}\u003c/script\u003e\n                \u003c/body\u003e\n            \u003c/html\u003e\"\"\"\n    \u003e\u003e\u003e selector = Selector(text=text)\n    \u003e\u003e\u003e selector.css('h1::text').get()\n    'Hello, Parsel!'\n    \u003e\u003e\u003e selector.xpath('//h1/text()').re(r'\\w+')\n    ['Hello', 'Parsel']\n    \u003e\u003e\u003e for li in selector.css('ul \u003e li'):\n    ...     print(li.xpath('.//@href').get())\n    http://example.com\n    http://scrapy.org\n    \u003e\u003e\u003e selector.css('script::text').jmespath(\"a\").get()\n    'b'\n    \u003e\u003e\u003e selector.css('script::text').jmespath(\"a\").getall()\n    ['b', 'c']\n\n.. _CSS: https://en.wikipedia.org/wiki/Cascading_Style_Sheets\n.. _HTML: https://en.wikipedia.org/wiki/HTML\n.. _JMESPath: https://jmespath.org/\n.. _JSON: https://en.wikipedia.org/wiki/JSON\n.. _open online demo: https://colab.research.google.com/drive/149VFa6Px3wg7S3SEnUqk--TyBrKplxCN#forceEdit=true\u0026sandboxMode=true\n.. _Python: https://www.python.org/\n.. _regular expressions: https://docs.python.org/library/re.html\n.. _XML: https://en.wikipedia.org/wiki/XML\n.. _XPath: https://en.wikipedia.org/wiki/XPath\n","funding_links":[],"categories":["Python","🧩 HTML \u0026 XML Parsing","Web Scraping \u0026 Crawling","\u003ca name=\"data-extraction\"\u003e\u003c/a\u003e⛏️ Data Extraction"],"sub_categories":["Ruby"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscrapy%2Fparsel","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fscrapy%2Fparsel","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscrapy%2Fparsel/lists"}