https://github.com/elliotgao2/htmlparsing

No pain HTML parsing library.
https://github.com/elliotgao2/htmlparsing

css html markdown parse xpath

Last synced: about 2 months ago
JSON representation

No pain HTML parsing library.

Host: GitHub
URL: https://github.com/elliotgao2/htmlparsing
Owner: elliotgao2
Created: 2018-02-26T09:57:15.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2018-04-02T09:19:56.000Z (about 7 years ago)
Last Synced: 2025-04-02T23:32:40.532Z (3 months ago)
Topics: css, html, markdown, parse, xpath
Language: Python
Homepage:
Size: 17.6 KB
Stars: 12
Watchers: 2
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # HTML Parsing

No Pain HTML parsing library.

## Installation

```python

pip install htmlparsing

```

## Usage

### Parse list

```python

import requests

from htmlparsing import Element, HTMLParsing, Text, Attr, Parse, HTML, Markdown

url = 'https://news.ycombinator.com/'

r = requests.get(url)

article_list = HTMLParsing(r.text).list('.athing', {'title': Text('a.storylink'), # css selector

                                                    'link': Attr('a.storylink', 'href')})

print(article_list)

```

### Parse detail

```python

import requests

from htmlparsing import Element, HTMLParsing, Text, Attr, Parse

url = 'https://news.ycombinator.com/item?id=16476454'

r = requests.get(url)

article_detail = HTMLParsing(r.text).detail({'title': Text('a.storylink'),

                                             'points': Parse('span.score', '>{} points'),

                                             'link': Attr('a.storylink', 'href')})

print(article_detail)

```

### Element

```python

import requests

from htmlparsing import Element

url = 'https://python.org/'

r = requests.get(url)

e = Element(text=r.text)

e.links

e.absolute_links

e.xpath('//a')[0].attrs

e.xpath('//a')[0].attrs.title

e.css('a')[0].attrs

e.parse('{}')

e.css('a')[5].text

e.css('a')[5].html

e.css('a')[5].markdown

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/elliotgao2/htmlparsing

Awesome Lists containing this project

README