https://github.com/mwd1993/parsemyhtml

Html DOM Parser Written in Python (library for data scraping)
https://github.com/mwd1993/parsemyhtml

Last synced: 4 months ago
JSON representation

Html DOM Parser Written in Python (library for data scraping)

Host: GitHub
URL: https://github.com/mwd1993/parsemyhtml
Owner: mwd1993
Created: 2020-05-06T11:08:26.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2021-03-09T04:11:09.000Z (over 4 years ago)
Last Synced: 2025-01-06T16:44:19.022Z (9 months ago)
Language: Python
Size: 27.3 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # parseMyHtml

Html DOM Parser Written in Python (library for data scraping)

# Example 1:

```python

# --------------------------------------------------

# Gets the price of bitcoin from businessinsider.com

# --------------------------------------------------

# Import Requests

import requests

# Import parseMyHtml

import parseMyHtml

# URL Where the bitcoin price is located

url = 'https://markets.businessinsider.com/currencies/btc-usd'

# Get request html

# --------------------------

_request = requests.get(url)

_request_html = _request.text

# --------------------------

# **parseMyHtml** Initiate the parseMyHtml Class

parser = parseMyHtml.parseMyHtml()

# **parseMyHtml** Pass in the request html

parser.parse(_request_html)

# **parseMyHtml** Get a list of elements that match this element text (Bitcoin price element)

_price_element = parser.get_by_full_element(' 0:

    # **parseMyHtml** The Bit Coin Price value is stored in the html webpage under the attribute jsvalue

    print("Bitcoin Price: " + str(_price_element[0].getAttribute("jsvalue")))

    # **parseMyHtml** OR print("Bitcoin Price: " + str(_price_element[0].text))

else:

    # Display the html returned from the request

    print(parser.html)

    print("Nothing found")

```

# Example 2:

```python

# ---------------------------------------------------------------

# Gets the amount of jobs available from the python jobs website

# ---------------------------------------------------------------

# Import Requests

import requests

# Import parseMyHtml

import parseMyHtml

# URL where the amount of available jobs is located

url = 'https://www.python.org/jobs/'

# Get request html

# --------------------------

_request = requests.get(url)

_request_html = _request.text

# --------------------------

# **parseMyHtml** Initiate the parseMyHtml Class

parser = parseMyHtml.parseMyHtml()

# **parseMyHtml** Pass in the request html

parser.parse(_request_html)

# **parseMyHtml** Get a list of elements that match this element fully (remove trailing >)

_jobs_element = parser.get_by_full_element('
 0:

    # Get the first returned element in the list

    __jobs_element = _jobs_element[0]

    # **parseMyHtml** Scrape the value of the string and remove the excess string

    _text_jobs_available = "( " + __jobs_element.text[:__jobs_element.text.rfind("jobs on")].strip() + " )"

    # **parseMyHtml** Get the attributes of the element object

    _text_jobs_attributes = __jobs_element.attributes

    # **parseMyHtml** Get the value of the attribute 'class'

    _text_jobs_class = __jobs_element.getAttribute("class")

    # Print the jobs available, the attributes the element object contains, the value of 'class' of the object and the objects raw HTML

    # -----------------------------------------------------------------------------------------------------------------------------------

    print("Jobs available:\t\t" + _text_jobs_available + "\nElement Object Attributes:\t\t" + _text_jobs_attributes + "\nElement Object Class Value:\t\t" + _text_jobs_class)

    print("Element Object HTML:\t\t" + __jobs_element.getHtml())

    # -----------------------------------------------------------------------------------------------------------------------------------

else:

    # Display the html returned from the request

    print(parser.html)

    print("Nothing found")

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/mwd1993/parsemyhtml

Awesome Lists containing this project

README