https://github.com/natac13/web-parse-re
My first attempt to parse a website with re from python
https://github.com/natac13/web-parse-re
Last synced: 3 months ago
JSON representation
My first attempt to parse a website with re from python
- Host: GitHub
- URL: https://github.com/natac13/web-parse-re
- Owner: natac13
- Created: 2015-02-26T11:27:27.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2015-02-26T11:38:39.000Z (about 10 years ago)
- Last Synced: 2023-02-27T21:46:52.637Z (about 2 years ago)
- Language: Python
- Size: 117 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Web Parser from D3 fan site
### By: Natac
This is simply my first successful attempt at parsing a website using python!
This script, although small aloud me to explore the urllib.request and regular
expression (re) modules in python.The program just simple goes to a site I visit, and prints out all the topics
that are current. This allows me to not have to visit the website unless I see
something of interest!! Save me a bit of time.*I tried to change this to a more practical program to do the same thing with
the [zerohedge.com](http://www.zerohedge.com/) website. However, the request
returns something that is not html... I am not really sure of what it is, as I
am still very new to this! My first guess is that it is someting to do with
Ascii and UTF-8 but I am not sure where to even start. **ANY** help would be
very welcome for this!*