https://github.com/jwheare/bizparse
Python scraper for parsing the House of Commons Future Business pages
https://github.com/jwheare/bizparse
Last synced: about 1 year ago
JSON representation
Python scraper for parsing the House of Commons Future Business pages
- Host: GitHub
- URL: https://github.com/jwheare/bizparse
- Owner: jwheare
- License: bsd-3-clause
- Created: 2009-08-13T00:10:26.000Z (almost 17 years ago)
- Default Branch: master
- Last Pushed: 2009-08-13T07:39:06.000Z (almost 17 years ago)
- Last Synced: 2025-02-12T08:54:41.422Z (over 1 year ago)
- Language: Python
- Homepage:
- Size: 125 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.txt
- License: LICENSE.txt
Awesome Lists containing this project
README
bizparse.py
A scraper for parsing the House of Commons Future Business pages
http://www.publications.parliament.uk/pa/cm/cmfbusi/fbusi.htm
Usage:
./bizparse.py
Writes an XML file to bizparseYYYY-MM-DD.xml for the period ending date.
Ouputs human readable debug logging for the data extracted to stdout
Uses BeautifulSoup: http://www.crummy.com/software/BeautifulSoup/