Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jabbalaci/Jabba-Webkit
Jabba's headless webkit browser for scraping AJAX-powered webpages.
https://github.com/jabbalaci/Jabba-Webkit
Last synced: about 2 months ago
JSON representation
Jabba's headless webkit browser for scraping AJAX-powered webpages.
- Host: GitHub
- URL: https://github.com/jabbalaci/Jabba-Webkit
- Owner: jabbalaci
- Created: 2012-12-27T14:52:05.000Z (almost 12 years ago)
- Default Branch: master
- Last Pushed: 2014-10-23T09:11:52.000Z (about 10 years ago)
- Last Synced: 2024-08-01T05:19:43.477Z (5 months ago)
- Language: Python
- Size: 117 KB
- Stars: 91
- Watchers: 12
- Forks: 11
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-projects - Jabba-Webkit - Jabba's headless webkit browser for scraping AJAX-powered webpages. (Python)
README
Jabba-Webkit
============Jabba's headless webkit browser for scraping AJAX-powered webpages.
* Author: Laszlo Szathmary, 2012 ()
* Blog post:
* GitHub:
* Reddit:Usage:
------
`jabba_webkit.py [`url`: the page whose source you want to get
`time`: The application will quit after this given time (in seconds)
If the webpage is AJAX-powered and updates itself, you can
tell this browser to wait X seconds. Then it fetches the
*generated* HTML source.You can also use it as a library:
>>> import jabba_webkit as jw
>>> html1 = jw.get_page(url1, time1)
>>> html2 = jw.get_page(url2) # yes, you can call it several times