Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/jabbalaci/Jabba-Webkit

Jabba's headless webkit browser for scraping AJAX-powered webpages.
https://github.com/jabbalaci/Jabba-Webkit

Last synced: about 2 months ago
JSON representation

Jabba's headless webkit browser for scraping AJAX-powered webpages.

Awesome Lists containing this project

README

        

Jabba-Webkit
============

Jabba's headless webkit browser for scraping AJAX-powered webpages.

* Author: Laszlo Szathmary, 2012 ()
* Blog post:
* GitHub:
* Reddit:

Usage:
------
`jabba_webkit.py [

`url`: the page whose source you want to get

`time`: The application will quit after this given time (in seconds)

If the webpage is AJAX-powered and updates itself, you can
tell this browser to wait X seconds. Then it fetches the
*generated* HTML source.

You can also use it as a library:

>>> import jabba_webkit as jw
>>> html1 = jw.get_page(url1, time1)
>>> html2 = jw.get_page(url2) # yes, you can call it several times