Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/yasulab/simple-tag-getter-with-lxml
Get html elements with just one line command.
https://github.com/yasulab/simple-tag-getter-with-lxml
Last synced: 1 day ago
JSON representation
Get html elements with just one line command.
- Host: GitHub
- URL: https://github.com/yasulab/simple-tag-getter-with-lxml
- Owner: yasulab
- Created: 2011-04-09T18:42:18.000Z (over 13 years ago)
- Default Branch: master
- Last Pushed: 2011-05-06T10:49:08.000Z (over 13 years ago)
- Last Synced: 2024-10-10T09:22:35.114Z (about 1 month ago)
- Language: Python
- Homepage:
- Size: 93.8 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README
Awesome Lists containing this project
README
Description:
Given URL and tag (and its attributes),
scrape the sentences with the tags in the URL.Usage Example:
$ python tag-getter.py http://ebooks.adelaide.edu.au/c/carroll/lewis/alice/chapter1.html div class=dochead
Created Xpath: //div[@class="dochead"]
Alice in Wonderland, by Lewis Carroll$ python tag-getter.py http://ebooks.adelaide.edu.au/c/carroll/lewis/alice/chapter1.html p
Created Xpath: //p
Alice was beginning to get very tired of sitting by her sister on the bank, and of having nothing to do: once or twice
she had peeped into the book her sister was reading, but it had no pictures or conversations in it, ‘and what is the use of
a book,’ thought Alice ‘without pictures or conversation?’