https://github.com/09u2h4n/selestium
A Python module for web scraping with javascript, alternative to requests-html
https://github.com/09u2h4n/selestium
python python3 requests requests-html selenium selenium-python selestium
Last synced: 10 months ago
JSON representation
A Python module for web scraping with javascript, alternative to requests-html
- Host: GitHub
- URL: https://github.com/09u2h4n/selestium
- Owner: 09u2h4n
- License: gpl-3.0
- Created: 2024-03-21T13:15:07.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-04-03T10:57:56.000Z (about 2 years ago)
- Last Synced: 2025-07-06T07:41:50.119Z (12 months ago)
- Topics: python, python3, requests, requests-html, selenium, selenium-python, selestium
- Language: Python
- Homepage:
- Size: 43.9 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Selestium
Selestium is a Python module for web scraping and automation using Selenium WebDriver.
## Features
- Provides a high-level interface for interacting with HTML content in web pages.
- Supports rendering JavaScript-based web pages using headless browsers (Firefox and Chrome).
- Allows easy navigation, element identification, and data extraction from web pages.
## Installation
You can install Selestium using pip:
```
pip install selestium
```
## Dependencies for Termux
In Termux you need some dependencies to work. Later it will bee automatic.
#### !!CHROME DOES NOT WORK JUST FIREFOX IN TERMUX!!
First update and then install tur and x11 repos
```
pkg update -y; pkg install -y tur-repo x11-repo
```
Then install firefox and geckodriver
```
pkg install -y firefox geckodriver
```
And you are ready to go..
## Dependencies for Linux
In Linux also you need get [Firefox dependencies](https://www.mozilla.org/en-US/firefox/124.0.1/system-requirements/).
Please note that GNU/Linux distributors may provide packages for your distribution which have different requirements.
Firefox will not run at all without the following libraries or packages:
glibc 2.17 or higher
GTK+ 3.14 or higher
libglib 2.42 or higher
libstdc++ 4.8.1 or higher
X.Org 1.0 or higher (1.7 or higher is recommended)
For optimal functionality, we recommend the following libraries or packages:
DBus 1.0 or higher
NetworkManager 0.7 or higher
PulseAudio
For Debian-based distros:
```
sudo apt update -y && sudo apt install -y \
libc6 \
libgtk-3-0 \
libglib2.0-0 \
libstdc++6 \
xorg
```
## Usage
Here's a basic example of how to use Selestium to render a web page and extract information:
### Make a Request Without Rendering:
```python
from Selestium import HTMLRequests
# Initialize a HTMLRequests instance with default settings (Firefox browser)
req = HTMLRequests()
# Make a GET request to a web page without rendering
response = req.get("https://www.example.com")
# Extract information from the response
print(response.content)
```
### Make a Request With Rendering:
```python
from Selestium import HTMLRequests
# Initialize a HTMLRequests instance with Firefox browser
req = HTMLRequests(browser='firefox')
# Get a web page and render it using the browser
response = req.get("https://www.example.com", render=True)
# Extract information from the rendered page
titles = response.find("h1")
for title in titles:
print(title.text)
```
### Using the Controller Method:
```python
from Selestium import HTMLRequests
# Initialize a HTMLRequests instance with Chrome browser
req = HTMLRequests(browser='chrome')
# Get the browser controller (WebDriver) instance
driver = req.browser_controller()
# Navigate to a web page
driver.get("https://www.example.com")
# Perform additional actions using the browser controller
# For example, click a button or fill out a form
# driver.find_element_by_id("button_id").click()
```
## Contributing
Contributions are welcome! If you encounter any issues or have suggestions for improvement, please open an issue or submit a pull request on GitHub.
## License
This project is licensed under the MIT License - see the [LICENSE](https://github.com/09u2h4n/selestium/blob/main/LICENSE) file for details.