Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/vivint/selenium-docker
Extending Selenium drivers with extra runtime goodies!
https://github.com/vivint/selenium-docker
docker gevent selenium selenium-python web-scraping
Last synced: 3 months ago
JSON representation
Extending Selenium drivers with extra runtime goodies!
- Host: GitHub
- URL: https://github.com/vivint/selenium-docker
- Owner: vivint
- License: apache-2.0
- Created: 2018-01-18T18:04:47.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2018-01-18T18:42:38.000Z (about 7 years ago)
- Last Synced: 2024-10-15T21:15:29.858Z (4 months ago)
- Topics: docker, gevent, selenium, selenium-python, web-scraping
- Language: Python
- Homepage: https://selenium-docker.readthedocs.io/
- Size: 111 KB
- Stars: 6
- Watchers: 4
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# vivint-selenium-docker
[![pypi](https://img.shields.io/pypi/v/selenium-docker.svg)]()
[![Read the Docs](https://img.shields.io/readthedocs/selenium-docker.svg)](https://selenium-docker.readthedocs.io/en/latest/)
[![License ALv2](https://img.shields.io/github/license/vivint/selenium-docker.svg)](https://github.com/vivint/selenium-docker/blob/master/LICENSE)Extending Selenium with drop in replacements for Chrome and Firefox webdrivers
that run in Docker containers. Additional goodies like automatic proxies, live
video recording and driver-pools are included!## Getting Started
1. Install the module:
Latest stable version from pypi,
```bash
$ pip install selenium-docker
```
Development version from source,
```bash
$ pip install git+ssh://[email protected]:vivint/selenium-council.git
```2. Download [docker](https://www.docker.com/get-docker) for your operating system and ensure it's running.
```bash
$ docker version
Client:
Version: 17.10.0-ce
API version: 1.33
Server:
Version: 17.10.0-ce
API version: 1.33 (minimum version 1.12)
```3.
#### You should know...
- Calling `getLogger('selenium_docker').setLevel(logging.DEBUG)` during Logging setup will turn on lots of debug statements involved with with spawning and managing the underlying containers and driver instances.
- You can use the script below to stop and remove all running containers created by this library:
```python
from selenium_docker.base import ContainerFactory
factory = ContainerFactory.get_default_factory()
factory.scrub_containers()
```This will do a search in the default Docker engine for all containers that use our `browser` and `dynamic` labels.
- We use [`gevent`](http://www.gevent.org/contents.html) for its concurrency idioms.
- We call `gevent.monkey.patch_socket` to communicate with Docker engine via REST. Other libraries may need to be patched contingent on what your project is trying to accomplish.
Read about [monkey patching](http://www.gevent.org/intro.html#monkey-patching) on the gevent website.## Examples
#### Basic
Creates a single container with a running Chrome Driver instance inside. Connecting and managing the container is all done automatically. This should function as a drop in replacement for using the desktop version of Chrome and Firefox drivers.
```python
import sys
import loggingfrom selenium_docker import ChromeDriver
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger('selenium_docker').setLevel(logging.DEBUG)driver = ChromeDriver()
driver.get('https://google.com')
print(driver.title)
driver.quit()
```#### Blocking driver pool
Used for performing a single task on multiple sites/items in parallel.
The blocking driver pool will create all the necessary containers in advance in order to distribute the work as resources become available. Drivers will be reused until the `.execute()` call is complete. If the driver throws an Exception then that driver will be removed from the pool.
```python
from selenium_docker.pool import DriverPooldef get_title(driver, url):
driver.get(url)
return driver.titleurls = [
'https://google.com',
'https://reddit.com',
'https://yahoo.com',
'http://ksl.com',
'http://cnn.com'
]pool = DriverPool(size=3)
for result in pool.execute(get_title, urls):
print(result)
```#### Asynchronous driver pool,
```python
from selenium_docker.pool import DriverPooldef get_title(driver, url):
driver.get(url)
return driver.titledef print_fn(s):
print(s)urls = [
'https://google.com',
'https://reddit.com',
'https://yahoo.com',
'http://ksl.com',
'http://cnn.com'
]pool = DriverPool(size=2)
pool.execute_async(get_title, urls, print_fn)
pool.add_async(['https://facebook.com',
'https://mail.com',
'https://outlook.com'])for x in pool.results():
print('result - ', x)if '.com' in x:
pool.add_async(['https://wikipedia.org'])if x == 'Wikipedia':
pool.stop_async()
pool.factory.scrub_containers()
```## License
Copyright 2017 - Vivint, inc.
Apache V2 -- See `LICENSE` for full statement.