Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/rrwen/search_google

A command line tool and module for Google web and image search
https://github.com/rrwen/search_google

api cli command cse custom engine google image interface line search tool web

Last synced: 6 days ago
JSON representation

A command line tool and module for Google web and image search

Awesome Lists containing this project

README

        

search_google
=============

| Richard Wen
| [email protected]

* `Documentation `_
* `PyPi Package `_

A command line tool and module for Google API web and image search.

.. image:: https://badge.fury.io/py/search-google.svg
:target: https://badge.fury.io/py/search-google
.. image:: https://travis-ci.org/rrwen/search_google.svg?branch=master
:target: https://travis-ci.org/rrwen/search_google
.. image:: https://coveralls.io/repos/github/rrwen/search_google/badge.svg?branch=master
:target: https://coveralls.io/github/rrwen/search_google?branch=master
.. image:: https://img.shields.io/github/issues/rrwen/search_google.svg
:target: https://github.com/rrwen/search_google/issues
.. image:: https://img.shields.io/badge/license-MIT-blue.svg
:target: https://raw.githubusercontent.com/rrwen/search_google/master/LICENSE
.. image:: https://img.shields.io/github/stars/rrwen/search_google.svg
:target: https://github.com/rrwen/search_google/stargazers
.. image:: https://img.shields.io/twitter/url/https/github.com/rrwen/search_google.svg?style=social
:target: https://twitter.com/intent/tweet?text=%23python%20%23dataextraction%20tool%20for%20%23googlesearch%20results%20and%20%23googleimages:%20https://github.com/rrwen/search_google

Install
-------

1. Install `Python `_
2. Install `search_google `_ via ``pip``

::

pip install search_google

For the latest developer version, see `Developer Install`_.

Usage
-----

For help in the console::

search_google -h

Ensure that a `CSE ID `_ and a `Google API developer key `_ are set::

search_google -s cx="your_cse_id"
search_google -s build_developerKey="your_dev_key"

Search the web for keyword "cat"::

search_google "cat"
search_google "cat" --save_links=cat.txt
search_google "cat" --save_downloads=downloads

Search for "cat" images::

search_google cat --searchType=image
search_google "cat" --searchType=image --save_links=cat_images.txt
search_google "cat" --searchType=image --save_downloads=downloads

Use as a Python module:

.. code-block:: python

# Import the api module for the results class
import search_google.api

# Define buildargs for cse api
buildargs = {
'serviceName': 'customsearch',
'version': 'v1',
'developerKey': 'your_api_key'
}

# Define cseargs for search
cseargs = {
'q': 'keyword query',
'cx': 'your_cse_id',
'num': 3
}

# Create a results object
results = search_google.api.results(buildargs, cseargs)

# Download the search results to a directory
results.download_links('downloads')

For more usage details, see the `Documentation `_.

Contributions
-------------

Report Contributions
********************

Reports for issues and suggestions can be made using the `issue submission `_ interface.

When possible, ensure that your submission is:

* **Descriptive**: has informative title, explanations, and screenshots
* **Specific**: has details of environment (such as operating system and hardware) and software used
* **Reproducible**: has steps, code, and examples to reproduce the issue

Code Contributions
******************

Code contributions are submitted via `pull requests `_:

1. Ensure that you pass the `Tests`_
2. Create a new `pull request `_
3. Provide an explanation of the changes

A template of the code contribution explanation is provided below:

::

## Purpose

The purpose can mention goals that include fixes to bugs, addition of features, and other improvements, etc.

## Description

The description is a short summary of the changes made such as improved speeds, implementation

## Changes

The changes are a list of general edits made to the files and their respective components.
* `file_path1`:
* `function_module_etc`: changed loop to map
* `function_module_etc`: changed variable value
* `file_path2`:
* `function_module_etc`: changed loop to map
* `function_module_etc`: changed variable value

## Notes

The notes provide any additional text that do not fit into the above sections.

For more information, see `Developer Install`_ and `Implementation`_.

Developer Notes
---------------

Developer Install
*****************

Install the latest developer version with ``pip`` from github::

pip install git+https://github.com/rrwen/search_google

Install from ``git`` cloned source:

1. Ensure `git `_ is installed
2. Clone into current path
3. Install via ``pip``

::

git clone https://github.com/rrwen/search_google
cd search_google
pip install . -I

Tests
*****

1. Clone into current path ``git clone https://github.com/rrwen/search_google``
2. Enter into folder ``cd search_google``
3. Ensure `unittest `_ is available
4. Set your `CSE ID `_ and `Google API developer key `_
5. Run tests
6. Reset config file to defaults
7. Please note that this will use up 7 requests from your quota

::

pip install . -I
python -m search_google -s cx="your_cse_id"
python -m search_google -s build_developerKey="your_dev_key"
python -m unittest
python -m search_google -d

Documentation Maintenance
*************************

1. Ensure `sphinx `_ is installed ``pip install -U sphinx``
2. Update the documentation in ``docs/``

::

pip install . -I
sphinx-build -b html docs/source docs

Upload to github
****************

1. Ensure `git `_ is installed
2. Add all files and commit changes
3. Push to github

::

git add .
git commit -a -m "Generic update"
git push

Upload to PyPi
**************

1. Ensure `twine `_ is installed ``pip install twine``
2. Ensure `sphinx `_ is installed ``pip install -U sphinx``
3. Run tests and check for OK status
4. Delete ``dist`` directory
5. Update the version ``search_google/__init__.py``
6. Update the documentation in ``docs/``
7. Create source distribution
8. Upload to `PyPi `_

::

pip install . -I
python -m search_google -s cx="your_cse_id"
python -m search_google -s build_developerKey="your_dev_key"
python -m unittest
python -m search_google -d
sphinx-build -b html docs/source docs
python setup.py sdist
twine upload dist/*

Implementation
**************

This command line tool uses the `Google Custom Search Engine (CSE) `_ to perform web and image searches. It relies on `googleapiclient.build `_ and `cse.list `_, where ``build`` was used to create a Google API object and ``cse`` was used to perform the searches.

The class `search_google.api `_ simply passed a dictionary of arguments into ``build`` and ``cse`` to process the returned results with properties and methods. `search_google.cli `_ was then used to create a command line interface for `search_google.api `_.

In order to use ``build`` and ``cse``, a `Google Developer API Key `_ and a `Google CSE ID `_ needs to be created for API access (see `search_google Setup `_). Creating these keys also required a `Gmail `_ account for login access.

::

googleapiclient.build <-- Google API
|
cse.list <-- Google CSE
|
search_google.api <-- search results
|
search_google.cli <-- command line

A rough example is provided below thanks to the `customsearch example `_ from Google:

.. code-block:: python

from apiclient.discovery import build

# Set developer key and CSE ID
dev_key = 'a_developer_key'
cse_id = 'a_cse_id'

# Obtain search results from Google CSE
service = build("customsearch", "v1", developerKey=dev_key)
results = service.cse().list(q='cat', cx=cse_id).execute()

# Manipulate search results after ...