https://github.com/qubole/qds-sdk-py
Python SDK for accessing Qubole Data Service
https://github.com/qubole/qds-sdk-py
python qubole sdk-python
Last synced: 6 months ago
JSON representation
Python SDK for accessing Qubole Data Service
- Host: GitHub
- URL: https://github.com/qubole/qds-sdk-py
- Owner: qubole
- License: apache-2.0
- Created: 2013-06-04T06:49:05.000Z (over 12 years ago)
- Default Branch: unreleased
- Last Pushed: 2025-03-06T09:25:57.000Z (7 months ago)
- Last Synced: 2025-04-03T02:34:02.461Z (6 months ago)
- Topics: python, qubole, sdk-python
- Language: Python
- Homepage: https://qubole.com
- Size: 989 KB
- Stars: 52
- Watchers: 19
- Forks: 128
- Open Issues: 23
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
README
Qubole Data Service Python SDK
==============================.. image:: https://travis-ci.org/qubole/qds-sdk-py.svg?branch=master
:target: https://travis-ci.org/qubole/qds-sdk-py
:alt: Build StatusA Python module that provides the tools you need to authenticate with,
and use the Qubole Data Service API.Installation
------------From PyPI
~~~~~~~~~
The SDK is available on `PyPI `_.::
$ pip install qds-sdk
From source
~~~~~~~~~~~
* Get the source code:- Either clone the project: ``git clone git@github.com:qubole/qds-sdk-py.git`` and checkout latest release tag from `Releases `_.
- Or download one of the releases from https://github.com/qubole/qds-sdk-py/releases* Run the following command (may need to do this as root):
::
$ python setup.py install
* Alternatively, if you use virtualenv, you can do this:
::
$ cd qds-sdk-py
$ virtualenv venv
$ source venv/bin/activate
$ python setup.py installThis should place a command line utility ``qds.py`` somewhere in your
path::
$ which qds.py
/usr/bin/qds.pyCLI
---``qds.py`` allows running Hive, Hadoop, Pig, Presto and Shell commands
against QDS. Users can run commands synchronously - or submit a command
and check its status.::
$ qds.py -h # will print detailed usage
Examples:
1. run a hive query and print the results
::
$ qds.py --token 'xxyyzz' hivecmd run --query "show tables"
$ qds.py --token 'xxyyzz' hivecmd run --script_location /tmp/myquery
$ qds.py --token 'xxyyzz' hivecmd run --script_location s3://my-qubole-location/myquery2. pass in api token from bash environment variable
::
$ export QDS_API_TOKEN=xxyyzz
3. run the example hadoop command
::
$ qds.py hadoopcmd run streaming -files 's3n://paid-qubole/HadoopAPIExamples/WordCountPython/mapper.py,s3n://paid-qubole/HadoopAPIExamples/WordCountPython/reducer.py' -mapper mapper.py -reducer reducer.py -numReduceTasks 1 -input 's3n://paid-qubole/default-datasets/gutenberg' -output 's3n://example.bucket.com/wcout'
4. check the status of command # 12345678
::
$ qds.py hivecmd check 12345678
{"status": "done", ... }5. If you are hitting api\_url other than api.qubole.com, then you can pass it in command line as ``--url`` or set in as env variable
::
$ qds.py --token 'xxyyzz' --url https://.qubole.com/api hivecmd ...
or
$ export QDS_API_URL=https://.qubole.com/api
SDK API
-------An example Python application needs to do the following:
1. Set the api\_token and api\_url (if api\_url other than api.qubole.com):
::
from qds_sdk.qubole import Qubole
Qubole.configure(api_token='ksbdvcwdkjn123423')
# or
Qubole.configure(api_token='ksbdvcwdkjn123423', api_url='https://.qubole.com/api')
2. Use the Command classes defined in commands.py to execute commands.
To run Hive Command:::
from qds_sdk.commands import *
hc=HiveCommand.create(query='show tables')
print "Id: %s, Status: %s" % (str(hc.id), hc.status)``example/mr_1.py`` contains a Hadoop Streaming example
Reporting Bugs and Contributing Code
------------------------------------* Want to report a bug or request a feature? Please open `an issue `_.
* Want to contribute? Fork the project and create a pull request with your changes against ``unreleased`` branch.Where are the maintainers ?
---------------------------Qubole was acquired. All the maintainers of this repo have moved on. Some of the employees founded `ClearFeed `_. Others are at big data teams in Microsoft, Amazon et al.