Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/drylikov/hue

Let’s big data. Hue is a Web interface for analyzing data with Apache Hadoop. It supports a file and job browser, Hive, Pig, Impala, Spark, Oozie editors, Solr Search dashboards, HBase, Sqoop2, and more.
https://github.com/drylikov/hue

Last synced: 8 days ago
JSON representation

Let’s big data. Hue is a Web interface for analyzing data with Apache Hadoop. It supports a file and job browser, Hive, Pig, Impala, Spark, Oozie editors, Solr Search dashboards, HBase, Sqoop2, and more.

Awesome Lists containing this project

README

        

.. image:: docs/images/hue_logo.png

Welcome to the repository for Hue
=================================

Hue is an open source Web interface for analyzing data with Apache Hadoop: `gethue.com
`_

.. image:: docs/images/hue-screen.png

It features:

* File Browser for accessing HDFS
* Hive Editor for developing and running Hive queries
* Search App for querying, exploring, visualizing data and dashboards with Solr
* Impala App for executing interactive SQL queries
* Spark Editor and Dashboard
* Pig Editor for submitting Pig scripts
* Oozie Editor and Dashboard for submitting and monitoring workflows, coordinators and bundles
* HBase Browser for visualizing, querying and modifying HBase tables
* Metastore Browser for accessing Hive metadata and HCatalog
* Job Browser for accessing MapReduce jobs (MR1/MR2-YARN)
* Job Designer for creating MapReduce/Streaming/Java jobs
* A Sqoop 2 Editor and Dashboard
* A ZooKeeper Browser and Editor
* A DB Query Editor for MySql, PostGres, Sqlite and Oracle

On top of that, a SDK is available for creating new apps integrated with Hadoop.

More user and developer documentation is available at http://gethue.com.

Getting Started
===============
To build and get the development server running::

$ git clone http://github.com/drylikov/hue.git
$ cd hue
$ make apps
$ build/env/bin/hue runserver

Now Hue should be running on http://localhost:8000 !

The configuration in development mode is ``desktop/conf/pseudo-distributed.ini``.

Note: to start the production server (but lose the automatic reloading after source modification)::

$ build/env/bin/supervisor

To run the tests::

Install the mini cluster (only once):
$ ./tools/jenkins/jenkins.sh slow

Run all the tests:
$ build/env/bin/hue test all

Or just some parts of the tests, e.g.:
$ build/env/bin/hue test specific impala
$ build/env/bin/hue test specific impala.tests:TestMockedImpala
$ build/env/bin/hue test specific impala.tests:TestMockedImpala.test_basic_flow

Development Prerequisites
===========================
You'll need these library development packages and tools installed on
your system:

Ubuntu:
* ant
* gcc
* g++
* libkrb5-dev
* libmysqlclient-dev
* libssl-dev
* libsasl2-dev
* libsasl2-modules-gssapi-mit
* libsqlite3-dev
* libtidy-0.99-0 (for unit tests only)
* libxml2-dev
* libxslt-dev
* mvn (from ``maven`` package or maven3 tarball)
* openldap-dev / libldap2-dev
* python-dev
* python-simplejson
* python-setuptools

CentOS:
* ant
* asciidoc
* cyrus-sasl-devel
* cyrus-sasl-gssapi
* gcc
* gcc-c++
* krb5-devel
* libtidy (for unit tests only)
* libxml2-devel
* libxslt-devel
* mvn (from ``maven`` package or maven3 tarball)
* mysql
* mysql-devel
* openldap-devel
* python-devel
* python-simplejson
* sqlite-devel

MacOS (mac port):
* liblxml
* libxml2
* libxslt
* mysql5-devel
* simplejson (easy_install)
* sqlite3

File Layout
===========
The Hue "framework" is in ``desktop``. ``/core/`` contains the Web components and
``desktop/libs/`` the API for talking to Hadoop.
The installable apps live in ``apps/``. Please place third-party dependencies in the app's ext-py/
directory.

The typical directory structure for inside an application includes:

src/
for Python/Django code
models.py
urls.py
views.py
forms.py
settings.py

conf/
for configuration (``.ini``) files to be installed

static/
for static HTML/js resources and help doc

templates/
for data to be put through a template engine

locales/
for localizations in multiple languages

For the URLs within your application, you should make your own ``urls.py``
which will be automatically rooted at ``/yourappname/`` in the global
namespace. See ``apps/about/src/about/urls.py`` for an example.

Main Stack
==========
Hue would not be possible without:

* Python 2.6 - 2.7
* Django 1.4 (https://docs.djangoproject.com/en/1.4/)
* Knockout.js (http://knockoutjs.com/)
* jQuery (http://jquery.com/)
* Bootstrap (http://getbootstrap.com/)