Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/cfpb/aurora
An open source enterprise data warehousing and analysis platform.
https://github.com/cfpb/aurora
ansible data-science data-warehousing
Last synced: 17 days ago
JSON representation
An open source enterprise data warehousing and analysis platform.
- Host: GitHub
- URL: https://github.com/cfpb/aurora
- Owner: cfpb
- License: cc0-1.0
- Created: 2015-03-27T20:37:23.000Z (over 9 years ago)
- Default Branch: develop
- Last Pushed: 2021-11-08T22:12:23.000Z (about 3 years ago)
- Last Synced: 2024-04-08T00:16:51.973Z (9 months ago)
- Topics: ansible, data-science, data-warehousing
- Language: Jinja
- Homepage: https://github.com/cfpb/aurora
- Size: 2.42 MB
- Stars: 21
- Watchers: 10
- Forks: 36
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# Aurora - An Enterprise Data Platform
**Description**: This repository is a collection of Ansible scripts and other
supporting code required to build a scalable, secure, and powerful data
processing platform.- **Technology stack**: Ansible is used for deployment.
- **Status**: Under active development. Once we've reached "Alpha", further
changes will be tracked in the [CHANGELOG](CHANGELOG.md).## Dependencies
The Aurora data platform was designed to work on a network of RHEL 6.5 servers, and
has only been tested in that environment. Additionally, you must have Ansible
installed to deploy, and Vagrant to run locally.## Installation
To install locally, simply run "vagrant up" from the /deploy directory. To deploy
to a remote environment, a custom inventory file is required along with a custom
group_vars file to go with it. Once that has been added, simply run
"ansible-playbook site.yml -i inventories/{{ your_environment }}"* Note: if there isn't a postgres instance running on your machine, you'll need to pass an environment variable to install it
- Ex: ```EXTRA_VARS='{pp_install:true}' vagrant up [server_name]```## Configuration
As mentioned above, you can configure the deployment using Ansible's inventory
and group_vars functionality.## Usage
TBD - Likely will create more substantial documentation defining what each
server is for and how it is meant to be used.## How to test the software
### Running Docker on a Macbook
1. brew cask install docker-toolbox
1. docker-machine start default
1. docker-machine create --driver "virtualbox" default
1. eval "$(docker-machine env default)"
1. docker ps (to validate it works)If docker starts running out of disk space, connect to the boot2docker VM (or Mac terminal) and run this:
docker ps -a -q | xargs -n 1 -I {} docker rm {}
Command to make sure the exited containers are deleted:
docker rm -v $(docker ps -a -q -f status=exited)
### Setting up Test Environment
When developing the Travis CI file, it can be helpful to test in travis's environment as described
here: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
* Note: you'll need to install the travis image with --privileged
* docker run --privileged -it quay.io/travisci/travis-ruby /bin/bashTo do this, follow the steps above up to actually running your commands. Before doing so, Docker must
be installed in the Travis CI image, like so:1. sudo apt-get install apt-transport-https ca-certificates
1. sudo apt-key adv --keyserver hkp://ha.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D
1. echo "deb https://apt.dockerproject.org/repo ubuntu-precise main" | sudo tee /etc/apt/sources.list.d/docker.list
1. sudo apt-get update
1. apt-cache policy docker-engine
1. sudo apt-get install docker-engine (May neeed --force-yes)
1. sudo ln -s /bin/true /sbin/initctl
1. sudo service docker start ->>
1. docker daemon -H unix:///var/run/docker.sock&>/var/log/docker.log &
1. git clone https://github.com/[githubfork]/aurora /aurora
1. cd /aurora
1. git checkout travis
1. Run commands in travis.yml fileTBD
## Role-specific documentation
Some Ansible roles in this project have role-specific documentation:
* (postgresql-server)[deploy/roles/postgresql-server/README.md]
* (python27-scl)[deploy/roles/python27-scl/README.md]
* (python36-scl)[deploy/roles/python36-scl/README.md]## Known issues
- Travis-CI hangs when jobs complete - [resolution](https://www.jeffgeerling.com/blog/2017/fix-ansible-hanging-when-used-docker-and-tty)
- R package installation takes too long (currently skipped)## Getting help
Open an issue on Github if you need help, have a feature request, or have
code to contribute.## Getting involved
Refer to [CONTRIBUTING](CONTRIBUTING.md) if you'd like to help!
----
## Open source licensing info
1. [TERMS](TERMS.md)
2. [LICENSE](LICENSE)
3. [CFPB Source Code Policy](https://github.com/cfpb/source-code-policy/)----
## Credits and references