https://github.com/maxpoi/oz-twitter-analysis
Assignment 2 for COMP90024, Cluster and Cloud Computing.
https://github.com/maxpoi/oz-twitter-analysis
ansible aurin backend cloud couchdb docker docker-swarm frontend restful-api twitter-api visualization
Last synced: about 2 months ago
JSON representation
Assignment 2 for COMP90024, Cluster and Cloud Computing.
- Host: GitHub
- URL: https://github.com/maxpoi/oz-twitter-analysis
- Owner: maxpoi
- Created: 2021-04-26T09:36:54.000Z (about 5 years ago)
- Default Branch: main
- Last Pushed: 2021-05-26T01:47:48.000Z (about 5 years ago)
- Last Synced: 2025-04-04T21:34:05.095Z (about 1 year ago)
- Topics: ansible, aurin, backend, cloud, couchdb, docker, docker-swarm, frontend, restful-api, twitter-api, visualization
- Language: JavaScript
- Homepage:
- Size: 3.94 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# COMP90024-Assignment-2
[](#contributors-)
## Contributors β¨

Jiacheng Ye
π»

Ma-Yuyao
π»

cc1032802
π»

jxstar11
π»

YUJGUAN
π»
## Project structure
> The ansible folder uses the Ansible playbook folder strucure.
> If a *templates* folder exists (like in *ansible/roles/deploy/couchdb*), then a J2 template is used to generate required files.
> J2 template is required because some files need to use the Ansible inventory variables.
```
.
βββ ansible # The folder of all ansible scripts (for setting up & deploy server)
β βββ roles # The ansible roles folder, listing all the tasks
β β βββ deploy # where all the deploy tasks are listed
β β β βββ couchdb #
β β β β βββ tasks #
β β β β β βββ main.yaml #
β β β β βββ templates #
β β β β βββ xxx.xxx.j2 #
| | | βββ app #
| | | | βββ ... #
| | | βββ copy-directory #
| | | | βββ ... #
| | | βββ harvester #
| | | | βββ ... #
β β βββ openstack # where all the setting up MRC tasks are listed
β β β βββ ... # β¬
β β β βββ remove # where all the uninstall server tasks are listed
β β β β βββ ... # β¬
β β β β βββ ... # β¬
β β βββ set-up # where all the setting up each individual instance server tasks are listed
β β β βββ ... # β¬
β β β βββ ... # β¬
β βββ vars # A folder listing all used Ansible environment variables
β βββ hosts # A customized Ansible inventory file; passed into playbook by using -i command
β βββ main.yaml # The main Ansible playbook file. It uses all the roles except the ones in the remove folder
β βββ uninstall_server.yaml # If this playbook is run, all MRC instances, security groups, volumes will be removed
βββ app # The folder for the actual application
β βββ backend # The folder containing all back-end codes
β β βββ api # The folder containing all api provided to the front-end
| | | βββ get_aurin.py # Define & realize APIs for getting AURIN data
| | | βββ get_map_reduce_result.py # Define & realize APIs for getting map_reduce data
β β βββ crawler # The folder containing Twitter Harvester codes
| | | βββ crawl_by_keyword # The folder containing Twitter Harvester codes for crawling by keyword
| | | | βββ 5G # The folder containing Twitter Harvester codes for crawling for 5G scenario
| | | | | βββ ... # β¬
| | | | | βββ ... # β¬
| | | | βββ AFL # The folder containing Twitter Harvester codes for crawling for AFL scenario
| | | | | βββ ... # β¬
| | | | | βββ ... # β¬
| | | | βββ food # The folder containing Twitter Harvester codes for crawling for food scenario
| | | | | βββ ... # β¬
| | | | | βββ ... # β¬
| | | | βββ vaccine # The folder containing Twitter Harvester codes for crawling for vaccine scenario
| | | | | βββ ... # β¬
| | | | | βββ ... # β¬
| | | βββ crawl_by_raw_data # The folder containing Twitter Harvester codes for crawling any keywords
| | | | βββ node_1 # The folder containing Twitter Harvester codes for crawling any keywords hosted on node_1
| | | | | βββ Dockerfile # β¬
| | | | | βββ ... # β¬
| | | | βββ node_2 # The folder containing Twitter Harvester codes for crawling any keywords hosted on node_2
| | | | | βββ Dockerfile # β¬
| | | | | βββ ... # β¬
| | | | βββ node_3 # The folder containing Twitter Harvester codes for crawling any keywords hosted on node_3
| | | | | βββ Dockerfile # β¬
| | | | | βββ ... # β¬
| | | βββ twitter_api_config.py # The file is to set Twitter API configuration information
β β βββ mapreduce # The folder containing CouchDB map_reduce codes
| | | βββ map_reduce.py # The file is to set map_reduce for CouchDB
| | |ββ upload_data # The folder containing Uploading data to CouchDB codes
| | | |ββ AURIN # The folder containing AURIN data (.json) we needed for this project
| | | | βββ ... # β¬
| | | |ββ AURIN-CSV # The folder containing AURIN data (.csv) we needed for this project
| | | | βββ ... # β¬
| | | |ββ Dockerfile #
| | | |ββ input_data_from_files.py # The file is to input data into CouchDB
| | | βββ requirements.txt #
| | |ββ utils # The folder containing until functions
| | | |ββ get_path.py # The file is to get the path
| | | βββ sentiment_analysis.py # The file is to perform sentiment analysis
| | |ββ couchdb_config.py # The file is to set CouchDB configuration information
| | |ββ requirements.txt #
| | |ββ run_node_1.sh #
| | |ββ run_node_2.sh #
| | |ββ run_node_3.sh #
| | βββ run_upload_data.sh #
β βββ frontend #
| | |ββ Dockerfile #
| | βββ ... #
βββ .all-contributorsec # Automate generated file by all-contributor plugin
βββ openrc.sh # An environment set up bash file; used in run.sh
βββ run_first.sh # The main shell script. Must be run at the very start
βββ run_last.sh # The main shell script. Must be run at the very end
βββ README.md
```
## How to run?
Before running the shell scripts, there are couple preparations must be done first.
1. Go to your MRC dashboard, download the OpenRC file by clicking your profile icon and choose download.
2. Rename the downloaded file to β*openrc.sh*β and move it under the root project folder.
3. Navigate to β*Key Pairs*β under β*Project β Compute*β and create a new key pair by clicking βCreate Key Pairβ.
4. Fill in key name and choose key type as β*SSH Key*β.
5. Save the downloaded *.pem* file to a directory where you can easily navigate to. (Warning! **DO NOT** share this key pair with anyone or public it!!!)
6. Open file β*hosts*β under *./ansible/*, after β*ansible_ssh_private_key_file*β append the *absolute path* to this .pem file you just downloaded.
7. Open file β*mrc.yaml*β under *./ansible/vars/*, after field β*instance: key_name:*β, replace the old key name by the key name you just created.
8. Click βUserβ on the top right and click βSettingβ
9. Navigate to βReset Passwordβ
10. Click βReset Passwordβ and write down the new password somewhere safe. You will need to enter this password later when running the shell scripts.
11. *[Optional]* Open β*mrc.yaml*β file again, and decrease the β*vol_size*β if there no 500 GB storage available in your project space.
After that, you can run the shell scripts following the following instructions to run Ansible.
1) cd to the project folder
2) in your terminal, enter β*sh ./run_first.sh*β (or simply double-click the β*run_first.sh*β file)
3) Terminal will ask you to enter password, which is the password from step 9 previously.
4) Copy the smallest IP address. By smallest we need numerically smallest, for example, 172.0.0.0 < 172.0.0.1
5) Then open file β*couchdb_config.py*β under *./app/backend/*, replace all strings β172.xxx.xxx.xxxβ with the smallest IP address you just copied.
6) (Make sure you are still at the project root folder in your terminal) enter β*sh ./run_last.sh*β and enter the same password again.
7) When this script finishes, app is up on the cloud and can be accessed by entering the smallest IP address in step 4 followed by β:8003β. For example, β172.168.123.43:8003β. Make sure you are using The University of Melbourneβs network as well.
Note: Here assume the project space in MRC is empty. If not, uncomment the second line in β*run_first.sh*β first before running it.
## Project specification
Look at the specification pdf.