Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/brooksian/ds_gtdb
KMeans Clustering on Global Terrorism Database
https://github.com/brooksian/ds_gtdb
global-terrorism-database machine-learning spark sparksql zeppelin-notebook
Last synced: 22 days ago
JSON representation
KMeans Clustering on Global Terrorism Database
- Host: GitHub
- URL: https://github.com/brooksian/ds_gtdb
- Owner: BrooksIan
- Created: 2017-02-01T23:23:21.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2018-04-05T00:54:18.000Z (almost 7 years ago)
- Last Synced: 2024-11-18T16:37:37.115Z (3 months ago)
- Topics: global-terrorism-database, machine-learning, spark, sparksql, zeppelin-notebook
- Homepage: http://start.umd.edu/gtd/
- Size: 1.14 MB
- Stars: 2
- Watchers: 1
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Data Science in Apache Spark
## Exploring the Global Terrorism Database Dataset**Level**: Moderate
**Language**: Scala
**Requirements**:
- [HDP 2.5](http://hortonworks.com/products/sandbox/) (or later) or [HDCloud](https://hortonworks.github.io/hdp-aws/)
- Spark 1.6.x
- Download [GTDB dataset](https://www.kaggle.com/START-UMD/gtd�)**Author**: Ian Brooks
**Follow** LinkedIn - [Ian Brooks PhD](https://www.linkedin.com/in/ianrbrooksphd/)
## Context
Information on more than 150,000 Terrorist AttacksThe Global Terrorism Database (GTD) is an open-source database including information on terrorist attacks around the world from 1970 through 2015 (with annual updates planned for the future). The GTD includes systematic data on domestic as well as international terrorist incidents that have occurred during this time period and now includes more than 150,000 cases. The database is maintained by researchers at the National Consortium for the Study of Terrorism and Responses to Terrorism (START), headquartered at the University of Maryland. [More Information](http://start.umd.edu/gtd/)
## Instructions
1. Using the provided link, please download Global Terrorism Database CSV file from [Kaggle.](https://www.kaggle.com/START-UMD/gtd) Note: You will need a Kaggle account.2. Using the provided link, please download the [Zeppelin Note.](https://github.com/BrooksIan/DS_GTDB)
3. Upload GTDB CSV file.
4. In Zeppelin, download the Zeppelin Note [JSON file.](https://github.com/BrooksIan/SacWomenInData) For assistance, please use the following [tutorial](https://hortonworks.com/tutorial/getting-started-with-apache-zeppelin/)
## License
Unlike all other Apache projects which use Apache license, this project uses an advanced and modern license named The Star And Thank Author License (SATA). Please see the [LICENSE](LICENSE) file for more information.