https://github.com/lugu/cleanwebhackathon2013team3

process taxi samples
https://github.com/lugu/cleanwebhackathon2013team3

Last synced: 9 months ago
JSON representation

process taxi samples

Host: GitHub
URL: https://github.com/lugu/cleanwebhackathon2013team3
Owner: lugu
Created: 2013-07-27T11:26:03.000Z (over 12 years ago)
Default Branch: master
Last Pushed: 2013-07-30T18:17:42.000Z (over 12 years ago)
Last Synced: 2024-12-30T19:56:58.525Z (11 months ago)
Language: Scala
Size: 59.1 MB
Stars: 0
Watchers: 3
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          
Introduction

============

This project was created during the CleanWeb Hackathon 2013 in Beijing.

The goal is to analyze the taxi dataset for find area in Beijing with high

potential for pedestrial development.

This program use Spark, a distributed computing framework written in Scala.

Team

====

 * Bingyue

 * Florian

 * Laura

 * Ludovic

 * Martin

 * Ray

 * Sam

 * Yao

Processing

==========

The program goes as as follow:

 1. read the input file

 2. parse the text format into binary format (Sample objects)

 3. filter the events "get in" and "get out" a taxi 

 4. join the consecutive events (get in and get out) into "trip" 

 5. and measure the distance of the trip

 6. filter the trips shorter than 3km.

 7. group the departure and arrival points in 10 clusters

 8. for each cluster, print 30 points (for visualization)

Input will be read from the file sample.csv

Output will be saved into:

 * ./rides.txt/part-XXXX : all the departure/arrival

 * ./results.txt/part-XXXX : the departure/arrival closest to the centers

Build

=====

Install sbt (Simple Build Tool) and execute:

	$ sbt assembly 

Run

===

Verify you have the files sample.csv in the current directory and run:

	$ java -jar ./target/scala-2.9.3/CleanWebHackathon2013-assembly-1.0.jar 10 300

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lugu/cleanwebhackathon2013team3

Awesome Lists containing this project

README