{"id":23712138,"url":"https://github.com/lugu/cleanwebhackathon2013team3","last_synced_at":"2026-02-08T04:30:15.027Z","repository":{"id":9742143,"uuid":"11704053","full_name":"lugu/CleanWebHackathon2013Team3","owner":"lugu","description":"process taxi samples","archived":false,"fork":false,"pushed_at":"2013-07-30T18:17:42.000Z","size":61940,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-12-30T19:56:58.525Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lugu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-07-27T11:26:03.000Z","updated_at":"2013-08-15T01:59:56.000Z","dependencies_parsed_at":"2022-07-08T06:30:56.575Z","dependency_job_id":null,"html_url":"https://github.com/lugu/CleanWebHackathon2013Team3","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lugu%2FCleanWebHackathon2013Team3","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lugu%2FCleanWebHackathon2013Team3/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lugu%2FCleanWebHackathon2013Team3/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lugu%2FCleanWebHackathon2013Team3/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lugu","download_url":"https://codeload.github.com/lugu/CleanWebHackathon2013Team3/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239800432,"owners_count":19699122,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-30T19:57:09.263Z","updated_at":"2026-02-08T04:30:14.904Z","avatar_url":"https://github.com/lugu.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"\nIntroduction\n============\n\nThis project was created during the CleanWeb Hackathon 2013 in Beijing.\n\nThe goal is to analyze the taxi dataset for find area in Beijing with high\npotential for pedestrial development.\n\nThis program use Spark, a distributed computing framework written in Scala.\n\nTeam\n====\n\n * Bingyue\n * Florian\n * Laura\n * Ludovic\n * Martin\n * Ray\n * Sam\n * Yao\n\nProcessing\n==========\n\nThe program goes as as follow:\n 1. read the input file\n 2. parse the text format into binary format (Sample objects)\n 3. filter the events \"get in\" and \"get out\" a taxi \n 4. join the consecutive events (get in and get out) into \"trip\" \n 5. and measure the distance of the trip\n 6. filter the trips shorter than 3km.\n 7. group the departure and arrival points in 10 clusters\n 8. for each cluster, print 30 points (for visualization)\n\nInput will be read from the file sample.csv\n\nOutput will be saved into:\n * ./rides.txt/part-XXXX : all the departure/arrival\n * ./results.txt/part-XXXX : the departure/arrival closest to the centers\n\n\nBuild\n=====\n\nInstall sbt (Simple Build Tool) and execute:\n\n\t$ sbt assembly \n\nRun\n===\n\nVerify you have the files sample.csv in the current directory and run:\n\n\t$ java -jar ./target/scala-2.9.3/CleanWebHackathon2013-assembly-1.0.jar 10 300\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flugu%2Fcleanwebhackathon2013team3","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flugu%2Fcleanwebhackathon2013team3","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flugu%2Fcleanwebhackathon2013team3/lists"}