{"id":21269021,"url":"https://github.com/qubole/demotrends","last_synced_at":"2025-07-11T05:30:42.819Z","repository":{"id":9919024,"uuid":"11929619","full_name":"qubole/demotrends","owner":"qubole","description":"Code required to setup the demo trends website (http://demotrends.qubole.com)","archived":false,"fork":false,"pushed_at":"2016-09-26T08:26:12.000Z","size":511,"stargazers_count":6,"open_issues_count":0,"forks_count":3,"subscribers_count":6,"default_branch":"master","last_synced_at":"2024-04-17T22:49:27.309Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/qubole.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-08-06T17:03:34.000Z","updated_at":"2017-11-16T08:24:30.000Z","dependencies_parsed_at":"2022-09-14T05:00:29.619Z","dependency_job_id":null,"html_url":"https://github.com/qubole/demotrends","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qubole%2Fdemotrends","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qubole%2Fdemotrends/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qubole%2Fdemotrends/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/qubole%2Fdemotrends/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/qubole","download_url":"https://codeload.github.com/qubole/demotrends/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225693819,"owners_count":17509227,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-21T08:07:05.147Z","updated_at":"2024-11-21T08:07:05.726Z","avatar_url":"https://github.com/qubole.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DemoTrends (http://demotrends.qubole.com)\n\nA Big Data app that displays the topics that are trending on Wikipedia.\n\nThere are two main parts:\n\n1. Webapp in Ruby on Rails.\n\n2. Data pipeline hosted in *Qubole Data Service*\n\n\nYou can read more about demotrends in this [Blog](https://www.qubole.com/blog/big-data/build-a-data-pipeline-with-qubole)\n\n## Quick Start\n1. Register for a [Trial Plan] (http://www.qubole.com/try) in Qubole\n2. [Obtain the API key] (http://www.qubole.com/qds-api-reference/authentication/)\n3. Run the commands in the *commands* directory\n\n## Webapp\n\nCode required to setup the demo trends website (http://demotrends.qubole.com)\n\n#### Set up \n1. Create the database - `./webapp/script/init-mysql.sh`\n2. Run the migrations:  `rake db:migrate`\n \n#### Populate Data in db \n1. Using Sample Data: `rake db:seed` These will insert one row in each of the tables. \n2. Using SQL Dump: You can also use SQL dump file to populate your DB. This file has data from processed data from 30th June 2013 - 13th August 2013.\n                   `sudo mysql trend \u003c webapp/db/sqldump/mysqldump_13AUG13.sql`\n\n#### Start the webapp\n1.  Run `./webapp/script/restart_server.sh`\n\n## Data Pipeline\n### Hive\nDirectory contains two UDFs required by the data pipeline:\n1. collect_all - A JAR UDF\n2. hive_trend_mapper - A Python UDF\n\n### Scripts\nDirectory contains scripts that are run in a *Shell Command*.\n1. pagecount_dump.py - A script to download ONE days *pagecounts* data from the Wikimedia website.\n\n### Commands\nDirectory contains all the commands to process one day's worth of data.\nThe sequence of commands is important. The filenames start with a number specifying the sequence it should be executed in.\nRun the scripts using [Qubole Python SDK] (https://github.com/qubole/qds-sdk-py)\n\n### Airflow\nIf you want to use [Apache Airflow](https://github.com/apache/incubator-airflow) to manage the pipeline, please look at `airflow` folder. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqubole%2Fdemotrends","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fqubole%2Fdemotrends","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fqubole%2Fdemotrends/lists"}