{"id":21020215,"url":"https://github.com/bringhurst/hello-samza","last_synced_at":"2026-04-29T06:31:39.145Z","repository":{"id":19527702,"uuid":"22775147","full_name":"bringhurst/hello-samza","owner":"bringhurst","description":null,"archived":false,"fork":false,"pushed_at":"2014-09-19T23:28:52.000Z","size":5112,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-01-02T15:19:23.304Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bringhurst.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2014-08-09T00:14:29.000Z","updated_at":"2017-06-21T08:04:07.000Z","dependencies_parsed_at":"2022-08-23T20:31:03.475Z","dependency_job_id":null,"html_url":"https://github.com/bringhurst/hello-samza","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/bringhurst/hello-samza","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bringhurst%2Fhello-samza","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bringhurst%2Fhello-samza/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bringhurst%2Fhello-samza/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bringhurst%2Fhello-samza/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bringhurst","download_url":"https://codeload.github.com/bringhurst/hello-samza/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bringhurst%2Fhello-samza/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32414413,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-29T06:29:02.080Z","status":"ssl_error","status_checked_at":"2026-04-29T06:29:00.631Z","response_time":110,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-19T10:36:19.608Z","updated_at":"2026-04-29T06:31:39.117Z","avatar_url":"https://github.com/bringhurst.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"hello-samza\n===========\n\nHello Samza is a starter project for [Apache Samza](http://samza.incubator.apache.org/) (Incubating) jobs.\n\nPlease see [Hello Samza](http://samza.incubator.apache.org/startup/hello-samza/0.7.0/) to get started.\n\nBy default, Hello Samza uses a recent release of Samza from a Maven repository. If you want to use a custom\nversion of Samza, you can publish it to your local Maven repository in `$HOME/.m2` by running the following\nin the Samza repository:\n\n    ./gradlew publishToMavenLocal\n\nYou can then use that version in Hello Samza by specifying the `samza.version` property when building\nHello Samza, for example:\n\n    mvn package -Dsamza.version=0.8.0-SNAPSHOT\n\n### Pull requests and questions\nHello Samza is developed as part of the Apache Samza project. Please direct questions, improvements and\nbug fixes there.  Questions about Hello Samza are welcome on the dev list (details on the main\nsite above) and the Samza JIRA has a hello-samza component for filing tickets.\n\n### Using Vagrant\n\nIf you'd like to use Vagrant to get up and running, follow these instructions.\n\n1) Install Vagrant [http://www.vagrantup.com/](http://www.vagrantup.com/)  \n2) Install Virtual Box [https://www.virtualbox.org/](https://www.virtualbox.org/)  \n\nThen once that is done (or if done already) clone this repository and boot the virtual machine up.\n \n    cd hello-samza\n    vagrant up  \n\nThis will take ~ 10-15 minutes to install Kafka, Hadoop/YARN, Samza, configure everything together and launch the jobs.\n\nOnce the VM is launched and you are back at a command prompt go into the virtual machine and see whats running.\n\n    vagrant ssh\n    cd /vagrant\n\nThe wikipedia-feed Samza job that is running is consuming a feed of real-time edits from Wikipedia, and producing them to a Kafka topic called \"wikipedia-raw\".  You can view this in real-time by using the Kafka console consumer to view the topic.\n\n    deploy/kafka/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic wikipedia-raw\n\nThe wikipedia-parser Samza job is then parsing the messages in wikipedia-raw, and extracting information about the size of the edit, who made the change, etc. It outputs these counts to the wikipedia-edits topic.\n\n    deploy/kafka/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic wikipedia-edits\n\nThe wikipedia-stats Samza job reads messages from the wikipedia-edits topic, and calculates counts, every ten seconds, for all edits that were made during that window. It outputs these counts to the wikipedia-stats topic.\n\n    deploy/kafka/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic wikipedia-stats\n\nYou can view the Samza jobs running in the YARN UI http://192.168.80.20:8088/cluster/apps too.\n\nTo see how this was setup and works look at `vagrant/bootstrap.sh` and [Hello Samza](http://samza.incubator.apache.org/).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbringhurst%2Fhello-samza","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbringhurst%2Fhello-samza","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbringhurst%2Fhello-samza/lists"}