{"id":16868964,"url":"https://github.com/jeqo/talk-observing-distributed-systems","last_synced_at":"2025-07-14T15:35:24.976Z","repository":{"id":79229701,"uuid":"106022256","full_name":"jeqo/talk-observing-distributed-systems","owner":"jeqo","description":null,"archived":false,"fork":false,"pushed_at":"2020-10-13T08:47:40.000Z","size":23458,"stargazers_count":7,"open_issues_count":1,"forks_count":3,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-11T11:33:56.735Z","etag":null,"topics":["distributed-tracing","jaeger","logging","metrics","prometheus","tracing"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jeqo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2017-10-06T15:44:45.000Z","updated_at":"2024-03-21T17:28:13.000Z","dependencies_parsed_at":"2023-05-12T12:15:54.927Z","dependency_job_id":null,"html_url":"https://github.com/jeqo/talk-observing-distributed-systems","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jeqo/talk-observing-distributed-systems","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jeqo%2Ftalk-observing-distributed-systems","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jeqo%2Ftalk-observing-distributed-systems/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jeqo%2Ftalk-observing-distributed-systems/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jeqo%2Ftalk-observing-distributed-systems/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jeqo","download_url":"https://codeload.github.com/jeqo/talk-observing-distributed-systems/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jeqo%2Ftalk-observing-distributed-systems/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265312561,"owners_count":23745181,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["distributed-tracing","jaeger","logging","metrics","prometheus","tracing"],"created_at":"2024-10-13T15:00:02.520Z","updated_at":"2025-07-14T15:35:24.953Z","avatar_url":"https://github.com/jeqo.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Talk: Observing Distributed Systems\n\nPresented at:\n\n* [NoSlidesConf 2017](http://www.noslidesconf.net/)\n\n* [PeruJUG Meetup](https://www.meetup.com/es-ES/Peru-Java-User-Group/events/245246354/) [[Slides](https://speakerdeck.com/jeqo/observando-sistemas-distribuidos-perujug)]\n\n* [GDG Oslo Meetup](https://www.meetup.com/es-ES/GDG-Cloud-Norway/events/247282228) [[Slides](https://speakerdeck.com/jeqo/increasing-observability-with-distributed-tracing)]\n\n**Observability** is the ability to understand what is going on with your systems, not \nonly from the point of view of how the system looks from outside, but been able to\nanswer more granular questions, like where did this message goes?\n\n`Metrics`, `Logging` and `Tracing` are called the three pillars of Observability.\n\nIn this presentation we will see how we can use these tools and how they are related \nto be able to observe our systems.\n\n* Logging and Metrics\n\n* OpenTracing API\n\n* Demo: Tweets App\n\n## Tools\n\n* JDK 8\n* Docker (Docker-Machine, host: docker-vm)\n\n* Logging: Fluentd, Elasticsearch Kibana\n* Metrics: Prometheus\n* Tracing: OpenTracing, Jaeger, Zipkin\n\n* Frameworks/Libraries: Dropwizard, JOOQ, Kafka Clients, HTTP Client, Elasticsearch, Postgresql.\n\n## Key takeaways\n\n* Distributed Tracing is just one more tool for your toolkit, and is not mean to replace metrics \nand logging, and it could be seen as an abstraction of them.\n\n* OpenTracing is an effort to standarize how to instrument your applications, so you can upgrade/migrate\nyour infrastructure without changing your implementations.\n\n* OpenTracing is a young project, go ahead, try out and give feedback to the community, or contribute\nto make it better.\n\n## Resources\n\n### Papers: \n\n* **dapper** https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/36356.pdf \n* **canopy** http://cs.brown.edu/~jcmace/papers/kaldor2017canopy.pdf\n* **automating failure testing research at internet scale** https://people.ucsc.edu/~palvaro/socc16.pdf \n* data on the outside vs data on the inside http://cidrdb.org/cidr2005/papers/P12.pdf \n* pivot tracing http://sigops.org/sosp/sosp15/current/2015-Monterey/printable/122-mace.pdf \n\n### Blog posts:\n\n* ok log https://peter.bourgon.org/ok-log/ \n* logs - 12 factor application https://12factor.net/logs \n* the problem with logging https://blog.codinghorror.com/the-problem-with-logging/ \n* logging v. instrumentation https://peter.bourgon.org/blog/2016/02/07/logging-v-instrumentation.html \n* logs and metrics https://medium.com/@copyconstruct/logs-and-metrics-6d34d3026e38 \n* measure anything, measure everything https://codeascraft.com/2011/02/15/measure-anything-measure-everything/ \n* metrics, tracing and logging https://peter.bourgon.org/blog/2017/02/21/metrics-tracing-and-logging.html \n* monitoring and observability https://medium.com/@copyconstruct/monitoring-and-observability-8417d1952e1c \n* monitoring in the time of cloud native https://medium.com/@copyconstruct/monitoring-in-the-time-of-cloud-native-c87c7a5bfa3e \n* sre book https://landing.google.com/sre/book/index.html \n* distributed tracing at uber https://eng.uber.com/distributed-tracing/ \n* spigo and simianviz https://github.com/adrianco/spigo \n* observability: what’s in a name? https://honeycomb.io/blog/2017/08/observability-whats-in-a-name/ \n* wtf is operations? #serverless https://charity.wtf/2016/05/31/wtf-is-operations-serverless/ \n* event foo: what should i add to an event https://honeycomb.io/blog/2017/08/event-foo-what-should-i-add-to-an-event/ \n* “The Verification of A Distributed System” - Caitie McCaffrie https://github.com/CaitieM20/Talks/tree/master/TheVerificationOfADistributedSystem \n* “Testing in Production” by Charity Majors https://opensource.com/article/17/8/testing-production \n* “Data on the outside vs Data on the inside - Review” by Adrian Colyer https://blog.acolyer.org/2016/09/13/data-on-the-outside-versus-data-on-the-inside/ \n* Google’s approach to Observability https://medium.com/@rakyll/googles-approach-to-observability-frameworks-c89fc1f0e058 \n* Microservices and Observability https://medium.com/@rakyll/microservices-observability-26a8b7056bb4 \n* Best Practices for Observability https://honeycomb.io/blog/2017/11/best-practices-for-observability/ \n* https://thenewstack.io/dev-ops-doesnt-matter-need-observability/ \n\n### talks\n\n* \"Observability for Emerging Infra: What Got You Here Won't Get You There\" by Charity Majors https://www.youtube.com/watch?v=1wjovFSCGhE  \n* “The Verification of a Distributed System” by Caitie McCaffrey https://www.youtube.com/watch?v=kDh5BrqiGhI \n* “Mastering Chaos - A Netflix Guide to Microservices” by Josh Evans https://www.youtube.com/watch?v=CZ3wIuvmHeM\n* “Monitoring Microservices” by Tom Wilkie https://www.youtube.com/watch?v=emaPPg_zxb4\n* “Microservice application tracing standards and simulations” by Adrian Cole and Adrian Cockcroft https://www.slideshare.net/adriancockcroft/microservices-application-tracing-standards-and-simulators-adrians-at-oscon \n* “Intuition Engineering at Netflix” by Justin Reynolds https://vimeo.com/173607639 \n* Distributed Tracing: Understanding how your all your components work together by José Carlos Chávez https://speakerdeck.com/jcchavezs/distributed-tracing-understanding-how-your-all-your-components-work-together \n* “Monitoring isn't just an accident” https://docs.google.com/presentation/d/1IEJIaQoCjzBsVq0h2Y7qcsWRWPS5lYt9CS2Jl25eurc/edit#slide=id.g327c9fd948_0_534 \n* Orchestrating Chaos Applying Database Research in the Wild - Peter Alvaro https://www.youtube.com/watch?v=YplkQu6a80Q \n\n### Books:\n\n* Martin Kleppmann - “Design Data-Intensive Applications” https://dataintensive.net/\n\n* Google - \"Site Reliability Engineering” https://landing.google.com/sre/book/index.html  \n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjeqo%2Ftalk-observing-distributed-systems","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjeqo%2Ftalk-observing-distributed-systems","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjeqo%2Ftalk-observing-distributed-systems/lists"}