https://github.com/ismoilovdevml/ops-in-maang
Ops In MAANG
https://github.com/ismoilovdevml/ops-in-maang
devops faang infrastructure maang ops
Last synced: 7 months ago
JSON representation
Ops In MAANG
- Host: GitHub
- URL: https://github.com/ismoilovdevml/ops-in-maang
- Owner: ismoilovdevml
- License: gpl-3.0
- Created: 2024-01-15T13:32:13.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-01-21T15:54:59.000Z (almost 2 years ago)
- Last Synced: 2025-04-06T04:13:17.816Z (7 months ago)
- Topics: devops, faang, infrastructure, maang, ops
- Homepage: https://devops-journey.uz/
- Size: 27.3 KB
- Stars: 5
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## Ops In MAANG
Welcome to the **Ops in MAANG** repository, a curated collection of articles, manuals, and research blogs from the engineering realms of top-tier technology companies - **M**eta(Facebook), **A**mazon (AWS), **A**pple, **N**etflix, and **G**oogle. This carefully curated repository combines a wealth of engineering and research insights, manuals, and projects directly sourced from the esteemed engineering blogs of MAANG giants. Whether you're a seasoned practitioner looking to fine-tune your skills or a newcomer eager to dive into the world of cutting-edge technology, this repository serves as your go-to guide.
> Act now MAANG is waiting for you
## Rust
### Meta
* [A brief history of Rust at Facebook](https://engineering.fb.com/2021/04/29/developer-tools/rust/)
* [Programming languages endorsed for server-side use at Meta](https://engineering.fb.com/2022/07/27/developer-tools/programming-languages-endorsed-for-server-side-use-at-meta/)
* [Rust fact vs. fiction: 5 Insights from Google's Rust journey in 2022](https://opensource.googleblog.com/2023/06/rust-fact-vs-fiction-5-insights-from-googles-rust-journey-2022.html)
### Microsoft
* [Why Rust for safe systems programming](https://msrc.microsoft.com/blog/2019/07/why-rust-for-safe-systems-programming/)
* [Using Rust in Windows](https://msrc.microsoft.com/blog/2019/11/using-rust-in-windows/)
* [Microsoft joins Rust Foundation](https://cloudblogs.microsoft.com/opensource/2021/02/08/microsoft-joins-rust-foundation/)
* [Open-source Rust driver development platform](https://techcommunity.microsoft.com/t5/surface-it-pro-blog/open-source-rust-driver-development-platform/ba-p/3974222)
* [An intern's experience with Rust](https://msrc.microsoft.com/blog/2019/10/an-interns-experience-with-rust/)
* [Designing a COM library for Rust](https://msrc.microsoft.com/blog/2019/10/designing-a-com-library-for-rust/)
* [Microsoft Azure security evolution: Embrace secure multitenancy, Confidential Compute, and Rust](https://azure.microsoft.com/en-us/blog/microsoft-azure-security-evolution-embrace-secure-multitenancy-confidential-compute-and-rust/)
* [The Safety Boat: Kubernetes and Rust](https://msrc.microsoft.com/blog/2020/04/the-safety-boat-kubernetes-and-rust/)
* [Announcing Rust for Windows v0.9](https://blogs.windows.com/windowsdeveloper/2021/05/06/announcing-rust-for-windows-v0-9/)
* [Rust/WinRT Public Preview](https://blogs.windows.com/windowsdeveloper/2020/04/30/rust-winrt-public-preview/)
### AWS
* [Sustainability with Rust](https://aws.amazon.com/blogs/opensource/sustainability-with-rust/)
* [Why AWS loves Rust, and how we’d like to help](https://aws.amazon.com/blogs/opensource/why-aws-loves-rust-and-how-wed-like-to-help/)
* [Why AWS is the Best Place to Run Rust](https://aws.amazon.com/blogs/devops/why-aws-is-the-best-place-to-run-rust/)
* [How Open Source Projects are Using Kani to Write Better Software in Rust](https://aws.amazon.com/blogs/opensource/how-open-source-projects-are-using-kani-to-write-better-software-in-rust/)
* [How our AWS Rust team will contribute to Rust’s future successes](https://aws.amazon.com/blogs/opensource/how-our-aws-rust-team-will-contribute-to-rusts-future-successes/)
* [AWS’ sponsorship of the Rust project](https://aws.amazon.com/blogs/opensource/aws-sponsorship-of-the-rust-project/)
* [Innovating with Rust](https://aws.amazon.com/blogs/opensource/innovating-with-rust/)## Linux
### Netflix
* [Debugging a FUSE deadlock in the Linux kernel](https://netflixtechblog.com/debugging-a-fuse-deadlock-in-the-linux-kernel-c75cd7989b6d)
* [Netflix at Velocity 2015: Linux Performance Tools](https://netflixtechblog.com/netflix-at-velocity-2015-linux-performance-tools-51964ddb81cf)
* [Linux Performance Analysis in 60,000 Milliseconds](https://netflixtechblog.com/linux-performance-analysis-in-60-000-milliseconds-accc10403c55)
* [Extending Vector with eBPF to inspect host and container performance](https://netflixtechblog.com/extending-vector-with-ebpf-to-inspect-host-and-container-performance-5da3af4c584b)
* [Predictive CPU isolation of containers at Netflix](https://netflixtechblog.com/predictive-cpu-isolation-of-containers-at-netflix-91f014d856c7)### Meta
* [Facebook open-sources new suite of Linux kernel components and tools](https://engineering.fb.com/2018/10/30/open-source/linux/)
* [drgn: How the Linux Kernel Team at Meta Debugs the Kernel at Scale](https://developers.facebook.com/blog/post/2021/12/09/drgn-how-linux-kernel-team-meta-debugs-kernel-scale/)* [Transparent memory offloading: more memory at a fraction of the cost and power](https://engineering.fb.com/2022/06/20/data-infrastructure/transparent-memory-offloading-more-memory-at-a-fraction-of-the-cost-and-power/)
### Microsoft
* [Microsoft Loves Linux](https://cloudblogs.microsoft.com/windowsserver/2015/05/06/microsoft-loves-linux/)
* [Linux Kernel Security Done Right](https://security.googleblog.com/2021/08/linux-kernel-security-done-right.html)
* [Fixing Linux filesystem performance regressions](https://engineering.linkedin.com/blog/2020/fixing-linux-filesystem-performance-regressions)
* [Upgrading to RHEL7 with minimal interruptions](https://engineering.linkedin.com/blog/2020/upgrading-to-rhel7-with-minimal-interruptions)
* [Application Pauses When Running JVM Inside Linux Control Groups](https://engineering.linkedin.com/blog/2016/11/application-pauses-when-running-jvm-inside-linux-control-groups)
* [Optimizing Linux Memory Management for Low-latency / High-throughput Databases](https://engineering.linkedin.com/performance/optimizing-linux-memory-management-low-latency-high-throughput-databases)
* [Don’t Let Linux Control Groups Run Uncontrolled](https://engineering.linkedin.com/blog/2016/08/don_t-let-linux-control-groups-uncontrolled)
* [Skyfall: eBPF agent for infrastructure observability](https://engineering.linkedin.com/blog/2022/skyfall--ebpf-agent-for-infrastructure-observability)
* [Overcoming challenges with Linux cgroups memory accounting](https://engineering.linkedin.com/blog/2022/overcoming-challenges-with-linux-cgroups-memory-accounting)### Instgaram
* [What Powers Instagram: Hundreds of Instances, Dozens of Technologies](https://instagram-engineering.com/what-powers-instagram-hundreds-of-instances-dozens-of-technologies-adf2e22da2ad)
## Monitoring
### Netflix
* [Improved Alerting with Atlas Streaming Eval](https://netflixtechblog.com/improved-alerting-with-atlas-streaming-eval-e691c60dc61e)
* [Telltale: Netflix Application Monitoring Simplified](https://netflixtechblog.com/telltale-netflix-application-monitoring-simplified-5c08bfa780ba)
* [Lessons from Building Observability Tools at Netflix](https://netflixtechblog.com/lessons-from-building-observability-tools-at-netflix-7cfafed6ab17)
* [Introducing Vector: Netflix’s On-Host Performance Monitoring Tool](https://netflixtechblog.com/introducing-vector-netflixs-on-host-performance-monitoring-tool-c0d3058c3f6f)
* [Scryer: Netflix’s Predictive Auto Scaling Engine Part1](https://netflixtechblog.com/scryer-netflixs-predictive-auto-scaling-engine-a3f8fc922270)
* [Scryer: Netflix’s Predictive Auto Scaling Engine — Part 2](https://netflixtechblog.com/scryer-netflixs-predictive-auto-scaling-engine-part-2-bb9c4f9b9385)
* [A Microscope on Microservices](https://netflixtechblog.com/a-microscope-on-microservices-923b906103f4)### Meta
* [AI debugging at Meta with HawkEye](https://engineering.fb.com/2023/12/19/data-infrastructure/hawkeye-ai-debugging-meta/)
* [BUILDING RESILIENT MONITORING AT META](https://atscaleconference.com/building-resilient-monitoring-at-meta/)
* [Dynolog: Open source system observability](https://developers.facebook.com/blog/post/2022/11/16/dynolog-open-source-system-observability/)
* [Below: a time travelling resource monitoring tool](https://developers.facebook.com/blog/post/2021/09/21/below-time-travelling-resource-monitoring-tool/)
* [Resource Control Demo: Better Resource Control with Simulation](https://developers.facebook.com/blog/post/2022/05/24/resource-control-demo-with-simulation/)
* [Lessons Learned: Running Presto at Meta Scale](https://developers.facebook.com/blog/post/2023/04/11/running-presto-at-meta-scale/)
* [Uncovering the Unknown Unknown](https://developers.facebook.com/blog/post/2023/05/16/uncovering-the-unknown-unknown/)
* [Inside Meta's AI optimization platform for engineers across the company](https://ai.meta.com/blog/looper-meta-ai-optimization-platform-for-engineers/)
* [Using Chakra execution traces for benchmarking and network performance optimization](https://engineering.fb.com/2023/09/07/networking-traffic/chakra-execution-traces-benchmarking-network-performance-optimization/)### Twitter(X)
* [Twitter’s Blobstore Hardware Lifecycle Monitoring and Reporting Service](https://blog.twitter.com/engineering/en_us/topics/infrastructure/2023/twitters-blobstore-hardware-lifecycle-monitoring-and-reporting-service)
* [Powering real-time data analytics with Druid at Twitter](https://blog.twitter.com/engineering/en_us/topics/infrastructure/2022/powering-real-time-data-analytics-with-druid-at-twitter)
* [Observability at Twitter: technical overview, part I](https://blog.twitter.com/engineering/en_us/a/2016/observability-at-twitter-technical-overview-part-i)
* [Observability at Twitter: technical overview, part II](https://blog.twitter.com/engineering/en_us/a/2016/observability-at-twitter-technical-overview-part-ii)* [Monitoring business performance data with ThirdEye smart alerts](https://engineering.linkedin.com/blog/2020/monitoring-business-performance-data-with-thirdeye-smart-alerts)
* [Analyzing anomalies with ThirdEye](https://engineering.linkedin.com/blog/2020/analyzing-anomalies-with-thirdeye)
* [Smart alerts in ThirdEye, LinkedIn’s real-time monitoring platform](https://engineering.linkedin.com/blog/2019/06/smart-alerts-in-thirdeye--linkedins-real-time-monitoring-platfor)
* [InGraphs: Monitoring and Unexpected Artwork](https://engineering.linkedin.com/blog/2017/08/ingraphs--monitoring-and-unexpected-artwork)
* [Open Sourcing Kafka Cruise Control](https://engineering.linkedin.com/blog/2017/08/open-sourcing-kafka-cruise-control)
* [Samza Aeon: Latency Insights for Asynchronous One-Way Flows](https://engineering.linkedin.com/blog/2018/04/samza-aeon--latency-insights-for-asynchronous-one-way-flows)
* [inMesh: Real-Time Monitoring of Remote Sites](https://engineering.linkedin.com/blog/2016/04/inmesh--real-time-monitoring-of-remote-sites)
* [InFlow - Making the LinkedIn network visible](https://engineering.linkedin.com/blog/2016/03/inflow---making-the-linkedin-network-visible)
* [Burrow: Kafka Consumer Monitoring Reinvented](https://engineering.linkedin.com/apache-kafka/burrow-kafka-consumer-monitoring-reinvented)
* [Scaling the collection of self-service metrics](https://engineering.linkedin.com/metrics/scaling-collection-self-service-metrics)### Spotify
* [Monitoring at Spotify: The Story So Far](https://engineering.atspotify.com/2015/11/monitoring-at-spotify-the-story-so-far/)
* [Analyzing Volatile Memory on a Google Kubernetes Engine Node](https://engineering.atspotify.com/2023/06/analyzing-volatile-memory-on-a-google-kubernetes-engine-node/)
* [Monitoring at Spotify: Introducing Heroic](https://engineering.atspotify.com/2015/11/monitoring-at-spotify-introducing-heroic/)## Git
* [How LinkedIn handles merging code in high-velocity repositories](https://engineering.linkedin.com/blog/2020/continuous-integration)
* [Accelerating Code Delivery By 97% With Yarn Workspaces](https://engineering.linkedin.com/blog/2022/accelerating-code-delivery-by-97--with-yarn-workspaces)
* [How LinkedIn automates cherry-picking commits to improve developer productivity](https://engineering.linkedin.com/blog/2023/how-linkedin-automates-cherry-picking-commits-to-improve-develop)
* [Effective Code Reviews and File Ownerships](https://engineering.linkedin.com/blog/2016/01/effective-code-reviews-and-file-ownerships)### Netflix
* [Introducing HubCommander](https://netflixtechblog.com/introducing-hubcommander-1774d8f08fc6)
* [Towards true continuous integration: distributed repositories and dependencies](https://netflixtechblog.com/towards-true-continuous-integration-distributed-repositories-and-dependencies-2a2e3108c051)### Meta
* [Sapling: Source control that’s user-friendly and scalable](https://engineering.fb.com/2022/11/15/open-source/sapling-source-control-scalable/)
* [Build faster with Buck2: Our open source build system](https://engineering.fb.com/2023/04/06/open-source/buck2-open-source-large-scale-build-system/)
* [Meta developer tools: Working at scale](https://engineering.fb.com/2023/06/27/developer-tools/meta-developer-tools-open-source/)
* [A Meta developer's workflow: Exploring the tools used to code at scale](https://developers.facebook.com/blog/post/2022/11/15/meta-developers-workflow-exploring-tools-used-to-code/)* [Why Google Stores Billions of Lines of Code in a Single Repository](https://research.google/pubs/why-google-stores-billions-of-lines-of-code-in-a-single-repository/)
### Uber
* [Faster Together: Uber Engineering’s iOS Monorepo](https://www.uber.com/blog/ios-monorepo/?uclick_id=4f040ea4-5355-40cf-bbbb-3b1bc3e808c7)
* [Building Uber’s Go Monorepo with Bazel](https://www.uber.com/blog/go-monorepo-bazel/?uclick_id=4f040ea4-5355-40cf-bbbb-3b1bc3e808c7)
* [The Journey To Android Monorepo: The History Of Uber Engineering’s Android Codebase Organization](https://www.uber.com/blog/android-engineering-code-monorepo/?uclick_id=4f040ea4-5355-40cf-bbbb-3b1bc3e808c7)
* [How We Halved Go Monorepo CI Build Time](https://www.uber.com/blog/how-we-halved-go-monorepo-ci-build-time/?uclick_id=4f040ea4-5355-40cf-bbbb-3b1bc3e808c7)* [Dynamic configuration at Twitter](https://blog.twitter.com/engineering/en_us/topics/infrastructure/2018/dynamic-configuration-at-twitter)
* [The release of Pants 1.0](https://blog.twitter.com/engineering/en_us/a/2016/the-release-of-pants-10)### Spotify
* [Fleet Management at Spotify (Part 1): Spotify’s Shift to a Fleet-First Mindset](https://engineering.atspotify.com/2023/04/spotifys-shift-to-a-fleet-first-mindset-part-1/)
* [Fleet Management at Spotify (Part 2): The Path to Declarative Infrastructure](https://engineering.atspotify.com/2023/05/fleet-management-at-spotify-part-2-the-path-to-declarative-infrastructure/)
* [Fleet Management at Spotify (Part 3): Fleet-wide Refactoring](https://engineering.atspotify.com/2023/05/fleet-management-at-spotify-part-3-fleet-wide-refactoring/)## Container & Kubernetes
* [Open sourcing Kube2Hadoop: Secure access to HDFS from Kubernetes
](https://engineering.linkedin.com/blog/2020/open-sourcing-kube2hadoop)* [Scaling LinkedIn's Hadoop YARN cluster beyond 10,000 nodes](https://engineering.linkedin.com/blog/2021/scaling-linkedin-s-hadoop-yarn-cluster-beyond-10-000-nodes)
* [Benchmarking Apache Samza: 1.2 million messages per second on a single node](https://engineering.linkedin.com/blog/2015/08/benchmarking-apache-samza--1-2-million-messages-per-second-on-a-)
* [Asynchronous Processing and Multithreading in Apache Samza, Part I: Design and Architecture](https://engineering.linkedin.com/blog/2017/01/asynchronous-processing-and-multithreading-in-apache-samza--part)
* [Asynchronous Processing and Multithreading in Apache Samza, Part II: Experiments and Evaluation](https://engineering.linkedin.com/blog/2017/01/asynchronous-processing-and-multithreading-in-apache-samza--part0)
* [Stream processing with Apache Samza - Current and Future](https://engineering.linkedin.com/blog/2016/01/whats-new-samza)
* [Automating Large-Scale Application Build](https://engineering.linkedin.com/blog/2016/05/automating-large-scale-application-build-)### Netflix
* [Kubernetes And Kernel Panics](https://netflixtechblog.com/kubernetes-and-kernel-panics-ed620b9c6225)
* [Evolving Container Security With Linux User Namespaces](https://netflixtechblog.com/evolving-container-security-with-linux-user-namespaces-afbe3308c082)
* [Predictive CPU isolation of containers at Netflix](https://netflixtechblog.com/predictive-cpu-isolation-of-containers-at-netflix-91f014d856c7)
* [Extending Vector with eBPF to inspect host and container performance](https://netflixtechblog.com/extending-vector-with-ebpf-to-inspect-host-and-container-performance-5da3af4c584b)
* [Auto Scaling Production Services on Titus](https://netflixtechblog.com/auto-scaling-production-services-on-titus-1f3cd49f5cd7)
* [Titus, the Netflix container management platform, is now open source](https://netflixtechblog.com/titus-the-netflix-container-management-platform-is-now-open-source-f868c9fb5436)
* [Updates on Netflix’s Container Management Platform](https://netflixtechblog.com/updates-on-netflixs-container-management-platform-a91738360bd8)
* [The Evolution of Container Usage at Netflix](https://netflixtechblog.com/the-evolution-of-container-usage-at-netflix-3abfc096781b)### Uber
* [Containerizing Apache Hadoop Infrastructure at Uber](https://www.uber.com/blog/hadoop-container-blog/)
* [uBuild: Fast and Safe Building of Thousands of Container Images](https://www.uber.com/blog/ubuild-fast-and-safe-building-of-thousands-of-container-images/)
* [Containerizing the Beast – Hadoop NameNodes in Uber’s Infrastructure](https://www.uber.com/blog/hadoop-namenode-container/)
* [Devpod: Improving Developer Productivity at Uber with Remote Development](https://www.uber.com/blog/devpod-improving-developer-productivity-at-uber/)
* [Up: Portable Microservices Ready for the Cloud](https://www.uber.com/blog/up-portable-microservices-ready-for-the-cloud/)
* [Introducing Makisu: Uber’s Fast, Reliable Docker Image Builder for Apache Mesos and Kubernetes](https://www.uber.com/blog/makisu/)
* [Efficient and Reliable Compute Cluster Management at Scale](https://www.uber.com/blog/compute-cluster-management/)
* [Dockerizing MySQL at Uber Engineering](https://www.uber.com/blog/dockerizing-mysql/)
* [Uber’s Highly Scalable and Distributed Shuffle as a Service](https://www.uber.com/en-IN/blog/ubers-highly-scalable-and-distributed-shuffle-as-a-service/)
* [Uber Engineering’s Micro Deploy: Deploying Daily with Confidence](https://www.uber.com/blog/micro-deploy-code/)### Spotify
* [Designing a Better Kubernetes Experience for Developers](https://engineering.atspotify.com/2021/03/designing-a-better-kubernetes-experience-for-developers/)
* [Analyzing Volatile Memory on a Google Kubernetes Engine Node](https://engineering.atspotify.com/2023/06/analyzing-volatile-memory-on-a-google-kubernetes-engine-node/)### OpenAI
* [Scaling Kubernetes to 2,500 nodes](https://openai.com/research/scaling-kubernetes-to-2500-nodes)
* [Scaling Kubernetes to 7,500 nodes](https://openai.com/research/scaling-kubernetes-to-7500-nodes)### Twitter(X)
* [The Infrastructure Behind Twitter: Scale](https://blog.twitter.com/engineering/en_us/topics/infrastructure/2017/the-infrastructure-behind-twitter-scale)
### Meta
* [Efficient, reliable cluster management at scale with Twine](https://engineering.fb.com/2019/06/06/data-center-engineering/twine/)
* [Containerizing ZooKeeper with Twine: Powering container orchestration from within](https://engineering.fb.com/2020/08/31/developer-tools/zookeeper-twine/)
* [Building a ubiquitous shared infrastructure using Twine](https://engineering.fb.com/2020/11/11/data-center-engineering/twine-2/)
* [Under the hood: Meta’s cloud gaming infrastructure](https://engineering.fb.com/2022/06/09/web/cloud-gaming-infrastructure/)