Scaling Kubernetes to 7,500 nodes
Scaling a Kubernetes cluster to this magnitude(7500 nodes) is a rare feat that demands careful consideration, but it offers the benefit of a simple infrastructure, empowering OpenAI’s machine learning teams to scale rapidly without altering their code. Following OpenAI’s previous update on scaling to 2,500 nodes, OpenAI has further developed its infrastructure, imparting valuable lessons…
Connecting Kernel Panics to Kubernetes Pods: Keeping Track of Lost Nodes at Netflix
With a dedicated effort to enhance user experience on the Titus container platform, Netflix delved into the issue of “orphaned” pods – those left incomplete without a clear final status. Although this may not be a concern for Service job owners, it holds significant importance for Batch users. This blog post provides insights into our…
Slack’s journey to reliable and scalable cron execution at scale
Slack started with the classic “one box, one crontab” approach for cron jobs. Initially, it worked fine, but as the platform grew, so did the number of scripts and their processing demands. This led to several issues: Building a Better Way: Introducing Chronos Facing these challenges, Slack opted for a custom solution: Chronos. Here’s a…
A “Krispr” Approach to Kubernetes Infrastructure: Keeping Pods Fresh and Rolling Out Updates Smoothly
Introduction In the demanding world of modern service-oriented architectures, maintaining fresh and up-to-date infrastructure is crucial for optimal performance and security. Airbnb, with its hundreds of services relying on Kubernetes, faced challenges in efficiently updating shared infrastructure components within their platform. Their existing approach, heavily dependent on service owner upgrades, led to version fragmentation, complexity,…
Get ready for KubeCon + CloudNativeCon Europe –Cloud Innovation
The KubeCon + CloudNativeCon Europe event is just around the corner! This is your chance to dive into the world of Kubernetes and cloud-native technologies, meet industry experts, and connect with fellow enthusiasts. Important Reminder: Reserve your spot now to lock in the current rates and save before prices go up! Don’t miss out on this…
Scaling Kubernetes to 2,500 Nodes for Deep Learning at OpenAI
OpenAI, a pioneer in artificial intelligence, pushes the boundaries of Kubernetes by scaling it to manage massive deep learning workloads. While managing bare VMs remains an option for the largest tasks, Kubernetes shines for its rapid iteration cycles, reasonable scalability, and reduced development overhead. This blog dives into OpenAI’s journey building a 2,500-node Kubernetes cluster…
Overusing getters and setters
Encapsulation is used to hide the values or state of a structured data object, preventing unauthorized parties’ direct access to them. In Golang there is no by default support of getters and setters, so it is optional. There are few advantage of using getters and setters event in golang and they are mentaion below :-…
Avoid any Type in TS (anti-pattern)
What are types in TS? Types in TS helps us understand what methods & properties are associated with a given value/variable in a program that can help us analyze our code for existing errors and prevent further errors. For example a value that is assigned a type of a string tells us that the value…
Variable Shadowing in Go
As a part of this blog post, I will try to explain about variable shadowing and how to avoid it. In programming, scope of variable defines to the places a variable can be referenced. In Golang, a variable name declared in a block can be redeclared in an inner block. This mechanism is called variable shadowing….
Interface on producer side
Interfaces are used to create common abstractions that multiple objects can implement. Before delving into this topic, let’s make sure the terms we use throughout this section are clear: It’s common to see developers creating interfaces on the producer side, alongside the concrete implementation. But in Go, in most cases, this is not what we…
Unnecessary nested code
Readable code requires less cognitive effort to maintain a mental model; hence, it is easier to read and maintain. A critical aspect of readability is the number of nested levels. Code is qualified as readable based on multiple criteria such as naming, consistency, formatting, and so forth. While programming, we need to maintain mental models…