Latest blog posts
DevOps resources tips and best practices
Ready to level up your production readiness game? Get the free guide to start reducing downtime, improving productivity, and enhancing security.
In this replay of a previous episode, unlock the secrets of OpenTelemetry, Lambda as a scalable CPU, and building a platform team. We revisit our past episode with Paul Osman, a Senior Staff Software Engineer at LinkedIn. At the time of the recording, Paul was a staff Platform Engineer at Honeycomb. In this episode, we dive into OpenTelemetry, architecture, and how to build long-term Ops strategies.
In this replay of a previous episode, unlock the secrets of OpenTelemetry, Lambda as a scalable CPU, and building a platform team.
We revisit our past episode with Paul Osman, a Senior Staff Software Engineer at LinkedIn. At the time of the recording, Paul was a staff Platform Engineer at Honeycomb. In this episode, we dive into OpenTelemetry, architecture, and how to build long-term Ops strategies.
Join us as we discuss:
OpenTelemetry open-source observability framework consists of several different tools, APIs and SDKs. Using OpenTelemetry, you can generate, collect and export telemetry data.
According to Paul, OpenTelemetry produces two main categories of work — one is a specification that describes how telemetry data should look; the other is a set of open-source APIs that allow easy generation from systems. And the entire system is rooted in the perfect union of two communities.
OpenTelemetry also offers a variety of services and tools to meet unique business needs. For example, by using a product called OpenTelemetry Collector, the user can fork off telemetry data. Rather than having to switch telemetry data within applications, you can easily change configurations to evaluate multiple vendors.
Reinstrumenting code is frequently a non-starter for people. Often, they would rather stick with their existing systems or build something from scratch. Instrumentation and reinstrumentation is work that most customers do not necessarily care about. For those with hundreds of services, instrumentation is something that customers want to think about once and never again.
The services and systems provided through HoneyComb and OpenTelemetry are offering just this — customers no longer have to instrument their code each time they evaluate or switch vendors.
By allowing versatility in vendors and manipulating configuration rather than code, new opportunities for observability arise.
Traditional observability relies on three factors: logs, metrics and traces. But Honeycomb breaks the historical folds, relying on data beyond these metrics.
“At its heart, Honeycomb is an ultra-wide event store,” Paul says. “We just accept keys and values. You can embed those and have as many of them as possible to make an ultra-wide table that represents your data.”
Rather than relying on logs, metrics and traces, Honeycomb allows you to assess any data point as traceable and observable. Ultimately, this allows organizations to build massive datasets according to their needs and services.
According to Paul, the big problem areas that new platform teams will face vary greatly on where, how and by whom their products are used. But regardless of industry or use, there are a few necessities for developing a platform team.
And to do that, you have to understand the problems your users are facing in depth. In some cases, engineers may have difficulty with reliability; in others, engineers may take a long time to get their code into production.
In the end, it’s the responsibility of the platform or internal systems teams to investigate existing problems and help combat them.
"There are no best practices in our industry,” Paul says, “there are only sets of guidelines you can use to look for and evaluate problems with value in mind.”
Establishing this understanding before building a platform team is essential. While some organizations thrive with a specialized team, others work cross-functionally. But each environment faces a unique set of challenges and therefore requires an equally unique approach.
Want to learn more about moving away from monolithic software, empowering your teams and the idea of ‘build and rent’? Listen on Apple Music, Spotify or wherever you find your podcasts.
Kenneth (Ken) Rose is the CTO and Co-Founder of OpsLevel. Ken has spent over 15 years scaling engineering teams as an early engineer at PagerDuty and Shopify. Having in-the-trenches experience has allowed Ken a unique perspective on how some of the best teams are built and scaled and lends this viewpoint to building products for OpsLevel, a service ownership platform built to turn chaos into consistency for engineering leaders.
Conversations with technical leaders delivered right to your inbox.
DevOps resources tips and best practices