OpsLevel Logo
Product
Developer portal
Software catalog
Understand your entire architecture at a glance
Standards
Your guide to safe, reliable software
Developer self-service
Empower developers to move faster, risk-free
Integrations
Connect your most powerful third-party tools
Use Cases
Ownership
Build accountability and clarity into your catalog
Standardization
Set and rollout best practices for your software
Developer Experience
Free up your team to focus on high-impact work
Customers
Resources
All Resources
Our full library of resources
Pricing
Flexible and designed for your unique needs
Podcast
Podcast
Conversations with technical leaders
Blog
Blog
DevOps resources, tips, and best practices
Demo
Demo
Videos of our product and features
Tech talk
Tech talk
Our POV on technical topics
Guide
Guide
Practical resources to roll out new programs and features
DocsLog In
Talk to usTry for free
No items found.
Share this
Table of contents
 
Resources
Blog

How Paul Osman thinks about long-term strategies, open telemetry, and the value of boring systems

Insights
DevX
Engineering leadership
How Paul Osman thinks about long-term strategies, open telemetry, and the 
value of boring systems
Kenneth Rose
|
September 20, 2021
How Paul Osman thinks about long-term strategies, open telemetry, and the 
value of boring systems

We’re kicking off an interview series, called Level-Up, with standout engineering leaders to learn what’s top of mind for them. Check out the full interview at the bottom of this post. And let us know who we should talk to next!

More and more teams are developing software, rushing to embrace new ways to stay current and competitive. But, according to Paul Osman, Staff Platform Engineer at Honeycomb, the best systems are actually pretty boring. If you’ve got a system that’s working for you, why mess with it?

That might make Osman sound like he’s content with legacy solutions, but that’s far from the case. When it comes to Ops, SDKs, and libraries, Osman looks for the best possible solution for the use case as well as the team, encouraging engineers to thoughtfully consider all their options.

At Honeycomb, he’s been tasked with developing a long-term strategy for client libraries and SDKs. In a previous role as a Senior Engineering Manager at Under Armour, he led infrastructure teams, specifically focusing on Kubernetes and microservices.

Osman and I actually worked together at PagerDuty where he led their internal platform team at the time.  I came to know Osman as someone who prolifically picked up new tools and technologies.  More importantly, he had (and still has) a phenomenal intuition of what good developer tooling feels like.

Recently, I got to sit down with Osman to talk about telemetry, architecture, and how to build long-term Ops strategies. He filled me in on his perspective: he’s excited about OpenTelemetry and is interested in how teams can gain better observability of their data. Osman’s insights are instructive for any engineer looking to become an adaptible, effective leader, so I’ve compiled the top takeaways from our conversation.

Building a long-term strategy for client libraries and SDKs

Osman has a lot of experience developing infrastructure and leading teams. At Under Armour, he was responsible for leading all of the infrastructure teams with a focus on migrating to Kubernetes. In previous roles at PagerDuty and Soundcloud, he led platform teams.

This experience, which made Osman an expert in client libraries and SDKs, led him to his current role at Honeycomb. When Osman joined, engineers were maintaining libraries in a variety of languages– Ruby, Go, Python, and Java. “My job was to come on board, figure out what we were going to continue to support, and develop a long-term strategy for client libraries and SDKs,” he said. “We also needed to figure out how to build a team to support that strategy.”

With the help of other folks at Honeycomb (Osman gave a shout-out to Liz Fong-Jones), Osman helped formulate a strategy for the company’s approach to ingesting observability data that relied on OpenTelemetry.

“We wanted to make it as easy as possible to get data into Honeycomb,” he said. “We wanted to meet customers wherever they were, so we leveraged OpenTelemetry and made our own client SDKs. This made it easy for customers to put instrumentation into code. We also built integrations that would take logs from different applications and ship them off to Honeycomb.”

Driving towards OpenTelemetry

OpenTelemetry is key to Osman’s work at Honeycomb, but he has a bigger picture perspective on its place in Ops. He not only sees it as a great story of combined efforts, but is also hopeful that it can change how engineers think about telemetry altogether.

“OpenTelemetry is one of those great stories in open source– two communities recognized that they were serving the same audience and then decided to combine their efforts,” said Osman. “OpenTelemetry specifically grew out of two different projects with very similar goals, which were to create an open-source ecosystem around telemetry data, specifically, tracing, metrics, and logs.”

Osman and his team at Honeycomb have embraced OpenTelemetry wholeheartedly. He calls them “big fans.” For Osman, OpenTelemetry represents possibility. “I dream of a future where telemetry is not something engineers think a lot about. Instead, it’s baked into the framework,” he said. “I’m really hoping that through an effort like this, we can get closer to a world like that.”

Prioritizing observability, not just data

For a long time, metrics, logging, and tracing have been seen as the equivalent to observability. Recently, however, there’s been a shift in taking observability back to its original definition of control systems.

To Osman, metrics, logging, and tracing were born out of what engineers had available. Today, engineers have much more at their fingertips. “I really don’t like companies selling the idea that if you have these three things, you have observability,” he said. “They do give you data, but how well that data represents the internal state of your system is the degree to which you really have observability.”

Honeycomb’s goal is to transcend all three. “Honeycomb, at its heart, is an ultra-wide event store. I love seeing people’s eyes light up when they experience what Honeycomb can show them in terms of observability. Our model is simple: we accept keys and values, which our users can embed. They can have as many as possible though, then have an ultra-wide table that represents their data in what we call a Honeycomb. But that ultra-wide part lets people slice and dice over a huge number of dimensions, quickly getting to the root of the issue.”

The best systems are pretty boring

One of Osman’s favorite things about Honeycomb is how simple the architecture is. “I don’t think there’s such a thing as cool or uncool architecture– there’s just architecture that works, he said. “One of the things I really appreciate about Honeycomb and the way we’ve built our systems is that it’s all boring– our system is not that complicated.”

 “I don’t think there’s such a thing as cool or uncool architecture– there’s just architecture that works, said Osman. “One of the things I really appreciate about Honeycomb and the way we’ve built our systems is that it’s all boring.”

He shared that the company leverages a few internally-built services, which are all dog-themed. Retriever is a columnar data store that stores events, Shepherd is their ingest services that persists events, and Doberman enforces usage restrictions, like a guard dog. To Osman, it’s not about having the fanciest Ops solutions. The best ones are simple– and they just work.

“Boring technology always wins,” said Osman.

How companies should adopt Kubernetes

Osman is well-versed in everything Kubernetes, and shared his wisdom on how companies should think about adopting it. Here are a few guiding principles he shared:

     
  • Simple architecture is the best architecture. Osman believes that the best architecture is simple. If you have solutions that work, you don’t need to suddenly embrace Kubernetes just because it’s the industry standard. “If the old and well-known thing is working for you and it’s not a limiter, why would you change it?,” he said.
  •  
  • Make sure your engineers are comfortable. If you have engineers who are really good with Kubernetes and it’s what they do the fastest, then embrace it. But if you’ll need to train engineers to get up to speed, it may not be worth the investment, especially if you have a solution that’s working. “If you’ve got a group of six engineers, and they’re all Kubernetes experts, then go to Kubernetes. It’ll probably be fast for you,” he said. “But if you’ve got a team that’s used to using configuration management and VMs, then stick with that until it’s a bottleneck.”
  •  
  • Recognize the power of industry-standard. Although Osman appreciates all that Kubernetes has to offer, he also recognizes that it’s become an industry standard, partly due to great marketing. It may be the right solution for you, but it’s not the only one.

The job of an engineer: Solving business problems

Engineers love to solve problems with their work. But according to Osman, an engineer’s job is to solve business problems with as little code (or work) as possible. “At the end of the day, the customers who love to use the tools are paying your salary,” he said. “That doesn’t mean your work isn’t important, but you do need to focus on critical areas that satisfy customers like reliability, availability, and making sure the services are running smoothly.”

The experience of an engineer can be humbling, especially when you’re building things that you’re not an expert on. “The people you work with who are using your systems are going to be the experts,” said Osman. “You can do the best job you can scoping out the use cases, but the other engineers in your company that are using your service are going to find all the edge cases and surprising ways that your solution doesn’t fit their needs.”

Osman sees this as an exciting opportunity to adopt a service mentality where engineers can get excited about learning more about how people want to use your service. There’s always room to improve a system.

Interview

Check out my full interview with Paul Osman to learn more about his perspective. We talked about Ops, OpenTelemetry, and the role of an engineer.

 

 

More resources

Blog
September 19, 2023
by
Fernando Villalba
The OpsLevel Developer Experience (DevEx) series. Part 1: What is DevEx?

Great developer experience (DevEx) is what you get when developers can easily achieve and maintain flow state at work. This article begins a series where we tackle all of the areas that affect flow state and impair your developer experience at your company and provide example metrics and suggestion to help you operate like a potential future unicorn.

Blog
August 31, 2023
by
OpsLevel
August 2023 release notes

This month included an update to our Service Maturity features—to give you even more flexibility—plus more sorting and syncing improvements. Read on to learn more!

Blog
May 31, 2023
by
Haley Hnatiw
May 2023 release notes

See what we’ve shipped in the month of May.

OpsLevel Logo
Subscribe
Join our newsletter to stay up to date on features and releases.
By subscribing you agree to with our Privacy Policy and provide consent to receive updates from our company.
SOC 2AICPA SOC
Product
Software CatalogMaturityIntegrationsSelf-serviceRequest a demo
Company
About usCareersContact UsCustomersPartnersSecurity
Resources
Docs
Blog
Demo
© 1999 J/K Labs Inc. All rights reserved.
Cookie Preferences
Terms of Use
Privacy Policy
Responsible Disclosure
By using this website, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Data Processing Agreement for more information.
Okay!