Learn how to help your team build confidently (and securely!) in prod in our upcoming Tech Talk: Sign up here.

Breaking Down Zapier's Monolith with OpsLevel's Service Catalog

Case Study Highlights

Case Study Highlights

Founded in 2011, Zapier helps users build their own code-less workflow automations, offering connections with more than 3,000 apps. Their mission is to help everyone be more productive at work.

Zapier’s team, including nearly 200 software engineers, is fully remote and distributed around the globe.

Zapier chose OpsLevel to help their engineers work more efficiently as they migrated to a microservice architecture.

Microservice Catalog

Engineers know where to quickly get correct answers to: Who owns this service? No more pinging large Slack channels and crossing fingers.

Production Readiness Checklist

As services exited the monolith, they were systematically evaluated against Zapier’s engineering standards.

Monolith to Microservices

Zapier derisked their migration, and instead turned it into an opportunity for increased reliability.

The Challenge

Battle tribal knowledge and break down the monolith at the same time

Like many growing engineering organizations, Zapier was used to tracking the components of its ever-evolving architecture with a series of spreadsheets and wikis. These tools were simple and flexible, but never remained accurate or widely discoverable by frontline engineers for long. Engineers were regularly resorting to pinging large Slack channels to get answers to questions about service ownership or service function.

“We had no solution to catalog all our services, so we didn’t know who owns what or which services are actually out there unless you go poking around in their repositories on Github.”

Mickey Wu
Architecture Engineering Manager at Zapier

Engineering management and architecture teams were similarly challenged, having to take manual inventories of services in order to progress key projects like their shift from monolith to microservices. A company-wide initiative to migrate their front end applications to microservices by the end of 2021 increased the urgency.

The Solution

One platform for cataloging and internal discovery of services

As a remote first company building and making automation easy for companies around the globe, Zapier didn’t settle for these manual, stop-gap solutions to its service ownership problems. Instead they turned to OpsLevel.

Zapier’s internal scaffolding tool is integrated with OpsLevel to generate an opslevel.yml file, so that anything Zapier builds is tracked in OpsLevel by default. With OpsLevel’s service catalog now embedded into their workflows, Zapier has done more than answer questions of ownership or explain the function of unconventionally named services.

“Teams would be developing a new feature and they didn’t really know what services to talk to. A lot of times it’s just tribal knowledge. It’s always a question on Slack,” said Mickey. “But now you can just say go to OpsLevel to easily gather that information

For example, using OpsLevel’s tags and GraphQL API, Zapier is continuously tracking which services need key rotation. Since engineers already know and trust OpsLevel as a source of truth, internal API documentation is also linked to relevant services in OpsLevel, making the docs more discoverable.

Zapier now tracks more than 40 microservices in OpsLevel and expects to only add more as they further break down the monolith. Zapier also catalogs demo apps, 3rd party tools, and other libraries or packages in OpsLevel in order to ensure ownership is defined and tracked.

The Outcomes

A strong foundation and clear source of truth for continued development

Today, each service at Zapier that exits their monolith is evaluated with production readiness checklists built on OpsLevel’s rubric functionality. In the past, teams made decisions about deploying to production on their own. But, since service health is built into OpsLevel, developers are now having important conversations with Zapier’s production engineering team before their microservices get to production.

“As services exit the monolith one of our requirements is that they have to go through the production readiness checklist,” said Mickey. “And that’s actually sparked a lot of good conversations with production engineering.”

At Zapier, these checks include things like ensuring monitoring is in place for the four golden signals and authentication and security checks that verify no unintended exposures to the public web. The end result: increased adherence to Zapier’s best practices and more reliable services.

Now that they’ve built a service catalog that their entire engineering organization can rely on, Zapier, as you might expect for a company focused on productivity and automation, has plans to keep optimizing.

“We’re going to continue building off of OpsLevel. And we’re never going to stop building services, so I think we’re just going to continue to see the benefits from all the ownership information … and being able to pinpoint where the gaps are and to making sure services follow all the health checks,” said Mickey. “That’s where I see us continuing to get value from OpsLevel.

Visualizing service dependency information in OpsLevel will make meeting compliance frameworks like SOC 2 easier. And further refining their production readiness checklists will ensure services of different tiers receive the appropriate level of review, balancing service reliability and engineering efficiency.

Solve Service Ownership Forever

Say goodbye to stale spreadsheets and wikis. We'll show you how OpsLevel can give you a rock solid foundation for building and maintaining microservices.