Latest blog posts
DevOps resources tips and best practices
Ready to level up your production readiness game? Get the free guide to start reducing downtime, improving productivity, and enhancing security.
Seth shares his experience leading Groupon's journey from monolith to microservices and the internal developer portal built during this migration. Seth provides an incredible overview of Groupon’s current success and how this portal has leveraged their scale-up journey. He dives into how ownership definitions and assignments have changed, the way that clean data has enabled more automation, and how they’ve used all of this to strengthen and clarify SLOs.
We’re delving into the benefits of developer portals this week with Seth Lochen, Senior Engineering Manager at Groupon.
Seth shares his experience leading Groupon's journey from monolith to microservices and the internal developer portal built during this migration. Seth provides an incredible overview of Groupon’s current success and how this portal has leveraged their scale-up journey.
He dives into how ownership definitions and assignments have changed, the way that clean data has enabled more automation, and how they’ve used all of this to strengthen and clarify SLOs.
Join as we discuss:
It worked at the time but, of course, it was among the most costly ways to grow a business.
“As we scaled up, we had to break that monolith into pieces,” says Lochen, explaining the shift in thinking at the organization. “We headed towards the classic service-oriented microservice architecture. As we did that, we started losing track of how many services there were. It was easy enough to carve off things in groups, but over time we built up more and more microservices.”
Lochen refers to the “explosion of services” as the key driver of the service portal’s development. It made life easier for all teams to understand the overall team structure and responsibilities, service definitions and integration relationships.
One would expect that an increase in operational complexity is part of the deal when scaling up but the motivation was quite simple for tying ownership to services.
Lochen explains: “We did this so we knew who to contact if something was broken.”
UX wins again — as it should.
According to Lochen, your portal is the key to reducing friction in development — just look at how he and the team were able to use theirs.
“We've added more and more to that service portal, which allowed us to do cross-service reporting and even get into SDLC automation, leveraging our own service portal API to make decisions,” he explains. “As we went from monolith to service-oriented architecture, the service portal became this necessary hub that enabled us to manage that service role.”
“We initiated a type of service scoring, where we're monitoring the health of certain things at a granular level so we can assign tasks to people who are actually going to do the work.” — Seth Lochen
The system now works so effectively that, today, it features automatic rules to reassign responsibilities in the event of a change of staff or something similar. There is officially no such thing as an “unowned” service at Groupon — a project manager’s dream come true.
Through more detailed ownership, overall risk and security management became easier, too. As with any tech project, what you initially build becomes obsolete and requires updating.
The granularity within the service portal allowed for intricate reporting of:
By reporting on particular items, appropriate corrective — and eventually, pre-emptive — actions could be assigned to the right people in good time.
“If you can't assign that work to somebody, and know that they’re directly responsible for completing it, you're spending a lot of time on just looking for the right person to own it.” — Seth Lochen
In Groupon’s case, Lochen confirms that the data cleanup process triggered the other benefits. It’s easier to automate processes once you’re looking at data that makes sense.
When it comes to contextualizing solutions, it’s clear that Groupon crushed it. They’ve been intentional about what not to do, just as much as they have about what to actually pursue and invest energy toward.
“We're not necessarily at the point where we're adding too many hurdles,” Lochen says. “We can't afford it from a CI/CD standpoint; we don't want to slow everybody down.”
In some cases, CI/CD effects take priority over other risks. This is especially the case when the UX could be jeopardized if something doesn’t execute or get deployed on time.
The ability to proactively identify and intervene at ‘red flag areas’ is not the only benefit of a service portal. When it comes to the potential for positive reinforcement, Lochen points to the portal’s:
"We don't want people using software that has vulnerabilities, that have either been exposed or used outside of Groupon, for us to then find out about it and throw together a patch. Similarly, we don't want to fall too far behind. That makes service ownership and maintenance hard to do.” — Seth Lochen
Groupon may be decisively tackling response times, error rates and latency at the moment, but you need to think about what matters most for your organization before you can translate those into service level objectives (SLOs).
For further inspiration and insight, listen to the full episode.
Kenneth (Ken) Rose is the CTO and Co-Founder of OpsLevel. Ken has spent over 15 years scaling engineering teams as an early engineer at PagerDuty and Shopify. Having in-the-trenches experience has allowed Ken a unique perspective on how some of the best teams are built and scaled and lends this viewpoint to building products for OpsLevel, a service ownership platform built to turn chaos into consistency for engineering leaders.
Conversations with technical leaders delivered right to your inbox.
DevOps resources tips and best practices