Shifting to a microservice architecture comes with well-known benefits. Isolation between systems and teams means feature teams can iterate faster and the entire software engineering organization can scale with more efficiency and agility.
But these benefits also come with tradeoffs. Microservices bring increased operational complexity in the form of increased surface area and more diverse tech stacks.
For central platform or SRE teams, this increased diversity in technologies makes their work more challenging. Tracking, monitoring, and supporting a variety of services–ensuring their reliability–can start to feel like taking care of beloved (or idiosyncratic) pets instead of managing herds of cattle.
Custom Event Checks: Intuitive Building Blocks
At OpsLevel, we’ve built our service ownership platform with microservices top of mind, so we’re ready to support teams no matter what technologies or tooling they need to integrate with. One way we make this possible is with a new framework called Custom Event Checks.
Like any check within our broader Service Maturity model, the goal is to evaluate services against your preferred standards or best practices. With Custom Event Checks (CECs), the data source for these checks can be nearly anything.
OpsLevel easily receives arbitrary JSON payloads from any source–homegrown systems, open-source tooling, or 3rd party vendors we don’t yet integrate with. The JSON data is then parsed, mapped to the corresponding services in your OpsLevel account, and evaluated against the relevant logic for determining whether a service has passed or failed that CEC.
Custom Event Checks: Under the hood
The magic of CECs is that they’re simultaneously all-purpose, powerful, and simple to create. To illustrate that, let’s take a look at the process from start to finish. For our example, imagine an SRE wants to incorporate Sentry issues into their evaluation of their company’s external-facing services.
The check writer can quickly create a CEC endpoint for Sentry in OpsLevel. Then they can write a check, based on jq, which will parse data extracted from Sentry’s API, map the JSON payload to services in OpsLevel, and evaluate the the payload data against any custom logic. For example, a check writer might look for any:
- services with more than 10 unresolved issues
- services with any issues impacting more than 100 users
- services with more than 3 reoccurring (i.e. Regressions) issues
An individual check will only produce pass/fail results for one of the specific scenarios above, but CEC endpoints and payloads can be easily reused across multiple checks for capturing pass/fails results at different levels or thresholds.
CECs are designed so that check writing doesn’t need sophisticated knowledge of OpsLevel’s data model or API. All that’s required is familiarity with the payload’s schema and basic jq in order to parse out service identifiers and the success condition, as pictured above.
Lastly, processing of data actually kicks off when OpsLevel receives a JSON payload to the Sentry CEC endpoint. Often data will reach a CEC endpoint via a webhook triggered in an external system or tool, but any method–e.g. bash script, cron job, serverless function–that pushes JSON to OpsLevel will work.
Next Steps: % if check.failed %
Once a CEC is producing boolean results per service as intended, as a check writer, it’s time to consider what action you want service owners to take–particularly if their service is failing the check. Ideally you want to provide service owners with information on why that check failed and what changes they should make in order to pass the check.
To make the process easy for everyone, CECs include a customizable result message. The message supports the open-source template language Liquid, including a set of variables for accessing check state and data from the check payload. The result messages also supports markdown for making the message easy to read.
How will you use Custom Event Checks?
In order to make Service Ownership at scale achievable for service owners and platform teams, a central source of truth for information about your microservices is needed.
With OpsLevel and Custom Event Checks, building and automating that repository of service metadata and maturity is easier than ever. Ready to get started? Request your OpsLevel demo today.