OpsLevel Logo
Product

Visibility

Catalog

Keep an automated record of truth

Integrations

Unify your entire tech stack

AI Engine

Restoring knowledge & generating insight

Standards

Scorecards

Measure and improve software health

Campaigns

Action on cross-cutting initiatives with ease

Checks

Get actionable insights

Developer Autonomy

Service Templates

Spin up new services within guardrails

Self-service Actions

Empower devs to do more on their own

Knowledge Center

Tap into API & Tech Docs in one single place

Featured Resource

OpsLevel Product Updates: May 2025
OpsLevel Product Updates: May 2025
Read more
Use Cases

Use cases

Improve Standards

Set and rollout best practices for your software

Drive Ownership

Build accountability and clarity into your catalog

Developer Experience

Free up your team to focus on high-impact work

Featured Resource

Software standards: How to build and maintain effective service maturity
Software standards: How to build and maintain effective service maturity
Read more
Customers
Our customers

We support leading engineering teams to deliver high-quality software, faster.

More customers
Hudl
Hudl goes from Rookie to MVP with OpsLevel
Read more
Hudl
Keller Williams
Keller Williams’ software catalog becomes a vital source of truth
Read more
Keller Williams
Duolingo
How Duolingo automates service creation and maintenance to tackle more impactful infra work
Read more
Duolingo
Resources
Our resources

Explore our library of helpful resources and learn what your team can do with OpsLevel.

All resources

Resource types

Blog

Resources, tips, and the latest in engineering insights

Guide

Practical resources to roll out new programs and features

Demo

Videos of our product and features

Events

Live and on-demand conversations

Interactive Demo

See OpsLevel in action

Pricing

Flexible and designed for your unique needs

Docs
Log In
Book a demo
Log In
Book a demo
No items found.
Share this
Table of contents
 link
 
Resources
Blog

Migrating OpsLevel search to Elastic

Product
Visibility
DevOps
Developer
Platform engineer
Catalog
Migrating OpsLevel search to Elastic
Farjaad Rawasia
|
November 10, 2022

OpsLevel recently upgraded its in-app search capabilities by migrating to Elasticsearch. We invested significant engineering resources into the project because we think search is a foundational capability for any catalog–and any foundational capability is worth doing well. For search, that means providing a fast, comprehensive, and user-friendly experience to the end user.

The first phase of the project is complete, so let’s review what prompted the switch, how we chose Elasticsearch, what we learned, and where we might go next.

The limitations of Search v1

Way back in 2018, OpsLevel’s first search was a fairly straightforward SQL query. Our data model was significantly simpler then, so the query was essentially:

As OpsLevel grew its product surface area over time, we had additional entities that needed to be searched on like tags and repos. Our query grew:

As we continued to grow, new challenges emerged with our SQL-based search.

The first issue was maintainability.  Though we used Arel to keep the actual Rails code somewhat well factored, the actual SQL query itself was gigantic and hard to reason about.  There were a lot of columns we were searching on, which made it difficult to change the query or add new associations to it.  It had achieved an almost mythical status internally around its complexity.

The second issue was performance.  As we onboarded new customers with hundreds or thousands of services, we’d occasionally see search queries timeout.  That wasn’t surprising.  The query is basically doing free text search on almost every attribute on a bunch of different tables.  MySQL’s optimizer didn’t really have any great indexes it could use. :(

The final issue was the actual quality of the search results.  The actual search itself was purely a wildcard / free-text search using SQL LIKE statements.  There was no concept of relevance or tokenization or really anything a modern search offers.

So we opted to fix it.

In addition to improving the search UX for our larger customers, we realized rearchitecting search was an opportunity to:

  • lay the groundwork for a search that would easily extend to objects beyond services
  • provide users with context for why particular search results were returned

Options for Search v2

After committing to re-architect and upgrade search, our first step was to assess our options. We considered and evaluated: 

  • Postgres
  • OpenSearch
  • Elasticsearch

Postgres

OpsLevel is a Ruby app, so a key consideration for most technology choices we make is: how well do the existing Ruby gems fit our needs? For Postgres, the answer was not very well. 

The most popular Postgres-driven search solution, the PGSearch gem runs in two modes: single-model-search and multi-search. Single-model-search includes advanced features, like ranking, ordering, and highlighting of search results, but those features do not exist in multi-search. But multi-search is required for searching across multiple tables. 

So PGSearch wouldn’t make it easy for us to deliver a comprehensive search (across multiple object types) that also provides context to users about search results.

OpenSearch

In parallel with investigating Elasticsearch, we considered OpenSearch, the fork supported by AWS and the open source community.

There are many similarities between the two–OpenSearch is derived from Elasticsearch 7.10.2–as well as infra and hosting cost considerations that made OpenSearch attractive.

Ultimately, Ruby compatibility was the deciding factor. OpenSearch isn’t fully compatible with gems and tooling that exists for Elasticsearch and the alternatives weren’t as robust.

Elasticsearch

Ultimately Elasiticseach was the right choice for our needs. It’s purpose-built for search use-cases (unlike SQL databases), is highly scalable and customizable, and has quality, battle-tested Ruby gems.

Migrating to Elasticsearch

Overall, we found the migration path to Elastic to be smoother than expected. Some of the highlights:

Indexing Service metadata

Indexing data per service was very straightforward. We were able to use the Rails method as_json as our serializer, in a custom as_indexed_json function, as suggested by the Ruby gem (elasticsearch-rails).

For the initial indexing, we used a sidekiq job to import via Elasticsearch's bulk API. The worker indexed services in batches of 1000, metering itself out over time.

This incremental and scalable approach made sure that we are able to index our services without overwhelming our primary database with queries for all of the data related to all our services (names, aliases, tags, descriptions, etc.) as we built up our index in Elastic.

We also had an existing pattern of using Wisper callbacks to trigger background jobs after any CRUD activity on Rails models, so near real-time updating of data in Elasticsearch was also easy to set up.

elasticsearch-rails

The elasticsearch-rails gem made setting up our mapping (e.g. our schema definition) in Elasticsearch very simple. It also had a number of methods that made indexing, searching and retrieving the highlighting easy as well. For example, we used the map_with_hit function and the hit object made the highlight available via property hit.highlight.

Of course, not everything was crystal clear on the first pass. We had to sort out some vocab confusion in the various documentation as we were configuring our indexes. For example:

  • An Elastic Cloud “deployment” should be thought of as a “cluster”. It contains related instances/nodes.
  • An Elastic Cloud “instance” seems to be what Elasticsearch calls a “node”. The Elasticsearch nodes contain shards. 

Testing

A dead end investigation. The pain is real.

We considered a variety of testing approaches, but with the dead end above and without the bandwidth for our Platform Engineering team to set up an Elasticsearch cluster in our CI pipeline, we elected to take a mocking or stubbing approach.

We’ve previously used Webmock for similar CI use-cases, but elected to go with VCR in this instance because we wanted to test our behavior all the way through the Elastic search engine.

We needed to drive out complex behaviors in an unfamiliar domain with Test Driven Development (TDD) and have confidence that the queries we wrote would return the expected search hits, ranking, and ordering. VCR let us write tests that ran quickly on CI, without Elastic itself running on CI, but while still putting the entire system under test in our local dev environments.

It got the job done eventually, but not without some struggle to resolve flaky tests and level-up the team on VCR best practices.

Product Outcomes & Tradeoffs

Moving to Elasticsearch has given us:

  • Faster, more reliable search
  • Ranking and highlighting on our search results page
  • An extensible framework for adding new objects to search

But we did make one clear concession: no more true wildcard search. In Search v1, our SQL-based approach supported this by default. With Elasticsearch, we had the opportunity to be more intentional about our configuration. 

We could use a wildcard query or an nGram filter to support this use case, but the wildcard query would mean significantly slower queries, and the nGram filter would cause our index sizes to spike. 

Ultimately, we decided prefix matching (a “git” search string would match “GitHub”, “GitLab”, “GitKraken”) was sufficient to support the vast majority of search use cases. 

Future plans

Now that we’ve completed and GA’d the first phase of our migration to Elasticearch, we’re excited about all the possibilities ahead of us for further improving our search experience.

  • Adding our Search to our external GraphQL API
  • Adding new objects to our search, so users can find information (and OpsLevel functionality) in their catalog faster.

Potential adds include:

  • API Docs
  • Tech Docs
  • Deploy events
  • Team metadata (e.g. description or charter)
  • Dependencies
  • Individual Check Reports

Tired of filtering spreadsheets or toiling in Confluence to find the service metadata or docs you need? Come checkout our Elasticsearch powered service catalog.

More resources

Fast code, firm control: An AI coding adoption overview for leaders
Blog
Fast code, firm control: An AI coding adoption overview for leaders

AI is writing your code; are you ready?

Read more
March Product Updates
Blog
March Product Updates

Some of the big releases from the month of March.

Read more
How Generative AI Is Changing Software Development: Key Insights from the DORA Report
Blog
How Generative AI Is Changing Software Development: Key Insights from the DORA Report

Discover the key findings from the 2024 DORA Report on Generative AI in Software Development. Learn how OpsLevel’s AI-powered tools enhance productivity, improve code quality, and simplify documentation, while helping developers avoid common pitfalls of AI adoption.

Read more
Product
Software catalogMaturityIntegrationsSelf-serviceKnowledge CenterBook a meeting
Company
About usCareersContact usCustomersPartnersSecurity
Resources
DocsEventsBlogPricingDemoGuide to Internal Developer PortalsGuide to Production Readiness
Comparisons
OpsLevel vs BackstageOpsLevel vs CortexOpsLevel vs Atlassian CompassOpsLevel vs Port
Subscribe
Join our newsletter to stay up to date on features and releases.
By subscribing you agree to with our Privacy Policy and provide consent to receive updates from our company.
SOC 2AICPA SOC
© 2024 J/K Labs Inc. All rights reserved.
Terms of Use
Privacy Policy
Responsible Disclosure
By using this website, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Data Processing Agreement for more information.
Okay!