Rosetta: Breaking language barriers at Headout

Ever tried explaining the magic of the Eiffel Tower to someone who doesn't speak your language? That's the challenge we face daily at Headout. As a global experiences marketplace, we've always known that speaking our customers' language isn't just nice—it's necessary. But until recently, our translation process felt a bit like trying to build the Tower of Babel—complex, fragmented, and increasingly inefficient as we scaled. That's where Rosetta comes in, our purpose-built localization microservice that's revolutionizing how we communicate with travelers worldwide.

The problem: Lost in translation

Before we dive into our solution, let's talk about why we needed one in the first place. Headout operates in over 160 cities, offering thousands of experiences that need accurate descriptions across multiple languages. Our previous localisation process was heavily manual and looked something like this:

  1. Our content team would write the original English content
  2. In-house linguists would translate it in Google Sheets
  3. The linguists would then copy the translations and manually insert them into our database using our internal CMS.

This approach had several critical flaws that were becoming increasingly problematic as we scaled:

  1. Painfully manual and time-consuming: Teams were either translating directly within content management systems or, in many cases, having to rely on colleagues to transfer content from spreadsheets due to access limitations—creating an additional layer of manual work in an already time-consuming process
  2. High risk of human error: With manual insertions into the database, there was always the risk of copying the wrong translation or inserting it into the incorrect field.
  3. Required entity-specific knowledge: Translators needed to understand the context of each entity they were working with, creating a steep learning curve for new team members.
  4. Failed to be content-agnostic: Ideally, translation should be a neutral process that works the same way regardless of content type, but our system required different approaches for different content categories.

As our content library grew, these limitations weren't just inconvenient—they were actively impeding our ability to expand globally. The manual copy-paste workflow created bottlenecks, introduced human error, and simply couldn't keep pace with our rapid expansion. We needed a solution that could scale with us and eliminate the tedious manual steps involved in getting content translated.

The Vision: A localization blackbox

What if translation could be as simple as sending content in, and getting localized versions back? No need to understand the complexities of translation workflows or provider integrations. Just a clean, abstract interface that handles everything behind the scenes.

That was the vision for Rosetta—named after the famous Rosetta Stone that unlocked the mystery of Egyptian hieroglyphics. Just as the Stone bridged ancient languages, our Rosetta would bridge modern ones, without forcing our other services to understand the complexities of translation.

How Rosetta works its magic

At its core, Rosetta is an event-driven microservice that acts as an abstraction layer between our content services and translation providers. Here's how it works:

  1. Content submission: A service sends content for translation through a simple API call, specifying target languages and translation method.
  2. Content processing: Rosetta breaks down the content into translatable components and manages the workflow.
  3. Translation management: Content is sent to translation providers (currently XTM & DeepL) with all necessary metadata.
  4. Status tracking: Rosetta monitors the translation status, handling retries and errors gracefully.
  5. Callback delivery: Once translation is complete, Rosetta sends the translated content back to the original service via a callback URL.

The beauty of this approach is its simplicity from the consumer service perspective. They don't need to know anything about translation workflows—they just need to be able to receive the translated content when it's ready.

POST /v1/{provider}/translate
{
  "files": [
    {
      "content": {
        "title": "Eiffel Tower Skip-the-Line Tickets",
        "description": "Enjoy breathtaking views of Paris from the city's most iconic landmark"
      },
      "metadata": {
            "field1": "value1"
      }
    }
  ],
  "metadata": {
    "languages": ["ES", "FR", "IT"],
    "translationMethod": "MACHINE",
    "callback": "https://example.com/content/translation-callback"
  }
}

The technical implementation

To build this system, we needed a robust, scalable architecture. We chose:

  • Spring Boot & Kotlin: For rapid development of our microservice
  • Apache Kafka: As our event streaming platform for handling asynchronous communication
  • MongoDB (managed DocDB): As our NoSQL database for flexibility in handling diverse document types

Event-Driven architecture

The decision to use an event-driven architecture was crucial. Translation is inherently asynchronous—it can take minutes, hours, or even days depending on the method (automatic using LLMs / AI vs. human). By building on Kafka, we ensured that our system could handle these varied timeframes without blocking or creating bottlenecks.

Each translation request generates several events in the system:

  • Request creation
  • Translation job assignment
  • Translation status updates
  • Completion and callback events

This approach ensures fault tolerance, scalability, and low latency even with high volumes of content.

Database design

Our database design focused on tracking the state of translation requests and jobs while maintaining the relationships between them. We use three main collections:

  1. Translation Request: Stores the original request with all files and metadata
  2. Translation Job Details: Tracks individual translation jobs for each file and language
  3. Project Details: Manages the relationship with translation provider projects

This structure allows us to maintain a clear history of all translation activities while providing the flexibility to adapt to different translation providers in the future.

Error handling & reliability

Reliability was a top priority in our design. We implemented Dead Letter Topics (DLTs) with exponential backoffs for handling failures. After retrying with appropriate backoffs, any persistent failures are reported to our Slack channel for tech intervention.

The results: Speaking everyone's language

The impact of Rosetta has been transformative:

  1. Elimination of manual processes: No more copying and pasting between Google Sheets and our CMS—translations flow automatically through the system.
  2. Decoupled services: Content-serving services no longer need to understand translation workflows, reducing complexity and maintenance overhead.
  3. Improved scalability: We can now handle significantly more content and languages without performance issues or overwhelming our in-house linguists, as we can outsource the translation process to freelancers and better integrate AI in our workflows, while our in-house linguists focus on maintaining and improving the quality. This also has brought down the TAT from a couple of days to mere minutes in some cases.
  4. Better fault tolerance: Event-driven architecture means that temporary outages or slowdowns don't cause system-wide issues.
  5. Provider flexibility: The abstract design makes it easier to integrate additional translation providers in the future.

Most importantly, Rosetta has enabled us to provide more content in more languages, more quickly—directly improving the experience for our global users.

Lessons learned

Building Rosetta wasn't without challenges. Some key lessons we learned:

  1. State management is crucial: When dealing with asynchronous workflows across services, careful state management prevents lost updates and inconsistencies.
  2. Abstractions need boundaries: While we aimed to abstract away translation complexity, we found that too much abstraction can make debugging difficult. We had to strike the right balance.

What's next for Rosetta

Rosetta is just getting started. Our roadmap includes:

  1. Additional translation providers: Beyond XTM, we plan to integrate with other services like Google Translate, ChatGPT and an WIP in-house content management system for more options.
  2. Enhanced monitoring: We're building better observability and monitoring tools to track translation quality and performance.
  3. Automated quality checks: Implementing automated verification of translation quality.

Conclusion: Breaking barriers, building bridges

In today's global marketplace, language should never be a barrier to great experiences. With Rosetta, we've taken a major step toward making Headout truly accessible to travelers from around the world, regardless of what language they speak.

By abstracting away the complexities of translation and creating a flexible, scalable system, we've built more than just a microservice—we've built a bridge that connects our experiences to people worldwide. And in the travel industry, those connections are what really matter.

Dive into more stories