TL;DR
Babbel’s transition from its legacy learning content structure to the new, optimised one required the development of a dedicated content delivery service. This shift was necessary due to limitations in the old pipeline since it couldn’t support the desired improvements, like fine-grained learning objectives that allow users to use the language in real-life scenarios.
AWS Lambda was chosen for scalability and ease of management. Certain API design patterns and integration strategies ensured a smooth transition and integration with existing infrastructure. Robust error handling mechanisms, including Lambda retry strategies and third-party tools like Rollbar and PagerDuty, ensured reliability. Testing and quality assurance were crucial in the process. Future plans include performance optimisation and content expansion.
What Existed Before
When the user first logs in the app, Home is the landing page. It consolidates different features in one place. It provides a central experience, guiding the learner, helping them to discover our content and to mark progress with the information displayed in the form of cards, i.e. placement test, lessons, reviews, their completions, and other learning activities, e.g. a relevant Live class or a podcast, based on the learner’s needs. This page forms the user’s learning path. Note that learning path indicates the user’s current position within their chosen language syllabus; this is represented through both the Today, i.e. “Up next” widget, and Learning plan tabs, each serving a distinct purpose in the user’s learning journey.
In the beginning, different client platforms (web, iOS and android) were computing the path, driving the business logic for calling different services (content or progress APIs) independently, then putting all of these cards together. This compute cycle of requesting information and transforming the data would happen 3x times (once per platform) and occasionally resulted in performance issues, inconsistencies or even race conditions, thereby causing a delta in the overall experience as the business logic resided at three different places.
The prominent nature of the feature combined with a growing user base made it necessary to simplify all this into a separate service, i.e. backend-for-frontend design pattern, with the responsibility to help mobile and web platforms retrieve the information they need to display the user’s path in a consistent manner (see Figure 1).
In the AWS ecosystem, we employed a Lambda function, integrated the necessary dependencies, i.e. legacy content endpoints, wrote an API that implements learning paths, then exposed it through api gateway, eventually allowing mobile and web clients to consume from it. This service has been launched and maintained for some years now enabling our team to build and test features, to run experiments for validating certain hypotheses and ultimately to serve Babbel’s lesson content, i.e. lesson
type activities – the backbone of our content, with everything else helping us to personalise and vary the experience. All this given a content which is structured in a specific course-based fashion, where for a given language combination, a course overview contains different courses, and a course enumerates most-of-the-times an extensive list of progressive lessons, with each course ranging from A1 to C1 CEFR levels (see Figure 2). This structure in the context of modern language learning could be perceived as somewhat overwhelming.
Why We Did Not Want to Make Changes to Existing Services/Our Source of Information Changed
To improve the way we guide users and lead them towards success, we have conducted several discovery cycles to validate the desirability of a new structure. Our goal was to present our learning material in more manageable portions, meaningfully organised by learning objectives, and to create shorter milestones aiming at a clearer sense of progress for maintaining motivation. As a user-facing team tasked with the discovery and delivery of end-to-end features primarily in the app’s home page to help users form a language learning habit, this proved to be an intriguing challenge.
The decision to embark on a new initiative was driven by the desire to provide richer learning experiences to our learners. The existing course-based structure and the services providing the legacy content were not suited for this purpose due to technical limitations and their inability to accommodate the new learning experience we wanted to bring forward – they support only learning activities of type lesson
. At this point, we should take into account that we continuously optimise our content based on user feedback. This is why it was important to design a solution that would enable lesson editing and versioning (e.g. splitting lesson X to lesson Y and Z) by our learning content teams or maintaining the pointers of a specific L2 lesson for all L1 languages, where L2 is the language being learned by users and L1 the user’s display language, can prove to be cumbersome.
Additionally, since the content is crafted and carefully curated by didactic experts, we are accounting for a process that demands considerable time and resources, especially when lessons from 16 languages ranging from newcomer to advanced levels need to be optimised. Therefore, modifications to the existing services were not practical or feasible to achieve these goals. At the same time, factoring the time span needed to develop the new content structure for a given language, i.e. division of courses into shorter sets of learning activities with a certain learning objective – what we call at Babbel a unit
, required a clear case of a new data model (see Figure 3).
At this point we knew that we will have to support the old and new content structures, simultaneously. Therefore, decoupling these responsibilities to different services adhering to different data models was our best bet: rolling out to certain eligible users an improved learning experience (see Figure 4), as it becomes available and gracefully serving the older content for languages or proficiency levels that have not been optimised yet; at the same time preserving the capability to roll something back to iterate on it was part of our strategy. Considering these limitations and the fact that mobile upgrades, where the initial release occurred, happen at regular intervals, limiting existing users on older app versions to legacy content only, it was crucial to enable an experience that seamlessly transitions between old and new content structure.
Modelling a New Service
Below, we discuss how we approached the development of our new service within the constraints of the allocated time and resources. It’s important to note that the technical report that follows intends to inform rather than dictate best practices and patterns. Despite certain limitations, we strived to make the most given what we had available.
Why We “Stayed” with AWS Lambda
We used AWS Lambda to power our new service for three main reasons:
- Scalability: AWS Lambda automatically scales with incoming workloads, making it perfect for services serving a growing user base with fluctuating traffic. No manual intervention needed; it keeps things running smoothly.
- Simplified Management: With serverless architectures like Lambda, we abstract away server and infrastructure management, letting us focus on the application and business logic, while minimising the infrastructure fuss.
- Easy Deployment: Lambda enables deploying new code versions easily. At Babbel, we’ve crafted internal tools to manage Lambda versions in staging and production environments, ensuring agility for frequent updates and feature additions.
API Design Patterns
When introducing a new service, particularly one that supports user-facing features, it is crucial to consider a seamless integration with the existing infrastructure and the management of dependencies from services we consume data. Below, we list the API design patterns we followed.
Microservice Architecture & Data Storage
We adopted a microservice architectural pattern for the functionalities our service performs. Specifically, to handle routing traffic to older content when optimised content for specific CEFR levels and language combinations is not available, we delegate to a few Lambdas several business logic tasks: from determining the user’s proficiency level and preferred learning language, then storing their eligibility status in an AWS DynamoDB table, to things like fetching optimised content from content endpoints -if eligible- and transforming it into the agreed-upon API contract for mobile apps and web consumers. It’s also important to note that the new service redirects traffic to legacy content managed by an older service if user eligibility criteria are not met (see Figure 5).
API Gateway, Versioning & Documentation
We leverage API Gateway to manage and expose our API, acting as a single entry point for clients to interact with the new service. API Gateway directs requests to specific Lambda functions based on the requested API version and method. Establishing this connection essentially tells API Gateway which Lambda function to invoke when a specific version resource, where for API version 1 the resource path to the endpoint would be /v1/resource
and method, e.g. GET
, are requested. For addressing future changes, we plan to maintain forward compatibility by including older properties in the API response and making new properties optional. Across engineering at Babbel, we maintain all services’ documentation in GitHub repos and update them as needed to reflect different versions and resource paths.
Logging, Monitoring & Observability
Following the service deployment, we utilised AWS X-ray to trace segments and identify bottlenecks in incoming requests. To gain insight into the service’s performance and usage, and troubleshoot issues, we put together dashboards powered by AWS CloudWatch for logging and monitoring, including metrics such as p95s and p99s (timings of 95% and 99% of requests).
Integration of Dependencies into Existing Infrastructure
As mentioned, at Babbel we document our existing endpoints, including content and user progress tracking, along with their respective methods. This guide allowed us to identify the dependencies necessary for the new service and incorporate them using RESTful APIs. To ensure error handling and exception management, we implemented the following strategies:
- Lambda Retry Mechanisms: We have established Lambda retry mechanisms to handle transient failures and retries when communicating with these dependencies.
- Integration of Rollbar: We have integrated a Rollbar project into our workflow to log errors and exceptions promptly.
- Enabling PagerDuty Alerts: To address multiple erroneous occurrences within short timeframes or new types of errors, we have enabled PagerDuty alerts.
- Fallbacks: In cases where an upstream service experiences timeouts or other issues, we have implemented fallback mechanisms as fail-safes. This allows us to return default values, ensuring that our pipeline remains functional even in the face of dependency-related failures.
Testing & Quality Assurance (QA) Processes
During development, we followed test-driven principles focusing on robust unit tests coverage. We integrated CI/CD using GitHub Actions and used CodeClimate to maintain code quality. After delivering optimised content to mobile/web consumers, we launched a bug-fixing sprint, working closely with our QA Analyst to address minor issues and make enhancements. To streamline testing, we temporarily disabled certain user validations and conducted a comprehensive end-to-end QA feedback loop where our QA team explored different places of the product to identify potential implications by the new feature.
Future directions
Some initial ideas to improve the user experience in their language learning journey include topics like:
- Performance: We want to reduce latency and improve response times. One way to do this is by implementing caching for a set period. We’re also looking into optimising our code quality and execution for faster API responses on small refactoring cycles as per CodeClimate reports.
- Coverage: We consider this feature crucial, so we’re building integration tests to ensure smooth dependency management. Performance testing will also ensure our service meets expected response times and can handle increased traffic as we introduce more optimised content for different CEFR levels and language combinations.
- Content Expansion: We have already released optimised content to new users learning Spanish & Mexican Spanish. Soon, we’ll introduce it to more users, including existing users of all levels, in more language combinations. As more levels and language combinations become available, we’ll add them to our system.
Header Photo by Martin Sanchez on Unsplash.