Loadmill

Simplify Test Automation with AI API-Driven Approach: Redefining Web, Mobile, and Load Testing. AN END-TO-END SOLUTION Harnessing AI Across Test Automation Lifecycle. Loadmill is the perfect solution for test automation.

Follow publication

Recording Production Traffic Using Service Workers

Yigal Dviri
Loadmill
Published in
6 min readOct 22, 2019

--

We’ve been building Loadmill’s testing platform for a while now. What started as a Load testing tool evolved and grew into an API testing platform and now, gained a new superpower of Automatic Regression Testing!

This came naturally and made a lot of sense — Since we enable developers to record and replay API tests in their QA environment, why not use this ability to create tests from real user behavior?

Automatic Regression Testing — how do you start this magic?:

Data powder: (!Spoiler alert! this was our main challenge)
This is a crucial ingredient: what is happening in production? What are your users actually doing? Their journeys and behavior are the keys for a meaningful regression test coverage.

A dash of the secret sauce:
Loadmill’s algorithm will process the data recorded from Prod and convert it to a Re-playable test scenario. We’ve got this!

A magic execution wand: A tool for running and analyzing the re-playable test scenarios. We’ve got that!

As you’ve probably noticed, obtaining the usage data is the MAIN challenge here, and this is where this post is headed toward!

For a couple of reasons, we decided to record the traffic on the client-side rather than the server. While researching ways for capturing network traffic on client-side, we started looking into Service Workers. When reading the description of Service Workers, it felt like it’s a perfect match for our needs:

Service workers essentially act as proxy servers that sit between web applications, the browser, and the network (when available). They are intended, among other things, to… …intercept network requests …

For those of you who aren’t familiar with Service Workers:

A service worker is a script that your browser runs in the background, separate from a web page…

We decided to use Service Workers and thought of the following architecture:

Loadmill service-worker architecture

This way, all the customer needs to do is add a script tag in its HTML file, and we’ll do the rest. Easy, right? Well… not so easy.

First challenge — Domain restriction

As shown in the last diagram, we want our customers to load (register in Service Workers jargon) our Service Worker on its website. However, you can’t load the Service Worker’s JS from another domain.

Furthermore, the Service Worker will only listen to ‘fetch’ events triggered in the same scope you have downloaded the Service Worker from. For example — if you registered your Service Worker from www.customer-domain/scopeA the Service Worker will intercept all the www.customer-domain/scopeA/* requests. Requests for www.customer-domain/scopeB/* won’t be intercepted!

If we have to host the Service Worker on the customer’s domain, they would have to deploy a new version during our integration and for each update to the Service Worker’s script file, and that’s not very elegant. So, somehow, we have to maintain a www.customer-domain/service-worker route without hosting it on the customer’s server. How can you do that?

CDN for the rescue

Let’s think about what component might be in the middle between the client and the server… CDN! We will intercept the request in their CDN, and if a request is sent for /service-worker, we will simply return the Service Worker script from our servers!

For this purpose, we created an App in Cloudflare’s App-store that adds a Cloudflare worker (very similar to the way service workers work) to our customer’s app that does the exact thing. Now our customers can add our Service Worker in just one click.

Registering a service worker from cross-origin URL

In case our customer doesn’t have a CDN (get one…), we ask for just one route which returns this simple line importScripts(“https://echo.loadmill.com/lm-worker-script”) that downloads our worker.

Adding our custom route to your application takes a few more minutes, but still, a pretty simple way to integrate our service worker.

The second challenge — AWS & XHR

We chose AWS Kinesis Firehose as the recorder end-point to which we are sending all user’s transactions (transaction = request + response). The easiest way to work with AWS should be the AWS SDK, right? Well… it’s true as much as saying that AWS console has a good UX.

Our main problem was that AWS SDK uses the old XMLHttpRequest web API, which is not available in the service worker scope (it was deemed deprecated by the service worker spec team). We had to write some shim code to replace the XHR object with fetch under the hoods. AWS will switch to newer fetch API in the next major version (v3), which is now in the developer preview stage, but it is unclear when exactly it is going to be released.

The third challenge — Test generation

Correlations between requests

In order to replay what took place in production, we need to detect the relationships between the requests the user has made. We have to do so because, in most cases, you can’t just replay what happened in production.

Let’s consider the following scenario:

A typical scenario in production

The user has made a POST request, got a new item id in the response, and then he made another GET using that id. Simple right? 2 requests — POST to /items and GET to /items/123. Let’s replay it in the staging environment then.

Replaying production scenario as is in stage

This simple example won’t work. You can’t just blindly request for /items/123 since you don’t have an item with 123 ID in the stage environment. What you need to do, is to understand the relationship between the two requests and extract it to a parameter that will be dynamically evaluated based on the actual response value and used by the next request — a correlation.

Detecting a correlation and extracting it to a parameter

User data obfuscation

Obviously, it’s a bad idea to store private user data (Security-wise, GDPR, and good manners). We hash all of it in an irreversible way that will still let us keep track of the correlation between requests. You can read more about it here.

Conclusion (my 2¢)

I won’t lie to you; writing a Service Worker was harder than I expected. There were different behaviors between different browser types, and while looking for documentation across the web, we ran into a lot of outdated/incomplete information. We found ourselves asking a StackOverflow question about some issues that seem trivial for the documentation but couldn’t find one.

I feel like the service-worker ecosystem is still on its bleeding-edge phase, and to be honest, I think that by now, an awesome tool like that should more developer-friendly. I know that many people are working hard to improve this. Hopefully, we’ll get there soon.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Published in Loadmill

Simplify Test Automation with AI API-Driven Approach: Redefining Web, Mobile, and Load Testing. AN END-TO-END SOLUTION Harnessing AI Across Test Automation Lifecycle. Loadmill is the perfect solution for test automation.

Written by Yigal Dviri

Co-founder & CTO of Loadmill. Author of “One Benny A Day” — a nonsense blog about soups 🥣.

No responses yet

Write a response