This is the story of how we set up our infrastructure for painless scale.
Banners, banners everywhere!
We help companies get in compliance with GDPR and ePrivacy, two European regulations with global outreach. Among a lot of new obligations, these two regulations force companies to collect consent before collecting and processing personal user information.
When we are deployed on a website, our job is to make sure that the user consent is collected the right way and then shared with all third-parties that will get access to user data (advertising, analytics, CRM, etc.) to ensure that the whole chain is in compliance.
Your typical banner to collect user consent will look like this:
We have different formats and variations to make sure that our banners fit in and we try to keep them short and to the point while compliant with the regulation.
Our clients are deploying our SDK (a standard
<script> tag that we host) to all their websites and it gets loaded on all pages. That leads to a few different constraints:
- Our SDK needs to run in environments that we do not control at all. That means not polluting the DOM, being able to run with a lot of libraries, etc. And that’s not always easy, especially considering that our SDK does a lot more than simply collecting data like an analytics script would: we display widgets, expose a public API on the page, etc.
- We need to be fast. Because we need to collect consent before the website or third-parties can start collecting data or setting cookies, we must load and execute as fast as possible to make sure that we do not delay other operations on the page.
- The scale of our operations grows pretty fast as new clients are deploying us to their websites that already have their own developed audience.
Looking at these constraints, we knew we wanted something that would rely on very scalable services that could help us grow.
How did we optimize our SDK distribution?
Most of the complexity comes from its distribution: how to get it to the user in the most efficient way (both from a speed and cost perspective) and how to get data back to our servers.
We had three guiding rules while building out our infrastructure.
1) Only load what’s necessary
We do lazy loading to ensure that we only load the modules that are relevant to the end user and website. For instance, we only load the code for the type of banners used by a given website and avoid loading banners that are not used by the current website that we are running on.
2) Stay close to the end user
When serving compressed static resources, the only way to make the HTTP requests faster is to reduce the distance that they have to travel. If your servers are located in Europe and users are in the US, there is only so much you can achieve in terms of latency. So we use a CDN (CloudFront) to make sure that the end user accesses servers that are located close to him or her.
We also make sure that the resources served by the CDN are replicated in different datacenters both for reliability and speed purposes.
3) Minimize HTTP requests
HTTP requests are fundamentally slow and it is even worse on mobile so minimizing the number of requests that our SDK sends was a pretty important goal.
The other element that was important to us was bundling HTTP requests together when possible. For instance, our SDK needs the country of the user which is determined from its IP address. That could be done with an HTTP request sent by the SDK to an API server but that’s an extra round-trip that we wanted to avoid. Instead, we are using Lambda@Edge to modify the contents of the SDK on-the-fly and inject the country of the user in it. We then cache one version of the SDK per country.
Our final infrastructure is the following:
These are some of the optimizations that we added to make sure that our SDK could be loaded as fast as possible. We also spent time setting up a scalable architecture for our API servers that store consents and various events. We’ll talk about that in a separate post!