Low Latency Serverless Typeahead using CloudFlare Workers

Helping users find what they want fast is essential for a good user experience. Using a typeahead input box allows this. But to be effective it needs to be fast.

Typeahead search at www.signasl.org

The fastest solution would be to have all the search entries available locally on the browser. But for this website there is currently 41,046 entries, taking 508kb of data. (gzip compression helps, but would still require 167kb of data to be sent over the wire). Sending this amount of data is not only slow, but also costly. With over 200,000 monthly users that would result in nearly 100GB of egress charges (33GB if compressed). Unlike website images this data can not be aggressively cached. For the typeahead search to be most useful this list needs to be regularly updated, not only for the latest search terms but also its order. Commonly searched or trending words need to be given priority in the list over less frequently searched words. To do this I update the search terms JSON file every 20 minutes.

So how can we make the typeahead fast but not burden the end user with a large JSON file? The trick is to bring that JSON file as close to the end users and then only transmit the entries they need to see as and when they request it. To do this I will use CloudFlare Workers, allowing me to leverage CloudFlare's global network of data centres. A Worker allows for a script size up to 1MB. This allows the whole dataset of all search terms to be stored. This script is then deployed to their edge locations around the world. The Worker can then take the search query requests and then filter the data, returning just a few rows to be displayed. CloudFlare provides up to 100,000 Worker requests per day, meaning that this performant, low latency typeahead can be deployed for free.

Setup

Select Workers from the CloudFlare menu on the left. Then choose Create a Service

Enter a name and choose HTTP Handler

Next under Triggers setup the path that this worker should respond on. In my case this was www.signasl.org/wordsearch.php* so that it could replace the existing server side method. This will mean a CloudFlare Worker nearest to the user will respond to the request rather than it having to travel to the single server in eu-west-1 region. I also set the Request limit failure mode to Fail open (proceed) this way if I exceed the free tier allowance then the typeahead will continue to work by falling back to its original (albeit slower) functionality.

Next step is to automate the updating of the Worker script including the large JSON array of search data. To do this you need to make a note of your CloudFlare AccountId, the name of the Worker Script and create credentials here that have permission to update this Worker script.

For me, I already have an existing C# AWS Lambda Function that I use to calculate recently searched data to make the typeahead ordering more helpful. This function already outputs the search options as a JSON file to S3. So I will add in an API call to this existing function to make an update to the Worker script.

The basic CURL command is:

curl -X PUT "https://api.cloudflare.com/client/v4/accounts/YOUR-ACCOUNT-ID/workers/scripts/typeahead-asl" \
     -H "Authorization: Bearer YOUR-TOKEN-HERE" \
     -H "Content-Type: application/javascript" \
     --data "SCRIPT DATA HERE"

My C# code in part looks like:

var requestContent = new StringContent(GetWorkerScript(jsonString), Encoding.UTF8, "application/javascript");
client.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Bearer", CloudFlareWorkerBearerToken);
client.PutAsync($"https://api.cloudflare.com/client/v4/accounts/{CloudFlareAccountId}/workers/scripts/{scriptName}", requestContent).Result;

The GetWorkerScript function produces the JavaScript output:

var globalData = ["WORD","LIST","ARRAY"];

addEventListener("fetch", event => {
    event.respondWith(handleRequest(event.request))
})

async function handleRequest(request)
{
    const url = new URL(request.url);
    if (!url.pathname.startsWith('/wordsearch.php')) return fetch(request);

    const { pathname, searchParams} = new URL(request.url)
    const query = (searchParams.get('query') || '').toLowerCase();
    
    var data = globalData;
    if (query != '')
    {
        data = data.filter(word => word.toLowerCase().includes(query));

        data = data.sort((a, b) => {
            // Give priority to exact matches
            if (a.toLowerCase() == query)
            {
                return -1;
            }
            if (b.toLowerCase() == query)
            {
                return 1;
            }

            const aPos = a.toLowerCase().indexOf(query);
            const bPos = b.toLowerCase().indexOf(query);

            if (aPos < 0)
            {
                if (bPos < 0)
                {
                    return 0;
                }

                return 1;
            }

            if (aPos < bPos)
            {
                return -1;
            }
            else if (aPos > bPos)
            {
                return 1;
            }
            else
            {
                return a - b;
                return 0; // neutral.
            }

            return 0;
        });

        if (data.length > 30)
        {
            data.length = 30; // Return just a small number of results for this specific search.
        }
    }
    else
    {
        // If no search result then return a large number of results to initially populate the typeahead search.
        data = data.slice(0, 4000);
    }

    response = new Response(JSON.stringify(data));
    response.headers.set('content-type', 'application/json;charset=UTF-8');
    return response;
}

The first line contains a Global Variable that lists all the search words ordered by how commonly searched for the word is. This Global Variable persists between executions giving a super fast in memory cache. Making subsequent requests lightning fast.

Only one Workers instance runs on each of the many global Cloudflare edge servers. Each Workers instance can consume up to 128MB of memory. Use global variables to persist data between requests on individual nodes as a super fast cache. Note: As the global variable is shared across requests, you must make sure you do not edit its content as that would then affect subsequent requests.

A basic filter at the beginning only respond to the expected requests. Although it should never respond to other requests as I have specified a specific request in the route setup.

Then it uses the array filter function followed by a custom sort which allows me to prioritise searches where matches appear nearer the beginning of the word. If someone searches for ma they are much more likely to be looking for man or mankind rather than looking for animal. If the position of the search term in the word is the same, then do not change the ordering, which will mean the pre-ordered list of words will return the most commonly searched for words.

This sorted list is then JSON.stringify and returned as JSON. Giving the following results when the user has entered ma:

["ma", "make", "made", "magic", "man", "management", "Mac", "Maker", "maths", "maybe", "make up", "March", "market", "Macintosh", "mainstream", "Makaton", "Malaysia", "manager", "math", "magnet", "magnetic flux", "main", "mango", "many", "masking", "May", "magazine", "mammal", "managing director", "map"]

Giving me responses between 30-40ms. With none of these requests hitting the origin server. Globally performance will be limited not by how close they are to the origin server, but solely based upon how far they are from a CloudFlare edge server.

Summary of the Worker Requests
CPU Time per execution.
Daniel Mitchell

Daniel Mitchell