Tag Archives: Cache

Cloudflare Integrations Marketplace introduces three new partners: Sentry, Momento and Turso

Post Syndicated from Tanushree Sharma original http://blog.cloudflare.com/cloudflare-integrations-marketplace-new-partners-sentry-momento-turso/

Cloudflare Integrations Marketplace introduces three new partners: Sentry, Momento and Turso

Cloudflare Integrations Marketplace introduces three new partners: Sentry, Momento and Turso

Building modern full-stack applications requires connecting to many hosted third party services, from observability platforms to databases and more. All too often, this means spending time doing busywork, managing credentials and writing glue code just to get started. This is why we’re building out the Cloudflare Integrations Marketplace to allow developers to easily discover, configure and deploy products to use with Workers.

Earlier this year, we introduced integrations with Supabase, PlanetScale, Neon and Upstash. Today, we are thrilled to introduce our newest additions to Cloudflare’s Integrations Marketplace – Sentry, Turso and Momento.

Let's take a closer look at some of the exciting integration providers that are now part of the Workers Integration Marketplace.

Improve performance and reliability by connecting Workers to Sentry

When your Worker encounters an error you want to know what happened and exactly what line of code triggered it. Sentry is an application monitoring platform that helps developers identify and resolve issues in real-time.

The Workers and Sentry integration automatically sends errors, exceptions and console.log() messages from your Worker to Sentry with no code changes required. Here’s how it works:

  1. You enable the integration from the Cloudflare Dashboard.
  2. The credentials from the Sentry project of your choice are automatically added to your Worker.
  3. You can configure sampling to control the volume of events you want sent to Sentry. This includes selecting the sample rate for different status codes and exceptions.
  4. Cloudflare deploys a Tail Worker behind the scenes that contains all the logic needed to capture and send data to Sentry.
  5. Like magic, errors, exceptions, and log messages are automatically sent to your Sentry project.

In the future, we’ll be improving this integration by adding support for uploading source maps and stack traces so that you can pinpoint exactly which line of your code caused the issue. We’ll also be tying in Workers deployments with Sentry releases to correlate new versions of your Worker with events in Sentry that help pinpoint problematic deployments. Check out our developer documentation for more information.

Develop at the Data Edge with Turso + Workers

Turso is an edge-hosted, distributed database based on libSQL, an open-source fork of SQLite. Turso focuses on providing a global service that minimizes query latency (and thus, application latency!). It’s perfect for use with Cloudflare Workers – both compute and data are served close to users.

Turso follows the model of having one primary database with replicas that are located globally, close to users. Turso automatically routes requests to a replica closest to where the Worker was invoked. This model works very efficiently for read heavy applications since read requests can be served globally. If you’re running an application that has heavy write workloads, or want to cut down on replication costs, you can run Turso with just the primary instance and use Smart Placement to speed up queries.

The Turso and Workers integration automatically pulls in Turso API credentials and adds them as secrets to your Worker, so that you can start using Turso by simply establishing a connection using the libsql SDK. Get started with the Turso and Workers Integration today by heading to our developer documentation.

Cache responses from data stores with Momento

Momento Cache is a low latency serverless caching solution that can be used on top of relational databases, key-value databases or object stores to get faster load times and better performance. Momento abstracts details like scaling, warming and replication so that users can deploy cache in a matter of minutes.

The Momento and Workers integration automatically pulls in your Momento API key using an OAuth2 flow. The Momento API key is added as a secret in Workers and, from there, you can start using the Momento SDK in Workers. Head to our developer documentation to learn more and use the Momento and Workers integration!

Try integrations out today

We want to give you back time, so that you can focus less on configuring and connecting third party tools to Workers and spend more time building. We’re excited to see what you build with integrations. Share your projects with us on Twitter (@CloudflareDev) and stay tuned for more exciting updates as we continue to grow our Integrations Marketplace!

If you would like to build an integration with Cloudflare Workers, fill out the integration request form and we’ll be in touch.

Workers KV is faster than ever with a new architecture

Post Syndicated from Charles Burnett original http://blog.cloudflare.com/faster-workers-kv-architecture/

Workers KV is faster than ever with a new architecture

Workers KV is faster than ever with a new architecture

We’re excited to announce a significant performance improvement coming to Workers KV, focused on dramatically improving cold read performance and reducing latency, even for long tail access patterns.

Developers using KV have seen great performance on hot reads, but ask why their 95th percentile latency — often on a key (or set of keys) that hadn’t been accessed recently or in that region — was higher than expected. We took this feedback to heart: we’ve been working feverishly on a new caching layer for KV behind the scenes, which enables customers to achieve much more frequent hot reads, reduced worst case latency times, better flexibility and control over cache TTLs, and much faster consistency over our previous iterations, and it’s now live for all KV users.

The best part? Developers using KV don’t need to change anything to benefit from this increased performance.

What is Workers KV?

Workers KV is a key value store designed for read heavy use-cases and applications powered by Cloudflare’s network. KV’s focus on read-heavy use-cases allows it to serve hot (cached) reads in milliseconds, which makes it ideal for storing per-application or customer configuration data, routing configuration, multivariate (A/B testing) configurations, and even small asset data that you need to serve quickly.  Anything that you can serialize and need quickly you can store in KV, all the way up to 25 MiB worth of data per each individual key, with no cap on total data stored.

The problem

KV might be optimized for read-heavy workloads, but it’s critical that writes are globally available quickly enough that they’re useful for your application. Under typical conditions, the convergence delay for an eventually consistent system like KV is approximately one minute, globally: a write from one location should be able to be observed by all readers. Typical conditions are great, but typical unfortunately didn’t mean “always”. It could take significant time to restore global consistency where regions like North America and Europe are reading the same value. We needed to improve not just the average convergence, but the worst case as well.

Speaking of consistency, setting a long cache Time to Live (cacheTTL) for reads would result in a situation where you won’t notice a write for the entire cacheTTL duration, as the existing cached data had not timed out yet. This means you have to trade off read latency for infrequently accessed keys against noticing writes. Developers using KV have been consistent in their feedback: a higher cache TTL should improve performance, but not multiply the time it takes for KV to converge on a write to that key.

Lastly, our cold read times also left room for improvement. While cache hits are fast in KV, a cache miss would result in a request being routed all the way to our storage backends. While this is slow for everyone, it was particularly slow for folks in regions not immediately in the US or EU.This is poor performance that doesn’t represent what we can achieve with our global presence.

Our solution

A new horizontally scaled tiered cache

We’ve revamped Workers KV to be powered by a new tiered cache implementation. This implementation is written as a Worker service. We reuse the Dynamic Dispatch infrastructure developed for Workers for Platforms which lets us jump from our old KV worker into our new caching service within hundreds of microseconds. Importantly, this means we don’t impact cache hit performance to implement this new transparent caching layer. We leverage the same infrastructure powering Smart Placement to implement the tiering.

Before we re-designed KV, our topology looked like this:

Workers KV is faster than ever with a new architecture
All data centers in Cloudflare’s network can reach out to the origin in the event of a cache miss or to do a background refresh.

Cache TTL and efficiency

Our design goal was clear and ambitious: “can we relax honoring the cacheTTL constraint without violating it”? While this seems contradictory, the motivation is clear: we want to minimize the need to communicate with our storage backends while honoring the user-facing semantics of the cacheTTL setting, as it can have security implications if violated (e.g. if you use it to store and validate security tokens). Answering this design question also manages to simultaneously solve many of the problems outlined earlier.

Comparing existing solutions

First, let’s look at the design constraints for two eventually consistent storage systems at Cloudflare: Quicksilver and Tiered CDN.

Quicksilver gives us global consistency within seconds using a push architecture to replicate the data across all machines at Cloudflare. That architecture however doesn’t scale for Workers KV’s needs, which can have terabytes of data just within one namespace. This would be too much to replicate to every single data center.

By comparison, the tiered CDN cache is a pull mechanism where each hop pulls a more recent version of the asset into the local cache on access. That scales better because we only use storage for assets that are accessed, which works well with most use-cases where the vast majority of data is never retrieved. However, a pull based architecture is insufficient because it can only let us aggregate traffic across broader regions but we still can’t decouple how long we serve from the cache from the cacheTTL.

Push based architectures let us know when an asset is updated and enable scalable storage. By blending the properties of both systems, we can decouple how long we store the assets in cache from the cacheTTL. And that’s exactly what we did: KV now uses a hybrid push/pull caching layer where data centers closest to customers will pull from the regional data centers that are a little bit farther away. Writes will broadcast to all regional data centers that a key has been updated, so that the regional data center will remove that key from the local cache.

Workers KV is faster than ever with a new architecture
Traditional regional tiered cache topology

We can solve this problem by taking advantage of the fact that we semantically understand the write operations that are happening within Workers KV:

  1. Workers KV doesn’t have one data center per region as might be typical for your zone in a Cloudflare CDN regional tiered cache topology. Instead, each key in a KV namespace is deterministically assigned a data center by performing a weighted rendezvous hash. The rendezvous hash ensures that load is distributed equally across the region and outages result in optimal shifts of traffic.
  2. When the data center closest to a customer has a miss, it computes the regional data center affinity and provides that information to our Smart Placement infrastructure. When a regional tier misses, we repeat this process except using data centers in the KV origin region.
  3. Finally, a miss at the upper tier exits to our storage nodes located in that origin region.

When we do a write, we only purge (invalidate) the key from the regional and upper tier data centers. This is a fixed number of data centers in our network regardless of how many data centers we add, which ensures that we aren’t reducing cache hit rates as our network continues to grow Compared with a global purge that delivers the event to every data center in our network, because we only need to deliver this purge to a random fixed set of data centers in our network, our aggregate write capacity for Workers KV automatically scales horizontally as we add more data centers.

Workers KV is faster than ever with a new architecture
All lower-tier data centers will reach out to a regional tier responsible for a given key in the event of a cache miss. If the regional tier doesn’t have the content, the regional tier will then ask an upper-tier out of region for the content. On a write for a given key, the responsible regional and upper tiers have that key deleted from cache.

Why do we call this a hybrid topology? The data centers closest to customers pull from the regional data centers as normal, but we automatically push invalidation events to the regional tier data centers on every write. That way, those customer data center pulls know to get an updated value when there is one. This means that while the cacheTTL parameter controls the caching behavior closest to the customer, it’s treated as a suggestion at best at the regional and upper tiers.

This way we’ve combined the push design principles behind Quicksilver, which delivers global consistency within seconds, with the pull-based design of our CDN tiered caching which can scale to handle “infinite” size workloads and prioritizes the assets that are most frequently accessed.

Visualizing it

It can be a bit hard to follow what’s happening in the new caching layer since there’s so many moving parts.

Here’s a video of a simplified version of how it works:

Small yellow balls represent KV read requests, larger green balls represent read responses. A larger purple ball represents a KV write request, while a read response ball represents a KV write response. Teal balls represent purge requests being broadcast. The “E” is a data center that doesn’t participate as a regional tier. The R represents the regional tier for key N while O is the upper tier for key N.

Decoupled cache TTL and consistency parameters

As a refresher, the objects written to KV can specify a cacheTTL: by default this is set to 1 minute, which is also the minimum acceptable value. This means that if an asset has been in the cache for longer than a minute, we bypass the cache and read instead from our durable storage nodes. In order to prevent eyeballs noticing origin fetches every minute, we implement stale while revalidate logic in our caching layer that automatically refreshes from the storage nodes in the background as requests come in.

Workers KV is faster than ever with a new architecture
Here’s an example from a Worker that’s constantly reading the same key

Notice the absence of any spikes indicating a cache miss? You’d expect to see them regularly every minute or so in the tens or even hundreds of milliseconds when the cacheTTL should expire. The reason this doesn’t happen is because as the expiry time is approaching, a background request to the storage nodes occurs and the cache is updated with an expiry time one more minute into the future; thus the asset in cache is never too stale and eyeball requests are always served from cache. Let’s take a look at requests to our storage layer before and after adding tiering:

Workers KV is faster than ever with a new architecture
Yellow is the estimated number of requests that would have occurred to origin without the new caching layer. Blue is the number of requests we’re making now.

The above chart is for a system with conservative parameters set. The upper tier doesn’t store the data for much longer than the cacheTTL currently and the upper tier will itself still do a background refresh probabilistically even though it doesn’t actually need to since we see all writes.

The new caching layer we’ve built inherits the old background refresh mechanism and expands on it. The first thing we did is decouple the background refresh period from the cacheTTL as a separate parameter (also defaulting to 1 minute). This means that even if you set a cacheTTL for 1 hour, KV will still check every minute from the regional tier to see if the value has been updated. If the data you’re storing within KV doesn’t have strict requirements on stale reads (think a key that’s accessed once every 10 minutes but needs to honor a write within 1 minute like security tokens), then you can increase the cacheTTL so that infrequently accessed keys stick around in the cache without changing the observed consistency.

Consistency improvements

Speaking of consistency, we’ve improved the worst case performance of that as well. Historically, we’ve had a background system that crawls all data in the storage nodes to figure out which region has the most up to date value and update accordingly. This gives us complete consistency coverage, but could take a significant amount of time to confirm. We would also periodically check both backends to see if network conditions had changed to pick the primary storage region to use for a given customer-close data center. Of course inconsistencies would be resolved then, but in practice this happens randomly, and at a low probability that won’t typically catch any meaningful values served inconsistently.

With the new caching layer all this changes. Since we’re now only reading keys on first access or after a write, we have enough storage capacity that we can check both backends on every read. When a customer requests data, we make a call to each origin data center, with the fastest response being returned immediately to reduce read latency. If the other data center has a newer value than what was returned first, we synchronize both data centers and notify our caching layer to purge that key from all regional data centers. If the other data center instead has an older value, we just synchronize the data centers without purging since we served the latest value. This means that even if our data centers are inconsistent, readers will notice new values much more quickly.

Latency improvements

Here’s the latency improvement at 10% rollout on a logarithmic x-axis:

Workers KV is faster than ever with a new architecture

Architecture that just gets better

This is just the start of what we can do. We now have a solid foundation for making further improvements, including making our best case reads even faster. We’ll be working on cutting out parts of our traditional stack that add unnecessary latency, and adding new high performance features that were too difficult to integrate otherwise. We can also explore features like setting the consistency TTL parameter for sub one minute consistency for additional cost. Similarly, we could create a best effort global purge feature if you want to choose to signal writes that way. Finally, we’re looking at exposing this new caching layer as a general Worker binding anyone can use within a Worker in front of their own service or to put in front of their Worker. If these sound like the killer features you need, please reach out to us if you’re interested in trying them out.

What next?

Developers don’t have to do anything to benefit from KV’s new performance improvements. We are currently in the process of rolling out our new architecture, and you don’t have to redeploy your Worker or change the way you use KV to benefit.

Workers KV is a natural fit for any application built on top of our Workers platform. We provide a native API that enables any Worker script to read, write, list, and manageyour Workers KV storage. You can also interact with Workers KV directly via our REST API from any client that can make a HTTP request, and the Cloudflare Dashboard provides an easy way to create, list, and delete keys to be used with the rest of your Workers setup.

Regardless of how you use Workers KV, it will be faster than ever before. We’re excited to see what you build with us, and you can dive into our documentation to start building with it.

Reduce latency and increase cache hits with Regional Tiered Cache

Post Syndicated from Alex Krivit original http://blog.cloudflare.com/introducing-regional-tiered-cache/

Reduce latency and increase cache hits with Regional Tiered Cache

Reduce latency and increase cache hits with Regional Tiered Cache

Today we’re excited to announce an update to our Tiered Cache offering: Regional Tiered Cache.

Tiered Cache allows customers to organize Cloudflare data centers into tiers so that only some “upper-tier” data centers can request content from an origin server, and then send content to “lower-tiers” closer to visitors. Tiered Cache helps content load faster for visitors, makes it cheaper to serve, and reduces origin resource consumption.

Regional Tiered Cache provides an additional layer of caching for Enterprise customers who have a global traffic footprint and want to serve content faster by avoiding network latency when there is a cache miss in a lower-tier, resulting in an upper-tier fetch in a data center located far away. In our trials, customers who have enabled Regional Tiered Cache have seen a 50-100ms improvement in tail cache hit response times from Cloudflare’s CDN.

What problem does Tiered Cache help solve?

First, a quick refresher on caching: a request for content is initiated from a visitor on their phone or computer. This request is generally routed to the closest Cloudflare data center. When the request arrives, we look to see if we have the content cached to respond to that request with. If it’s not in cache (it’s a miss), Cloudflare data centers must contact the origin server to get a new copy of the content.

Getting content from an origin server suffers from two issues: latency and increased origin egress and load.

Latency

Origin servers, where content is hosted, can be far away from visitors. This is especially true the more global of an audience a particular piece of content has relative to where the origin is located. This means that content hosted in New York can be served in dramatically different amounts of time for visitors in London, Tokyo, and Cape Town. The farther away from New York a visitor is, the longer they must wait before the content is returned. Serving content from cache helps provide a uniform experience to all of these visitors because the content is served from a data center that’s close.

Origin load

Even when using a CDN, many different visitors can be interacting with different data centers around the world and each data center, without the content visitors are requesting, will need to reach out to the origin for a copy. This can cost customers money because of egress fees origins charge for sending traffic to Cloudflare, and it places needless load on the origin by opening multiple connections for the same content, just headed to different data centers.

Reduce latency and increase cache hits with Regional Tiered Cache
When Tiered Cache is not enabled, all data centers in Cloudflare’s network can reach out to the origin in the event of a cache miss.

Performance improvements and origin load reductions are the promise of tiered cache.

Tiered Caching means that instead of every data center reaching out to the origin when there is a cache miss, the lower-tier data center that is closest to the visitor will reach out to a larger upper-tier data center to see if it has the requested content cached before the upper-tier asks the origin for the content. Organizing Cloudflare’s data centers into tiers means that fewer requests will make it back to the origin for the same content, preserving origin resources, reducing load, and saving the customer money in egress fees.

What options are there to maximize the benefits of tiered caching?

Cloudflare customers are given access to different Tiered Cache topologies based on their plan level. There are currently two predefined Tiered Cache topologies to select from – Smart and Generic Global. If either of those don’t work for a particular customer’s traffic profile, Enterprise customers can also work with us to define a custom topology.

In 2021, we announced that we’d allow all plans to access Smart Tiered Cache. Smart Tiered Cache dynamically finds the single closest data center to a customer’s origin server and chooses that as the upper-tier that all lower-tier data centers reach out to in the event of a cache miss. All other data centers go through that single upper-tier for content and that data center is the only one that can reach out to the origin. This helps to drastically boost cache hit ratios and reduces the connections to the origin. However, this topology can come at the cost of increased latency for visitors that are farther away from that single upper-tier.

Reduce latency and increase cache hits with Regional Tiered Cache
When Smart Tiered Cache is enabled, a single upper tier data center can communicate with the origin, helping to conserve origin resources.

Enterprise customers may select additional tiered cache topologies like the Generic Global topology which allows all of Cloudflare’s large data centers on our network (about 40 data centers) to serve as upper-tiers. While this topology may help reduce the long tail latencies for far-away visitors, it does so at the cost of increased connections and load on a customer's origin.

Reduce latency and increase cache hits with Regional Tiered Cache
When Generic Global Tiered Cache is enabled, lower-tier data centers are mapped to all upper-tier data centers in Cloudflare’s network which can all reach out to the origin in the event of a cache miss. 

To describe the latency problem with Smart Tiered Cache in more detail let’s use an example. Suppose the upper-tier data center is selected to be in New York using Smart Tiered Cache. The traffic profile for the website with the New York upper-tier is relatively global. Visitors are coming from London, Tokyo, and Cape Town. For every cache miss in a lower-tier it will need to reach out to the New York upper-tier for content. This means these requests from Tokyo will need to traverse the Pacific Ocean and most of the continental United States to check the New York upper-tier cache. Then turn around and go all the way back to Tokyo. This is a giant performance hit for visitors outside the US for the sake of improving origin resource load.

Regional Tiered Cache brings the best of both worlds

With Regional Tiered Cache we introduce a middle-tier in each region around the world. When a lower-tier fetches on a cache miss it tries the regional-tier first if the upper-tier is in a different region. If the regional-tier does not have the asset then it asks the upper-tier for it. On the response the regional-tier writes to its cache so other lower-tiers in the same region benefit.

By putting an additional tier in the same region as the lower-tier, there’s an increased chance that the content will be available in the region before heading to a far-away upper-tier. This can drastically improve the performance of assets while still reducing the number of connections that will eventually need to be made to the customer’s origin.

Reduce latency and increase cache hits with Regional Tiered Cache
When Regional Tiered Cache is enabled, all lower-tier data centers will reach out to a regional tier close to them in the event of a cache miss. If the regional tier doesn’t have the content, the regional tier will then ask an upper-tier out of region for the content. This can help improve latency for Smart and Custom Tiered Cache topologies.

Who will benefit from regional tiered cache?

Regional Tiered Cache helps customers with Smart Tiered Cache or a Custom Tiered Cache topology with upper-tiers in one or two regions. Regional Tiered Cache is not beneficial for customers with many upper-tiers in many regions like Generic Global Tiered Cache .

How to enable Regional Tiered Cache

Enterprise customers can enable Regional Tiered Cache via the Cloudflare Dashboard or the API:

UI

  • To enable Regional Tiered Cache, simply sign in to your account and select your website
  • Navigate to the Cache Tab of the dashboard, and select the Tiered Cache Section
  • If you have Smart or Custom Tiered Cache Topology Selected, you should have the ability to choose Regional Tiered Cache
Reduce latency and increase cache hits with Regional Tiered Cache

API

Please see the documentation for detailed information about how to configure Regional Tiered Cache from the API.

GET

curl --request GET \
 --url https://api.cloudflare.com/client/v4/zones/zone_identifier/cache/regional_tiered_cache \
 --header 'Content-Type: application/json' \
 --header 'X-Auth-Email: '

PATCH

curl --request PATCH \
 --url https://api.cloudflare.com/client/v4/zones/zone_identifier/cache/regional_tiered_cache \
 --header 'Content-Type: application/json' \
 --header 'X-Auth-Email: ' \
 --data '{
 "value": "on"
}'

Try Regional Tiered Cache out today!

Regional Tiered Cache is the first of many planned improvements to Cloudflare’s Tiered Cache offering which are currently in development. We look forward to hearing what you think about Regional Tiered Cache, and if you’re interested in helping us improve our CDN, we’re hiring.

Introducing Cache Rules: precision caching at your fingertips

Post Syndicated from Alex Krivit original https://blog.cloudflare.com/introducing-cache-rules/

Introducing Cache Rules: precision caching at your fingertips

Introducing Cache Rules: precision caching at your fingertips

Ten years ago, in 2012, we released a product that put “a powerful new set of tools” in the hands of Cloudflare customers, allowing website owners to control how Cloudflare would cache, apply security controls, manipulate headers, implement redirects, and more on any page of their website. This product is called Page Rules and since its introduction, it has grown substantially in terms of popularity and functionality.

Page Rules are a common choice for customers that want to have fine-grained control over how Cloudflare should cache their content. There are more than 3.5 million caching Page Rules currently deployed that help websites customize their content. We have spent the last ten years learning how customers use those rules to cache content, and it’s clear the time is ripe for evolving rules-based caching on Cloudflare. This evolution will allow for greater flexibility in caching different types of content through additional rule configurability, while providing more visibility into when and how different rules interact across Cloudflare’s ecosystem.

Today, we’ve announced that Page Rules will be re-imagined into four product-specific rule sets: Origin Rules, Cache Rules, Configuration Rules, and Redirect Rules.

In this blog we’re going to discuss Cache Rules, and how we’re applying ten years of product iteration and learning from Page Rules to give you the tools and options to best optimize your cache.

Activating Page Rules, then and now

Adding a Page Rule is very simple: users either make an API call or navigate to the dashboard, enter a full or wildcard URL pattern (e.g. example.com/images/scr1.png or example.com/images/scr*), and tell us which actions to perform when we see that pattern. For example a Page Rule could tell browsers– keep a copy of the response longer via “Browser Cache TTL”, or tell our cache that via “Edge Cache TTL”. Low effort, high impact. All this is accomplished without fighting origin configuration or writing a single line of code.

Under the hood, a lot is happening to make that rule scale: we turn every rule condition into regexes, matching them against the tens of millions of requests per second across 275+ data centers globally. The compute necessary to process and apply new values on the fly across the globe is immense and corresponds directly to the number of rules we are able to offer to users. By moving cache actions from Page Rules to Cache Rules we can allow for users to not only set more rules, but also to trigger these rules more precisely.

More than a URL

Users of Page Rules are limited to specific URLs or URL patterns to define how browsers or Cloudflare cache their websites files. Cache Rules allows users to set caching behavior on additional criteria, such as the HTTP request headers or the requested file type. Users can continue to match on the requested URL also, as used in our Page Rules example earlier. With Cache Rules, users can now define this behavior on one or more fields available.

For example, if a user wanted to specify cache behavior for all image/png content-types, it’s now as simple as pushing a few buttons in the UI or writing a small expression in the API. Cache Rules give users precise control over when and how Cloudflare and browsers cache their content. Cache Rules allow for rules to be triggered on request header values that can be simply defined like

any(http.request.headers["content-type"][*] == "image/png")

Which triggers the Cache Rule to be applied to all image/png media types. Additionally, users may also leverage other request headers like cookie values, user-agents, or hostnames.

As a plus, these matching criteria can be stacked and configured with operators like AND and OR, providing additional simplicity in building complex rules from many discrete blocks, e.g. if you would like to target both image/png AND image/jpeg.

For the full list of fields available conditionals you can apply Cache Rules to, please refer to the Cache Rules documentation.

Introducing Cache Rules: precision caching at your fingertips

Visibility into how and when Rules are applied

Our current offerings of Page Rules, Workers, and Transform Rules can all manipulate caching functionality for our users’ content. Often, there is some trial and error required to make sure that the confluence of several rules and/or Workers are behaving in an expected manner.

As part of upgrading Page Rules we have separated it into four new products:

  1. Origin Rues
  2. Cache Rules
  3. Configuration Rules
  4. Redirect Rules

This gives users a better understanding into how and when different parts of the Cloudflare stack are activated, reducing the spin-up and debug time. We will also be providing additional visibility in the dashboard for when rules are activated as they go through Cloudflare. As a sneak peek please see:

Introducing Cache Rules: precision caching at your fingertips

Our users may take advantage of this strict precedence by chaining the results of one product into another. For example, the output of URL rewrites in Transform Rules will feed into the actions of Cache Rules, and the output of Cache Rules will feed into IP Access Rules, and so on.

In the future, we plan to increase this visibility further to allow for inputs and outputs across the rules products to be observed so that the modifications made on our network can be observed before the rule is even deployed.

Cache Rules. What are they? Are they improved? Let’s find out!

To start, Cache Rules will have all the caching functionality currently available in Page Rules. Users will be able to:

  • Tell Cloudflare to cache an asset or not,
  • Alter how long Cloudflare should cache an asset,
  • Alter how long a browser should cache an asset,
  • Define a custom cache key for an asset,
  • Configure how Cloudflare serves stale, revalidates, or otherwise uses header values to direct cache freshness and content continuity,

And so much more.

Cache Rules are intuitive and work similarly to our other ruleset engine-based products announced today: API or UI conditionals for URL or request headers are evaluated, and if matching, Cloudflare and browser caching options are configured on behalf of the user. For all the different options available, see our Cache Rules documentation.

Under the hood, Cache Rules apply targeted rule applications so that additional rules can be supported per user and across the whole engine. What this means for our users is that by consuming less CPU for rule evaluations, we’re able to support more rules per user. For specifics on how many additional Cache Rules you’ll be able to use, please see the Future of Rules Blog.

Introducing Cache Rules: precision caching at your fingertips

How can you use Cache Rules today?

Cache Rules are available today in beta and can be configured via the API, Terraform, or UI in the Caching tab of the dashboard. We welcome you to try the functionality and provide us feedback for how they are working or what additional features you’d like to see via community posts, or however else you generally get our attention 🙂.

If you have Page Rules implemented for caching on the same path, Cache Rules will take precedence by design. For our more patient users, we plan on releasing a one-click migration tool for Page Rules in the near future.

What’s in store for the future of Cache Rules?

In addition to granular control and increased visibility, the new rules products also opens the door to more complex features that can recommend rules to help customers achieve better cache hit ratios and reduce their egress costs, adding additional caching actions and visibility, so you can see precisely how Cache Rules will alter headers that Cloudflare uses to cache content, and allowing customers to run experiments with different rule configurations and see the outcome firsthand. These possibilities represent the tip of the iceberg for the next iteration of how customers will use rules on Cloudflare.

Try it out!

We look forward to you trying Cache Rules and providing feedback on what you’d like to see us build next.

Crawler Hints supports Microsoft’s IndexNow in helping users find new content

Post Syndicated from Alex Krivit original https://blog.cloudflare.com/crawler-hints-supports-microsofts-indexnow-in-helping-users-find-new-content/

Crawler Hints supports Microsoft’s IndexNow in helping users find new content

Crawler Hints supports Microsoft’s IndexNow in helping users find new content

The web is constantly changing. Whether it’s news or updates to your social feed, it’s a constant flow of information. As a user, that’s great. But have you ever stopped to think how search engines deal with all the change?

It turns out, they “index” the web on a regular basis — sending bots out, to constantly crawl webpages, looking for changes. Today, bot traffic accounts for about 30% of total traffic on the Internet, and given how foundational search is to using the Internet, it should come as no surprise that search engine bots make up a large proportion of that what might come as a surprise is how inefficient the model is, though: we estimate that over 50% of crawler traffic is wasted effort.

This has a huge impact. There’s all the additional capacity that owners of websites need to bake into their site to absorb the bots crawling all over it. There’s the transmission of the data. There’s the CPU cost of running the bots. And when you’re running at the scale of the Internet, all of this has a pretty big environmental footprint.

Part of the problem, though, is nobody had really stopped to ask: maybe there’s a better way?

Right now, the model for indexing websites is the same as it has been since the 1990s: a “pull” model, where the search engine sends a crawler out to a website after a predetermined amount of time. During Impact Week last year, we asked: what about flipping the model on its head? What about moving to a push model, where a website could simply ping a search engine to let it know an update had been made?

There are a heap of advantages to such a model. The website wins: it’s not dealing with unnecessary crawls. It also makes sure that as soon as there’s an update to its content, it’s reflected in the search engine — it doesn’t need to wait for the next crawl. The website owner wins because they don’t need to manage distinct search engine crawl submissions. The search engine wins, too: it saves money on crawl costs, and it can make sure it gets the latest content.

Of course, this needs work to be done on both sides of the equation. The websites need a mechanism to alert the search engines; and the search engines need a mechanism to receive the alert, so they know when to do the crawl.

Crawler Hints — Cloudflare’s Solution for Websites

Solving this problem is why we launched Crawler Hints. Cloudflare sits in a unique position on the Internet — we’re serving on average 36 million HTTP requests per second. That represents a lot of websites. It also means we’re uniquely positioned to help solve this problem:  to help give crawlers hints about when they should recrawl if new content has been added or if content on a site has recently changed.

With Crawler Hints, we send signals to web indexers based on cache data and origin status codes to help them understand when content has likely changed or been added to a site. The aim is to increase the number of relevant crawls as well as drastically reduce the number of crawls that don’t find fresh content, saving bandwidth and compute for both indexers and sites alike, and improving the experience of using the search engines.

But, of course, that’s just half the equation.

IndexNow Protocol — the Search Engine Moves from Pull to Push

Websites alerting the search engine about changes is useless if the search engines aren’t listening — and they simply continue to crawl the way they always have. Of course, search engines are incredibly complicated, and changing the way they operate is no easy task.

The IndexNow Protocol is a standard developed by Microsoft, Seznam.cz and Yandex, and it represents a major shift in the way search engines operate. Using IndexNow, search engines have a mechanism by which they can receive signals from Crawler Hints. Once they have that signal, they can shift their crawlers from a pull model to a push model.

In a recent update, Microsoft has announced that millions of websites are now using IndexNow to signal to search engine crawlers when their content needs to be crawled and IndexNow was used to index/crawl about 7% of all new URLs clicked when someone is selecting from web search results.

On the Cloudflare side, since the release of Crawler Hints in October 2021, Crawler Hints has processed about six-hundred-billion signals to IndexNow.

That’s a lot of saved crawls.

How to enable Crawler Hints

By enabling Crawler Hints on your website, with the simple click of a button, Cloudflare will take care of signaling to these search engines when your content has changed via the IndexNow API. You don’t need to do anything else!

Crawler Hints is free to use and available to all Cloudflare customers. If you’d like to see how Crawler Hints can benefit how your website is indexed by the world’s biggest search engines, please feel free to opt-into the service by:

  1. Sign in to your Cloudflare Account.
  2. In the dashboard, navigate to the Cache tab.
  3. Click on the Configuration section.
  4. Locate the Crawler Hints and enable.
Crawler Hints supports Microsoft’s IndexNow in helping users find new content

Upon enabling Crawler Hints, Cloudflare will share when content on your site has changed and needs to be re-crawled with search engines using the IndexNow protocol (this blog can help if you’re interested in finding out more about how the mechanism works).

What’s Next?

Going forward, because the benefits are so substantial for site owners, search operators, and the environment, we plan to start defaulting Crawler Hints on for all our customers. We’re also hopeful that Google, the world’s largest search engine and most wasteful user of Internet resources, will adopt IndexNow or a similar standard and lower the burden of search crawling on the planet.

When we think of helping to build a better Internet, this is exactly what comes to mind: creating and supporting standards that make it operate better, greener, faster. We’re really excited about the work to date, and will continue to work to improve the signaling to ensure the most valuable information is being sent to the search engines in a timely manner. This includes incorporating additional signals such as etags, last-modified headers, and content hash differences. Adding these signals will help further inform crawlers when they should reindex sites, and how often they need to return to a particular site to check if it’s been changed. This is only the beginning. We will continue testing more signals and working with industry partners so that we can help crawlers run efficiently with these hints.

And finally: if you’re on Cloudflare, and you’d like to be part of this revolution in how search engines operate on the web (it’s free!), simply follow the instructions in the section above.

Introducing Smart Edge Revalidation

Post Syndicated from Alex Krivit original https://blog.cloudflare.com/introducing-smart-edge-revalidation/

Introducing Smart Edge Revalidation

Introducing Smart Edge Revalidation

Today we’re excited to announce Smart Edge Revalidation. It was designed to ensure that compute resources are synchronized efficiently between our edge and a browser. Right now, as many as 30% of objects cached on Cloudflare’s edge do not have the HTTP response headers required for revalidation. This can result in unnecessary origin calls. Smart Edge Revalidation fixes this: it does the work to ensure that these headers are present, even when an origin doesn’t send them to us. The advantage of this? There’s less wasted bandwidth and compute for objects that do not need to be redownloaded. And there are faster browser page loads for users.

So What Is Revalidation?

Introducing Smart Edge Revalidation

Revalidation is one part of a longer story about efficiently serving objects that live on an origin server from an intermediary cache. Visitors to a website want it to be fast. One foundational way to make sure that a website is fast for visitors is to serve objects from cache. In this way, requests and responses do not need to transit unnecessary parts of the Internet back to an origin and, instead, can be served from a data center that is closer to the visitor. As such, website operators generally only want to serve content from an origin when content has changed. So how do objects stay in cache for as long as necessary?

One way to do that is with HTTP response headers.

When Cloudflare gets a response from an origin, included in that response are a number of headers. You can see these headers by opening any webpage, inspecting the page, going to the network tab, and clicking any file. In the response headers section there will generally be a header known as “Cache-Control.” This header is a way for origins to answer caching intermediaries’ questions like: is this object eligible for cache? How long should this object be in cache? And what should the caching intermediary do after that time expires?

How long something should be in cache can be specified through the max-age or s-maxage directives. These directives specify a TTL or time-to-live for the object in seconds. Once the object has been in cache for the requisite TTL, the clock hits 0 (zero) and it is marked as expired. Cache can no longer safely serve expired content to requests without figuring out if the object has changed on the origin or if it is the same.

If it has changed, it must be redownloaded from the origin. If it hasn’t changed, then it can be marked as fresh and continue to be served. This check, again, is known as revalidation.

We’re excited that Smart Edge Revalidation extends the efficiency of revalidation to everyone, regardless of an origin sending the necessary response headers

How is Revalidation Accomplished?

Two additional headers, Last-Modified and ETag, are set by an origin in order to distinguish different versions of the same URL/object across modifications. After the object expires and the revalidation check occurs, if the ETag value hasn’t changed or a more recent Last-Modified timestamp isn’t present, the object is marked “revalidated” and the expired object can continue to be served from cache. If there has been a change as indicated by the ETag value or Last-Modified timestamp, then the new object is downloaded and the old object is removed from cache.

Revalidation checks occur when a browser sends a request to a cache server using If-Modified-Since or If-None-Match headers. These request headers are questions sent from the browser cache about when an object has last changed that can be answered via the ETag or Last-Modified response headers on the cache server. For example, if the browser sends a request to a cache server with If-Modified-Since: Tue, 8 Nov 2021 07:28:00 GMT the cache server must look at the object being asked about and if it has not changed since November 8 at 7:28 AM, it will respond with a 304 status code indicating it’s unchanged. If the object has changed, the cache server will respond with the new object.

Introducing Smart Edge Revalidation

Sending a 304 status code that indicates an object can be reused is much more efficient than sending the entire object. It’s like if you ran a news website that updated every 24 hours. Once the content is updated for the day, you wouldn’t want to keep redownloading the same unchanged content from the origin and instead, you would prefer to make sure that the day’s content was just reused by sending a lightweight signal to that effect, until the site changes the next day.

The problem with this system of browser questions and revalidation responses is that sometimes origins don’t set ETag or Last-Modified headers, or they aren’t configured by the website’s admin, making revalidation impossible. This means that every time an object expires, it must be redownloaded regardless of if there has been a change or not, because we have to assume that the asset has been updated, or else risk serving stale content.

Introducing Smart Edge Revalidation

This is an incredible waste of resources which costs hundreds of GB/sec of needless bandwidth between the edge and the visitor. Meaning browsers are downloading hundreds of GB/sec of content they may already have. If our baseline of revalidation is around 10% of all traffic and in initial tests, Smart Edge Revalidation increased revalidation just under 50%, this means that without a user needing to configure anything, we can increase total revalidations by around 5%!

Such a large reduction in bandwidth use also comes with potential environmental benefits. Based on Cloudflare’s carbon emissions per byte, the needless bandwidth being used could amount to 2000+ metric tons CO2e/year, the equivalent of the CO2 emissions from more than 400 cars in a year.

Revalidation also comes with a performance improvement because it usually means a browser is downloading less than 1KB of data to check if the asset has changed or not, while pulling the full asset can be 100sKB. This can improve performance and reduce the bandwidth between the visitor and our edge.

How Smart Edge Revalidation Works

When both Last-Modified and Etag headers are absent from the origin server response, Smart Edge Revalidation will use the time the object was cached on Cloudflare’s edge as the Last-Modified header value. When a browser sends a revalidation request to Cloudflare using If-Modified-Since or If-None-Match, our edge can answer those revalidation questions using the Last-Modified header generated from Smart Edge Revalidation. In this way, our edge can ensure efficient revalidation even if the headers are not sent from the origin.

Smart Edge Revalidation will be enabled automatically for all Cloudflare customers over the coming weeks. If this behavior is undesired, you can always ensure that Smart Edge Revalidation is not activated by confirming your origin is sending ETag or Last-Modified headers when you want to indicate changed content. Additionally, you could have your origin direct your desired revalidation behavior by making sure it sets appropriate cache-control headers.

Smart Edge Revalidation is a win for everyone: visitors will get more content faster from cache, website owners can serve and revalidate additional content from Cloudflare efficiently, and the Internet will get a bit greener and more efficient.

Smart Edge Revalidation is the latest announcement to join the list of ways we’re making our network more sustainable to help build a greener Internet — check out posts from earlier this week to learn about our climate commitments, Green Compute with Workers, Carbon Impact Report, Pages x Green Web Foundation partnership, and crawler hints.

Introducing: Smarter Tiered Cache Topology Generation

Post Syndicated from Alex Krivit original https://blog.cloudflare.com/introducing-smarter-tiered-cache-topology-generation/

Introducing: Smarter Tiered Cache Topology Generation

Introducing: Smarter Tiered Cache Topology Generation

Caching is a magic trick. Instead of a customer’s origin responding to every request, Cloudflare’s 200+ data centers around the world respond with content that is cached geographically close to visitors. This dramatically improves the load performance for web pages while decreasing the bandwidth costs by having Cloudflare respond to a request with cached content.

However, if content is not in cache, Cloudflare data centers must contact the origin server to receive the content. This isn’t as fast as delivering content from cache. It also places load on an origin server, and is more costly compared to serving directly from cache. These issues can be amplified depending on the geographic distribution of a website’s visitors, the number of data centers contacting the origin, and the available origin resources for responding to requests.

To decrease the number of times our network of data centers communicate with an origin, we organize data centers into tiers so that only upper-tier data centers can request content from an origin and then they spread content to lower tiers. This means content that loads faster for visitors, is cheaper to serve, and reduces origin resource consumption.

Today, I’m thrilled to announce a fundamental improvement to Argo Tiered Cache we’re calling Smart Tiered Cache Topology. When enabled, Argo Tiered Cache will now dynamically select the single best upper tier for each of your website’s origins while providing tiered cache analytics showing how your custom topology is performing.

Smarter Tiered Cache Topology Generation

Tiered Cache is part of Argo, a constellation of products that analyzes and optimizes routing decisions across the global Internet in real-time by processing information from every Cloudflare request to determine which routes to an origin are fast, which are slow, and what the optimum path from visitor to content is at any given moment. Previously, Argo Tiered Cache would use a static collection of upper-tier data centers to communicate with the origin. With the improvements we’re announcing today, Tiered Cache can now dynamically find the single best upper tier for an origin using Argo performance and routing data. When Argo is enabled and a request for particular content is made, we collect latency data for each request to pick the optimal path. Using this latency data, we can determine how well any upper-tier data center is connected with an origin and can empirically select the best data center with the lowest latency to be the upper tier for an origin.

Argo Tiered Cache

Taking one step back, tiered caching is a practice where Cloudflare’s network of global data centers are subdivided into a hierarchy of upper tiers and lower tiers. In order to control bandwidth and number of connections between an origin and Cloudflare, only upper tiers are permitted to request content from an origin and must propagate that information to the lower tiers. In this way, Cloudflare data centers first talk to each other to find content before asking the origin. This practice improves bandwidth efficiency by limiting the number of data centers that can ask the origin for content, reduces origin load, and makes websites more cost-effective to operate. Argo Tiered Cache customers only pay for data transfer between the client and edge, and we take care of the rest. Tiered caching also allows for improved performance for visitors, because distances and links traversed between Cloudflare data centers are generally shorter and faster than the links between data centers and origins.

Introducing: Smarter Tiered Cache Topology Generation

Previously, when Argo Tiered Cache was enabled for a website, several of Cloudflare’s largest and most-connected data centers were determined to be upper tiers and could pull content from an origin on a cache MISS. While utilizing a topology consisting of numerous upper-tier data centers may be globally performant, we found that cost-sensitive customers generally wanted to find the single best upper tier for their origin to ensure efficient data transfer of their content to Cloudflare’s network. We built Smart Tiered Cache Topology for this reason.

How to enable Smart Tiered Cache Topology

When you enable Argo Tiered Cache, Cloudflare now by default concentrates connections to origin servers so they come from a single data center. This is done without needing to work with our Customer Success or Solutions Engineering organization to custom configure the best single upper tier. Argo customers can generate this topology by:

  • Logging into your Cloudflare account.
  • Navigating to the Traffic tab in the dashboard.
  • Ensuring you have Argo enabled.
  • From there, Non-Enterprise Argo customers will automatically be enrolled in Smart Tiered Cache Topology without needing to make any additional changes.

Enterprise customers can select the type of topology they’d like to generate.

Introducing: Smarter Tiered Cache Topology Generation

Self-serve Argo customers are automatically enrolled in Smart Tiered Cache Topology

Introducing: Smarter Tiered Cache Topology Generation

Enterprise customers can determine the tiered cache topology that works best for them.

More data, fewer problems

Once enabled, in addition to performance and cost improvements, Smart Tiered Cache Topology also delivers summary analytics for how the upper tiers are performing so that you can monitor the cost and performance benefits your website is receiving. These analytics are available in the Cache Tab of the dashboard in the Tiered Cache section. The “Primary Data Center” and “Secondary Data Center” fields show you which data centers were determined to be the best upper tier and failover for your origin. “Cached Hits” and the “Hit Ratio” shows you the proportion of requests that were served by the upper tier and how many needed to be forwarded on to the origin for a response. “Bytes Saved” indicates the total transfer from the upper-tier data center to the lower tiers, showing the total bandwidth saved by having Cloudflare’s lower tier data centers ask the upper tiers for the content instead of the origin.

Introducing: Smarter Tiered Cache Topology Generation

Smart Tiered Cache Topology works with Cloudflare’s existing products to deliver you a seamless, easy, and performant experience that saves you money and provides you useful information about how your upper tiers are working with your origins. Smart Tiered Cache Topology stands on the shoulders of some of the most resilient and useful products at Cloudflare to provide even more benefits to webmasters.

If you’re interested in seeing how Argo and Smart Tiered Cache Topology can benefit your web property, please login to your Cloudflare account and find more information in the Traffic tab of the dashboard here.

Tiered Cache Smart Topology

Post Syndicated from Brian Bradley original https://blog.cloudflare.com/tiered-cache-smart-topology/

Tiered Cache Smart Topology

Tiered Cache Smart Topology

A few years ago, we released Argo to help make the Internet faster and more efficient. Argo observes network conditions and finds the optimal route across the Internet for origin server requests, avoiding congestion along the way.

Tiered Cache is an Argo feature that reduces the number of data centers responsible for requesting assets from the origin. With Tiered Cache active, a request in South Africa won’t go directly to an origin in North America, but, instead, look in a large, nearby data center to see if the data requested is cached there first. The number and location of the data centers used by Tiered Cache is controlled by a piece of configuration called the topology. By default, we use a generic topology for every customer that strikes a balance between cache hit ratios and latency that is suitable for most users.

Today we’re introducing Smart Topology, which maximizes cache hit ratios by building on Argo’s internal infrastructure to identify the single best data center for making requests to the origin.

Standard Cache

The standard method for caching assets is to let each data center be a reverse proxy for the origin server. In this scheme, a miss in any data center causes a request to the origin for an asset. A request to the origin for one asset could be made as many times as there are data centers.

Tiered Cache Smart Topology

A cache miss in any data center will result in a request being sent to the origin server even if the asset is cached in some other data center. This is because the data centers are completely oblivious of each other.

Theoretically, a request for the asset would have to be sent to every data center in order to reduce the cache misses to the minimum possible. However, sending every request to every data center is not practical.

The minimum possible cache hit latency is achieved if the asset is moved into the nearest cache before the request for it is made, but this kind of prediction is generally not possible. Instead, a good heuristic is to move the asset into the nearest cache after the first cache miss.

However, the asset has to be copied from somewhere and it isn’t possible to know where in the network it might be without querying each data center.

To avoid querying each data center, a copy of the asset has to be stored in a known location after the first cache miss so it is available to other data centers. This is precisely what Tiered Cache does.

Tiered Cache

Tiered Cache improves cache hit ratios by allowing some data centers to serve as caches for others, before the latter has to make a request to the origin. With Tiered Cache, certain data centers are reverse proxies to the origin for other data centers.

Tiered Cache Smart Topology

If the proxied data centers make requests for the same asset, the asset will already be cached in the proxying data center and can be retrieved from there rather than from the origin. Fewer requests to the origin are made overall.

Custom Topology

In Tiered Cache, the topology describes which data center should serve as a proxy for others.

For customers, devising an optimal topology is a challenge requiring optimization and continuous maintenance. The best topology is a configuration based on information that is privately held by the customer and other information held only by Cloudflare.

For instance, knowing the desired balance of latency versus cache hit ratio is information only the customer has, but how to best make use of the Internet is something we would know. Enterprise customers usually have dedicated infrastructure teams that work with our solutions engineers to manually optimize and maintain their tiered cache topology.

Not every customer would want to personalize their topology. For this reason a generic topology exists.

Generic Topology

The generic topology is designed to achieve good latency and cache efficiency for any origin, regardless of location. A balance is struck between two constraints —  cache efficiency and  latency.

The generic topology has multiple proxying data centers that are distributed around the world in order to ensure that requests that result in a cache miss do not take a very long detour before going to the origin. There is a balance between the number of proxying data centers and the cache hit ratio, because the proxying data centers are oblivious to each other.

If a proxying data center is taken offline, the proxied data centers either use a fallback (if the fallback is online) or revert to behaving like Tiered Cache is disabled.

To achieve the best balance for general usage, the generic topology instructs the smaller data centers to be proxied by the larger data centers in the same geographic region.

Smart Topology

Smart Topology assumes the origin is in one place and then automatically configures itself to be optimal once the customer just flips a switch in the dashboard. In order to actually do this, Cloudflare needs to be able to determine which data center has the lowest latency to the origin without making the customer tell Cloudflare where the origin is.

Methods for Latency Determination

There are a few ways to determine which data center has the lowest latency with respect to the origin.

IP geolocation
Physical distance can be used as an approximation for latency, but Smart Topology was not built this way for a couple of reasons. First, even the best commercial IP geo database doesn’t have the required coverage and accuracy. Second, even with perfect accuracy, physical distance is a questionable approximation of Internet latency.

Probing
Latency to an IP address can be determined exactly by probing that address. The probe can just be the time required to perform the TCP handshake. Each data center probes the origin so that the latencies can be directly measured and the minimum can be found. Except for edge cases involving Anycast and TCP termination, we can assume that the latency to an IP address is the same as the latency to the origin server behind that IP address.

Topology Selection Algorithm

The goal of the topology selection algorithm is to minimize cache misses and latency. The topology chooses a single proxying data center in order to maximize the cache hit ratio. The proxying data center is chosen to be close to the origin so that the latencies of cache misses in the proxied data centers are not much worse than they would be with tiered cache turned off.

The choice should eventually become stable. Stability is important because each time the choice changes, cache misses in proxied data centers are likely to cause cache misses in the new proxying data center. Capacity is important because when a data center goes offline, it can cause a large number of cache misses. Minimizing latency to the origin is important to ensure that the network is used efficiently.

The data center selection algorithm is rather like a leaderboard of the fastest data center for each origin. As data is collected, a faster data center can knock others off a given origin’s leaderboard. This competition is based on the 24 hour median latency and is held each hour. Only a subset of data centers deemed large enough are permitted to compete.

Eventually, the choice for proxying data centers becomes stable. Over time, data centers produce competing records for each origin and less competitive records in the leaderboard are replaced as necessary. Thus, latencies for any origin on the leaderboard can only monotonically decrease. There are always physical limits in the real world, so eventually the ideal data center will set a record that is too good to beat.

Also, the leaderboard actually includes both the lowest latency data center and the second lowest latency data center. The second lowest latency data center serves as a fallback if the preferred data center is taken offline for maintenance.

Anycast Networks
We are measuring the latency to the origin IP address and assuming that it represents the latency to the origin server, but this can break down in certain cases. A few cloud providers other than Cloudflare also use Anycast technology to provide their services. In Anycast, multiple machines can share an IP address regardless of where they are connected to the Internet, and the Internet will typically route packets destined for that address to the closest machine. If an Anycast network is used to proxy an origin server, then the apparent latency to the IP address for the origin server is actually the latency to the edge of the Anycast network rather than the latency to the origin server. The real latency to the origin server cannot be determined by probing.

Tiered Cache Smart Topology

The algorithm would fail to select the single best proxying data center if the latencies are not representative of the actual latency between data center and origin. Selecting the wrong data center would adversely affect latencies for requests to the origin, and could be expensive.

For instance, imagine a cloud provider provides an IP address that actually routes to multiple data centers all over the world. Packets are routed through private infrastructure to the correct destination once they enter the network. The lowest latency data center to this Anycast IP address could potentially even be on a different continent than the actual origin server. Therefore, the apparent latency cannot actually be trusted as a true measure of actual latency to the origin.

The data center selection algorithm assumes that the origin is in a single geographic location and can be probed to determine latency from each data center. These networks break one or both of these assumptions, so a procedure had to be developed in order to detect them. First, it is assumed that the IP appears in a single geographic location and is not proxied by such a network. The latency to the origin is bounded by the speed of light through fiber. Although the distance between any data center and the origin server is not known, the distances between data centers is known by Cloudflare.

Tiered Cache Smart Topology

Imagine putting the origin server as a pitstop in that journey. Then, the theoretical minimum possible observable pair of latencies between the origin server and any two data centers can be computed. We have the latency probe data from both of these data centers and the origin, so we can check to see whether the observed latency is lower than what is possible.

The original assumption was that the origin IP address identifies an origin server that is in one location and the latency to that IP address is the latency to the origin server. If the observed latencies are faster than light then clearly the assumption is false. Smart Topology falls back to the generic topology when the original assumption does not hold. To be extra sure, we check this constraint on a bunch of data centers around the world and fall back if there is even a single physically impossible observation.

The Big Picture

When Smart Topology is enabled many Cloudflare systems work together to ensure the correct data center is eventually used to request assets from the origin.

Tiered Cache Smart Topology

When the customer enables Tiered Cache Smart Topology, one of a few things can happen from the perspective of the origin. If a proxying data center has already been assigned to the CIDR block that encompasses the origin IP, the preferred or fallback data center is used to request assets from the origin. Otherwise, the generic topology is used to determine which proxying data centers to use to pull assets from the origin. The latency to the proxying data center should only decrease as the choice for proxying data center is updated over time.

Conclusion

Developing this technology offered a lot of opportunities to exercise great engineering and build an impactful product. It was not done in a vacuum; we used infrastructure that Cloudflare had already built, and we moved along that exponential gradient of using existing progress to make more progress. Building this framework opens a lot of doors to future progress too; for instance, in the future, we can explore ways to select the ideal proxying data center even for origins behind Anycast networks that hide the true latency to the origin.