Prefetching? At This Age?

A while back, I wrote about using Netlify and SpeedCurve to A/B test performance changes. The one I specifically mentioned was testing out Instant.Page on my site.

While the bulk of the post is about the A/B testing setup (which I am very happy with), I did note at the end that I was seeing some small improvements from Instant.Page, though the results were far from conclusive yet.

Alexandre, the creator of Instant.Page, suggested on Twitter that the gains I was seeing were small because Netlify passes an Age header that messes with prefetching.

Tim, Netlify sends a Age header that conflicts with prefetching. A prefetched page will get fetched again on navigation if its Age header is over 300. The small gain you are seeing are due to the navigation request being a 304 and not a 200.

This lead down an interesting little rabbit hole and, eventually, a bug. I learned a few new things as I dug in, so I figured it was worth sharing for others as well (and for me to come back to when I inevitably forget the details).

First, before we dive in, let’s zero in on the critical components of what’s happening on my site specifically.

For all HTML responses, I pass a Cache-control: max-age=900, must-revalidate header. This tells the browser to cache the response for 15 minutes (90060). After that, it has to revalidate—basically, talk to the server again to make sure the asset is still valid and a newer version isn’t available. As soon as the resources is revalidated, the 15 minutes starts over.

Netlify also passes along an ‘Age’ header, indicating how long they’ve been caching the resource themselves. (More on that in a bit) So, for example, if they’ve had the resource on the servers for 14 minutes, that would look like this:

age: 840

And finally, as a recap, Instant.Page works by using the prefetch resource hint to fetch links early, when someone hovers over the link instead of waiting for the next navigation to start.

Now let’s dive into each part of that and how they fit together.

The Age Header

The ‘Age’ header is used by upstream caching layers (Varnish, CDNs, other proxies, etc.) to indicate how long it’s been since a response was either generated or validated at the origin server. In other words, how long has that resource been sitting in that upstream cache.

It’s not just something that Netlify does—open just about any site and you’ll find resources with the ‘Age’ header set. That’s because if you’ve got something sitting between your origin and the browser caching your content, setting the ‘Age’ header is exactly what you’re supposed to be doing. It’s important information.

Let’s say you’re using a CDN to cache content on their edge servers instead of making visitors wait while assets are requested from wherever your origin server resides. The first time a resource is requested, the CDN is going to have to go out and make a connection to your origin server to get it. At that point, since the CDN just got the resource from the origin server, the age is ‘0’.

Then, depending on what you’ve set up at the CDN level and assuming the CDN can cache the resource, the CDN will start serving that resource as it’s requested without talking to the origin again. As it does this, the age of the resource gets older and older.

Eventually, the CDN needs to talk to the origin server again.

Let’s say your CDN is set to cache a resource for 15 minutes before it needs to validate that the resource is still fresh. After 15 minutes, the CDN talks to the origin and will either get a new version of that resource or verification that the resource is still valid. At that point, the age of the resource resets to ‘0’—we’ve got a fresh start since we know what we have on the CDN is the latest version.

The browser’s primary mechanisms for determining what to cache and for how long are headers like Expires (which provides an expiration date for the resource being served), Cache-control (a ton of stuff here, but specifically for duration is max-age), Last-Modified (the date at which the resource was last modified), and Etag (a unique version identifier for the object). (For more detail on all of those,Paul Calvano’s post on Heuristic Caching and Harry’s post about Cache-Control are both top-notch resources.)

Age, too, factors in.

Let’s say that your CDN is set to cache a resource for 15 minutes, and you’ve also told the browser to cache that resource for 15 minutes using the Cache-control header (Cache-control: max-age=900, must-revalidate). With two layers of caching, each at 15 minutes, that means we have a potential Time to Live (the time a resource is stored in a cache before it’s deleted or updated) of up to 30 minutes—if the browser requests the resource just before the CDN’s version expires, then it’s been sitting in cache for 15 minutes on the CDN and will sit in the browser cache for another 15 minutes—so 30 minutes total.

For any sort of remotely dynamic content, this could be problematic. If the content changes in that upstream cache, we could still be serving an old version of the resource for 15 more minutes until the browser cache expires.

The Age header helps to prevent against this.

Let’s go back to our example, where the browser requests the resource just before the CDN’s version expires. Only this time, let’s say the CDN communicates how long it’s had the asset in cache by providing an Age header of 840 (14 minutes). The browser knows from the max-age directive that it’s ok to serve an asset that is 15 minutes old, and it knows that the asset has been sitting on the CDN for 14 minutes. So, the browser adjusts the TTL to 1 minute (15 minutes of browser TTL minus 14 minutes it’s already been on the CDN), protecting against this problem of cache layers stacking on top of each other.

This can all get a bit funky if the max-age directive you’re passing to the browser doesn’t align with how long you’re caching the resource upstream.

For example, if you’re telling your CDN to cache a file for a week, but you’re only telling the browser to cache that resource for 15 minutes, then as soon as the Age of that resource exceeds 900 (15*60) the browser will no longer consider that resource safe to cache. Everytime it sees the request, it will note that the age is past the maximum TTL it’s been told to pay attention to, so it goes back out to the servers to try to find a new version.

There are times where having mismatched TTL’s at a caching layer and at the browser may make sense. It’s pretty quick to purge the cache (basically, empty it out) for most CDNs. So sometimes what you’ll see is folks set a long TTL at the CDN layer and a short one at the browser level. Then, if the content does need to change, they can purge the CDN cache quickly and all they have to wait for is the browser to get past whatever short TTL they’ve set there. In those cases, it makes sense from a performance standpoint not to pass the Age header so that the browser can keep caching.

How prefetch works

When you use the prefetch resource hint (which is what Instant.Page does), you’re telling the browser to go grab that resource even though it hasn’t been requested by the current page, and put it into cache.

So, for example, the following example tells the browser to grab the about page and store it.

<link rel="prefetch" href="/about" as="document" />

The browser will request the resource at a very low priority during idle time so that the resource doesn’t compete with anything needed for the current navigation.

As with any request that gets cached, how long it’s cached depends on the caching headers. But with prefetch, there’s an added wrinkle.

The entire point of prefetch is to have something stored for the next navigation: making a prefetch for a resource that expires before that next navigation is wasted work and wasted bytes.

For this reasons, Chromium-based browsers have a period of five minutes where they’ll cache any prefetched resources regardless of any other caching indicators (unless no-store has explicitly been set in the Cache-control header). After that window has expired, the normal Cache-control directives kick in, minus that initial window.

In my case, I serve HTML documents with a max-age of 15 minutes. That means Chrome will save that prefetched resource for 15 minutes so this 5 minute window doesn’t really do anything special.

But if you served an asset with a max-age of 0, then Chrome is still going to hold that resource for 5 minutes before having to revalidate it. The main takeaway here is that to avoid wasted work, the browser ignores the usual indicators of freshness for a period of time.

Firefox, on the other hand, does not have this little extra window for prefetched resources—it treats them like any other cached object, paying attention to the caching headers as normal. So, if (for example) the max-age is 0 for a prefetched resource, Firefox will make the request as directed using prefetch and then make the request again once it discovers it on the next navigation.

Bringing it altogether

Phew. Ok. So we know what the Age header does, we know how the browser uses it to determine caching, and we know that Chromium-based browsers ignore all the usual freshness indicators when it comes to prefetch, at least for a short period of time, and Firefox does not.

All of this means that in Firefox, if the Age exceeds the max-age directive, then the prefetched resource is going to result in two requests: once for the actual prefetch and, because the asset is older than the TTL, once again on the next navigation.

In Chromium-based browsers, it seems likeAge shouldn’t impact prefetch behavior at all—if Chromium ignores other caching directives, why is Age any different? It seems like a bug.

Which is exactly the conclusion Yoav came to:

To clarify, sounds like a Chromium bug. Sending Age headers for cached resources is what caches are supposed to do
And indeed, the 5 minutes calculation includes the Age header, which IMO makes little sense https://source.chromium.org/chromium/chromium/src/+/master:net/http/http_cache_transaction.cc;l=2716;drc=2f11470d7ad8963a9add116df64d2edd1b85d3a4;bpv=1;bpt=1?originalUrl=https:%2F%2Fcs.chromium.org%2F

The bug is the source of what Alexandre was noting. Since Age is being included in the prefetch caching considerations, any prefetched resource in Chrome with an Age higher than either that 5 minute window or the max-age (whichever is longer) can’t be cached, so the request happens twice: once on prefetch and once on the next navigation.

In my specific case, while the bug’s behavior is definitely not ideal, it also doesn’t jump out in the metrics on the aggregate because of my service worker. When the request gets prefetched, the service worker caches it. On that next navigation, the request gets made again, but the service worker has it at the ready, which accounts for why I’m seeing some performance improvements even with the bug.

Now, if we ignore the prefetch specific issues here, we do still have an issue with the way Netlify handles the Age header. Netlify is, interestingly, both the CDN and the origin here. Typically, whenever the CDN has to revalidate that a resource is still fresh with the origin, it will reset the Age header back to 0.

In this case, because Netlify essentially is our origin, there’s no other layer somewhere for Netlify to revalidate with. The buck stops here, or something like that.

By passing the Age header along, and only updating it when the content is changed or cache is explicitly cleared, Netlify creates a situation where the browser will always have to go back to the server (Netlify) to see if the resource is fresh, regardless of that max-age window. The only way around this is to set a very long max-age or make sure to clear your Netlify cache on a semi-regular basis.

I suspect Netlify shouldn’t be passing the Age header down at all. Or, if that header is being applied at their edge layer (I’m not 100% clear on their architecture), then whenever their edge layer has to revalidate with the original source, they should be updating that Age at that point to avoid the issue of an ever-increasing Age.

Where do we go from here?

So, how do we make sure that our prefetched resources are as performant as possible?

First things first: measure. I tried to emphasize this in my last post, but the data about the impact of this approach on my site says nothing about the impact on other sites. In my situation, I’m seeing a small improvement in most situations even with the bug in place. Your mileage may vary. Testing performance changes is good.

From the Chromium side of things, don’t worry about it. Yoav was all over it, and a fix has already landed.

Firefox, however, is another story. It seems they’ve been contemplating making this change for awhile now, so it’s a matter of prioritizing the work. In the meantime, there are a few things to keep in mind.

One, if you have a service worker in place and you’re using an approach where the service worker serves from the cached version first, that helps to offset the double request penalty you might otherwise pay. The first request puts it in the service worker cache, the second gets pulled from there before it has to go any further.

If you don’t have a service worker in place, then you’re going to have to make a decision regarding the Age header.

If you don’t pass the Age header, then Firefox can cache the resource according to your cache headers regardless of whether the age of the resource on the CDN (or proxy) is longer than the max-age communicated to the browser, but it does introduce the risk of extending the total TTL as we saw above. If your max-age directive is set to a short duration and you can quickly purge the upstream cache, you reduce the pain here a little.

If you do pass the Age header along, you avoid longer total TTL issues, but you now risk issuing double requests for every prefetched resource as the age of the cached resource gets older. If the resource changes frequently in the upstream cache, or if you are passing a long max-age directive to the browser, the severity of this risk is reduced a little.

In the end, this comes down to a combination of what services and tools you’re using for those upstream caches, and how frequently your prefetched resources may change.

find the cost of your paper

Sep 13, Grand Remembrances

Today is Grandparents Day in the United States. Being a Grand is a special honor. I feel very blessed that my wife and I have two grandchildren. We were able to visit them today. Yes, we are still being cautious with the coronavirus, but we also find it very difficult to not see them when they live so close. So today we did drop by to visit Jacob (age 10) and Sophia (age 7) along with their parents. We brought donuts and caught up with them. Our grandchildren are still pretty young and this is a precious time in their lives – and ours!

I wish I had known my grandparents better. We never lived in the same place. Dad was a career Air Force pilot, so we moved around a lot. But we did get to see them once in a while when they would visit us, or we them.

A Plague of Giants

There are five known magical ‘kennings’ or types: air, water, fire, earth, and plants. Each nation specializes in of these kennings, and the magic influences the society. There’s a big pitfall with this diversity of ability and locale–not everyone gets along.

Enter the Hathrim giants, or ‘lavaborn’ whose kenning is fire. Where they live the trees that fuel their fire are long gone, but the giants are definitely not welcome anywhere else. They’re big, they’re violent, and they’re ruthless. When a volcano erupts and they are forced to evacuate, they take the opportunity to relocate. They don’t care that it’s in a place where they aren’t wanted.

I first read Kevin Hearne’s Iron Druid books and loved them (also the quirky The Tales of Pell), so was curious about this new venture, starting with A PLAGUE OF GIANTS. Think Avatar: The Last Airbender meets Jim Butcher’s Codex Alera series. Elemental magic, a variety of races, different lands. And it’s all thrown at you from page one.

But this story is told a little differently. It starts at the end of the war, after a difficult victory, and a bard with earth kenning uses his magic to re-tell the story of the war to a city of refugees. And it’s this movement back and forth in time and between key players in this war that we get a singularly grand view of the war as a whole. Hearne uses this method to great effect.

There are so many interesting characters in this book that I can’t cover them all here. Often in books like this such a large cast of ‘main’ character can make the storytelling suffer, especially since they don’t have a lot of interaction with each other for the first 3/4 of the book–but it doesn’t suffer, thankfully. And the characterization is good enough, despite these short bursts, that by the end we understand these people and care about what happens to them.

If there were a main character it would be Dervan, a historian who is assigned to record (also spy on?) the bard’s stories. He finds himself caught up in machinations he feels unfit to survive. Fintan is the bard from another country, who at first is rather mysterious and his true personality is hidden by the stories he tells; it takes a while to understand him. Gorin Mogen is the leader of the Hathrim giants who decide to find a new land to settle. He’s hard to like, but as far as villains go, you understand his motivations and he can be even a little convincing. There’s Abhi, the son of hunters, who decides hunting isn’t the life for him–and unexpectedly finds himself on a quest for the sixth kenning. And Gondel Vedd, a scholar of linguistics who finds himself tasked with finding a way to communicate with a race of giants never seen before (definitely not Hathrim) and stumbles onto a mystery no one could have guessed: there may be a seventh kenning.

There are other characters, but what makes them all interesting is that they’re regular people (well, maybe not Gorin Mogen or the viceroy–he’s a piece of work) who become heroes in their own little ways, whether it’s the teenage girl who isn’t afraid to share vital information, to the scholars who suddenly find how crucial their minds are to the survival of a nation, to the humble public servants who find bravery when they need it most. This is a story of loss, love, redemption, courage, unity, and overcoming despair to not give up. All very human experiences by simple people who do extraordinary things.

Hearne’s worldbuilding is engaging. He doesn’t bottle feed you, at first it feels like drinking from a hydrant, but then you settle in and pick up things along the way. Then he shows you stuff with a punch to the gut. This is no fluffy world with simple magic without price. All the magic has a price, and more often than not it leads you straight to death’s door. For most people just the seeking of the magic will kill you. I particularly enjoyed the scenes with Ahbi and his discovery of the sixth kenning and everything associated with it. But giants? I mean, really? It isn’t bad enough fighting people who can control fire that you have to add that they’re twice the size of normal people? For Hearne if it’s war, the stakes are pretty high, and it gets ugly.

The benefit of the storytelling style is that the book, despite its length, moves along steadily (Hearne is no novice, here). The bits of story lead you along without annoying cliffhangers (mostly), and I never got bored with the switch between characters. It was easy to move between them, and they were recognizable enough that I got lost or confused. The end of the novel felt a little abrupt, but I guess that has more to do with I was ready for the story to continue, despite the exiting climax.

If you’re looking for epic fantasy with fun storytelling and clever worldbuilding, check out A PLAGUE OF GIANTS.

The post A Plague of Giants appeared first on Elitist Book Reviews.

The Artwork Of Gary Choo

Gary Choo is a concept artist/illustrator based in Singapore. I’ve know Gary for a good many years ( 17, actually ), working together in animation studios in Singapore like Silicon Illusions and Lucasfilm. Gary currently runs an art team at Mighty Bear Games, but when time allows he also draws covers for Marvel comics, and they’re amazing –

The Art Of Gary Choo
The Art Of Gary Choo
The Art Of Gary Choo
The Art Of Gary Choo
The Art Of Gary Choo

To see more of Gary’s work or to engage him for freelance work, head down to his ArtStation.

The post The Art Of Gary Choo appeared first on Halcyon Realms – Art Book Reviews – Anime, Manga, Film, Photography.

27