Skip to content
This repository has been archived by the owner on Apr 21, 2023. It is now read-only.

Design Doc: Flush Subresources Early

Jeff Kaufman edited this page Jan 6, 2017 · 1 revision

Flush Subresources Early

Megha Mohabey, 2012-03-09

Last Updated: 2012-05-30

Objective

Induce early downloading of the subresources of an html page in browser by flushing a dummy head early, before receiving a response from the origin server (NOTE: Resources that take multiple round trips will benefit most. e.g. images listed in a CSS are requested after CSS is downloaded & parsed. Likewise resources requested through XHR in AJAX.). The browser must not request the resource twice, and must recognize that the resource has already been requested. Also the ordering of execution should not be affected.

Basic Design

Usually the html is uncacheable and there is some fetch latency observed while fetching the page from the origin server. To overcome that, a list of cacheable resources seen in the head section of the page are stored in the property cache and when a subsequent request comes for the same page, a dummy head is flushed. Dummy head is constructed from the list of critical subresources as follows:

  <head>
    <script type="text/javascript">
      var img = new Image();
      img.src = "subresource1";
      img.src = "subresource2";
    </script>
  </head>

Either image or object tag can be used depending on the browser support. This will induce the browser to download the elements without executing them. Chrome supports <link rel="subresource" href="jquery.js/> which can be used instead of image/object tags for preloading the critical subresources.

Detailed Design

  1. When a request for an html page is received
    1. Get the response headers.
    2. Extract all the rewritten subresources in the head section of the page.
    3. Extract the html till the first tag.
    4. Store all the above information in property cache.
  2. When another request for the same page comes, check the property cache to see if there is an entry for the sub resources.
  3. If there is an existing entry and the html response is known to be non cacheable and the user agent is Chrome 5. Create a new head from the list of critical subresources as follows: ```
``` 6. Prepend it with pre head information like `` tag and doctype. 7. Flush this dummy head. 8. Initiate a fetch request to the origin server. 9. Suppress the pre head information while writing the original html page as it was already sent. 10. Since html head is flushed early, small javascript needs to be sent to set the cookie after receiving the response from origin server.

Challenges

  • Ensure that the browser doesn’t download the same resources twice.
  • Ensure that the browser doesn’t cancel the preloading in midst of downloading. Eg. Firefox cancels the request when it discovers that the content type is not image. So we use different strategies based on user agent to induce preloading.
  • Ensure that the resource urls which are flushed early do not change in the real response which comes later.
  • Enabling this filter might lead to contention on browser side between the real requests and preloading requests. The number of resources need to be appropriately tuned to avoid that.

Design Alternatives

  • Use headless browser or pagespeed rule to identify the critical subresources.
  • Do log analysis to get the critical subresources.

Related Efforts

Code Pointers

  • automatic/flush_early_flow.cc
  • rewriter/public/collect_flush_early_content_filter.h
  • rewriter/public/suppress_prehead_filter.h
Clone this wiki locally