Slashdot Log In
Optimizing Page Load Times
Posted by
kdawson
on Mon Oct 30, 2006 05:05 AM
John Callender writes, "Google engineer Aaron Hopkins has written an interesting analysis of optimizing page load time. Hopkins simulated connections to a web page consisting of many small objects (HTML file, images, external javascript and CSS files, etc.), and looked at how things like browser settings and request size affect perceived performance. Among his findings: For web pages consisting of many small objects, performance often bottlenecks on upload speed, rather than download speed. Also, by spreading static content across four different hostnames, site operators can achieve dramatic improvements in perceived performance."
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Erm.. huh? (Score:2, Insightful)
(Last Journal: Sunday September 19 2004, @10:03PM)
I can see it's use on large sites but this seems aimed at smaller sites.
Then again HTML isn't my thing so it goes over my head I guess.
Re:Erm.. huh? (Score:4, Informative)
(http://www.a2b2.com/)
Firstly if the ISP has a proxy server then using it will reduce the trip time for some stored content meaning it only has to go over a few hops than prehaps all the way across the world. You can also look at something like Onspeed [onspeed.com] which is a paid for product but compresses images (though makes them look worse) and content and can give a decent boost on very slow (GPRS/3G) connections and also get more out of your transfer quota.
Re:Erm.. huh? (Score:4, Informative)
1 - keepalive/pipelining connections means only 1 dns lookup is performed, often cached on your local machine means this delay is minimal.
2 - the dns lookup can be happening for the second host while connections to the first host are still downloading, rather than stopping everything while the second host is looked up. This hides the latency of the second lookup.
3 - most browsers limit the number of connections to each server to 2. If you're loading loads of images, this means you can only be loading two at once (or one while the rest of the page is still downloading). If you put images on a different host, you can get extra connections to it. Also, cookies will usually stop an object from taking advantage of proxies/caches. Putting images on a different host is an easy to way make sure they're not cookied.
Those tenths of seconds add up (Score:5, Informative)
Re:Erm.. huh? (Score:4, Interesting)
(http://orasio.freeservers.com/)
Just because 1 seconds seems fast, it doesn't mean that it's fast enough to stop improving.
When you reach that 200ms barrier, the interface has perfect responsiveness, a bigger interval is always perfectible.
HTTP Pipelining (Score:5, Informative)
(http://www.phpgd.com/)
For those that don't know what that means: http://www.mozilla.org/projects/netlib/http/pipel
I've had it switched on for ages. I sometimes wonder why it's off by default.
Re:HTTP Pipelining (Score:5, Interesting)
Reference [operawiki.info]
HTTP/1.1 Design (Score:5, Insightful)
From TFA:
And:
From RFC 2616, section 8.1.4:
It's not a browser quirk, it's specified behavior.
Re:HTTP/1.1 Design (Score:5, Interesting)
Re:HTTP/1.1 Design (Score:4, Insightful)
Simulation software available? (Score:4, Informative)
(http://emulemorph.sourceforge.net/)
What (free) simulation is available for this? I only know dummynet which requires a linux server and some advanced routing. But surely there is more. Is there?
Re:Simulation software available? (Score:4, Interesting)
(http://www.ggvaidya.com/ | Last Journal: Sunday July 16 2006, @11:28PM)
Css and Scripts (Score:5, Informative)
(http://t3.dotgnu.info/ | Last Journal: Monday September 26 2005, @06:32AM)
I've done some benchmarks and measurements in the past which will never be made public (I work for Yahoo!). And the most important bits in those have been CSS and Scripts. A lot of performance has been squeezed out of the HTTP layers (akamai, Expires headers), but not enough attention has been paid to the render section of the experience. You could possibly reproduce the benchmarks with a php script which does a sleep() for a few seconds to introduce delays at various points and with a weekend to waste [dotgnu.info].
The page does not start rendering till the last CSS stream is completed, which means if your css has @import url() entries, the delay before render increases (until that file is pulled & parsed too). It really pays to have the quickest load for the css data over anything else - because without it, all you'll get it a blank page for a while.
Scripts marked defer do not always defer and a lot of inline code in <script> tags depend on such scripts that a lot of browsers just pull the scripts as and when they find it. There seems to be just two threads downloading data in parallel (from one hostname), which means a couple of large (but rarely used) scripts in the code will block the rest of the css/image fetches. See flickr's organizr [flickr.com] for an example of that in action.
You should understand that these resources have different priorities in the render land and you should really only venture here after you've optimized the other bits (server [yahoo.com] and application [php.net]).
All said and done, good tutorial by Aaron Hopkins - a lot of us have had to rediscover all that (& more) by ourselves.
Spreading content across hostnames... (Score:1)
Caching of dynamic content (Score:5, Insightful)
Abolishment of nasty long query strings into nicer, more memorable URI's is also something we should be seeing more of in "Web 2.0." Use mod_rewrite [google.com], you'll feel better for it.
Pipelining (Score:2)
(http://inglorion.net/ | Last Journal: Thursday October 06 2005, @07:17AM)
``Neither IE nor Firefox ship with HTTP pipelining enabled by default.''
Huh? So all these web servers implement keep-alive connections and browsers don't use it?
Re:Pipelining (Score:4, Informative)
Keep-alive no:
Open connection
-Request
-Response
Close Connection
Open connection
-Request
-Response
Close Connection
-Repeat-
Keep-alive yes:
Open connection
-Request
-Response
-Request
-Response
-Repeat-
Close Connection
Pipe-lining yes:
Open connection
-Request
-Request
-Repeat-
-Response
-Response
-Repeat-
Close Connection
Connection Limits (Score:3, Interesting)
(http://inglorion.net/ | Last Journal: Thursday October 06 2005, @07:17AM)
Anybody know why? This seems pretty dumb to me. Request a page with several linked objects (images, stylesheets, scripts,
Requests Too Large (Score:3, Interesting)
(http://inglorion.net/ | Last Journal: Thursday October 06 2005, @07:17AM)
``Most DSL or cable Internet connections have asymmetric bandwidth, at rates like 1.5Mbit down/128Kbit up, 6Mbit down/512Kbit up, etc. Ratios of download to upload bandwidth are commonly in the 5:1 to 20:1 range. This means that for your users, a request takes the same amount of time to send as it takes to receive an object of 5 to 20 times the request size. Requests are commonly around 500 bytes, so this should significantly impact objects that are smaller than maybe 2.5k to 10k. This means that serving small objects might mean the page load is bottlenecked on the users' upload bandwidth, as strange as that may sound.''
I've said for years that HTTP requests are larger than they should be. It's good to hear it confirmed by someone who's taken seriously. This is even more of an issue when doing things like AJAX, where you send HTTP requests and receive HTTP responses + XML verbosity for what should be small and quick user interface actions.
ooh sub domain spam (Score:1)
(http://hauntingthunder.wordpress.com/)
Latency (Score:2)
(http://inglorion.net/ | Last Journal: Thursday October 06 2005, @07:17AM)
If you read the article, you will see that the default behavior for Firefox and MSIE is to use only up to two connections per hostname (resulting in many objects being received sequentially - add one round trip time for each), and that they don't use HTTP pipelining, meaning a new connection is set up for each object (add one round trip time for each).
In other words: it's the latency, stupid.
Gmail (Score:3, Insightful)
(http://www.upperland.net/)
The fun is that newer AJAX products from google (like goffice) don't suffer from this behavior, they have a much more cleaner code (just pick view code on your favorite browser and see). Probally Gmail HTML/Javascript is already showing it's age, and paying the price for being a first at google AJAX apps.
Is this really news? (Score:2)
(http://www.ofcourseimright.com/)
Also, the reason pipelining is turned off by default in many browsers is that there are a lot of middleboxes that can't handle it.
Page load time is still important (Score:2)
(http://www.cheapcheap.biz/)
There are a lot of posts here asking "why is this important" and saying that pages already load fast enough on their broadband Internet connection. That may be true for you, but I'm frequently in a position where I am designing a site that needs to load over a slow satellite connection in rural Africa, say, or into a remote village in Nepal. They have a fairly recent computer, OS and browser on the recieving end, but their Internet connection is dog slow; anything I can do to speed it up will be greatly appreciated. It's back to 1980s dial-up speeds.
This isn't everyone's problem, I admit, but it's an issue for a lot of people in the world.
"Also, by spreading static content..." (Score:2)
How ironic that a google engineer would say this, since doing this will also pretty well kill your google pagerank rankings. Google is great, yes, but among is many, many problems are the ridiculous ways that it forces people to do web design if they want a decent pagerank. another is how it "helpfully" directs you to "geographically relevant" searches - meaning that, for example, if you want a hotel room in egypt and browse from the UK, you get all the links from (much more expensive) UK based hotel and travel shops rather than, say, ones in egypt or elsewhere that, while also in english, are much cheaper.
4 hostnames and security (Score:2)
Not to mention the management issues of having to link to content on 4 different domains in an efficient enough manner.
This leaves us with pipelining on the client, which could results in much worse load peaks on the servers though.
In the end: let the page load a little slower, the workarounds are not worth it.
render time (Score:1)
(http://127.0.1.1/ | Last Journal: Tuesday October 03 2006, @08:10AM)
Cheers,
-S
Optimization of slashdot? (Score:1)
(Last Journal: Friday March 31 2006, @02:35PM)
The problem, as I see it, is that issues like page load times are partly caused by browser issues (HTTP pipelining, cache, etc) and partly caused by server issues. (yes, yes, I know it's obvious) However, consider the idea of specialized configurations. Essentially a per-site set of conditions. For Slashdot.org, allow multiple HTTP connections (have to load that style file) and just load the images from the old cache (after all, the Microsoft Borg icon hasn't changed, has it?)
To a certain extent, this could be handled almost in a cookie-like fashion, except it's read before the initial HTTP request is made. You'd know that you're only requesting parts of the page, and could do a background query for elements which have been updated (i.e. a new category image, etc).
Then again, I also hate it when the loading of a PDF causes a loss of focus and slowing of the browser. Not the same, but in the same category of annoyance.
Q. What if the connection limit was set to 2 ? (Score:1)
For example, say a popular browser in it's next release had a default of 10?
What would be the effect on servers (small and large)? Would it help everyone? Do servers reject >2 connections by default?
Connection speed (Score:1)
(http://www.xanga.com/ipooptoomuch/ | Last Journal: Thursday September 06, @07:13AM)
This explains why blogger is down.... (Score:1)
(http://convergence.in/blog)
Static content in multipart packages? (Score:2, Interesting)
(http://vectorsigma.org/)
Faster downloading (Score:1)
All the offsite stuff is ads anyway. Block them. (Score:3, Insightful)
(http://www.animats.com)
This is an excellent argument for ad blocking. The article never mentions the basic truth - almost all offsite content on web pages is ads. (Of course, this is someone from Google talking, and Google, after all, is an ad-delivery service which runs a search engine to boost their hits.) Web page load is choking on ads. I noted previously that some sites load ads from as many as six different sources. This saturates the number of connections the browser supports. Page load then bottlenecks on the slowest ad server.
So install AdBlock and FlashBlock in Firefox, and watch your browsing speed up.
Web-based advertising looks like a saturated market. Watch for some big bankruptcies among advertising-supported services.
It's not reading the AMOL data in the VBI? (Score:2)
(http://www.animats.com)
So MythTV is still guessing, like the ad-skipping VCRs of twenty years ago.
There's better data available. Broadcast TV signals contain considerable metadata. The AMOL data in the VBI and the SID data in the audio clearly identify the program content and source. Here's a encoder [norpak.ca] for that information, which is inserted to make Nielsen ratings and advertising payments work.
See U.S. patent #5,699,124 for some details of how the data is encoded.
So far, the PVR community doesn't seem to have figured this stuff out, and the specs aren't easy to get, but the data is out there.
Web optimization techniques (Score:1)
E.g., Cisco's AVS (formerly Fineground): http://www.cisco.com/en/US/products/ps6492/product s_white_paper0900aecd80321a32.shtml [cisco.com]
- implements the multiple DNS name solution suggested by Mr Hopkins
- has a clever way of eliminating browser cache validation requests
- has a mechanism to transparently measure actual (not simulated) user page load times
Other products have similar but, in this age of software patents, slightly different optimizations:Slashdot Needs to Listen (Score:2)
Guys, splunk does not apparently have the server power or bandwidth to service Slashdot. Get a clue and dump their ads or tell them to buy another server box.
Ninety percent of the time when I'm waiting on a page to load, it's because some ad server is overloaded. The rest of the time it's because the site server itself is overloaded (or "Slashdotted").
Use Firefox - then use Adblock generously.
Tweak Network Settings (Score:1)
(http://www.bitstorm.org/edwin/en/)
Tweak Network Settings:
http://www.bitstorm.org/extensions/tweak/ [bitstorm.org]
Very easy to use: just use the "Power profile".
Re:It's talking about 'percieved performance' (Score:2)
When I click on an element in a web page to manage my email or use a word processor, the response time is going to be around my ping (30-90 ms depending on where in the country it is) plus the time to load. That is long enough that I am clicking, and waiting. If I were working on a local native app, the response time would be under 30 ms and I would probably not even notice it.
For a quick email check or reading webpages, it doesn't really matter too much. But if you're trying to use that for constant daily productivity sorts of things (or even have a lot of email to go through) it is wasting a ton of your time. There are some real advantages to moving applications online and into a web browser (I've even heard people suggesting we should move to a web-browser for the full interface of our windowing system) but speed is currently NOT one of them. Since it seems like it's going to be more or less forced on me, anything that can make it faster and more tolerable is quite appreciated.