Analyser algorithm

Of course the algorithm initially analyses the HTML file of the website's home page. This file is parsed to detect all frames (which are parsed further to find other files), stylesheets (now also directive @import is supported), JavaScript files, CSS images, common images and multimedia files (<embed>). Then all of these files are consecutively loaded, stylesheets are additionally analyzed for CSS images (and external files).

Size, type, ETag and cache headers are detected for each file downloaded. Simple minimization is performed for HTML files, for CSS ones — with CSS Tidy 1.3, for JavaScript — Dean Edwards Packer is used. All textual files are gzipped after it. The result is compared with the initial file downloaded (to detect if it was initially compressed). Of course this compression method isnt perfect, but it gives us information, whether the initial file has already been somehow minified or not.

Optimization also can be performed for images. If it's a BMP image it is converted into PNG with bmp2png. If it's a GIF then gif2png is used. For PNG images pngcrush is used. JPEG images are additionally optimized with jpegtran.

An integral estimate is made when all data is collected. The results of the analysis also include all the information on the files downloaded and recommendations on website load time optimization (decrease of the requests number and size of downloaded files).

Also server cookie headers are analyzed. DNS lookup to this server (dig) and socket open time are taken (ab) as a typical network delay time. Usually it's time before the client browser receives the first byte from the server. Of course this doesn't cover all the possible cases but it is a real delay time of the download from the current server to the analyzed one. If the delay time is more than 200ms, it is decreased to this figure (so the delay time cannot exceed 200ms, so that connection lags dont affect the results of the analysis).

Final website load time both for broadband and dial-up connections is approximate and can considerably differ from real values, as a lot of additional conditions aren't taken into consideration. Both load times cover the whole webpage download process (with all external files, i.e. images). So the page can be displayed earlier while the browser will be loading all the remained components some time after it displays the initial layout of the page. The speeding-up calculated by Web Optimizator isn't maximum but can be reached provided all the advice are followed.

You should register to model the website load process and its visual optimization (that allows you to estimate the possible speeding-up more precisely).

Acknowledged trouble

  • Files with spaces in names are parsed incorrectly (this is contradictory to specification)

See also