Now that Minify 2 is out, I’ve been thinking of expanding the project to take on the not-straightforward task of serving already HTTP encoded files on Apache and allowing Apache to negotiate the encoding version; taking CGI out of the picture would be the natural next step toward serving these files as efficiently as possible.
All mod_negotiation docs and articles I could find applied mainly to language negotiation, so I hope this is helpful to someone. I’m using Apache 2.2.4 on WinXP (XAMPP package), so the rest of this article applies to this setup.
Type-map over MultiViews
I was first able to get this working with MultiViews, but everywhere I’ve read says this is much slower than using type-maps. Supposedly, with MultiViews, Apache has to internally pull the directory contents and generate an internal type-map structure for each file then apply the type-map algorithm to choose the resource to send, so creating them explicitly saves Apache the trouble. Although one is required for each resource, Minify will eventually automate this.
Setup
To simplify config, I’m applying this setup to one directory where all the content-negotiated files will be served from. Here’s the starting .htaccess (we’ll add more later):
# turn off MultiViews if enabled
Options -MultiViews
# For *.var requests, negotiate using type-map
AddHandler type-map .var
# custom extensions so existing handlers for .gz/.Z don't interfere
AddEncoding x-gzip .zg
AddEncoding x-compress .zc
AddEncoding deflate .zd
Now I placed 4 files in the directory (the encoded files were created with this little utility):
- before.js (not HTTP encoded)
- before.js.zd (deflate encoded – identical to gzip, but without header)
- before.js.zg (gzip encoded)
- before.js.zc (compress encoded)
Now the type-map “before.js.var”:
URI: before.js.zd
Content-Type: application/x-javascript; qs=0.9
Content-Encoding: deflate
URI: before.js.zg
Content-Type: application/x-javascript; qs=0.8
Content-Encoding: x-gzip
URI: before.js.zc
Content-Type: application/x-javascript; qs=0.7
Content-Encoding: x-compress
URI: before.js
Content-Type: application/x-javascript; qs=0.6
So what this gives us is already useful. When the browser requests before.js.var, Apache returns one of the files that (a) is encoded in a format accepted by the browser, and (b) the particular file with the highest qs value. If the browser is Firefox, that will be “before.js.zd” (the deflated version). Apache will also send the necessary Content-Encoding header so FF can decode it, and the Vary header to help caches understand that various versions exist at this URL and what you get depends on the Accept-Encoding headers sent in the request.
The Content-Encoding lines in the type-map tell Apache to look out for these encodings in the Accept-Encoding request header. E.g. Firefox accepts only gzip or deflate, so “Content-Encoding: x-compress” tells Apache that Firefox can’t accept before.js.zc. If you strip out the Content-Encoding line from “before.js.zc” and give it the highest qs, Apache will dutifully send it to Firefox, which will choke on it. The “x-” in the Content-Encoding lines and AddEncoding directives is used to negotiate with older browsers that call gzip “x-gzip”. Apache understands that it also has to report this encoding the same way.
Sending other desired headers
I’d like to add the charset to the Content-Type header, and also some headers to optimize caching. The following .htaccess snippet removes ETags and adds the far-off expiration headers:
# Below we remove the ETag header and set a far-off Expires
# header. Since clients will aggressively cache, make sure
# to modify the URL (querystring or via mod_rewrite) when
# the resource changes
# remove ETag
FileETag None
# requires mod_expires
ExpiresActive On
# sets Expires and Cache-Control: max-age, but not "public"
ExpiresDefault "access plus 1 year"
# requires mod_headers
# adds the "public" to Cache-Control.
Header set Cache-Control "public, max-age=31536000"
Adding charset was a bit more tricky. The type-map docs show the author placing the charset in the Content-Type lines of the type-map, only this doesn’t work! The value sent is actually just the original type set in Apache’s “mime.types”. So to actually send a charset, you have to redefine type for the js extension (I added in CSS since I’ll be serving those the same way):
# Necessary to add charset while using type-map
AddType application/x-javascript;charset=utf-8 js
AddType text/css;charset=utf-8 css
Now we have full negotiation based on the browser’s Accept-Encoding and are sending far-off expiration headers and the charset. Try it.
Got old IE version users?
This config trusts Apache to make the right decision of which encoding a browser can handle. The problem is that IE6 before XPSP1 and older versions lie; they can’t really handle encoded content in some situations. HTTP_Encoder::_isBuggyIe() roots these out, but Apache needs help. With the help of mod_rewrite, we can sniff for the affected browsers and rewrite their requests to go to the non-encoded files directly:
# requires mod_rewrite
RewriteEngine On
RewriteBase /tests/encoding-negotiation
# IE 5 and 6 are the only ones we really care about
RewriteCond %{HTTP_USER_AGENT} MSIE\ [56]
# but not if it's got the SV1 patch or is really Opera
RewriteCond %{HTTP_USER_AGENT} !(\ SV1|Opera)
RewriteRule ^(.*)\.var$ $1 [L]
Also worth mentioning is that, if you figure out the browser encoding elsewhere (e.g. in PHP while generating markup), you can link directly to the encoded files (like before.js.zd) and they’ll be served correctly.
No mod_headers
If you’re on a shared host w/o mod_headers enabled (like mrclay.org for now), you’ll just have to accept what mod_expires sends, i.e. the HTTP/1.1 Cache-Control header won’t have the explicit “public” directive, but should still be quite cache-able.
Update: The configuration above can serve larger files 2.5x as fast as using mod_deflate.
Update Correction:
The Apache buggy browser code (last line) needs changing from
(orig) RewriteRule ^(.*)\.var$ $1 [L]
// fixed and tested on both IE6 with SV1 and IE6 without
to RewriteRule ^(.*)\.css$ $1 [L]
or RewriteRule ^(.*)\.js$ $1 [L]
Update: Mention to your readers that the above .htaccess directive file is to be placed in its respective static resource folder (eg: /css or /javascript) and not the root folder. Also you could explain the value of the
RewriteBase /tests/encoding-negotiation
. This is the /path to the static resource (css or javascript).On my own live website, I have implemented a HTTP encoding negotiation method with type-map resources (javascript and css) and it works very well. I have also tried ‘minify’ code and benchmarked the ‘real-time’ results as marginal compared to the “Apache HTTP encoding negotiation notes” pre-compression technique.
Thanks for sharing! We learn from each other, your project and notes have helped me heaps!
@Peter: Thanks. The RewriteRule I gave is correct. It purposefully redirects to the un-encoded resource for the targeted browsers. So:
IE5: GET file.js.var => file.js
Only the non-buggy browser requests get handled by mod_negotiation.
BTW, I’ve recently rewritten Minify::serve to get about 3x the performance. It’s now faster than mod_deflate!
It would be useful if you can configure Apache that it generates the different encodings just once when there do not exist the different file versions or the base file changed. So you don’t have to manually compress them each time you upload/change a file.
Hi Steve,
You said in project goal #3:
“Minify should work towards maintaining builds of pre-encoded files and letting Apache serve them.”
Will this work?
– Dynamic CMS which knows which JS files to include (e.g. thesame as “/?f=…”
– Get the last modified date
– Create an MD5 hash of these files (to get the filename)
– Append the lastmod date so you get -12345.js
– Use Minify to create this file
– Use typemap so apache serves the correct file to the browser.
All i am wondering is how Apache will handle the caching, since you no longer go through the PHP interperter. Any ideas / tests on this one?