Hopes for 2009

In no particular order, I hope…

  • the release of IE8 will spur organizations currently standardized on IE6 to finally bite the bullet and either upgrade their users to IE8 or move them to other browsers. Killing off IE6 (and IE7 really) will significantly decrease web development costs and reinvigorate CSS by opening up a world of selectors and properties that have been unofficially “off the table” due to the prevalence of IE6. As bad as IE6 and 7 have been in comparison to its competitors, IE8 looks to be a major step forward for the default browser of Windows systems.
  • the popularity of IE8 will interest IE upgraders to try other browsers as well. While IE8 is great for the IE user, exactly what the web does not need is another browser market so dominated by one product that web developers move from web standards back to coding for the dominant browser. Although IE8 looks to be committed to standards support, there will be plenty of quirky rendering modes that ignorant developers will get accustomed to if they don’t test in other browsers.
  • Opera 10 will continue strides to increase compatibility with broken sites and stay so blazingly fast and handy out-of-the-box that I’m willing to do most of my browsing without the luxury of add-ons. As far as I know no other browser lets me put my addressbar and tabs on the bottom where I like em’; it’s the little things.
  • that the kids who vandalized a bunch of cars last night, including mine, will receive better parenting than they have in the past. God knows making them spend time with other messed up kids in juvenile detention or giving them permanent criminal records isn’t going to do anything positive for their lives.
  • that our family will have fewer health problems. For the past couple months illnesses just haven’t let up long enough for us to catch our breath. For several events we were looking forward to we were either out of town, sick, or just too exhausted to bother. A Roller Rebels bout, the Of Montreal show, Don & Sarah’s mixtape party…
  • that the new president will choose a drug czar with a background in harm reduction or, better yet, open a public dialogue to discuss if the current federal system (ONDCP, DEA, and CSA) is the right way to reduce the public harms associated with drug use.
  • that my friends and random readers (especially those, like me, who don’t use any drugs) will take some time to learn about what the War on Drugs is doing to the world. The top search engine results are as good a place to start as any, and, of those, the Drug Policy Alliance provides the best overview of the harms, while Rolling Stone describes the last 20 years and the battle of cocaine and meth. Since October I’ve found this topic fascinating, and every day I uncover more evidence that our current system based on blanket prohibition causes tremendous societal harm.
  • that the web will continue to be an exhaustive source of information about drugs, policies, and history and help people form educated opinions based on facts. I grew up knowing nothing about drugs but the old “this is your brain on drugs” ads, so when I started reading about the real science and history of illegal drugs it was quite eye-opening. First you realize how dangerous heroin and meth are, then you find to your shock that marijuana is hardly the drug the government makes it out to be, then that alcohol and tobacco are so much worse and you wonder why they’re exempt from the CSA, then you read about when alcohol was illegal and the havoc that caused, and finally you realize it’s not the drugs, but the prohibition causing the biggest problems.
  • that the media will continue its coverage on the harms of the War on Drugs in Mexico and continue to give voice to clear-headed intelligent criticism of drug policy as its done increasingly recently.
  • that online news sources continue to allow readers to openly discuss drug policy in their commenting systems. It’s obvious that more people are taking the time to do their research; the won’t please someone think of the children! arguments are thankfully falling from fashion, though I’m increasingly seeing the “sends the wrong message to kids” argument from drug warriors anytime anyone suggests reducing criminal penalties for marijuana possession.
  • I’ll play and record more music.
  • that Skate 2 will be as awesome as it looks.
  • state budget cuts will not cost me nor Kathleen a job. Did I mention the War on Drugs is damn expensive?
  • the recession will not cost Gainesville any of its awesome eateries. Yesterday at The Jones’ I had corn-flake-encrusted brioche french toast topped with almond whipped cream. It was possibly the most magical thing I’ve ever tasted.

Where’s the code?

Google’s free open source project hosting has been awesome for Minify, so when I was looking around for Subversion hosting for my personal code, I figured why not host it there? So here’s a bunch of my PHP and Javascript code. Hopefully some of it will be useful to people. A few PHP highlights:

  • HashUtils implements password hashing with a random salt, preventing the use of rainbow table cracks. It can also sign and verify the signature of string content.
  • StringDebug makes debugging strings with whitespace/non-printable/UTF-8 characters much less painful.
  • CookieStorage saves/fetches tamper-proof and optionally encrypted strings in cookies.
  • TimeZone simplifies the handling of date/times between timezones using an API you already know: strtotime() and date().
  • Utf8String is an immutable UTF-8 string class that aims to simplify the API of the phputf8 library and make behind-the-scene optimizations like using native functions whenever possible. Mostly a proof-of-concept, but it works.

I’ll get the examples of these online at some point, but if you export /trunk/php, all the lowercase-named files are demos/tests. There’s also an “oldies” branch (not necessarily goodies).

Hopefully this makes the political jibba jabba more forgivable.

Multibyte Input Bookmarklet

All modern web applications should be using UTF-8 encoding. For developers with English keyboards (where all keys produce 7-bit ASCII), testing web forms with multibyte characters can be a pain. You can, of course, enter Unicode characters via obscure key combinations, but using this bookmarklet may be easier:

Get it

You must enable Javascript! (right-click, add to favorites or bookmarks)

This simply visits all text/password/textarea inputs on a page and replaces Latin vowels with multibyte variants. E.g. John A PublicJōhn Ā Pūblīc.

Test it here

The bookmarklet prompts me for “Match beginning”. What is this?

If you only want to affect certain inputs, enter a phrase at the prompt. The bookmarklet will then only affect inputs whose values start with that phrase. E.g. To affect only a “comment” field, place “||” at the beginning of the field, and enter “||” at the match prompt. The bookmarklet will affect only this field and strip the “||” from the field for you.

Uncompressed source

Pre-encoding vs. mod_deflate

Recently I configured Apache to serve pre-encoded files with encoding-negotiation. In theory this should be faster than using mod_deflate, which has to re-encode every hit, but testing was in order.

My mod_deflate setup consisted of a directory with this .htaccess:

AddOutputFilterByType DEFLATE application/x-javascript
BrowserMatch \bMSIE\s[456] no-gzip
BrowserMatch \b(SV1|Opera)\b !no-gzip

and a pre-minified version of jquery-1.2.3 (54,417 bytes) saved as “before.js”. The BrowserMatch directives ensured the same rules for buggy browsers as used in the type-map setup. The compression level was left at the default, which I assume is optimal for performance.

The type-map setup was as described here. “before.js” was identical to the mod_deflate setup and included separate files for the gzip/deflate/compress-encoded versions. Each encoded file was created using maximum compression (9) since we needn’t worry about encoding efficiency.

I benchmarked with Apache ab and ran 10,000 requests, 100 concurrently. To make sure the encoded versions were returned I added the request header Accept-Encoding: deflate, gzip. The ab commands:

ab -c 100 -n 10000 -H "Accept-Encoding: deflate, gzip" http://path/to/mod_deflate/before.js > results_deflate.txt
ab -c 100 -n 10000 -H "Accept-Encoding: deflate, gzip" http://path/to/statics/before.js.var > results_statics.txt

Pre-encoding kicks butt

method requests/sec output size (bytes)
mod_deflate 187 16053
pre-encoding + type-map 470 15993

Apache can serve a pre-encoded version of jQuery is 2.5x as fast as it can using mod_deflate to compress on-the-fly. It may also be worth mentioning that mod_deflate chooses gzip over deflate, sending a few more bytes.

Basically, if you’re serving large textual files with Apache on a high-traffic site, you should work pre-encoding and type-map configuration into your build process and try these tests yourself rather than just flipping on mod_deflate.

mod_deflate may win for small files

Before serving jQuery, I ran the same test with a much smaller file (around 2k) and found that, at that size, mod_deflate actually outperformed type-map by a little. Obviously somewhere in-between 2K and 54K the benefit of pre-encoding starts to pay off. It probably also depends on compression level, compression buffer sizes, and a bunch of other things no-one wants to fiddle with.

Apache HTTP encoding negotiation notes

Now that Minify 2 is out, I’ve been thinking of expanding the project to take on the not-straightforward task of serving already HTTP encoded files on Apache and allowing Apache to negotiate the encoding version; taking CGI out of the picture would be the natural next step toward serving these files as efficiently as possible.

All mod_negotiation docs and articles I could find applied mainly to language negotiation, so I hope this is helpful to someone. I’m using Apache 2.2.4 on WinXP (XAMPP package), so the rest of this article applies to this setup.

Type-map over MultiViews

I was first able to get this working with MultiViews, but everywhere I’ve read says this is much slower than using type-maps. Supposedly, with MultiViews, Apache has to internally pull the directory contents and generate an internal type-map structure for each file then apply the type-map algorithm to choose the resource to send, so creating them explicitly saves Apache the trouble. Although one is required for each resource, Minify will eventually automate this.

Setup

To simplify config, I’m applying this setup to one directory where all the content-negotiated files will be served from. Here’s the starting .htaccess (we’ll add more later):

# turn off MultiViews if enabled
Options -MultiViews

# For *.var requests, negotiate using type-map
AddHandler type-map .var

# custom extensions so existing handlers for .gz/.Z don't interfere
AddEncoding x-gzip .zg
AddEncoding x-compress .zc
AddEncoding deflate .zd

Now I placed 4 files in the directory (the encoded files were created with this little utility):

  • before.js (not HTTP encoded)
  • before.js.zd (deflate encoded – identical to gzip, but without header)
  • before.js.zg (gzip encoded)
  • before.js.zc (compress encoded)

Now the type-map “before.js.var”:

URI: before.js.zd
Content-Type: application/x-javascript; qs=0.9
Content-Encoding: deflate

URI: before.js.zg
Content-Type: application/x-javascript; qs=0.8
Content-Encoding: x-gzip

URI: before.js.zc
Content-Type: application/x-javascript; qs=0.7
Content-Encoding: x-compress

URI: before.js
Content-Type: application/x-javascript; qs=0.6

So what this gives us is already useful. When the browser requests before.js.var, Apache returns one of the files that (a) is encoded in a format accepted by the browser, and (b) the particular file with the highest qs value. If the browser is Firefox, that will be “before.js.zd” (the deflated version). Apache will also send the necessary Content-Encoding header so FF can decode it, and the Vary header to help caches understand that various versions exist at this URL and what you get depends on the Accept-Encoding headers sent in the request.

The Content-Encoding lines in the type-map tell Apache to look out for these encodings in the Accept-Encoding request header. E.g. Firefox accepts only gzip or deflate, so “Content-Encoding: x-compress” tells Apache that Firefox can’t accept before.js.zc. If you strip out the Content-Encoding line from “before.js.zc” and give it the highest qs, Apache will dutifully send it to Firefox, which will choke on it. The “x-” in the Content-Encoding lines and AddEncoding directives is used to negotiate with older browsers that call gzip “x-gzip”. Apache understands that it also has to report this encoding the same way.

Sending other desired headers

I’d like to add the charset to the Content-Type header, and also some headers to optimize caching. The following .htaccess snippet removes ETags and adds the far-off expiration headers:

# Below we remove the ETag header and set a far-off Expires
# header. Since clients will aggressively cache, make sure
# to modify the URL (querystring or via mod_rewrite) when
# the resource changes

# remove ETag
FileETag None

# requires mod_expires
ExpiresActive On
# sets Expires and Cache-Control: max-age, but not "public"
ExpiresDefault "access plus 1 year"

# requires mod_headers
# adds the "public" to Cache-Control.
Header set Cache-Control "public, max-age=31536000"

Adding charset was a bit more tricky. The type-map docs show the author placing the charset in the Content-Type lines of the type-map, only this doesn’t work! The value sent is actually just the original type set in Apache’s “mime.types”. So to actually send a charset, you have to redefine type for the js extension (I added in CSS since I’ll be serving those the same way):

# Necessary to add charset while using type-map
AddType application/x-javascript;charset=utf-8 js
AddType text/css;charset=utf-8 css

Now we have full negotiation based on the browser’s Accept-Encoding and are sending far-off expiration headers and the charset. Try it.

Got old IE version users?

This config trusts Apache to make the right decision of which encoding a browser can handle. The problem is that IE6 before XPSP1 and older versions lie; they can’t really handle encoded content in some situations. HTTP_Encoder::_isBuggyIe() roots these out, but Apache needs help. With the help of mod_rewrite, we can sniff for the affected browsers and rewrite their requests to go to the non-encoded files directly:

# requires mod_rewrite
RewriteEngine On
RewriteBase /tests/encoding-negotiation
# IE 5 and 6 are the only ones we really care about
RewriteCond %{HTTP_USER_AGENT}  MSIE\ [56]
# but not if it's got the SV1 patch or is really Opera
RewriteCond %{HTTP_USER_AGENT} !(\ SV1|Opera)
RewriteRule ^(.*)\.var$         $1 [L]

Also worth mentioning is that, if you figure out the browser encoding elsewhere (e.g. in PHP while generating markup), you can link directly to the encoded files (like before.js.zd) and they’ll be served correctly.

No mod_headers

If you’re on a shared host w/o mod_headers enabled (like mrclay.org for now), you’ll just have to accept what mod_expires sends, i.e. the HTTP/1.1 Cache-Control header won’t have the explicit “public” directive, but should still be quite cache-able.

Minifying Javascript and CSS on mrclay.org

Update: Please read the new version of this article. It covers Minify 2.1, which is much easier to use.

Minify v2 is coming along, but it’s time to start getting some real-world testing, so last night I started serving this site’s Javascript and CSS (at least the 6 files in my WordPress templates) via a recent Minify snapshot.

As you can see below, I was serving 67K over 6 requests and was using some packed Javascript, which has a client-side decompression overhead.

fiddler1.png

Using Minify, this is down to 2 requests, 28K (58% reduction), and I’m no longer using any packed Javascript:

fiddler2.png

Getting it working

  1. Exported Minify from svn (only the /lib tree is really needed).
  2. Placed the contents of /lib in my PHP include path.
  3. Determined where I wanted to store cache files (server-side caching is a must.)
  4. Gathered a list of the JS/CSS files I wanted to serve.
  5. Created “min.php” in the doc root:
    // load Minify
    require_once 'Minify.php';
    
    // setup caching
    Minify::useServerCache(realpath("{$_SERVER['DOCUMENT_ROOT']}/../tmp"));
    
    // controller options
    $options = array(
    	'groups' => array(
    		'js' => array(
    			'//wp-content/chili/jquery-1.2.3.min.js'
    			,'//wp-content/chili/chili-1.8b.js'
    			,'//wp-content/chili/recipes.js'
    			,'//js/email.js'
    		)
    		,'css' => array(
    			'//wp-content/chili/recipes.css'
    			,'//wp-content/themes/orangesky/style.css'
    		)
    	)
    );
    
    // serve it!
    Minify::serve('Groups', $options);

    (note: The double solidi at the beginning of the filenames are shortcuts for $_SERVER['DOCUMENT_ROOT'].)

  6. In HTML, replaced the 4 script elements with one:
    <script type="text/javascript" src="/min/js"></script>

    (note: Why not “min.php/js”? Since I use MultiViews, I can request min.php by just “min”.)

  7. and replaced the 2 stylesheet links with one:
    <link rel="stylesheet" href="/min/css" type="text/css" media="screen" />

At this point Minify was doing its job, but there was a big problem: My theme’s CSS uses relative URIs to reference images. Thankfully Minify’s CSS minifier can rewrite these, but I needed to specify that option just for style.css.

I did that by giving a Minify_Source object in place of the filename:

// load Minify_Source
require_once 'Minify/Source.php';

// new controller options
$options = array(
	'groups' => array(
		'js' => array(
			'//wp-content/chili/jquery-1.2.3.min.js'
			,'//wp-content/chili/chili-1.8b.js'
			,'//wp-content/chili/recipes.js'
			,'//js/email.js'
		)
		,'css' => array(
			'//wp-content/chili/recipes.css'

			// style.css has some relative URIs we'll need to fix since
			// it will be served from a different URL
			,new Minify_Source(array(
				'filepath' => '//wp-content/themes/orangesky/style.css'
				,'minifyOptions' => array(
					'prependRelativePath' => '../wp-content/themes/orangesky/'
				)
			))
		)
	)
);

Now, during the minification of style.css, Minify prepends all relative URIs with ../wp-content/themes/orangesky/, which fixes all the image links.

What’s next

This is fine for now, but there’s one more step we can do: send far off Expires headers with our JS/CSS. This is tricky because whenever a change is made to a source file, the URL used to call it must change in order to force the browser to download the new version. As of this morning, Minify has an elegant way to handle this, but I’ll tackle this in a later post.

Update: Please read the new version of this article. It covers Minify 2.1, which is much easier to use.

Awesome Holiday boredom

88

I got: A, ABBR, ACRONYM, APPLET, AREA, B, BASE, BASEFONT, BIG, BLOCKQUOTE, BODY, BR, BUTTON, CAPTION, CENTER, CITE, CODE, COL, COLGROUP, DD, DEL, DFN, DIR, DIV, DL, DT, EM, FIELDSET, FONT, FORM, FRAME, FRAMESET, H1, H2, H3, H4, H5, H6, HEAD, HR, HTML, I, IFRAME, INPUT, INS, ISINDEX, KBD, LABEL, LEGEND, LI, LINK, MAP, MENU, META, NOFRAMES, NOSCRIPT, OBJECT, OL, OPTGROUP, OPTION, P, PARAM, PRE, Q, S, SAMP, SCRIPT, SELECT, SMALL, SPAN, STRIKE, STRONG, STYLE, SUB, SUP, TABLE, TBODY, TD, TEXTAREA, TFOOT, TH, THEAD, TITLE, TR, TT, U, UL, and VAR

I forgot: ADDRESS, BDO, and IMG. Yes, IMG.

To be fair, this was probably the 8th try. It helps to group them into forms, block/inline, quoting, embedding/linking, framing, lists, data/code display, etc.

LoadVars implements HTTP caching

Searching for info about Flash’s caching mechanism turned up endless posts on how to prevent caching, but none mentioned how LoadVars.load() handled server-sent Cache-Control headers. So I tested this myself.

In the SWF, I loaded the same URL once every 5 seconds using setInterval() and LoadVars.

The URL ran a PHP script that sent content along with Last-Modified and ETag headers based on its own mtime, and the header Cache-Control: max-age=0, private, must-revalidate. This basically means “browsers may cache this item, but must always check the server for updates before using it.”

It worked! Once Flash got those headers, load() would then make “conditional” GET requests: Flash included If-None-Match and If-Modified-Since headers, allowing PHP to respond with 304 Not Modified, and Flash used its cached copy of the data.

Flash seems to use the browser as a proxy to handle these requests and manage the cache, because Flash in Safari 3 shared a limitation of that browser, namely not supporting ETags; Safari only sends back an If-Modified-Since header for conditional GETs.

Actionscript pains

I have an Actionscript 3 book lined up to tackle at some point, but generally my interaction with Actionscript is having to modify someone else’s SWF, most commonly old code from the 1.0 days. When I open one of these source files it sometimes takes time to even figure out where the code is. When I do find it, it’s not obvious when this code executes and in what scope. The object model of a Flash movie may not be much more complex than the browser DOM, but they’re quite different. I think part of my problem is that the structure of a movie (and the IDE navigation) is still foreign to me. The programmer in me wants to dig in with a spec in hand, and that often works with the projects I have to work on, but it would probably benefit me greatly to spend some time doing the boring beginner Flash tutorials.