flock() blocking reads

(Sigh.) After applying recent upgrades to our Red Hat 5 server at work, suddenly PHP file locking blocks the script execution for exactly 30 seconds! Both a shared reading lock (LOCK_SH) and exclusive writing lock (LOCK_EX) do this. This was, shall we say, unpleasant to diagnose. Since we use Cache_Lite (which locks by default) to cache lots of stuff and sometimes in multiple layers, most of our pages suddenly took 30, 60, etc. seconds to load! Here were duration times from a test script:

  'write' => '0.00257301',
  'read' => '0.00039792',
  'write w/ exclusive lock' => '30.00274611', (file_put_contents w/ LOCK_EX)
  'fopen' => '0.00080800',
  'get shared lock' => '29.99504590', (flock w/ LOCK_SH)
  'read file' => '0.00034904',
  'close file' => '0.00005412',

Awesome. Now it is documented that flock() “will not work on NFS” (which we are on), but this has worked fine for over a year and continues to work on our unpatched server. The versions of Apache and PHP did not seem to change. Here’s how Red Hat listed them:

old: httpd-2.2.3-6.el5 new: httpd-2.2.3-11.el5_1.3
old: php-5.1.6-12.el5 new: php-5.1.6-20.el5_2.1

So… any ideas? I’ll have to turn off fileLocking in Cache_Lite to move forward, but I’d love to know what’s up. This especially troubles me because I recently added default cache locking to Minify. Locking is also the default in Zend_Cache_Backend_File, so it seems like the right thing to do.

Javascript humor

hasthelargehadroncolliderdestroyedtheworldyet.com has a news feed. It also has this in the source:

if (!(typeof worldHasEnded == "undefined")) {
document.write("YUP.");
} else {
document.write("NOPE.");
}

Folks without Javascript get a more definite answer:

<noscript>NOPE.</noscript>

Also appreciated:

<!-- if the lhc actually destroys the earth & this page isn't
yet updated please email mike@frantic.org to receive a full
refund -->

uTag: like snurl, but crappy!

A SitePoint blogger recently wrote about uTag. Like my favorite, snurl, it’s one of many make-a-shorter-link URL redirect services, but with a 90’s twist! When you click on these, instead of getting a page, you get a frameset with ads on top, broken addressbar usability, and, if you know how uTag works, the warm fuzzies that the jerk who put you through this annoyance got ad revenue while the site they deemed worth linking to doesn’t even get PageRank credit for the inbound link. The best part is that uTag encourages site owners to replace all their links with crappy uTag links, so when uTag folds your site can add to the collective link rot. Also, if more site owners do this, browsing their sites starts to look like this:

Surfing a wonderful web of uTag links

Tip: Unless you’re as established as about.com (hated for this practice), you probably can’t get away with annoying users this thoroughly.

Site owners, it may be time to dust off this snippet:

if (top !== self) top.location.replace(document.location.href);

Case of the NS_ERROR_DOM_SECURITY_ERR

Working on a bookmarklet, I ran across “security errors” in Firefox and Opera (may happen in others, I didn’t check). In Firefox the code threw “Security error (NS_ERROR_DOM_SECURITY_ERR)” and in Opera it was something similarly vague.

The culprit code was trying to access the cssRules property of a style sheet from a different domain (my CSS is on a subdomain). It appears browsers apply the same origin policy to the DOM CSSStyleSheet interface. Effectively, you can access the href property of a different domain styleSheet object, but not its cssRules. In Firebug, the cssRules property will be null, but attempting to even read it in user script causes the exception.

If your script is throwing NS_ERROR_DOM_SECURITY_ERR, check for code trying to access objects on a different domain.

Thanks, Recuva

Recuva is a drop-dead simple Windows file recovery program [aside 1] that just saved me two hours of work. I was trying to reorganize some directories and ended up deleting two freshly written PHP files [2]. I’ve used about a half-dozen file recovery apps over the years due to my affliction [3], but Recuva was the most intuitive by far. Within 30 seconds I opened the program for the first time and closed it with my recovered files saved.

I tried to write a little review on download.com, but their Javascript was too broken for Opera.

[1] DOS’s “undelete.exe“, as crummy as it was, was the only reason ever to miss Win9x.

[2] Subversion lesson: Don’t commit directory renames/moves at the same time as modifications to descendant files/folders. Commit the rename, breathe, commit the rest.

[3] The shortcut geek in me developed the terrible habit of using [secret key]+[Delete] to delete a file, bypassing the Recycle Bin (if you don’t already know this key combination, don’t look it up; forget it existed; I wish I could).

9/6 update: Recuva just recovered 62G of Kathleen’s stuff from a failing external drive. OSX couldn’t mount it and a supposedly more heavy-duty Windows recovery app I tried couldn’t open the drive either.

Multibyte Input Bookmarklet

All modern web applications should be using UTF-8 encoding. For developers with English keyboards (where all keys produce 7-bit ASCII), testing web forms with multibyte characters can be a pain. You can, of course, enter Unicode characters via obscure key combinations, but using this bookmarklet may be easier:

Get it

You must enable Javascript! (right-click, add to favorites or bookmarks)

This simply visits all text/password/textarea inputs on a page and replaces Latin vowels with multibyte variants. E.g. John A PublicJōhn Ā Pūblīc.

Test it here

The bookmarklet prompts me for “Match beginning”. What is this?

If you only want to affect certain inputs, enter a phrase at the prompt. The bookmarklet will then only affect inputs whose values start with that phrase. E.g. To affect only a “comment” field, place “||” at the beginning of the field, and enter “||” at the match prompt. The bookmarklet will affect only this field and strip the “||” from the field for you.

Uncompressed source

Physics engine in Sketchup

I knew Sketchup was a great modeling tool, but apparently it’s also scriptable via a Ruby API and embedded web browser. At Google’s IO conference, Scott Lininger showed off some of this awesomeness.

20 min into the video below we see Scott capturing keystroke events within the browser instance and using them to control a modelled character. The character, its surroundings, and even the whole visual rendering style can be edited in real-time with the standard Sketchup tools.

Later he shows this YouTube clip that has more advanced physics demos made with the SketchyPhysics plugin. Wow.

EA skate has just about everything I could ask for in a skateboarding game (another post entirely), but you can’t design your own spots. Next logical step? Build a skateboarding simulator based on SketchyPhysics. The spots are already waiting.

Pre-encoding vs. mod_deflate

Recently I configured Apache to serve pre-encoded files with encoding-negotiation. In theory this should be faster than using mod_deflate, which has to re-encode every hit, but testing was in order.

My mod_deflate setup consisted of a directory with this .htaccess:

AddOutputFilterByType DEFLATE application/x-javascript
BrowserMatch \bMSIE\s[456] no-gzip
BrowserMatch \b(SV1|Opera)\b !no-gzip

and a pre-minified version of jquery-1.2.3 (54,417 bytes) saved as “before.js”. The BrowserMatch directives ensured the same rules for buggy browsers as used in the type-map setup. The compression level was left at the default, which I assume is optimal for performance.

The type-map setup was as described here. “before.js” was identical to the mod_deflate setup and included separate files for the gzip/deflate/compress-encoded versions. Each encoded file was created using maximum compression (9) since we needn’t worry about encoding efficiency.

I benchmarked with Apache ab and ran 10,000 requests, 100 concurrently. To make sure the encoded versions were returned I added the request header Accept-Encoding: deflate, gzip. The ab commands:

ab -c 100 -n 10000 -H "Accept-Encoding: deflate, gzip" http://path/to/mod_deflate/before.js > results_deflate.txt
ab -c 100 -n 10000 -H "Accept-Encoding: deflate, gzip" http://path/to/statics/before.js.var > results_statics.txt

Pre-encoding kicks butt

method requests/sec output size (bytes)
mod_deflate 187 16053
pre-encoding + type-map 470 15993

Apache can serve a pre-encoded version of jQuery is 2.5x as fast as it can using mod_deflate to compress on-the-fly. It may also be worth mentioning that mod_deflate chooses gzip over deflate, sending a few more bytes.

Basically, if you’re serving large textual files with Apache on a high-traffic site, you should work pre-encoding and type-map configuration into your build process and try these tests yourself rather than just flipping on mod_deflate.

mod_deflate may win for small files

Before serving jQuery, I ran the same test with a much smaller file (around 2k) and found that, at that size, mod_deflate actually outperformed type-map by a little. Obviously somewhere in-between 2K and 54K the benefit of pre-encoding starts to pay off. It probably also depends on compression level, compression buffer sizes, and a bunch of other things no-one wants to fiddle with.

Chili enzyme for .htaccess

Here’s a chili (javascript syntax highlighter) enzyme for highlighting .htaccess code snippets. This is for the 1.x series (I’m using 1.8b here) so this will likely not work for the latest 2.0 release without some modification. Also the highlighting is pretty basic, but at least you get comments and the first directive on a line.

place in recipes.js

ChiliBook.recipes[ "htaccess.js" ] = {
    steps: {
        com : { exp: /(?:^|\n)\s*\#.*/ }
        ,dir : { exp: /(?:^|\n)\s*\w+/ }
    }
};

place in recipes.css

.htaccess .com { color: green; }
.htaccess .dir { color: navy; }

Apache HTTP encoding negotiation notes

Now that Minify 2 is out, I’ve been thinking of expanding the project to take on the not-straightforward task of serving already HTTP encoded files on Apache and allowing Apache to negotiate the encoding version; taking CGI out of the picture would be the natural next step toward serving these files as efficiently as possible.

All mod_negotiation docs and articles I could find applied mainly to language negotiation, so I hope this is helpful to someone. I’m using Apache 2.2.4 on WinXP (XAMPP package), so the rest of this article applies to this setup.

Type-map over MultiViews

I was first able to get this working with MultiViews, but everywhere I’ve read says this is much slower than using type-maps. Supposedly, with MultiViews, Apache has to internally pull the directory contents and generate an internal type-map structure for each file then apply the type-map algorithm to choose the resource to send, so creating them explicitly saves Apache the trouble. Although one is required for each resource, Minify will eventually automate this.

Setup

To simplify config, I’m applying this setup to one directory where all the content-negotiated files will be served from. Here’s the starting .htaccess (we’ll add more later):

# turn off MultiViews if enabled
Options -MultiViews

# For *.var requests, negotiate using type-map
AddHandler type-map .var

# custom extensions so existing handlers for .gz/.Z don't interfere
AddEncoding x-gzip .zg
AddEncoding x-compress .zc
AddEncoding deflate .zd

Now I placed 4 files in the directory (the encoded files were created with this little utility):

  • before.js (not HTTP encoded)
  • before.js.zd (deflate encoded – identical to gzip, but without header)
  • before.js.zg (gzip encoded)
  • before.js.zc (compress encoded)

Now the type-map “before.js.var”:

URI: before.js.zd
Content-Type: application/x-javascript; qs=0.9
Content-Encoding: deflate

URI: before.js.zg
Content-Type: application/x-javascript; qs=0.8
Content-Encoding: x-gzip

URI: before.js.zc
Content-Type: application/x-javascript; qs=0.7
Content-Encoding: x-compress

URI: before.js
Content-Type: application/x-javascript; qs=0.6

So what this gives us is already useful. When the browser requests before.js.var, Apache returns one of the files that (a) is encoded in a format accepted by the browser, and (b) the particular file with the highest qs value. If the browser is Firefox, that will be “before.js.zd” (the deflated version). Apache will also send the necessary Content-Encoding header so FF can decode it, and the Vary header to help caches understand that various versions exist at this URL and what you get depends on the Accept-Encoding headers sent in the request.

The Content-Encoding lines in the type-map tell Apache to look out for these encodings in the Accept-Encoding request header. E.g. Firefox accepts only gzip or deflate, so “Content-Encoding: x-compress” tells Apache that Firefox can’t accept before.js.zc. If you strip out the Content-Encoding line from “before.js.zc” and give it the highest qs, Apache will dutifully send it to Firefox, which will choke on it. The “x-” in the Content-Encoding lines and AddEncoding directives is used to negotiate with older browsers that call gzip “x-gzip”. Apache understands that it also has to report this encoding the same way.

Sending other desired headers

I’d like to add the charset to the Content-Type header, and also some headers to optimize caching. The following .htaccess snippet removes ETags and adds the far-off expiration headers:

# Below we remove the ETag header and set a far-off Expires
# header. Since clients will aggressively cache, make sure
# to modify the URL (querystring or via mod_rewrite) when
# the resource changes

# remove ETag
FileETag None

# requires mod_expires
ExpiresActive On
# sets Expires and Cache-Control: max-age, but not "public"
ExpiresDefault "access plus 1 year"

# requires mod_headers
# adds the "public" to Cache-Control.
Header set Cache-Control "public, max-age=31536000"

Adding charset was a bit more tricky. The type-map docs show the author placing the charset in the Content-Type lines of the type-map, only this doesn’t work! The value sent is actually just the original type set in Apache’s “mime.types”. So to actually send a charset, you have to redefine type for the js extension (I added in CSS since I’ll be serving those the same way):

# Necessary to add charset while using type-map
AddType application/x-javascript;charset=utf-8 js
AddType text/css;charset=utf-8 css

Now we have full negotiation based on the browser’s Accept-Encoding and are sending far-off expiration headers and the charset. Try it.

Got old IE version users?

This config trusts Apache to make the right decision of which encoding a browser can handle. The problem is that IE6 before XPSP1 and older versions lie; they can’t really handle encoded content in some situations. HTTP_Encoder::_isBuggyIe() roots these out, but Apache needs help. With the help of mod_rewrite, we can sniff for the affected browsers and rewrite their requests to go to the non-encoded files directly:

# requires mod_rewrite
RewriteEngine On
RewriteBase /tests/encoding-negotiation
# IE 5 and 6 are the only ones we really care about
RewriteCond %{HTTP_USER_AGENT}  MSIE\ [56]
# but not if it's got the SV1 patch or is really Opera
RewriteCond %{HTTP_USER_AGENT} !(\ SV1|Opera)
RewriteRule ^(.*)\.var$         $1 [L]

Also worth mentioning is that, if you figure out the browser encoding elsewhere (e.g. in PHP while generating markup), you can link directly to the encoded files (like before.js.zd) and they’ll be served correctly.

No mod_headers

If you’re on a shared host w/o mod_headers enabled (like mrclay.org for now), you’ll just have to accept what mod_expires sends, i.e. the HTTP/1.1 Cache-Control header won’t have the explicit “public” directive, but should still be quite cache-able.