PHP RFC Preview: Dynamic Callback Expressions

I’m posting this to get some initial feedback on this idea before I officially submit an RFC.

Background

Even with PHP’s growing object-oriented and functional programming features, the callback remains widely-used and useful. However, forcing authors to create callbacks via strings and arrays presents difficulties:

  1. Most IDEs do not recognize callbacks as such, and so cannot offer autocompletion, rename refactoring, and other benefits of code comprehension.
  2. Authors can misspell identifiers inside strings.
  3. Within namespaced code, authors can forget to prepend the namespace, since function calls within the namespace do not require it.
  4. Where use statements change the identifier for a class, authors can specify the local classname instead of the fully resolved name.

Proposal Continue reading  

Convert Google Maps embed HTML to Street View URL

You can use the form below to convert the HTML embed code Google Maps gives you to a usable Street View URL

Why do I need this?

The new Google Maps layout has a chain-link icon on the left that gives you a URL to what you’re looking at. If you’re in Street View, sometimes the given URL doesn’t include the proper parameters and you end up back on the top-down map view. This converter pulls a valid Street View URL out of the embed HTML.

source code

String Subtypes for Safer Web Programming

Valid HTML markup involves several different contexts and escaping rules, yet many APIs give no precise indication of which context their string return values are escaped for, or how strings should be escaped before being passed in (let’s not even get into character encoding). Most programming languages only have a single String type, so there’s a strong urge to document function with @param string and/or @return string and move on to other work, but this is rarely sufficient information.

Look at the documentation for WordPress’s get_the_title:

Returns

(string) 
Post title. …

If the title is Stan "The Man" & Capt. <Awesome>, will & and < be escaped? Will the quotes be escaped? “string” leaves these important questions unanswered. This isn’t meant to slight WordPress’s documentation team (they at least frequently give you example code from which you can guess the escaping model); the problem is endemic to web software.

So for better web security—and developer sanity—I think we need a shared vocabulary of string subtypes which can supply this missing metadata at least via mention or annotation in the documentation (if not via actual types).

Proposed Subtypes and Content Models

A basic set of four might help quite a bit. Each should have its own URL to explain its content model in detail, and how it should be handled:

Unescaped
Arbitrary characters not escaped for HTML in any way, possibly including nulls/control characters. If a string’s subtype is not explicit, for safety it should be assumed to contain this content.
Markup
Well-formed HTML markup matching the serialization of a DocumentFragment
TaglessMarkup
Markup containing no literal less-than sign (U+003C) characters (e.g. for output inside title/textarea elements)
AttrValue
TaglessMarkup containing no literal apostrophe (U+0027) or quotation mark (U+0022) characters, for output as a single/double-quoted attribute value

What would these really give us?

These subtypes cannot make promises about what they contain, but are rather for making explicit what they should contain. It’s still up to developers to correctly handle input, character encoding, filtering, and string operations to fulfill those contracts.

The work left to do is to define how these subtypes should be handled and in what contexts they can be output as-is, and what escaping needs to be applied in other contexts.

Obvious Limitations

For the sake of simplicity, these subtypes shouldn’t attempt to address notions of input filtering or whether a string should be considered “clean”, “tainted”, “unsafe”, etc. A type/annotation convention like this should be used to assist—not replace—experienced developers practicing secure coding methods.

RotURL: Rot13 for URLs

RotURL is a simple substitution cipher for encoding/obscuring URLs embedded in other URLs (e.g. in a querystring). Also, common chars that need to be escaped (:/?=&%#) are mapped to infrequently used capital letters, so this generally yields shorter querystrings, too.

/**
 * Rot35 with URL/urlencode-friendly mappings. To avoid increasing size during
 * urlencode(), commonly encoded chars are mapped to more rarely used chars.
 */
function rotUrl($url) {
    return strtr($url,
        './-:?=&%# ZQXJKVWPY abcdefghijklmnopqrstuvwxyz123456789ABCDEFGHILMNORSTU',
        'ZQXJKVWPY ./-:?=&%# 123456789ABCDEFGHILMNORSTUabcdefghijklmnopqrstuvwxyz');
}

rotUrl('https://en.wikipedia.org/w/index.php?title=Special%3ASearch&search=Base64#foo')
    == '8MMGLJQQ5EZR9B9G5491ZFI7QRQ9E45SZG8GKM9MC5VxG5391CPcjx51I38WL51I38Vk1L5fdY6FF';
rotUrl(rotUrl($anyUrl)) = $anyUrl;

You could save a few more bytes by encoding the schema (e.g. “h” for http://, “H” for https://). Since your end encoding has to be URL-safe, there’s not much you can do beyond this to compress a URL embedded in a URL.

Validate Private Page Bookmarklet

ValidatePrivatePage <– validates in current window

ValidatePrivatePage <– validates in new window (your pop-up blocker may complain)

If you need to validate the markup of a page that’s not public (e.g. on localhost), you can now use this bookmarklet to auto-submit the current page source to the validator (instead of viewing source, copying, opening the validator, pasting in, and pressing “check”).

Note: this gets the page source making an XMLHTTPRequest to the current URL, so it does not get interpreted by the browser; i.e. this is NOT based on innerHTML(). If the request made returns a different page (e.g. you were logged out in the meantime), that page’s source will be sent to the validator. Not much can be done about that. I once wrote a crusty PHP4 class/bookmarklet combo that helped do this, but thanks to the standardization of XMLHTTPRequest, this is easy in JS now. You should also thank W3C for allowing cross-domain POSTs to the validator :)

NetBeans Love & Hate

For those cases where you have to work on remote code, NetBeans‘ remote project functionality seems to put it ahead of other PHP IDEs. It pulls down a tree of files and uploads files that you save. Having a local copy allows it to offer its full code comprehension, auto-complete, and great rename refactoring for “remote” code. In contrast Eclipse allows you to open remote files using Remote System Explorer, but you only get PHP syntax highlighting, not the excellent PDT.

But NetBeans is not all smiles and sunshine. Continue reading  

Helping Netbeans/PhpStorm with Autocomplete/Code-hinting

Where Netbeans can’t guess the type/existence of a local variable, you can tell it in a multiline comment:

/* @var $varName TypeName */

After this comment (and as long as TypeName is defined in your project/project’s include path), when you start to type $varName, Netbeans will offer to autocomplete it, and will offer TypeName method/property suggestions. If you rename the variable with Ctrl+r (rename refactoring), Netbeans will change the comment, too.

I usually forget this syntax because type comes first in @param declarations.

Update: PhpStorm supports a similar syntax, but reversing the type and variable name:

/* @var TypeName $varName */

Bookmarklet: Horizontally invert HTML5 videos

My demands for “reverse” glasses have gone unserved, but I made a bookmarklet that provides the same effect: “flopping” a video horizontally.

  1. Install the SwitchStance bookmarklet, for which you’ll need a modern browser that supports CSS transforms on video elements.
  2. Opt-in to YouTube’s HTML5 trial
  3. Load up any video without ads (here’s one of Matt Hensley skating)
  4. While the video plays, click the bookmarklet.

The video will mirror and you’ll see Hensley, a regular-footed skater, now skating goofy foot (in “switch stance“). Or you can get Paul McCartney to play guitar right-handed.