In Support of Bloated, Heavyweight IDEs

I’ve done plenty of programming in bare-bones text editors of all kinds over the years. Free/open editors were once pretty bad and a lot of capable commercial ones have been expensive. Today it’s still handy to pop a change into Github’s web editor or nano. Frankly, though, I’m unconvinced by arguments suggesting I use text editors that don’t really understand the codeI believe that, independent of your skill level, you’ll produce better code and faster by using the most powerful IDE you can get your hands on.

To convince you of this, I’ll try to show how each of the following list of features, in isolation, is good for your productivity. Then it should follow that each one you work without will be lowering it. Also it’s important to note that leaving in place or producing bugs that must be fixed later, or that create external costs, reduces your real productivity, and “you” in the list below could also mean you six months from now or another developer. The list is not in any particular order, just numbered for reference.

  1. Syntax highlighting saves you from finding and fixing typos after compilation failures. In a language where a script file may be conditionally executed, like PHP, you may leave a bug that will have to be dug up by someone else after costing end users a lot of time. In rarer cases the code may compile but not do what you expected, costing even more time. SH also makes the many contexts available in code (strings, functions, vars, comments, etc.) significantly easier to see when scanning/scrolling.
  2. Having a background task scan the code can help catch errors that simple syntax highlighting cannot, since most highlighters are designed to expect valid syntax and may not show problems.
  3. Highlighting matching braces/parenthesis eases the writing and reading of code and expressions.
  4. IDEs can show the opening/closing lines of blocks that appear offscreen without you needing to scroll. Although long blocks/function bodies can be a signal to refactor, this can aid you in working on existing code like this.
  5. Highlighting the use of unknown variables/functions/methods can show false positives for problems, but more often signals a bug that’s hidden from sight: E.g. a variable declared above has been removed; the type of a variable is not what is expected; a library upgrade has removed a method, or a piece of code has been transplanted from another context without its dependencies. Missing these problems has a big future cost as these may not always cause compile or even runtime errors.
  6. Highlighting an unused variable warns you that it isn’t being used how it was probably intended. It may uncover a logic bug or mean you can safely remove its declaration, preventing you from later having to wonder what it’s for.
  7. Highlighting the violation of type hints saves you from having to find those problems at compile or run-time.
  8. Auto-completing file paths and highlighting unresolvable ones saves you from time-consumingly debugging 404 errors.
  9. Background scanning other files for problems (applying all the above features to unopened project files) allows you to quickly see and fix bugs that you/others left. Simply opening an existing codebase in a more capable IDE can reveal thousands of certain/potential code problems. If you’re responsible for that code, you’ve potentially saved an enormous amount of time: End users hitting bugs, reporting them, you reading, investigating, fixing, typing summaries, etc. etc. etc. This feature is like having a whole team of programmers scouring your codebase for you; a big productivity boost.
  10. Understanding multiple language contexts can help a great deal when you’re forced to work in files with different contexts embedded within each other.
  11. Parameter/documentation tooltips eliminate the need to look up function purpose, signatures, and return types. While you should memorize commonly used functions, a significant amount of programming involves using new/unfamiliar libraries. Leaving your context to look up docs imposes costs in time and concentration. Sometimes that cost yields later benefits, but often you’ve just forgotten the order of a few parameters.
  12. Jumping to the declaration of a function/variable saves you from having to search for it.
  13. Find usages in an IDE that comprehends code allows you to quickly understand where and how a variable/function is used (or mentioned in comments!) in a codebase with very little error.
  14. Rename refactoring can carefully change an identifier in your code (and optionally filenames and comments) across an entire project of files. This can also apply to CSS; when renaming a class/id, the IDE may offer to replace its usages elsewhere in CSS and HTML markup. The obvious benefits are time savings and reduction in the errors you might make using more simple string/regular expression replacements, but there are other gains: When the cost of changing a name reduces to almost nothing, you will be more inclined to improve names when needed. Better names can reduce the time needed to understand the code and how it should be used, and to recognize when it’s not being used well.
  15. Comprehension of variable/expression type allows the IDE to offer intelligent autocompletion options, reducing your time spent typing, fixing typing errors, and looking up property/method names on classes. But more than saving time, when an expected autocomplete option doesn’t appear, it can let you know that your variable/expression is not of the type that you think it is.
  16. IDEs can automatically suggest variables for function arguments based on type, so if you’re calling a function that needs a Foo and a Bar, the IDE can suggest your local vars of those types. This eliminates the need to remember parameter order or the exact names of your vars. Your role briefly becomes to check the work of the IDE, which is almost always correct. In strongly typed languages like Java, this can be a great boost; I found Java development in NetBeans to be an eye-opening experience to how helpful a good IDE can be.
  17. IDEs can grok Javadoc-style comments, auto-generate them based on code, and highlight discrepancies between the comments and the code. This reduces the time you spend documenting, improves your accuracy when documenting, and can highlight problems where someone has changed the signature/return type of a function, or has documented it incorrectly. IDEs can add support for various libraries so that that code can be understood by the IDE without being in the project.
  18. IDEs can maintain local histories of project files (like committing each change in git) so you can easily revert recent changes (or bring back files accidentally deleted!) or better understand the overall impact of your recent changes.
  19. IDEs can integrate with source control so you can see changes made that aren’t committed. E.g. a color might appear in the scroll bar next to uncommitted changes. You could click to jump to that change, mouse over to get a tooltip of the original code and choose to revert if needed. This could give you a good idea of changes before you switch focus to your source control tools to commit them. Of course being able to perform source control operations inside the IDE saves time, too.
  20. IDEs can maintain a local cache of code on a remote server, making it snappier to work on, reducing the time you’d spend switching to a separate SFTP app, and allowing you to adjust the code outside the IDE. The IDE could monitor local and remote changes and allow merging between the versions as necessary.
  21. IDEs can help you maintain code style standards in different projects, and allow you to instantly restyle a selection or file(s) according to those standards. When contributing to open source projects, this can save you from having to go back and restyle code after having your change rejected.
  22. Integrated debugging offers a huge productivity win. Inline debugging statements are no replacement for the ability to carefully step though execution with access to all variable contents/types and the full call stack. In some cases bugs that are practically impossible to find without debugging can be found in a few minutes with one.
  23. Integrated unit test running also makes for much less context switching when using test-driven development.

This list is obviously not exhaustive, and is geared towards the IDEs that I’m most familiar with, but the kernel is that environments that truly understand the language and your codebase as a whole can give you some powerful advantages over those that don’t. A fair argument is that lightweight editors with few bells and whistles “stay out of your way” and can be more responsive on large codebases–which is true–but you can’t ignore that there’s a large real productivity cost incurred in doing without these features.

For PHP users, PhpStorm* includes almost everything in the list, with Netbeans coming a close second. With my limited experience Eclipse PDT was great for local projects, but I’ve only seen basic syntax highlighting working in the Remote System Explorer. All three also fairly well understand Javascript, CSS, HTML, and to some extent basic SQL and XML DTDs.

*Full disclosure: PhpStorm granted me a copy for my work on Minify, but I requested it, and my ravings about it and other IDEs are all unsolicited.

Define namespace constants using expressions

Since const is parsed at compile-time, you can’t use expressions in namespace constants, but you can use define as long as the name argument is the full name from the global scope (run this):

namespace Foo\Bar;
const CONST1 = 1;
define('CONST2', 1 + 1); // global
define(__NAMESPACE__ . '\\CONST3', 1 + 1 + 1); // in namespace!
echo CONST1, " ", \CONST2, " ", CONST3; // echos '1 2 3'

The bad: PHPStorm comprehends const and global defines, but not define(__NAMESPACE__ . '\\CONST', $value)

PHP RFC Preview: Dynamic Callback Expressions

I’m posting this to get some initial feedback on this idea before I officially submit an RFC.

Background

Even with PHP’s growing object-oriented and functional programming features, the callback remains widely-used and useful. However, forcing authors to create callbacks via strings and arrays presents difficulties:

  1. Most IDEs do not recognize callbacks as such, and so cannot offer autocompletion, rename refactoring, and other benefits of code comprehension.
  2. Authors can misspell identifiers inside strings.
  3. Within namespaced code, authors can forget to prepend the namespace, since function calls within the namespace do not require it.
  4. Where use statements change the identifier for a class, authors can specify the local classname instead of the fully resolved name.

Proposal Continue reading  

Convert Google Maps embed HTML to Street View URL

You can use the form below to convert the HTML embed code Google Maps gives you to a usable Street View URL

Why do I need this?

The new Google Maps layout has a chain-link icon on the left that gives you a URL to what you’re looking at. If you’re in Street View, sometimes the given URL doesn’t include the proper parameters and you end up back on the top-down map view. This converter pulls a valid Street View URL out of the embed HTML.

source code

String Subtypes for Safer Web Programming

Valid HTML markup involves several different contexts and escaping rules, yet many APIs give no precise indication of which context their string return values are escaped for, or how strings should be escaped before being passed in (let’s not even get into character encoding). Most programming languages only have a single String type, so there’s a strong urge to document function with @param string and/or @return string and move on to other work, but this is rarely sufficient information.

Look at the documentation for WordPress’s get_the_title:

Returns

(string) 
Post title. …

If the title is Stan "The Man" & Capt. <Awesome>, will & and < be escaped? Will the quotes be escaped? “string” leaves these important questions unanswered. This isn’t meant to slight WordPress’s documentation team (they at least frequently give you example code from which you can guess the escaping model); the problem is endemic to web software.

So for better web security—and developer sanity—I think we need a shared vocabulary of string subtypes which can supply this missing metadata at least via mention or annotation in the documentation (if not via actual types).

Proposed Subtypes and Content Models

A basic set of four might help quite a bit. Each should have its own URL to explain its content model in detail, and how it should be handled:

Unescaped
Arbitrary characters not escaped for HTML in any way, possibly including nulls/control characters. If a string’s subtype is not explicit, for safety it should be assumed to contain this content.
Markup
Well-formed HTML markup matching the serialization of a DocumentFragment
TaglessMarkup
Markup containing no literal less-than sign (U+003C) characters (e.g. for output inside title/textarea elements)
AttrValue
TaglessMarkup containing no literal apostrophe (U+0027) or quotation mark (U+0022) characters, for output as a single/double-quoted attribute value

What would these really give us?

These subtypes cannot make promises about what they contain, but are rather for making explicit what they should contain. It’s still up to developers to correctly handle input, character encoding, filtering, and string operations to fulfill those contracts.

The work left to do is to define how these subtypes should be handled and in what contexts they can be output as-is, and what escaping needs to be applied in other contexts.

Obvious Limitations

For the sake of simplicity, these subtypes shouldn’t attempt to address notions of input filtering or whether a string should be considered “clean”, “tainted”, “unsafe”, etc. A type/annotation convention like this should be used to assist—not replace—experienced developers practicing secure coding methods.

RotURL: Rot13 for URLs

RotURL is a simple substitution cipher for encoding/obscuring URLs embedded in other URLs (e.g. in a querystring). Also, common chars that need to be escaped (:/?=&%#) are mapped to infrequently used capital letters, so this generally yields shorter querystrings, too.

/**
 * Rot35 with URL/urlencode-friendly mappings. To avoid increasing size during
 * urlencode(), commonly encoded chars are mapped to more rarely used chars.
 */
function rotUrl($url) {
    return strtr($url,
        './-:?=&%# ZQXJKVWPY abcdefghijklmnopqrstuvwxyz123456789ABCDEFGHILMNORSTU',
        'ZQXJKVWPY ./-:?=&%# 123456789ABCDEFGHILMNORSTUabcdefghijklmnopqrstuvwxyz');
}

rotUrl('https://en.wikipedia.org/w/index.php?title=Special%3ASearch&search=Base64#foo')
    == '8MMGLJQQ5EZR9B9G5491ZFI7QRQ9E45SZG8GKM9MC5VxG5391CPcjx51I38WL51I38Vk1L5fdY6FF';
rotUrl(rotUrl($anyUrl)) = $anyUrl;

You could save a few more bytes by encoding the schema (e.g. “h” for http://, “H” for https://). Since your end encoding has to be URL-safe, there’s not much you can do beyond this to compress a URL embedded in a URL.

Validate Private Page Bookmarklet

ValidatePrivatePage <– validates in current window

ValidatePrivatePage <– validates in new window (your pop-up blocker may complain)

If you need to validate the markup of a page that’s not public (e.g. on localhost), you can now use this bookmarklet to auto-submit the current page source to the validator (instead of viewing source, copying, opening the validator, pasting in, and pressing “check”).

Note: this gets the page source making an XMLHTTPRequest to the current URL, so it does not get interpreted by the browser; i.e. this is NOT based on innerHTML(). If the request made returns a different page (e.g. you were logged out in the meantime), that page’s source will be sent to the validator. Not much can be done about that. I once wrote a crusty PHP4 class/bookmarklet combo that helped do this, but thanks to the standardization of XMLHTTPRequest, this is easy in JS now. You should also thank W3C for allowing cross-domain POSTs to the validator :)

NetBeans Love & Hate

For those cases where you have to work on remote code, NetBeans‘ remote project functionality seems to put it ahead of other PHP IDEs. It pulls down a tree of files and uploads files that you save. Having a local copy allows it to offer its full code comprehension, auto-complete, and great rename refactoring for “remote” code. In contrast Eclipse allows you to open remote files using Remote System Explorer, but you only get PHP syntax highlighting, not the excellent PDT.

But NetBeans is not all smiles and sunshine. Continue reading  

Helping Netbeans/PhpStorm with Autocomplete/Code-hinting

Where Netbeans can’t guess the type/existence of a local variable, you can tell it in a multiline comment:

/* @var $varName TypeName */

After this comment (and as long as TypeName is defined in your project/project’s include path), when you start to type $varName, Netbeans will offer to autocomplete it, and will offer TypeName method/property suggestions. If you rename the variable with Ctrl+r (rename refactoring), Netbeans will change the comment, too.

I usually forget this syntax because type comes first in @param declarations.

Update: PhpStorm supports a similar syntax, but reversing the type and variable name:

/* @var TypeName $varName */