All modern web applications should be using UTF-8 encoding. For developers with English keyboards (where all keys produce 7-bit ASCII), testing web forms with multibyte characters can be a pain. You can, of course, enter Unicode characters via obscure key combinations, but using this bookmarklet may be easier:
Get it
This simply visits all text/password/textarea inputs on a page and replaces Latin vowels with multibyte variants. E.g. John A Public ⇒ Jōhn Ā Pūblīc.
Test it here
The bookmarklet prompts me for “Match beginning”. What is this?
If you only want to affect certain inputs, enter a phrase at the prompt. The bookmarklet will then only affect inputs whose values start with that phrase. E.g. To affect only a “comment” field, place “||” at the beginning of the field, and enter “||” at the match prompt. The bookmarklet will affect only this field and strip the “||” from the field for you.
Would this solve this problem that I have? A person’s name is, say, Elisabeth Söderström. It’s too much trouble to go to the character map and copy the “ö”, “Soederstroem” is right, too, but just looks weird, and “Soderstrom” doesn’t produce an accurate pronunciation. But I want to type “Soderstrom” and have it find “Söderström”.
Smart search engines will generally store “Latinified”, if you will, versions of non-Latin words in its index alongside the originals. Google seems to do this. It probably also helps that many web authors are also too lazy to break out character map, so there’s a lot of common data for both words.
“Soederstroem” is a transliteration. This is tricky business that search engines probably don’t get into.
As for this tool, it’s just for web developers make sure whatever goes in comes out unmangled.