Using BINARY in a MySQL IN Expression

With the default collations in MySQL (e.g. utf8_general_ci), strings will be matched case-insensitively:

WHERE col IN ('IT', 'Ruby') will match “it” and “ruby”.

For case-sensitive matches, each string must be preceded by the keyword BINARY:

WHERE col IN (BINARY 'IT', BINARY 'Ruby')

Caveat: I would guess that BINARY affects the matching of strings with combining characters (same character, different byte representation).

Character Encoding Bug of the Day

Today I had one of those bugs that starts out looking simple and keeps going deeper and deeper. Video service Kaltura has a plugin for Moodle, that just stopped working one day (no changes on the server).

  • It’s throwing an exception because an expected element isn’t in the page.
  • Oh, the element’s supposed to be delivered by XHR from the plugin.
  • But the plugin’s code is generating correct markup…
  • Why is Moodle’s function to serialize an array into a JS function call returning null for that markup?
  • json_encode is converting the markup string to null?
  • Because json_encode is choking on invalid UTF-8.
  • Because the markup has a right single quotation encoded in Windows-1252 :(
  • And that string is coming from the Kaltura API.

So over 2 years ago someone named a video player Jim’s Test Player and over the weekend Kaltura’s API started returning that single quote in Windows-1252. We removed the quote from the name and the problem disappeared.