Anyone who has written a web page has probably had to work out what character encoding to use for their web site, which is communicated from the web server to the browser using either the charset field in the HTML content meta tag, or the Content-Type header. Most new web sites are developed with the UTF-8 character set encoding as it can be used to represent any character possible (including Kanji, Chinese, Arabic etc), but many legacy web sites deal with an extended ASCII character set commonly called latin1. Our web site is one of those web sites that has always used the latin1 character set, and more importantly, all our database tables are encoded in that same character set. Up until this past week, I had always worked under the assumption that the ISO-8859-1 character set was in fact the same as latin1, but in the case of MySQL we found out this NOT to be the case, even though all browsers will happily display a page encoded in ISO-8859-1 with characters that are not technically legal for that encoding.
Continue Reading →