Search this site:

Converting incoherent sets of data between charsets

Dato complains that converting changelogs to be UTF8-clean is not always as simple as running iconv - One of the reasons that took me so long to migrate my blog is that, due to having migrated (at different points in time) my previous CMS (Jaws 0.4->0.5->0.7), its underlying database (MySQL 3.4 -> 4.0 -> 5.0 IIRC), the distribution (Debian Woody -> Sarge -> Etch+backports), several reinstalls and all… Well, I had a completely mixed-up database, with some tables in UTF, some tables in latin1, some tables with mixed rows, some tables that for some strange reason had double-mixed rows (that is, that had UTF8 misrepresented as latin1 and then re-encoded into UTF8)… No, it was not fun to sort out.