This has been bugging me the few last months that i got a job and i have to develop greek sites. This encoding thing is a mess! You never gonna know what will happen the next time you develop a new site. One has to be very carefull about the database’s encoding and the one you use for your final output. If you had problems with that or this sounds like your problem too keep reading. If you are to develop a non English site then i would definitely suggest that you get the following pointers under consideration… Well here comes a very bad scenario… You have a database that has, let’s say, a greek encoding. So your database is greek encoded with greek_general_ci as a collation. Now here comes the worse part. Your final output is in UTF8. Now it’s time to scream “Houston… We have a problem!!”. Yup. You’re sc****d
Been there before. What you need to do is this:
SET NAMES greek;Although all the above do work, they do sound a bit weird. One would ask: “if i have a site with many users imagine doing this for every string on every request!”. Well it sounds about right. The most prefered way and what i would suggest is: “hey! utf’em all fellas!”. I mean ok, i know. A utf string is much bigger than the one on the local encoding but you will never have problems. If somebody comes to you and says “i want to open my fora to the chineese market” the only thing you will say is “hire a translator”
But here comes a much worse scenario, which has happened to me several times the past few months. If you want to transfer content from one site (that has a local encoding i.e. ISO-8859-7) to a one with UTF-8?? Well this is trickier, not because it’s hard but just because it has more variables to think about. Having two databases source (local one for example greek_general_ci) and dest (UTF-8) step by step way would be:
SET NAMES greek;SET NAMES utf8;Do not forget to use the last query. Well, it sounds pretty clear but somewhere on the way it gets really messy. Somehow you either forget something or something gets in the way. For example, mysql_real_escape_string() converts some of the greek’s local characters to UTF-8 :S Anyway i don’t want to go on nagging. It’s been a long day. I hope my small summarize helps you guys out make your way through the encodings. If you have anything to add, go on and post a comment.
/me out