Enabling Tamil unicode font (Collation UTF8_UNICODE_CI charset) support in Wordpress 2.8.x
Monday, September 14, 2009 2:53:55 AM
Wordpress is a great tool for blogging. Now a days, Wordpress is gaining momentum after support for Unicode was introduced. The unicode charset (UTF8 - Unicode) enables the database to store data in Unicode UTF8 format (best for tamil, hindi or any other indic language blogs). This helps in publishing multilingual (tamil, hindi and various other non-english language) blogs.
How to start?
My webhost uses cPanel which includes the Wordpress 2.8.4 package under the Fantastico section. When you click the install Wordpress link, it creates the DB, initializes the PHP engines and writes the configuration files by itself. During creation, remember to select the "Charset" to "utf8_unicode_ci" or "UTF8-Unicode".
If by any chance, you missed to set the charset, have no worries. Just login to the cPanel or your webhost's control panel. Then locate the Database sections and open the phpMyAdmin section. U should be looking at something similar shown below. Under "localhost" make sure the parameter is set to "utf8_unicode_ci". If not, click the drop-down and choose "utf8_unicode_ci".
The click the wordpress database link in the left side. In most cases, wordpress 2.2 and above creates DB with names ending as "wrdp1" by default. The opening screen should look something similar to this:
Note, in my case the collation was set to "latin_swedish_ci". This prevents multilingual contents from being stored in the mySQL DB. Now click on the "Operations" tab on the top right.
Locate the "Collation" box to the end of the page as shown below and change the value to "utf8_unicode_ci". Click GO.
Now you should be able to see a screen as shown below. Note the collation column. Earlier it was set to Latin_swedish_ci in my case. The screenshot actually shows the values after I changed to utf8_unicode_ci. To change all the values in a bulk, click the "Check all" link below the table. After all rows above are automatically selected, click on the "Pencil" icon below the table. CHeck the small circle in the image below.
After clicking the pencil icon, you will be shown a screen as below.
Under the "COllation" column, change all the values to "utf8_unicode_ci" in all drop-downs. Repeat the same for all tables in the database. The tables list is shown in the left as link. Make sure you have changed all text field's Collation value to "utf8_unicode_ci" in all tables. This is a must!. Even a single non-utf collation may end up in weird results.
That's it. Once you have changed the collation for text fields in all tables, you are set. Don't mind if the database home page still shows non-utf charset in "Collation" column as shown below. But make sure the last row's collation column value is set to "utf8_unicode_ci".
I hope this quick and dirty guide will help most of the tamil bloggers to setup their wordpress blog without any major hassles. If you know or aware of any other shortcuts or SQL queries, let me know. :-)
- Rajesh Sundaram
How to start?
My webhost uses cPanel which includes the Wordpress 2.8.4 package under the Fantastico section. When you click the install Wordpress link, it creates the DB, initializes the PHP engines and writes the configuration files by itself. During creation, remember to select the "Charset" to "utf8_unicode_ci" or "UTF8-Unicode".
If by any chance, you missed to set the charset, have no worries. Just login to the cPanel or your webhost's control panel. Then locate the Database sections and open the phpMyAdmin section. U should be looking at something similar shown below. Under "localhost" make sure the parameter is set to "utf8_unicode_ci". If not, click the drop-down and choose "utf8_unicode_ci".
The click the wordpress database link in the left side. In most cases, wordpress 2.2 and above creates DB with names ending as "wrdp1" by default. The opening screen should look something similar to this:
Note, in my case the collation was set to "latin_swedish_ci". This prevents multilingual contents from being stored in the mySQL DB. Now click on the "Operations" tab on the top right.
Locate the "Collation" box to the end of the page as shown below and change the value to "utf8_unicode_ci". Click GO.
Now you should be able to see a screen as shown below. Note the collation column. Earlier it was set to Latin_swedish_ci in my case. The screenshot actually shows the values after I changed to utf8_unicode_ci. To change all the values in a bulk, click the "Check all" link below the table. After all rows above are automatically selected, click on the "Pencil" icon below the table. CHeck the small circle in the image below.
After clicking the pencil icon, you will be shown a screen as below.
Under the "COllation" column, change all the values to "utf8_unicode_ci" in all drop-downs. Repeat the same for all tables in the database. The tables list is shown in the left as link. Make sure you have changed all text field's Collation value to "utf8_unicode_ci" in all tables. This is a must!. Even a single non-utf collation may end up in weird results.
That's it. Once you have changed the collation for text fields in all tables, you are set. Don't mind if the database home page still shows non-utf charset in "Collation" column as shown below. But make sure the last row's collation column value is set to "utf8_unicode_ci".
I hope this quick and dirty guide will help most of the tamil bloggers to setup their wordpress blog without any major hassles. If you know or aware of any other shortcuts or SQL queries, let me know. :-)
- Rajesh Sundaram













