Ruby and MySQL encoding flakiness

Manfred Stienstra

The last few weeks we noticed the dreaded question marks on our sites running against MySQL 5.0. We thought we did everything to make sure our servers, databases, tables, clients and connections understood UTF-8, but somehow connections to the database were reset back to Latin1 after some time.

Instead of trying to fix the problem in Rails/Ruby/libmysql I decided to squash the problem in the MySQL server configuration. By default we were seeing this:

mysql> SHOW VARIABLES LIKE 'character\_set\_%';
+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| character_set_client     | latin1 | 
| character_set_connection | latin1 | 
| character_set_database   | latin1 | 
| character_set_filesystem | binary | 
| character_set_results    | latin1 | 
| character_set_server     | latin1 | 
| character_set_system     | utf8   | 
+--------------------------+--------+

So I set the following in /etc/mysql/my.cnf:

[mysqld]
character-set-server = utf8

[client]
default-character-set = utf8

Which forces all the encoding to go to UTF-8 by default:

mysql> SHOW VARIABLES LIKE 'character\_set\_%';
+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| character_set_client     | utf8   | 
| character_set_connection | utf8   | 
| character_set_database   | utf8   | 
| character_set_filesystem | binary | 
| character_set_results    | utf8   | 
| character_set_server     | utf8   | 
| character_set_system     | utf8   | 
+--------------------------+--------+