Quick ActiveSupport::Multibyte glossary trick

Manfred Stienstra

I was trying to make a glossary of words grouped by their first letter, but I wanted words starting with the letter é grouped with words starting with the letter e. No small feat you might imagine. Wrong.

dict = words.inject({}) do |dict, word|
  letter = word.chars.decompose[0..0].downcase.to_s
  dict[letter] ||= []
  dict[letter] << word; dict

The reason this works is that letters like é have a decomposed form in Unicode, this form consists of a latin letter and a accent modifier. I’m not sure what happens if you run Arabic through this code, but we’ll cross that bridge when we get there.