Unichars 0.2 released

Manfred Stienstra

Unichars is a simple wrapper around Glib2 Unicode functions. You can use it to speed up certain methods on Unicode string. Currently supported are: upcase, downcase, reverse, and size. The cool thing about it is that it works seamlessly with ActiveSupport::Multibyte and it works great without ActiveSupport::Multibyte.

I know Unichars is not a very exiting name like God, Vlad the Deployer, or Gerard Joling but I guess I’m not that kind of a guy.

You can install Unichars with Rubygems:

$ gem install unichars

Or you can fetch it from Github:

$ git clone git://github.com/Manfred/unichars.git
$ cd unichars
$ rake gem:install

With Rails

The examples in the README tell you how to use Unichars with Rails 2.1 or newer. I’ll just re-iterate how it’s done.

First you make sure you load the library, the easiest way to do this is with config.gem in environment.rb:

config.gem 'unichars'

Or when you dislike gems, you can just require it:

require 'unichars'

When you’re not using config.gem, you have to make sure ActiveSupport is loaded before Unichars, otherwise the Rails integration won’t magically work.

After that you have to tell ActiveSupport::Multibyte to use the Unichars class as proxy class. You can do that in an initializer or at the end of your environment.rb. I would recommend doing it in config/initializers/unichars.rb.

ActiveSupport::Multibyte.proxy_class = Unichars

Now all of Rails will automatically use the Unichars character proxy, you can also use it yourself:

'Café'.mb_chars.reverse #=> 'éfaC'

Without Rails, but with ActiveSupport

require 'activesupport'
require 'unichars'
ActiveSupport::Multibyte.proxy_class = Unichars
$KCODE = 'u'

Other than that it’s similar to Rails:

'Sluß'.mb_chars.upcase #=> 'SLUSS'

A good time to talk about LC_CTYPE real quick. Note that Glib2 picks that up from your environment, so your results may vary depending on what it’s set too.

Without training wheels

require 'unichars'

If you don’t use ActiveSupport, you can still use Unichars because it comes with a light version of the Chars proxy. You will have to wire it yourself though:

class String
  def mb_chars
    Unichars.new(self)
  end
end

'Copy-®'.mb_chars.size #=> 6

Without anything

Finally, you can just use the Glib2 wrapper and roll your own solution:

require 'glib'
Glib.utf8_upcase('Comme des Garçons').upcase #=> 'COMME DES GARÇONS'

Questions?

If you have any questions or issues, please use the Github Wiki Wiki as much as possible. If you want to discuss anything you can find me on Freenode in #rails-contrib. Have fun with Unichars!


You’re reading an archived weblog post that was originally published on our website.