There is a gem called ydd that offers really simple import and export of smallish databases. It exports to YAML and then imports to whatever database Rails can connect to. After using YDD a few times I’ve found it easier to pinpoint the cause of problems that occur using taps.

It doesn’t handle character encodings though so I went about adding that. With the handy rchardet gem and IConv, detecting the character encoding of the incoming string and converting it to UTF-8 was pretty simple. I’ve created a pull request for the gem that will hopefully be accepted.

The essential code is below, and revolves mainly around the detection and conversion. Using //TRANSLIT causes IConv to try and convert the incoming character code to something that exists in the UTF8 character set, and then //IGNORE will ignore any characters that don’t exist in the UTF8 character set. Chaining //TRANSLIT and then //IGNORE will make IConv try a conversion first and then ignore anything it cannot convert.

I used this gem after the above changes to convert about 400,000 records of text data with ASCII, windows-1252, IBM866 and other character encodings from an old SQLite installation to a new postgres database without any issues.