There is a gem called ydd that offers really simple import and export of smallish databases. It exports to YAML and then imports to whatever database Rails can connect to. After using YDD a few times I’ve found it easier to pinpoint the cause of problems that occur using taps.
It doesn’t handle character encodings though so I went about adding that. With the handy rchardet gem and IConv, detecting the character encoding of the incoming string and converting it to UTF-8 was pretty simple. I’ve created a pull request for the gem that will hopefully be accepted.
The essential code is below, and revolves mainly around the detection and conversion. Using //TRANSLIT causes IConv to try and convert the incoming character code to something that exists in the UTF8 character set, and then //IGNORE will ignore any characters that don’t exist in the UTF8 character set. Chaining //TRANSLIT and then //IGNORE will make IConv try a conversion first and then ignore anything it cannot convert.
I used this gem after the above changes to convert about 400,000 records of text data with ASCII, windows-1252, IBM866 and other character encodings from an old SQLite installation to a new postgres database without any issues.
Benoist
May 25, 2011
Nice post!!
Could you tell me how long it took to convert 400k records??
I’ve got a similar task but a lot more records and limited time to migrate the data.
Greetz
Benoist
Aaron
May 25, 2011
Hi Benoist, it took about 6 minutes to export, and about the same to import. So all up 10-15 minutes.
The DB had one table of 400k records, so obviously there was some other data in there, it was just that table that had the problematic records.
Good luck.
sergio
Jun 30, 2011
Good evening friends, I am in Brazil working with Ruby on Rails is 3 months
I have Ruby 1.8.7 and Rails 3.
I am connecting with an old database in Firebird 2.1 Charset WIN1252
And I’m having problems with characters looking for a solution on google I ended up finding your blog, I wonder if you have a very detailed tutorial on what you did to convert your bank because I could not understand, you seem to be an advanced programmer.