Wednesday, October 24, 2007

Import and Export

Recently we have been doing some work with export and import of data from and to MySql databases. Our customer wants to be able to export configuration data from one MySql instance and import it into another.

The obvious tool to use is mysqldump. But since we need to edit the exported data (just replace some codes) before importing it into the other instance we use the --tab option. This means that instead of one big file (or actually redirected standard output) we get two files per table, one with the CREATE TABLE command and one with the data.

In the next step we run some scripts to modify the data. This is quite easy, since data is in text files with one record per row.

Finally we import the data into the new instance. While the default mode of mysqldump creates a file you can run in the command line client, mysqldump --tab does not. You have to create that yourself, but it is not very complicated. You need the following lines for every table:

source [tablename].sql
LOCK TABLES [tablename] WRITE;
ALTER TABLE [tablename] DISABLE KEYS ;
LOAD DATA INFILE '[path][tablename].txt' INTO TABLE [tablename] ;
ALTER TABLE [tablename] ENABLE KEYS ;
UNLOCK TABLES;

The ALTER TABLE and LOCK/UNLOCK commands are not absolutely necessary, but they speed things up. We have not measured the speed of this approach, but the general impression is that it is fast.

Probably this method is faster than the default mode of mysqldump, because data is not sent through the parser, where there has been some performance problems. LOAD DATA INFILE is in most cases very fast.

Monday, October 22, 2007

Out of memory......

Last week or so has been busy. We are working on our next release, which seems to work well on a functional level. But we get problems with Out of memory errors...

There are different causes for this, but the main cause is with Hibernate. The part that causes the most problems is where we read quite a lot of data and generate data for the online part of the system. The data is stored as BLOBs in the database, and Hibernate is used througout. We want to be sure that everything or nothing is processed, so it is all done in one tranmsaction.

Now it seems that even though we do not read from the table with the BLOBs, only write to it, Hibernate keeps all records in the session cache, filling the memory. To me this seems very strange. We dont need this cache, once the records are written to the database, they should be removed from the cache. In my experience this is a very common scenario, that you do not read the records created again in the same process. In fact it is good practice to get all data before inserting, so you do not need to read the same record again.

It is also difficult to understand that Hibernate does not take care of Out of memory conditions. A good cache should not grow without limit, it should purge data when memory usage is too high. This is not something you should have to do in your application.

The problem seems to be solved for now, but since data volumes in the system will grow, it will probably reappear. We will have to do something more permanent for the future. Probably this will be to bypass Hibernate for the BLOB insert, and do this with JDBC calls.

Thursday, October 11, 2007

Moving !!


I haven't had much time for this blogging for a while, and probably won't have until perhaps in december. Most of my spare time is spend on moving.

After 23 years I am leaving Lund. Not going too far though, just some 60 kilometers, to Helsingborg. From the University town of Lund, where Sony Ericsson is the largest employer, to the harbour town of Helsingborg, with IKEA:s IT department.

Wednesday, October 03, 2007

Hibernate proxy classes and reflection

I have been struggling the last few days with a small tool to export and import data from a system using Hibernate Object Relational mapping. To make it fairly general, and avoid having to change it when attributes are added to the data model we based it on Java reflection. Fairly straightforward, create an XML file with classnames as tagnames and fields as attributes.

The first problem was object relations. My initial plan was to simply follow them, but sinces they are all bi-directional this would mean looping forever. So we had to add a setup where you tell the tool which object relations should be included.

The next problem was worse. It seems that Hibernate somtimes does not create the right objects when loading objects from the database. Instead so called 'proxy classes' are created. This means that reflection does not work as expected... You cannot use Field to acces the fields, instead you have to use the getter methods. OK, not to complicated, once you know. Unfortunately we had one attribute which did not follow naming standards, so we needed a small hack to make this work. What I would have liked is a way to get the real class and skip the proxy. But i guess you can't get everything.