Saturday, December 08, 2007
Web enabling MySQL
Mark Atwood has published an interesting post with his wishlist for MySQL. One of his points is already (partly) covered with the LOAD XML command I made for MySQL 6.0 (or is it 5.2).
The main problem with it is perhaps that it works (like the LOAD DATA command) only with files, but in a web world you want it to be able to take input from other sources. You also need to be able to send an XML to several tables, with foreign keys coordinated etc.
Mark also has a wishlist for the opposite, the SELECT OUTFILE command. I have made a couple of stored procedures to help with this, but an extended SELECT OUTFILE would be mucke aesier to use and give much more possibilities. It would be fun to work on this, we'll see if I have the time, now that I am almost ready with my relocation..
The main problem with it is perhaps that it works (like the LOAD DATA command) only with files, but in a web world you want it to be able to take input from other sources. You also need to be able to send an XML to several tables, with foreign keys coordinated etc.
Mark also has a wishlist for the opposite, the SELECT OUTFILE command. I have made a couple of stored procedures to help with this, but an extended SELECT OUTFILE would be mucke aesier to use and give much more possibilities. It would be fun to work on this, we'll see if I have the time, now that I am almost ready with my relocation..
Friday, November 16, 2007
RSS reader gadget
For a while I have been using a small PC program for the feeds I keep a look on. It has lot's of functionality, and with some work I can get it to do just what I want to, which is very simple: just display a list of my feeds, where I can expand a feed and look at the posts.
But this gave me some problems... Sometimes it crashed, and I loose all my settings. I then have to set up everything again, which is a lot of work. Also I only had it on one of the computers I use (three in all), which meant I could only follow the feeds on that computer.
So the last time it crashed for me, i deinstalled it and made a Google gadget for feeds instead. So far it is very basic, the main advantage over the ones I have seen is that it handles multiple feeds ( there are excellent gadgets for handling only one feed). Probably I will add more options in the future.
Thursday, November 01, 2007
Back to the J2ME world
I have got a new assignment, for a J2ME midlet this time. Primary target is not a small cellphone, but the new BlackBerry Curve. This means we have to think differently, the BlackBerry has a muck better interface, larger screen and a qwerty keyboard. We have to make sure they are used the best way possible. My previous J2ME experience has been with cellphones from Nokia, Sony Ericsson and Motorola, this is something different.
The BlackBerry environment means that additional preprocessing has to be done. Not a big thing, but yet another point that differs from PC/Windows client, where you just develop form Windows (possible verify on both XP and Vista) and don't care about the hardware. The J2ME platform is still not that mature (will it ever be?), you have to verify on the major phone brands, both that it works and looks OK.
The BlackBerry environment means that additional preprocessing has to be done. Not a big thing, but yet another point that differs from PC/Windows client, where you just develop form Windows (possible verify on both XP and Vista) and don't care about the hardware. The J2ME platform is still not that mature (will it ever be?), you have to verify on the major phone brands, both that it works and looks OK.
Wednesday, October 24, 2007
Import and Export
Recently we have been doing some work with export and import of data from and to MySql databases. Our customer wants to be able to export configuration data from one MySql instance and import it into another.
The obvious tool to use is mysqldump. But since we need to edit the exported data (just replace some codes) before importing it into the other instance we use the --tab option. This means that instead of one big file (or actually redirected standard output) we get two files per table, one with the CREATE TABLE command and one with the data.
In the next step we run some scripts to modify the data. This is quite easy, since data is in text files with one record per row.
Finally we import the data into the new instance. While the default mode of mysqldump creates a file you can run in the command line client, mysqldump --tab does not. You have to create that yourself, but it is not very complicated. You need the following lines for every table:
source [tablename].sql
LOCK TABLES [tablename] WRITE;
ALTER TABLE [tablename] DISABLE KEYS ;
LOAD DATA INFILE '[path][tablename].txt' INTO TABLE [tablename] ;
ALTER TABLE [tablename] ENABLE KEYS ;
UNLOCK TABLES;
The ALTER TABLE and LOCK/UNLOCK commands are not absolutely necessary, but they speed things up. We have not measured the speed of this approach, but the general impression is that it is fast.
Probably this method is faster than the default mode of mysqldump, because data is not sent through the parser, where there has been some performance problems. LOAD DATA INFILE is in most cases very fast.
The obvious tool to use is mysqldump. But since we need to edit the exported data (just replace some codes) before importing it into the other instance we use the --tab option. This means that instead of one big file (or actually redirected standard output) we get two files per table, one with the CREATE TABLE command and one with the data.
In the next step we run some scripts to modify the data. This is quite easy, since data is in text files with one record per row.
Finally we import the data into the new instance. While the default mode of mysqldump creates a file you can run in the command line client, mysqldump --tab does not. You have to create that yourself, but it is not very complicated. You need the following lines for every table:
source [tablename].sql
LOCK TABLES [tablename] WRITE;
ALTER TABLE [tablename] DISABLE KEYS ;
LOAD DATA INFILE '[path][tablename].txt' INTO TABLE [tablename] ;
ALTER TABLE [tablename] ENABLE KEYS ;
UNLOCK TABLES;
The ALTER TABLE and LOCK/UNLOCK commands are not absolutely necessary, but they speed things up. We have not measured the speed of this approach, but the general impression is that it is fast.
Probably this method is faster than the default mode of mysqldump, because data is not sent through the parser, where there has been some performance problems. LOAD DATA INFILE is in most cases very fast.
Monday, October 22, 2007
Out of memory......
Last week or so has been busy. We are working on our next release, which seems to work well on a functional level. But we get problems with Out of memory errors...
There are different causes for this, but the main cause is with Hibernate. The part that causes the most problems is where we read quite a lot of data and generate data for the online part of the system. The data is stored as BLOBs in the database, and Hibernate is used througout. We want to be sure that everything or nothing is processed, so it is all done in one tranmsaction.
Now it seems that even though we do not read from the table with the BLOBs, only write to it, Hibernate keeps all records in the session cache, filling the memory. To me this seems very strange. We dont need this cache, once the records are written to the database, they should be removed from the cache. In my experience this is a very common scenario, that you do not read the records created again in the same process. In fact it is good practice to get all data before inserting, so you do not need to read the same record again.
It is also difficult to understand that Hibernate does not take care of Out of memory conditions. A good cache should not grow without limit, it should purge data when memory usage is too high. This is not something you should have to do in your application.
The problem seems to be solved for now, but since data volumes in the system will grow, it will probably reappear. We will have to do something more permanent for the future. Probably this will be to bypass Hibernate for the BLOB insert, and do this with JDBC calls.
There are different causes for this, but the main cause is with Hibernate. The part that causes the most problems is where we read quite a lot of data and generate data for the online part of the system. The data is stored as BLOBs in the database, and Hibernate is used througout. We want to be sure that everything or nothing is processed, so it is all done in one tranmsaction.
Now it seems that even though we do not read from the table with the BLOBs, only write to it, Hibernate keeps all records in the session cache, filling the memory. To me this seems very strange. We dont need this cache, once the records are written to the database, they should be removed from the cache. In my experience this is a very common scenario, that you do not read the records created again in the same process. In fact it is good practice to get all data before inserting, so you do not need to read the same record again.
It is also difficult to understand that Hibernate does not take care of Out of memory conditions. A good cache should not grow without limit, it should purge data when memory usage is too high. This is not something you should have to do in your application.
The problem seems to be solved for now, but since data volumes in the system will grow, it will probably reappear. We will have to do something more permanent for the future. Probably this will be to bypass Hibernate for the BLOB insert, and do this with JDBC calls.
Thursday, October 11, 2007
Moving !!
I haven't had much time for this blogging for a while, and probably won't have until perhaps in december. Most of my spare time is spend on moving.
After 23 years I am leaving Lund. Not going too far though, just some 60 kilometers, to Helsingborg. From the University town of Lund, where Sony Ericsson is the largest employer, to the harbour town of Helsingborg, with IKEA:s IT department.
Wednesday, October 03, 2007
Hibernate proxy classes and reflection
I have been struggling the last few days with a small tool to export and import data from a system using Hibernate Object Relational mapping. To make it fairly general, and avoid having to change it when attributes are added to the data model we based it on Java reflection. Fairly straightforward, create an XML file with classnames as tagnames and fields as attributes.
The first problem was object relations. My initial plan was to simply follow them, but sinces they are all bi-directional this would mean looping forever. So we had to add a setup where you tell the tool which object relations should be included.
The next problem was worse. It seems that Hibernate somtimes does not create the right objects when loading objects from the database. Instead so called 'proxy classes' are created. This means that reflection does not work as expected... You cannot use Field to acces the fields, instead you have to use the getter methods. OK, not to complicated, once you know. Unfortunately we had one attribute which did not follow naming standards, so we needed a small hack to make this work. What I would have liked is a way to get the real class and skip the proxy. But i guess you can't get everything.
The first problem was object relations. My initial plan was to simply follow them, but sinces they are all bi-directional this would mean looping forever. So we had to add a setup where you tell the tool which object relations should be included.
The next problem was worse. It seems that Hibernate somtimes does not create the right objects when loading objects from the database. Instead so called 'proxy classes' are created. This means that reflection does not work as expected... You cannot use Field to acces the fields, instead you have to use the getter methods. OK, not to complicated, once you know. Unfortunately we had one attribute which did not follow naming standards, so we needed a small hack to make this work. What I would have liked is a way to get the real class and skip the proxy. But i guess you can't get everything.
Friday, September 07, 2007
SQL mode quiz answers
My quiz about SQL mode was a bit more difficult than I expected. The question was which ones of the following statements are affected by a change from the default SQL mode to ANSI mode.
I used the INFORMATION_SCHEMA tables for examples throughout. This is to make it easy to try for yourself, without needing to create tables etc. If you use the Query Browser to try this out you should beware that setting the sql_mode might not work, since the Query Browser by default creates a new database connection for every statement. The easiest way to check this si to use the 'set global sql_mode' command, but you should not do this on a production database, since strange things might happen....
If you want to read the documentation, you find it here.
Now for the solution...
a) SELECT concat(table_name,' ',column_name) FROM information_schema.COLUMNS
This was quite an easy one. CONCAT is a MySQL extension to ANSI SQL, but it is not disabled by the ANSI SQL mode. The command will work just as well whatever the SQL mode. But CONCAT is not portable to other databases. Oracle has a CONCAT function, but it takes only two arguments. SQL Server has no CONCAT, and I don't know any other databases that has one.
b) SELECT TABLE_NAME||' '||COLUMN_NAME FROM information_schema.COLUMNS
That was also an easy one. Double pipes is the ANSI standard way of string concatenation. In MySQL it means OR.
So, in ANSI mode you will get:
etc
In default mode:
c) SELECT count (*) FROM information_schema.COLUMNS
This was a tricky one. I guess it was not easy to see, but there is actually a space between count and the left parenthesis. This makes it invalid in MySQL standard mode:
d) SELECT "TABLE_NAME" FROM information_schema.COLUMNS
This was also quite easy. Double quote is according to ANSI standard used for identifiers like table and column names. You only need them if you have characters that would otherwise be illegal or if you want to have mixed case in your identifiers.
By default MySQL uses double quotes for strings. This means that the select will give the following result:
If you change to ANSI, you will get the following result:
e) SELECT `TABLE_NAME` FROM information_schema.COLUMNS
This is the MySQL alternative to ANSI double quote. It does not follow ANSI standard, but the MySQL ANSI mode will not turn it off. I do not know another database that allows this syntax, so if you try to convert your system you will have to remove them.
Conclusion
Statements b, c and d are the ones that will behave differently if you change to ANSI mode. Statements a and e will work the same.
I used the INFORMATION_SCHEMA tables for examples throughout. This is to make it easy to try for yourself, without needing to create tables etc. If you use the Query Browser to try this out you should beware that setting the sql_mode might not work, since the Query Browser by default creates a new database connection for every statement. The easiest way to check this si to use the 'set global sql_mode' command, but you should not do this on a production database, since strange things might happen....
If you want to read the documentation, you find it here.
Now for the solution...
a) SELECT concat(table_name,' ',column_name) FROM information_schema.COLUMNS
This was quite an easy one. CONCAT is a MySQL extension to ANSI SQL, but it is not disabled by the ANSI SQL mode. The command will work just as well whatever the SQL mode. But CONCAT is not portable to other databases. Oracle has a CONCAT function, but it takes only two arguments. SQL Server has no CONCAT, and I don't know any other databases that has one.
b) SELECT TABLE_NAME||' '||COLUMN_NAME FROM information_schema.COLUMNS
That was also an easy one. Double pipes is the ANSI standard way of string concatenation. In MySQL it means OR.
So, in ANSI mode you will get:
+-------------------------------------+
| TABLE_NAME||' '||COLUMN_NAME |
+-------------------------------------+
| CHARACTER_SETS CHARACTER_SET_NAME
| CHARACTER_SETS DEFAULT_COLLATE_NAME
| CHARACTER_SETS DESCRIPTION
etc
In default mode:
+------------------------------+
| TABLE_NAME||' '||COLUMN_NAME |
+------------------------------+
| 0 |
| 0 |
| 0 |
| 0 |
| 0 |
c) SELECT count (*) FROM information_schema.COLUMNS
This was a tricky one. I guess it was not easy to see, but there is actually a space between count and the left parenthesis. This makes it invalid in MySQL standard mode:
mysql> SELECT count (*) FROM information_schema.COLUMNS;Change to ANSI mode, and it works:
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that
corresponds to your MySQL server version for the right syntax to use near '*) FR
OM information_schema.COLUMNS' at line 1
mysql> set sql_mode = 'ANSI';So, this is a statement that works differently....
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT count (*) FROM information_schema.COLUMNS;
+-----------+
| count (*) |
+-----------+
| 503 |
+-----------+
1 row in set (0.11 sec)
d) SELECT "TABLE_NAME" FROM information_schema.COLUMNS
This was also quite easy. Double quote is according to ANSI standard used for identifiers like table and column names. You only need them if you have characters that would otherwise be illegal or if you want to have mixed case in your identifiers.
By default MySQL uses double quotes for strings. This means that the select will give the following result:
+------------+
| TABLE_NAME |
+------------+
| TABLE_NAME |
| TABLE_NAME |
| TABLE_NAME |
| TABLE_NAME |
If you change to ANSI, you will get the following result:
+----------------+
| TABLE_NAME |
+----------------+
| CHARACTER_SETS |
| CHARACTER_SETS |
| CHARACTER_SETS |
| CHARACTER_SETS |
| COLLATIONS |
e) SELECT `TABLE_NAME` FROM information_schema.COLUMNS
This is the MySQL alternative to ANSI double quote. It does not follow ANSI standard, but the MySQL ANSI mode will not turn it off. I do not know another database that allows this syntax, so if you try to convert your system you will have to remove them.
Conclusion
Statements b, c and d are the ones that will behave differently if you change to ANSI mode. Statements a and e will work the same.
Wednesday, September 05, 2007
SQL mode quiz
MySQL follows ANSI SQL standard quite close. But there are a few points where it differs. To make MySQL follow the ANSI standard more closely you can use the SQL_MODE system system variable.
If you set the SQL mode to ANSI, MySQL will interpret you SQL commands differently. Some of your SELECT statements might work differently or not at all. But which ones? Let's see if you know....
Suppose you change your sql mode setting from none to ANSI. Which of the following statements will be affected:
a) SELECT concat(table_name,' ',column_name) FROM information_schema.COLUMNS
b) SELECT TABLE_NAME||' '||COLUMN_NAME FROM information_schema.COLUMNS
c) SELECT count (*) FROM information_schema.COLUMNS
d) SELECT "TABLE_NAME" FROM information_schema.COLUMNS
e) SELECT `TABLE_NAME` FROM information_schema.COLUMNS
If you set the SQL mode to ANSI, MySQL will interpret you SQL commands differently. Some of your SELECT statements might work differently or not at all. But which ones? Let's see if you know....
Suppose you change your sql mode setting from none to ANSI. Which of the following statements will be affected:
a) SELECT concat(table_name,' ',column_name) FROM information_schema.COLUMNS
b) SELECT TABLE_NAME||' '||COLUMN_NAME FROM information_schema.COLUMNS
c) SELECT count (*) FROM information_schema.COLUMNS
d) SELECT "TABLE_NAME" FROM information_schema.COLUMNS
e) SELECT `TABLE_NAME` FROM information_schema.COLUMNS
Wednesday, August 29, 2007
Follow up on LOAD XML
A few weeks ago (time flies..) i posted a short note about my LOAD XML contribution. It has actually been available for some time, veven if it is not so easy to find. It is included in MySql 5.2.5:
download
documentation - quite good I think!
It is a bit sad that it is included in 5.2 (still in alfa) and not in 5.1. Since it is a separate command it will not affect existing installations and would not be risky. That faster it gets included the faster we will get feedback on it, which is what I believe is needed right now. Is this something that MySql users need and will use? Or will they use other tools to read XML data into their MySql databases?
LOAD XML reads data into a MySql table in much the same way as LOAD DATA does and in fact the implementation is based on LOAD DATA. This means that it does not parse the entire XML document ( I belive MySql xml function ExtractValue() does that) but instead reads one row at a time from the xml file and inserts it into the database. This makes reading of large xml files possible at quite a decent speed, althoug not as fast as LOAD DATA.
The mapping between XML and the table is based on two assumptions:
- there should be an XML tag that corresponds to a record in the table. You name this tag in the 'ROWS IDENTIFIED BY' clause
- both tags and attributes are matched automatically with column names and values
inserted into the tables. Tags and attributes that do not match columns are simply ignored
When I implemented this the scenario it was designed for is an integration scenario, where you receive data in an XML file periodically and dump it into your database.
There are of course limitations you should be aware of:
- CDATA is not supported and binary fields can not be handled
- it can only load data from a file
- the XML is not verfied, an illegal XML could very well be processed, no validation with xml schema or DTD is made
- there is no namespace support
All in all I think this could be quite a useful function. Arnold Daniels, who worked with me on this contribution initially, is planning some improvements. Among other things to make this work with XML data that is not read from a file.
Personally I will perhaps make some minor fixes, but probably start on something entierly different. Don't know what yet, your suggestions are welcome....
download
documentation - quite good I think!
It is a bit sad that it is included in 5.2 (still in alfa) and not in 5.1. Since it is a separate command it will not affect existing installations and would not be risky. That faster it gets included the faster we will get feedback on it, which is what I believe is needed right now. Is this something that MySql users need and will use? Or will they use other tools to read XML data into their MySql databases?
LOAD XML reads data into a MySql table in much the same way as LOAD DATA does and in fact the implementation is based on LOAD DATA. This means that it does not parse the entire XML document ( I belive MySql xml function ExtractValue() does that) but instead reads one row at a time from the xml file and inserts it into the database. This makes reading of large xml files possible at quite a decent speed, althoug not as fast as LOAD DATA.
The mapping between XML and the table is based on two assumptions:
- there should be an XML tag that corresponds to a record in the table. You name this tag in the 'ROWS IDENTIFIED BY' clause
- both tags and attributes are matched automatically with column names and values
inserted into the tables. Tags and attributes that do not match columns are simply ignored
When I implemented this the scenario it was designed for is an integration scenario, where you receive data in an XML file periodically and dump it into your database.
There are of course limitations you should be aware of:
- CDATA is not supported and binary fields can not be handled
- it can only load data from a file
- the XML is not verfied, an illegal XML could very well be processed, no validation with xml schema or DTD is made
- there is no namespace support
All in all I think this could be quite a useful function. Arnold Daniels, who worked with me on this contribution initially, is planning some improvements. Among other things to make this work with XML data that is not read from a file.
Personally I will perhaps make some minor fixes, but probably start on something entierly different. Don't know what yet, your suggestions are welcome....
Monday, August 13, 2007
LOAD XML contribution added to MySQL
I got news from the MySQL people that my LOAD XML contribution is added to release 5.2. Check the manual page.
The aim of the contribution is to simplify reading of XML data into a MySQL database. The LOAD XML command supports three different xml formats:
- field values as attributes
- field values as tags
- the format produced by the mysql -x command, with the field name as an attribute and field value as a tag
If the LOAD XML command finds a matching field in the target table, the value is inserted, otherwise it is ignored. If you have used the LOAD DATA command (most of us have) you should recognize much of the functionality, LOAD XML works much the same way.
The aim of the contribution is to simplify reading of XML data into a MySQL database. The LOAD XML command supports three different xml formats:
- field values as attributes
- field values as tags
- the format produced by the mysql -x command, with the field name as an attribute and field value as a tag
If the LOAD XML command finds a matching field in the target table, the value is inserted, otherwise it is ignored. If you have used the LOAD DATA command (most of us have) you should recognize much of the functionality, LOAD XML works much the same way.
Monday, August 06, 2007
More on XML output from MySql
In a previous post I described how you could create XML output from SQL using stored functions. But that was a flat structure, with only one table and simple XML structure with a tag for every row in the result set, and column values as attributes or sub tags. Part of the real power of the XML format is that it supports more complicated structures, with hierarchical data. If you need to produce XML for integration with another application you will probably have to produce a XML with data from several tables, and a hierarchical structure. To accomplish this in MySQL we can use the GROUP_CONCAT function, combined with our XML stored functions.
An example:
We take our select statement from the previous post, that gives us a list of tables in xml format:
We then add a join with the columns table to get column data:
and replace the last parameter of the xml_tag function call, the one for the subtags with a new xml_tag function call to produce a column tag:
This gives us:
Running this gives us:
OK, this is some kind of structure, column tags inside the table tag. But still we have one table tag for every column, but we want one for every table, with all the column tags inside. To accomplish this we add a group by clause, and use the GROUP_CONCAT function for the column tags. This gives us the following SELECT:
And the following result:
OK, this is what we wanted! Just one small problem left... When I run this on my laptop I get the error 1260 with the following message:
%d line(s) were cut by GROUP_CONCAT()
And the result is only 7 rows. This is because the group_concat_max_len has a value that is too small. You can reset it with the command:
I set it with the command:
I then rerun the command, and there is no error, and 23 records.
Code for the stored functions is available at MySql forge.
An example:
We take our select statement from the previous post, that gives us a list of tables in xml format:
select xml_tag('table',null, concat(xml_attr('name',t.table_name),
xml_attr('engine',t.engine)),null)
from information_schema.tables t
where t.table_schema = 'INFORMATION_SCHEMA'
We then add a join with the columns table to get column data:
inner join information_schema.columns c on c.table_name = t.table_name and c.table_schema = t.table_schema
and replace the last parameter of the xml_tag function call, the one for the subtags with a new xml_tag function call to produce a column tag:
xml_tag('column',c.column_name,null,null)
This gives us:
select xml_tag('table',null, concat(xml_attr('name',t.table_name),
xml_attr('engine',t.engine)),xml_tag('column',c.column_name,null,null))
from information_schema.tables t
inner join information_schema.columns c on c.table_name = t.table_name and c.table_schema = t.table_schema
where t.table_schema = 'INFORMATION_SCHEMA'
Running this gives us:
<table name="CHARACTER_SETS" engine="MEMORY" >
<column>CHARACTER_SET_NAME</column></table>
<table name="CHARACTER_SETS" engine="MEMORY" >
<column>DEFAULT_COLLATE_NAME</column></table>
<table name="CHARACTER_SETS" engine="MEMORY" >
<column>DESCRIPTION</column></table>
<table name="CHARACTER_SETS" engine="MEMORY" >
<column>MAXLEN</column></table>
<table name="COLLATIONS" engine="MEMORY" >
<column>COLLATION_NAME</column></table>
etc...
OK, this is some kind of structure, column tags inside the table tag. But still we have one table tag for every column, but we want one for every table, with all the column tags inside. To accomplish this we add a group by clause, and use the GROUP_CONCAT function for the column tags. This gives us the following SELECT:
select xml_tag('table',null, concat(xml_attr('name',t.table_name),
xml_attr('engine',t.engine)),GROUP_CONCAT(xml_tag('column',c.column_name,null,null) separator ''))
from information_schema.tables t
inner join information_schema.columns c on c.table_name = t.table_name and c.table_schema = t.table_schema
where t.table_schema = 'INFORMATION_SCHEMA'
group by t.table_name, t.engine
And the following result:
<table name="CHARACTER_SETS" engine="MEMORY" >
<column>CHARACTER_SET_NAME</column>
<column>DEFAULT_COLLATE_NAME</column>
<column>DESCRIPTION</column>
<column>MAXLEN</column>
</table>
<table name="COLLATIONS" engine="MEMORY" >
<column>COLLATION_NAME</column>
<column>CHARACTER_SET_NAME</column>
<column>ID</column>
<column>IS_DEFAULT</column>
<column>IS_COMPILED</column>
<column>SORTLEN</column>
</table>
etc
OK, this is what we wanted! Just one small problem left... When I run this on my laptop I get the error 1260 with the following message:
%d line(s) were cut by GROUP_CONCAT()
And the result is only 7 rows. This is because the group_concat_max_len has a value that is too small. You can reset it with the command:
SET [SESSION | GLOBAL] group_concat_max_len = val
I set it with the command:
SET GLOBAL group_concat_max_len = 2048
I then rerun the command, and there is no error, and 23 records.
Code for the stored functions is available at MySql forge.
Saturday, July 14, 2007
Comments on MySql XML output
I have had some comments on my stored functions for XML output from MySql. First of all, publishning the code as a part of a blog post was not a good choice. All xml chars, like < and > must be escaped, and I missed some. I should probably have used the xml_escape function to precess my text...
That is fixed now, and also I have put the function definitions on the MySql forge, here. Please do not hesitate to tell me if you find any errors, or have suggestions for improvement.
Another comment was about the xml functionality of 'mysql --xml'. Yes, there is a function for generating xml output from MySql, I do know that. But if you need to output xml from your application there are a number of problems with it:
1) it has to be invoked from the command line, so you need to start a command shell, redirect output somewhere etc. This is not always possible and might not be a good alternative even if it is.
2) you have no control over the xml format, you would probably need to add som XSL processing to get the format you need. This will make it complicated, and XSL knowledge is not that common.
So, there are a number of situations where the 'mysql --xml' approach is not very suitable, but the stored function might do the job. But certainly you should consider the 'mysql --xml' possibility before choosing.
That is fixed now, and also I have put the function definitions on the MySql forge, here. Please do not hesitate to tell me if you find any errors, or have suggestions for improvement.
Another comment was about the xml functionality of 'mysql --xml'. Yes, there is a function for generating xml output from MySql, I do know that. But if you need to output xml from your application there are a number of problems with it:
1) it has to be invoked from the command line, so you need to start a command shell, redirect output somewhere etc. This is not always possible and might not be a good alternative even if it is.
2) you have no control over the xml format, you would probably need to add som XSL processing to get the format you need. This will make it complicated, and XSL knowledge is not that common.
So, there are a number of situations where the 'mysql --xml' approach is not very suitable, but the stored function might do the job. But certainly you should consider the 'mysql --xml' possibility before choosing.
Friday, July 13, 2007
XML output from MySql
Many times you need to produce XML output from your MySql database. MySql has no built-in support for creating XML in it's SQL implementation but you can easily add support using the stored function feature.
To make xml output easier I have made three small stored SQL functions. They produce xml fragments, which you can easily put together to get a complete xml:
Lets see some examples. First a real simple one:
Gives you something like:
To make this a valid xml, you need to add an xml header and a root tag, surrounding all the content. Something like this:
That will give you a real simple xml, with a list of the tables
in the information schema. But perhaps your specification says
you should present the data as attributes and not as text nodes.
Let's try that:
Since we don't want a text node we let the tagvalue parameter be null.
Instead we use the xml_attr function to format an xml attribute and
send it in in the attrs parameter.
But what if we need several attributes? You can not write a stored
function with variable number of arguments, so we use the MySql
standard function concat() for this.
And if you want your data as text nodes in sub tags, you do the
same trick with the last parameter in xml_tag, the subtags parameter:
Of course you can combine attributes, subtags and a text node,
but I leave that for yourself to experiment with.
The stored functions are defined like this:
To make xml output easier I have made three small stored SQL functions. They produce xml fragments, which you can easily put together to get a complete xml:
xml_escape(value) replace characters not allowed in xml with the escape sequences
xml_attr(name, value) create an xml attribute
xml_tag(tagname, tagvalue, attrs, subtags) create a tag, optionally with
Lets see some examples. First a real simple one:
select xml_tag('table',table_name,null,null)
from information_schema.tables
where table_schema = 'INFORMATION_SCHEMA'
Gives you something like:
<table>CHARACTER_SETS</table>
<table>COLLATIONS</table>
<table>COLLATION_CHARACTER_SET_APPLICABILITY</table>
etc...
To make this a valid xml, you need to add an xml header and a root tag, surrounding all the content. Something like this:
<?xml version="1.0" encoding="UTF-8"?>
<list>
your content...
</list>
That will give you a real simple xml, with a list of the tables
in the information schema. But perhaps your specification says
you should present the data as attributes and not as text nodes.
Let's try that:
select xml_tag('table',null,xml_attr('name',table_name),null)
from information_schema.tables
where table_schema = 'INFORMATION_SCHEMA'
Gives you:
<table name="CHARACTER_SETS" />
<table name="COLLATIONS" />
<table name="COLLATION_CHARACTER_SET_APPLICABILITY" />
etc..
Since we don't want a text node we let the tagvalue parameter be null.
Instead we use the xml_attr function to format an xml attribute and
send it in in the attrs parameter.
But what if we need several attributes? You can not write a stored
function with variable number of arguments, so we use the MySql
standard function concat() for this.
select xml_tag('table',null, concat(xml_attr('name',table_name),
xml_attr('engine',engine)),null)
from information_schema.tables
where table_schema = 'INFORMATION_SCHEMA'
gives you:
<table name="CHARACTER_SETS" engine="MEMORY" />
<table name="COLLATIONS" engine="MEMORY" />
<table name="COLLATION_CHARACTER_SET_APPLICABILITY"
engine="MEMORY" />
<table name="COLUMNS" engine="MyISAM" />
And if you want your data as text nodes in sub tags, you do the
same trick with the last parameter in xml_tag, the subtags parameter:
select xml_tag('table',null,null,
concat(xml_tag('name',table_name,null,null),
xml_tag('engine',engine,null,null)))
from information_schema.tables
where table_schema = 'INFORMATION_SCHEMA'
Results in:
<table><name>CHARACTER_SETS</name><engine>MEMORY</engine></table>
<table><name>COLLATIONS</name><engine>MEMORY</engine></table>
<table><name>COLLATION_CHARACTER_SET_APPLICABILITY</name>
<engine>MEMORY</engine></table>
etc..
Of course you can combine attributes, subtags and a text node,
but I leave that for yourself to experiment with.
The stored functions are defined like this:
DELIMITER $$
DROP FUNCTION IF EXISTS `xml_escape` $$
CREATE FUNCTION `xml_escape`( tagvalue VARCHAR(2000))
RETURNS varchar(2000)
BEGIN
IF (tagvalue IS NULL) THEN
RETURN null;
END IF;
RETURN REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(
tagvalue,'&','&'),
'<','<'),
'>','>'),
'"','"'),
'\'',''');
END $$
DROP FUNCTION IF EXISTS `xml_attr` $$
CREATE FUNCTION `xml_attr`(tagname VARCHAR(2000),
tagvalue VARCHAR(2000))
RETURNS varchar(2000)
BEGIN
IF (tagvalue IS NULL) THEN
RETURN null;
END IF;
RETURN CONCAT(' ', tagname ,'="',xml_escape(tagvalue),'" ');
END $$
DROP FUNCTION IF EXISTS `xml_tag` $$
CREATE FUNCTION `xml_tag`(tagname VARCHAR(2000),
tagvalue VARCHAR(2000),
attrs VARCHAR(2000),
subtags VARCHAR(2000))
RETURNS varchar(2000)
BEGIN
DECLARE result VARCHAR(2000);
SET result = CONCAT('<' , tagname);
IF attrs IS NOT NULL THEN
SET result = CONCAT(result,' ', attrs);
END IF;
IF (tagvalue IS NULL AND subtags IS NULL) THEN
RETURN CONCAT(result,' />');
END IF;
RETURN CONCAT(result ,'>',ifnull(xml_escape(tagvalue),''),
ifnull(subtags,''),'</',tagname, '>');
END $$
DELIMITER ;
Friday, July 06, 2007
Wishes & Gifts
For a while I have been working on a new Google Gadget. The idea is to keep lists of your wishes and gifts for Christmas or other occasions. I sometimes have problems with coming up with new ideas for gifts both for myself and others, so this should be a help. This is what the gifts tab looks like:
The wishes tab is simpler, with less functionality, no search etc.
The gadget was a bit more complicated than I thought, so it tok some time, but finally it was ready yesterday. Here is the link to add it to your Google homepage:
Add to Google homepage
The wishes tab is simpler, with less functionality, no search etc.
The gadget was a bit more complicated than I thought, so it tok some time, but finally it was ready yesterday. Here is the link to add it to your Google homepage:
Add to Google homepage
Thursday, June 28, 2007
Top 5 best MySql practices
My opinion on MySql best practices, the new new 'chain letter' started by Ronald Bradford. As always, this is a developers point of view.
0. Make a database design. Databases generated automatically might work very well in development, but when you have large data volumes and lots of concurrent users they do not.
1. Be consistent in your choice of datatypes. I prefer to use just a subset in a system. Try to stick with SMALLINT UNSIGNED or INTEGER UNSIGNED for keys. Always use the same datatype for the same term in all tables, no joins with datatype conversion! Do not store numbers as CHAR/VARCHAR fields.
2. Make sure there are indexes supporting all searches that are part of routine system usage. If you have tables that are very small it might be better not to index them, but this should be an active decision.
3. Verify your database design with real volumes and tests covering the planned usage. A modern computer can quickly scan (without indexes) more data than you will enter by hand, so you will probably need a dump from a production system or generated data.
4.Use SQL_MODE ANSI and TRADITIONAL for new systems. For old ones, you might have to allow a less strict mode.
5. Choose your storage engine with care. Use InnoDB (and keep an eye on Falcon!!) if you need transactions, but even if you do, MYISAM migh be a good choice for some tables.
0. Make a database design. Databases generated automatically might work very well in development, but when you have large data volumes and lots of concurrent users they do not.
1. Be consistent in your choice of datatypes. I prefer to use just a subset in a system. Try to stick with SMALLINT UNSIGNED or INTEGER UNSIGNED for keys. Always use the same datatype for the same term in all tables, no joins with datatype conversion! Do not store numbers as CHAR/VARCHAR fields.
2. Make sure there are indexes supporting all searches that are part of routine system usage. If you have tables that are very small it might be better not to index them, but this should be an active decision.
3. Verify your database design with real volumes and tests covering the planned usage. A modern computer can quickly scan (without indexes) more data than you will enter by hand, so you will probably need a dump from a production system or generated data.
4.Use SQL_MODE ANSI and TRADITIONAL for new systems. For old ones, you might have to allow a less strict mode.
5. Choose your storage engine with care. Use InnoDB (and keep an eye on Falcon!!) if you need transactions, but even if you do, MYISAM migh be a good choice for some tables.
Wednesday, June 27, 2007
MySql performance in a .NET environment
The last two weeks I have been working on a .NET application using MySQL (3.21) for data storage. The application is basically a stand-alone PC application where MySql is used to store a lot of data from test equipment.
This is a bit different from what I spend most of my time on. No web site with lost of concurrent session, but instead a lot of installations of the software on local PC's. The also means we can not tune parameters on a live system, we have to set them before delivering to customers. Basically this is the main problem, since we don't know how the system will be run.
Well, if the scenario is different, the most important action was the same. Changing the field order in one of the indexes to match the queries better was our number one action, and the impact was larger than all other actions together. If your indexes are not righ, you should fix it before anything else.
An upgrade to MySql 5.41 and .net connector 5.0.7 was another obviuos task. I am a bit surprised that it was so easy. The only real problem we had was with merge tables. The initially did not work, but after we made a ALTER TABLE on all underlying tables even the merge tables were OK.
Performance improvement after upgrading was not large, but when we switched shared memory on, there was a noticeable performance improvement. The application transfers a lot of data between MySql to the applcation, mainly to produce graphs, so this should be important. We also switched both key cache and query cache on, but since we do not know how the application will be used, it is hard to determine the effect of this.
We had some problems with connections not being closed initially. But turning off connection pooling fixed this. Turning it on from the start was simply a reflex of an old web programmer, I'm afraid. Connecion pooling is not relevant in this scenario.
All in all a quite successful project. Performance is OK now, and the bottleneck is not the database, so we will look at some other changes, mainy in the graph package.
This is a bit different from what I spend most of my time on. No web site with lost of concurrent session, but instead a lot of installations of the software on local PC's. The also means we can not tune parameters on a live system, we have to set them before delivering to customers. Basically this is the main problem, since we don't know how the system will be run.
Well, if the scenario is different, the most important action was the same. Changing the field order in one of the indexes to match the queries better was our number one action, and the impact was larger than all other actions together. If your indexes are not righ, you should fix it before anything else.
An upgrade to MySql 5.41 and .net connector 5.0.7 was another obviuos task. I am a bit surprised that it was so easy. The only real problem we had was with merge tables. The initially did not work, but after we made a ALTER TABLE on all underlying tables even the merge tables were OK.
Performance improvement after upgrading was not large, but when we switched shared memory on, there was a noticeable performance improvement. The application transfers a lot of data between MySql to the applcation, mainly to produce graphs, so this should be important. We also switched both key cache and query cache on, but since we do not know how the application will be used, it is hard to determine the effect of this.
We had some problems with connections not being closed initially. But turning off connection pooling fixed this. Turning it on from the start was simply a reflex of an old web programmer, I'm afraid. Connecion pooling is not relevant in this scenario.
All in all a quite successful project. Performance is OK now, and the bottleneck is not the database, so we will look at some other changes, mainy in the graph package.
Thursday, June 21, 2007
MySql merge table problems
While upgrading fom MySql 3 to Myql 5 our only real problem was with merge tables. We have two in the system, both did not work. Createing the merge table did not give an eror, bt when we tried to select from it, we get the error message 1168:
Unable to open underlying table which is differently defined or of non-MyISAM type or doesn't exist
This kind of error message gives my the creeps. Or?? Why doesn't it say which one of the cases it is? Surely these are three different cases and it should be easy to separate them. Is it difficult to add error messages to MySql? Or a lazy programmer??
OK, we try to look at the table definition i MySql's own tool the Query browser. But if you try to do 'Edit Table' in the Query Browser, you get the following error message:
A MySql eror message was encountered. The eror message is:
Cannot fetch table information
The following error occurred:Unable to open underlying table which is differently
defined or of non-MyISAM type or doesn't exist (1168)
OK, we knew that, but we wanted to take look at the definition to check that it matches the underlying table.
OK time to check for bugs. We finally find one, saying that of our three possible causes, it is actually a fourth: the table does exist, is of MyISAM type, and definitions match. But it is of an old MyISAM version... Fix is to make an ALTER TABLE on all the underlying tables, this will fix the version also, and everything is OK.
For the future I think we will look into other solutions. Merge tables are very restrictive and for our purposes a view with UNON ALL would work just as well, since we already make inserts into the underlying tables and have no updates. A quic test also says that the performance is about the same. Or, possibly, we will wait for MySql 5.1 with partitions.
Unable to open underlying table which is differently defined or of non-MyISAM type or doesn't exist
This kind of error message gives my the creeps. Or?? Why doesn't it say which one of the cases it is? Surely these are three different cases and it should be easy to separate them. Is it difficult to add error messages to MySql? Or a lazy programmer??
OK, we try to look at the table definition i MySql's own tool the Query browser. But if you try to do 'Edit Table' in the Query Browser, you get the following error message:
A MySql eror message was encountered. The eror message is:
Cannot fetch table information
The following error occurred:Unable to open underlying table which is differently
defined or of non-MyISAM type or doesn't exist (1168)
OK, we knew that, but we wanted to take look at the definition to check that it matches the underlying table.
OK time to check for bugs. We finally find one, saying that of our three possible causes, it is actually a fourth: the table does exist, is of MyISAM type, and definitions match. But it is of an old MyISAM version... Fix is to make an ALTER TABLE on all the underlying tables, this will fix the version also, and everything is OK.
For the future I think we will look into other solutions. Merge tables are very restrictive and for our purposes a view with UNON ALL would work just as well, since we already make inserts into the underlying tables and have no updates. A quic test also says that the performance is about the same. Or, possibly, we will wait for MySql 5.1 with partitions.
Tuesday, June 05, 2007
MySQL performance enhancements
I have been working on a new assignment this week, with the goal to improve performance in a .NET application using MySQL (3.21) for data storage. The application handles a lot of data in what is basically a single user system.
The main rason for the performance problems was heavy searches where indexes did not match. Simply switching the order of columns in the composite key fixed most of it.
In the old version they copied data to a temporary table, to solve the slow query problem. Once we had the indexes fixed, we could drop this solution and use the ase table in all cases. This also speeded it up a lot.
The change to MySQL 5.0 made a few other changes possible
- switch to shared memory for communication between database and application
- buffering of indexes
- query caching
which all ment further improved performance
All in all quite successful so far. Performance is OK now, and the bottleneck is not the database, so we will look at some other changes.
It has been great fun, .Net is a new environment for me, but it was quite easy to get into it.
The main rason for the performance problems was heavy searches where indexes did not match. Simply switching the order of columns in the composite key fixed most of it.
In the old version they copied data to a temporary table, to solve the slow query problem. Once we had the indexes fixed, we could drop this solution and use the ase table in all cases. This also speeded it up a lot.
The change to MySQL 5.0 made a few other changes possible
- switch to shared memory for communication between database and application
- buffering of indexes
- query caching
which all ment further improved performance
All in all quite successful so far. Performance is OK now, and the bottleneck is not the database, so we will look at some other changes.
It has been great fun, .Net is a new environment for me, but it was quite easy to get into it.
Wednesday, May 30, 2007
More gadgeteering....
I am a great music fan, and from time to time can't stop myself from singing a song or two, mostly rock classics. Although I know a lot of lyrics, often I can't recall them all.
Now they are all out there on the net, but sometimes difficult to find. So I made a small Google custom search engine, and a gadget to go with it, like this:
Surprisingly what you can do with so little, the power of the web is amazing!
Now they are all out there on the net, but sometimes difficult to find. So I made a small Google custom search engine, and a gadget to go with it, like this:
Surprisingly what you can do with so little, the power of the web is amazing!
Wednesday, May 16, 2007
Pricerunner Google Gadget
Made another Google gadget the other day, this time doing a search in Pricerunner, to compare prices and products. I believe Pricerunner is a very good service, and widely used in Sweden.
The gadget looks like this:
Today it supports Pricerunner in the UK, Germany, Sweden and Denmark. Hopefully I will be able to add US and France soon.
You could also try it here.
Add Pricerunner to your Google homepage.
The gadget looks like this:
Today it supports Pricerunner in the UK, Germany, Sweden and Denmark. Hopefully I will be able to add US and France soon.
You could also try it here.
Add Pricerunner to your Google homepage.
Thursday, May 10, 2007
Turning the result around
In many cases you don't want your the result from your SQL statement in just a plain listing, but organized in columns, or whith data grouped on status, period or whatever. This can often be accomplished with a combination of the IF function and SUM or another group function.
For example get your sales per customer grouped by period:
SELECT customer_no AS CUSTNO, cust_name AS CUSTNAME,
sum(if(period = '200701',amount,0) as Jan_Amount,
sum(if(period = '200702',amount,0) as Jan_Amount
FROM otd.salestrans_hist s
WHERE period IN ('200701','200702')
GROUP BY customer_no, cust_name
Or number of open and closed records in each class:
SELECT class, sum(if(status='Open',1,0)) as open,
sum(if(status='Closed',1,0)) as closed
FROM table_name
GROUP BY class
In the examples I have used the MySQL IF function. You could also use CASE, which is SQL standard.
For example get your sales per customer grouped by period:
SELECT customer_no AS CUSTNO, cust_name AS CUSTNAME,
sum(if(period = '200701',amount,0) as Jan_Amount,
sum(if(period = '200702',amount,0) as Jan_Amount
FROM otd.salestrans_hist s
WHERE period IN ('200701','200702')
GROUP BY customer_no, cust_name
Or number of open and closed records in each class:
SELECT class, sum(if(status='Open',1,0)) as open,
sum(if(status='Closed',1,0)) as closed
FROM table_name
GROUP BY class
In the examples I have used the MySQL IF function. You could also use CASE, which is SQL standard.
Wednesday, April 25, 2007
Indexing, indexing and indexing....
Saw this table definition in a system I was working on:
CREATE TABLE session_role (
Session_sessionId bigint(20) NOT NULL,
activeRoles_roleId varchar(255) NOT NULL,
PRIMARY KEY(Session_sessionId,activeRoles_roleId),
UNIQUE KEY activeAccounts_actorId (activeRoles_roleId),
KEY `FK38CC06CCF5B03D50` (Session_sessionId),
KEY `FK38CC06CC4085AE4` (activeRoles_roleId),
CONSTRAINT `FK38CC06CC4085AE4` FOREIGN KEY (activeRoles_roleId)
REFERENCES role(roleId),
CONSTRAINT `FK38CC06CCF5B03D50` FOREIGN KEY (Session_sessionId)
REFERENCES session (sessionId)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
This is four indexes for a two-column table! Of course nobody has actually written this, this is generated by JBoss/Hibernate, with a MySQL database used for storing data. Be very careful with code generators....
Wednesday, April 11, 2007
Google gadget
I have spent some time on a Google gadget, Shopping List, and now it is released.
Helps you remeber what to buy. It can also be printed and sent via mail, Gmail or other.
Click the link below to add to your Google homepage.
It is amazing what you can do with a little javascript and html. The small format also makes it necessary to think about your design.
Add "Shopping List" to your Google homepage
Helps you remeber what to buy. It can also be printed and sent via mail, Gmail or other.
Click the link below to add to your Google homepage.
It is amazing what you can do with a little javascript and html. The small format also makes it necessary to think about your design.
Add "Shopping List" to your Google homepage
Thursday, March 22, 2007
More EJB questions
Our EJB project is moving forward.. My sceptisism is growing... Some questionable points:
Session beans is where most, if not all our business logic goes. But session beans are nut pure java classes. They are based on some arcitectural ideas, which are not obviuosly good:
Session beans is where most, if not all our business logic goes. But session beans are nut pure java classes. They are based on some arcitectural ideas, which are not obviuosly good:
- Communication with RMI and serialized java objects. This is probably a bad idea in most cases, a solution that is not only java specific but also has problems with different java versions. Basically it is based on that both sides have the same java classes. And then there is
- Putting your logic in separate threads. A good idea if your session beans work with a large number of java clients, but probably a bad idea if your clients are servlets or message driven beans. In that case it will give you more thread handling overhead and little gain.
Entity beans have disappeared, instead we have got the new EntityManager. This of course means much faster development, but what is the cost? In my opinion the Data Access Object pattern is a good one, since it means that you have all your database access for one object in one place. With the EnityManager you can easily end up with SQL statements scattered all over your application. Where did the object orientation go?
Another point is constructions like this:
customer.getOrders().add(new Order(.....))
this would mean that to add a new order, you first fetch all existing orders from a customer before adding a new one. A framework that encourages this kind of solutions is not a good one.
Thursday, March 08, 2007
Started with EJB3
I have just started on a new project, where we will use EJB 3.0. I am very sceptical to the EJB concept, because some of the worst systems I have seen use EJB. Complicated to write, very hard to maintain and with very bad performance, mostly because they generate a lot of database traffic.
Version 3.0 is a big improvement, that's for sure. But is it better than the old POJO way, with Database Access Objects and SQL/JDBC? That remains to see. Development is so far fast, but will performance be OK? If everything goes OK, we will roll out the system in thousands of installations. We can not make performance tuning when the systems are up and running, which according to my experience is necessary on EJB systems. The performance has to be ok from day one.
Version 3.0 is a big improvement, that's for sure. But is it better than the old POJO way, with Database Access Objects and SQL/JDBC? That remains to see. Development is so far fast, but will performance be OK? If everything goes OK, we will roll out the system in thousands of installations. We can not make performance tuning when the systems are up and running, which according to my experience is necessary on EJB systems. The performance has to be ok from day one.
Subscribe to:
Posts (Atom)