November 20, 2009

ON DUPLICATE KEY With NULL Validation

I am not sure if this is a bug or how MySQL works on validating constraints in association with ON DUPLICATE KEY (late or early checking). For example, consider the following use case (this is irrepective of storage engine and MySQL version):

mysql> create table t1(id int not null primary key, val int not null) Engine=MyISAM;
Query OK, 0 rows affected (0.07 sec)
 
mysql> insert into t1 values(10,20);
Query OK, 1 row affected (0.01 sec)
 
mysql> insert into t1 values(20,10);
Query OK, 1 row affected (0.00 sec)
 
mysql> create table t2(id1 int not null primary key, val1 int) Engine=MyISAM;
Query OK, 0 rows affected (0.14 sec)
 
mysql> insert into t2 values(10,NULL);
Query OK, 1 row affected (0.00 sec)
 
mysql> insert into t1(id, val) select id1, val1 from t2 ON DUPLICATE KEY UPDATE val=IF (VALUES(val) IS NULL, val, VALUES(val));
Query OK, 2 rows affected, 1 warning (0.00 sec)
Records: 1  Duplicates: 1  Warnings: 1
 
mysql> show warnings;
+---------+------+-----------------------------+
| Level   | Code | Message                     |
+---------+------+-----------------------------+
| Warning | 1048 | Column 'val' cannot be null |
+---------+------+-----------------------------+
1 row in set (0.00 sec)
 
mysql> select * from t1;
+----+------+
| id | val  |
+----+------+
| 10 |    0 |
| 20 |   10 |
+----+------+
2 rows in set (0.00 sec)

In the above case, you have table t1 with ‘val’ column being NOT NULL; and t2 is trying to insert to t1 with NULL for val column as part of the INSERT, and ON DUPLICATE KEY; it actually keeps the old value and will not populate the NULL to t1 at all (common use case when you have a business logic to replace old with new or keep the old as is when new value is not appropriate or not modified, which in general is represented by NULL).

So; but as you can see; the validation of NULL happens before the ON DUPLICATE constraint checking; and in the above case, it should not as already primary key satisfies the duplicate constraint (before trigger of fill_records) and server should proceed with changes from what was specified from the ON DUPLICATE section (after trigger).

That is apart; and still you can see that it got a new value 0 from no where by replacing the old one 20 for column val, which is odd. The correct behavior is to just check only primary/unique constraint values when ON DUPLICATE KEY is specified in early binding and validate the rest in late binding by taking the final output from ON DUPLICATE clause… then the above should have been…

mysql> insert into t1(id, val) select id1, val1 from t2 ON DUPLICATE KEY UPDATE conv=IF (VALUES(val) IS NULL, val, VALUES(val));
Query OK, 2 rows affected (0.00 sec)
Records: 1  Duplicates: 1  Warnings: 0
 
mysql> select * from t1;
+----+------+
| id | val  |
+----+------+
| 10 |   20 |
| 20 |   10 |
+----+------+
2 rows in set (0.00 sec)

As no other server supports ON DUPLICATE KEY, its hard to derive what is the right logic that needs to be applied here. But I still think early validation is wrong here as the value can change in the late binding due to other constraints that can happen from ON DUPLICATE KEY derivation.

The above is only one use case with NULL; this can be simulated to other constraints as well…

November 17, 2009

InnoDB Tablespace Corruption

Over the weekend, I experienced a strange issue (even though its not new) with the InnoDB tablespace (ibdata) corruption. When in general InnoDB crashes, it automatically recovers during the next start by rolling back/forward based on what was pending and un-flushed/un-committed changes at the time of crash.

But for some reason, one of the server; we ran out of disk space (yeah, no alerts) on data directory; where we store everything (tablespace, logs and data); and server was running for few hours in this mode (disk full); and it became un-available and not responding after a while. Only option left was to kill the server process and its PID along with cleaning the stuff to get the space back. After I (re)started the server; server failed to start with the following error..

InnoDB: Error: trying to access page number 1098759810 in space 0,
InnoDB: space name /data/ibdata1,
InnoDB: which is outside the tablespace bounds.
InnoDB: Byte offset 0, len 16384, i/o type 10.

Means the tablespace is corrupted.. By enabling the monitor; I noticed the following..

InnoDB: Error: data dictionary entry for table DBXX/tableXX is corrupt!
InnoDB: Index field 0 is delete marked.
091117 20:32:47InnoDB: Assertion failure in thread 1148791104 in file dict0load.c line 503
InnoDB: Failing assertion: ut_memcmp(buf, field, len) == 0

Looks like a modification to primary key on that table which was never persisted…as it failed to write anything to disk just before the crash; wondering why the engine did not trap it earlier as it will not mark as persistent unless a write to log returns success (even if the modified entries are in memory, the change is written in log as everything else is rolled back)

Anyway, I recovered using innodb_force_recovery=4 and dumped and re-loaded it as it failed sub-sequent restarts to start again even after the first recovery (well, should not start as dictionary is wrong)…

Time for me to simulate this scenario on Drizzle and Maria for fun (Just got the source of Maria couple of days back, and should start contributing the code) to see how it works out in the coming days.