April 12, 2008

Solution to most common db problems ?

Today I just happen to read May 2008 edition of Dr.Dobb’s Journal printed edition and the article “Kernel-mode Databases” written by Andrel and Alexander from McObject really excited me.

The basic principle of any database engine is its performance and scalability and success of any database in most part relays on these two concepts. To achieve this, operating system kernel plays a major role in the form of …

  • resource allocation
  • thread scheduling
  • low-level hardware access
  • network
  • security

So, looks like eXtremeDB try to overcome from this by releasing kernel-mode database. The kernel mode package actually overrides most of the kernel access calls that database engine expects by replacing with optimized direct calls and exposing them to the top layer. And the run-time engine of eXtremeDB actually linked with the kernel mode to avoid any overhead from the RPC.

Interesting quotes:

“Kernel-mode threads have direct access to kernel-mode databases; concurrent access is coordinated by the database run-time”

“Applications performing computer security functions performed an order of magnitude faster”

It will be really nice if McObject can release benchmark results between regular kernel threads and the optimized one, along with conducting some more tests to prove the benefit from their kernel mode. Even though they found a solution for part of the common problems (kernel overhead, security, thread scheduling) but still there is a overhead from regular IO and exclusive lock time when doing the updates. Looks like some of their locks are using “Compare-and-swap“; but could not find much reference to where they were actually used within the code base.

April 11, 2008

SQL Server Data Services - Invitation to attend the event

After Microsoft released its beta version of SQL Server Data Services; I was more interested to know what kind of considerations were taken care in the design and architecture of SSDS in order to get performance, distribution of data and scalability as it is developed on top of SQL Server.

To my surprise, I noticed Microsoft is conducting the live event to answer some of these questions by following up Ryan Dunn blog; Today I also got the confirmation to attend the event and I look forward to get some of my general questions answered.

It will be interesting to see the pricing model of SSDS in the coming days, and might even attract if they make it FREE for initial customers and/or by setting the limit on the data usage by distributing it as part of OS. It will be really cool if they could also integrate some kind of access mechanism as part of the windows operating system itself to access/store the data.

I hope Microsoft will also change the name of SSDS to something else before the final release as Data Services is not catchy and does not sound right (check yet another post on the similar lines).

February 26, 2008

Query Cache - Prepared Statements

For years, MySQL never supported query cache with prepared statements. When I initially implemented MySQL prepared statements and its associated C API in MySQL 4.1 version; prepared statements were just saving the parsing cost of the query and nothing much. But now, the things are totally different.

Lot of developers stayed away from prepared statements due to this limitation (atleast for select statements). But now the latest MySQL 5.1 server does support the caching of prepared statements with some limitations.

I just wrote a simple C API test which uses mysql_prepare, mysql_execute and mysql_real_query and all seemed to work as expected and the cache works even with intermixed (prepare and non-prepare) on the same binary protocol. Here is a simple example how the QCache works with MySQL command line client when queries were intermixed with prepared and non-prepare execution on 5.1.23 server.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
mysql> show status like '%cache%';
+-------------------------+----------+
| Variable_name           | Value    |
+-------------------------+----------+
| Qcache_hits             | 0        |
| Qcache_inserts          | 0        |
| Qcache_not_cached       | 1        |
| Qcache_queries_in_cache | 0        |
+-------------------------+----------+
11 rows in set (0.01 sec)
 
mysql> create table t1(c1 int);
Query OK, 0 rows affected (0.00 sec)
 
mysql> insert into t1 values(10),(20),(30);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0
 
mysql> select * from t1;
+------+
| c1   |
+------+
|   10 |
|   20 |
|   30 |
+------+
3 rows in set (0.00 sec)
 
mysql> prepare stmt1 from 'select * from t1 where c1=?';
Query OK, 0 rows affected (0.00 sec)
Statement prepared
 
mysql> set @a=10;
Query OK, 0 rows affected (0.00 sec)
 
mysql> execute stmt1 using @a;
+------+
| c1   |
+------+
|   10 |
+------+
1 row in set (0.00 sec)
 
mysql> show status like '%Qcache%';
+-------------------------+----------+
| Variable_name           | Value    |
+-------------------------+----------+
| Qcache_hits             | 0        |
| Qcache_inserts          | 1        |
| Qcache_not_cached       | 1        |
| Qcache_queries_in_cache | 1        |
+-------------------------+----------+
8 rows in set (0.00 sec)
 
mysql> select * from t1 where c1=10;
+------+
| c1   |
+------+
|   10 |
+------+
1 row in set (0.00 sec)
 
mysql> show status like '%Qcache%';
+-------------------------+----------+
| Variable_name           | Value    |
+-------------------------+----------+
| Qcache_hits             | 1        |
| Qcache_inserts          | 1        |
| Qcache_not_cached       | 2        |
| Qcache_queries_in_cache | 1        |
+-------------------------+----------+
8 rows in set (0.00 sec)
 
mysql> select * from t1 where c1=20;
+------+
| c1   |
+------+
|   20 |
+------+
1 row in set (0.00 sec)
 
mysql> show status like '%Qcache%';
+-------------------------+----------+
| Variable_name           | Value    |
+-------------------------+----------+
| Qcache_hits             | 1        |
| Qcache_inserts          | 2        |
| Qcache_not_cached       | 2        |
| Qcache_queries_in_cache | 2        |
+-------------------------+----------+
8 rows in set (0.01 sec)
 
mysql> set @a=20;
Query OK, 0 rows affected (0.02 sec)
 
mysql> execute stmt1 using @a;
+------+
| c1   |
+------+
|   20 |
+------+
1 row in set (0.00 sec)
 
mysql> show status like '%Qcache%';
+-------------------------+----------+
| Variable_name           | Value    |
+-------------------------+----------+
| Qcache_hits             | 2        |
| Qcache_inserts          | 2        |
| Qcache_not_cached       | 2        |
| Qcache_queries_in_cache | 2        |
+-------------------------+----------+
8 rows in set (0.00 sec)
 
mysql> select * from t1 where c1=20;
+------+
| c1   |
+------+
|   20 |
+------+
1 row in set (0.00 sec)
 
mysql> show status like '%Qcache%';
+-------------------------+----------+
| Variable_name           | Value    |
+-------------------------+----------+
| Qcache_hits             | 3        |
| Qcache_inserts          | 2        |
| Qcache_not_cached       | 2        |
| Qcache_queries_in_cache | 2        |
+-------------------------+----------+
8 rows in set (0.00 sec)
 
mysql> select * from t1 where c1=30;
+------+
| c1   |
+------+
|   30 |
+------+
1 row in set (0.00 sec)
 
mysql> show status like '%Qcache%';
+-------------------------+----------+
| Variable_name           | Value    |
+-------------------------+----------+
| Qcache_hits             | 3        |
| Qcache_inserts          | 3        |
| Qcache_not_cached       | 2        |
| Qcache_queries_in_cache | 3        |
+-------------------------+----------+
8 rows in set (0.00 sec)
 
mysql> set @a=30;
Query OK, 0 rows affected (0.00 sec)
 
mysql> execute stmt1 using @a;
+------+
| c1   |
+------+
|   30 |
+------+
1 row in set (0.00 sec)
 
mysql> show status like '%Qcache%';
+-------------------------+----------+
| Variable_name           | Value    |
+-------------------------+----------+
| Qcache_hits             | 4        |
| Qcache_inserts          | 3        |
| Qcache_not_cached       | 2        |
| Qcache_queries_in_cache | 3        |
+-------------------------+----------+
8 rows in set (0.00 sec)

February 8, 2008

MySQL MyISAM and Merge Tables Incompatibility - II

Few months back we noticed the incompatibility change related to MyISAM that was introduced in 4.1.23 and recent 5.0 and 5.1 MySQL versions which breaks when one wants to upgrade to 4.1.23 or latest 5.0 or 5.1 when the MyISAM or Merga table is created with 4.0 or earlier 4.1 version.

I reported a bug to MySQL team way back; and looks like they now pushed a patch to 5.1.

This happens even with Merge tables and looks like merge tables are overlooked in that bug report and I am not 100% sure if the patch fixes the merge table issue. This bug is triggered only when the MyISAM table is created with a key column and Merge is associated with that table.

For example, consider the following MyISAM and Merge tables in 4.1:

1
2
3
4
5
6
7
8
9
10
mysql> CREATE TABLE IF NOT EXISTS base_t1 (
    -> id int NOT NULL PRIMARY KEY
    -> ) TYPE = MyISAM;
Query OK, 0 rows affected (0.00 sec)
 
mysql>
mysql> CREATE TABLE IF NOT EXISTS merge_t1 (
    -> id int NOT NULL PRIMARY KEY
    -> ) TYPE = MRG_MyISAM INSERT_METHOD=LAST UNION=(base_t1);
Query OK, 0 rows affected (0.01 sec)

and when it comes to 5.0 or 5.1; the merge table completely fails to open the table files.

1
2
3
4
5
6
7
8
9
10
mysql> desc base_t1;
+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| id    | int(11) |      | PRI | 0       |       |
+-------+---------+------+-----+---------+-------+
1 row in set (0.00 sec)
 
mysql> desc merge_t1;
ERROR 1168 (HY000): Unable to open underlying table which is differently defined or of non-MyISAM type or doesn't exist

So, the only option is to drop and re-create the tables back in the latest MySQL version. In case if the MyISAM table had blob/text fields; then you need to drop and re-create both MyISAM and Merge tables. Even though you can re-create the merge table, the table creation works; but again it fails when you try to do any operation on the table, so one needs to fix MyISAM as well.

1
2
3
4
5
6
7
8
mysql> create table mt4(id int) TYPE=MRG_MYISAM INSERT_METHOD=LAST UNION=(t4);
Query OK, 0 rows affected, 1 warning (0.00 sec)
 
mysql> select * from mt3;
Empty set (0.00 sec)
 
mysql> select * from mt4;
ERROR 1168 (HY000): Unable to open underlying table which is differently defined or of non-MyISAM type

I mad a patch in the internal Yahoo! builds to overcome from this issue and hopefully MySQL will address both the issues as lot of people might start migrating from the existing old tables (even though it is always a good idea to re-create the tables and populate the data when there is major version change)

February 4, 2008

MySQL Enterprise Monitor

Today I attended a webex demo of MySQL Enterprise Monitor from MySQL. As most of the Yahoo’s are interested in learning about this tool, so arranged a web-ex demo from MySQL. MySQL is kind enough to host this event.

I always thought a dedicated monitoring and alerting system is completely missing from MySQL product line for all these days, and I can see that this tool is heading in the right direction to capture the market. Currently it monitors all server variables, errors and identifies any critical conditions upfront to avoid a disaster. But the new features that are in the pipeline for the coming months seemed to be promising (like upgrade adviser, Load balancing, query analyzer and connection manager).

As this is not like a single node stand-alone architecture, instead it works on client-server architecture where the agent runs in the same box as that of MySQL server and collects and reports to centralized service repository (dashboard) where you have all the information.

Time for me to get the evaluation copy and start looking into how well we can adopt this.

December 13, 2007

MySQL 5.1 - LOAD DATA INFILE

Today to debug one of the issue related to InnoDB where the INSERT INTO .. SELECT simply bails with an error as “ERROR 1105 (HY000): Unknown error”; I simply started loading the data onto 5.1 server as it was consistent in 4.1 version (even though it was noted as fixed). So, LOAD DATA INFILE suddenly pops with an error as ..

1
2
3
4
5
6
7
8
9
mysql> LOAD DATA LOCAL INFILE 'venudata-all.txt' INTO TABLE DOCUMENT_DESC FIELDS TERMINATED BY ',';
ERROR 1148 (42000): The used command is not allowed with this MySQL version
mysql> show variables like '%infile%';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| local_infile  | ON    |
+---------------+-------+
1 row in set (0.00 sec)

As you can see I have the local_infile enabled in server. And further reading enabled me that I need to start my mysql command line utility also with –local-infile option… well…looks like time for me to read more documentation and feature changes.

December 5, 2007

MySQL 5.1 - query optimizer regression

Since few days I was debugging a case from Yahoo! Japan team (thanks to Akira Kitada) where the query runs fine in about 0 secs with the 5.1.16 build where as the same query takes almost close to 2 minutes with the latest 5.1.22 release.

One can simply identify that this is a optimizer issue as it is not picking the right index. Lets consider the following two tables urls and urls_categories

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
| urls  | CREATE TABLE `urls` (
  `url_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `url` varbinary(2000) NOT NULL,
  `url_status` enum('EXISTS','DEAD','N/A') NOT NULL,
  `yservice_id` smallint(5) unsigned DEFAULT NULL,
  `created_by` smallint(5) unsigned NOT NULL,
  `modified_by` smallint(5) unsigned NOT NULL,
  `created_on` datetime NOT NULL,
  `modified_on` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `nocheck` tinyint(1) NOT NULL DEFAULT '0',
  `url_md5` char(32) NOT NULL,
  PRIMARY KEY (`url_id`),
  UNIQUE KEY `url_unique` (`url_md5`),
  KEY `modified_on` (`modified_on`),
  KEY `url` (`url`(50)),
  KEY `yservice_id` (`yservice_id`),
  KEY `modified_by` (`modified_by`),
  KEY `created_by` (`created_by`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
 
 
| urls_categories | CREATE TABLE `urls_categories` (
  `url_id` int(10) unsigned NOT NULL,
  `category_id` tinyint(3) unsigned NOT NULL,
  `workflow` enum('NEW','ACTIVATED','DEACTIVATED','REMOVED') NOT NULL,
  `searchtype_id` tinyint(3) unsigned NOT NULL,
  `work_id` tinyint(3) unsigned NOT NULL,
  `created_by` smallint(5) unsigned NOT NULL,
  `modified_by` smallint(5) unsigned NOT NULL,
  `created_on` datetime NOT NULL,
  `modified_on` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `comment` varchar(200) NOT NULL,
  PRIMARY KEY (`url_id`,`category_id`),
  KEY `search_index` (`modified_on`,`workflow`,`category_id`),
  KEY `rd_index` (`url_id`,`category_id`,`workflow`),
  KEY `category_id` (`category_id`),
  KEY `work_id` (`work_id`),
  KEY `modified_by` (`modified_by`),
  KEY `created_by` (`created_by`),
  KEY `searchtype_id` (`searchtype_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |

And here is the explain output for a simple inner join query when ran with MySQL version 5.1.16:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
mysql> explain select SQL_NO_CACHE * from urls u inner join urls_categories uc  using (url_id) order by uc.modified_on limit 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: uc
         type: index
possible_keys: PRIMARY,rd_index
          key: search_index
      key_len: 6
          ref: NULL
         rows: 3121058
        Extra:
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: u
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.uc.url_id
         rows: 1
        Extra:
2 rows in set (0.00 sec)

and here is what you get when ran with MySQL 5.1.22:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
mysql> explain select SQL_NO_CACHE * from urls u inner join urls_categories uc  using (url_id) order by uc.modified_on limit 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: uc
         type: ALL
possible_keys: PRIMARY,rd_index
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3121045
        Extra: Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: u
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.uc.url_id
         rows: 1
        Extra:
2 rows in set (0.00 sec)

as you can cleary see, for urls_categories table, the key search_index is missing and optimizer is not picking that. Even if I force to use the key, still no effect:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
mysql> explain select SQL_NO_CACHE * from urls u inner join urls_categories uc force key(search_index,PRIMARY,rd_index)  using (url_id) order by uc.modified_on limit 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: uc
         type: ALL
possible_keys: PRIMARY,rd_index
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3121045
        Extra: Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: u
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.uc.url_id
         rows: 1
        Extra:
2 rows in set (0.00 sec)
 
mysql> explain select SQL_NO_CACHE * from urls u inner join urls_categories uc force key(search_index)  using (url_id) order by uc.modified_on limit 10\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: uc
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 3121045
        Extra: Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: u
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: test.uc.url_id
         rows: 1
        Extra:
2 rows in set (0.00 sec)

As a result of the above bad index use, the query takes about 1m 20 secs with 5.1.22 where as the same one runs in about 0.0 secs as shown below..

with 5.1.22:

1
2
3
4
mysql> select SQL_NO_CACHE * from urls u inner join urls_categories uc  using (url_id) order by uc.modified_on limit 10;
...
...
10 rows in set (1 min 14.55 sec)

with 5.1.16:

1
2
3
4
mysql> select SQL_NO_CACHE * from urls u inner join urls_categories uc  using (url_id) order by uc.modified_on limit 10;
...
...
10 rows in set (0.00 sec)

Here is the bug report to mysql team: MySQL 5.1 optimizer regression.