November 29, 2007

MySQL - Simple ways to avoid filesort

It is bit panic to see ‘Using filesort‘ from the extra field when one runs a explain of select query on MySQL server. At times it is bit annoy why MySQL optimizer does not avoid this as you can see from the following cases…

This is the information (Extra field) that scares a lot for many users from Explain output:

1
2
3
4
5
6
7
8
9
10
          id: 1
  select_type: SIMPLE
        table: t1
         type: index
possible_keys: NULL
          key: col1key
      key_len: 5
          ref: NULL
         rows: 1000000
        Extra: Using index; Using temporary; Using filesort

Lets consider the following simple table (MyISAM in this case, does not matter what storage engine it is) populated with distinct random 1000000 (1M) rows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
Create Table: CREATE TABLE `testtab` (
  `col1` int(11) DEFAULT NULL,
  `col2` text,
  `col3` int(11) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`col3`),
  KEY `col1key` (`col1`)
) ENGINE=MyISAM
 
mysql> select count(*) from testtab;
+----------+
| count(*) |
+----------+
|  1000000 |
+----------+
1 row in set (0.00 sec)
 
mysql> show table status like 'testtab'\G
*************************** 1. row ***************************
           Name: testtab
         Engine: MyISAM
        Version: 10
     Row_format: Dynamic
           Rows: 1000000
 Avg_row_length: 88
    Data_length: 88000000
Max_data_length: 281474976710655
   Index_length: 26619904
      Data_free: 0
 Auto_increment: 1000001
    Create_time: 2007-11-29 20:32:19
    Update_time: 2007-11-29 20:35:03
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)

And lets run a explain for a simple select query from the above table:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
mysql> explain select * from testtab where col1 > 50000 AND col1 < 150000 order by col3\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: testtab
         type: range
possible_keys: col1key
          key: col1key
      key_len: 5
          ref: NULL
         rows: 91609
        Extra: Using where; Using filesort
1 row in set (0.00 sec)
 
mysql> EXPLAIN SELECT * FROM testtab WHERE col1 > 50000  AND col3 < 20000  ORDER BY col3\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: testtab
         type: range
possible_keys: PRIMARY,col1key
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 21209
        Extra: Using where
1 row in set (0.00 sec)
 
mysql> EXPLAIN SELECT * FROM testtab WHERE col3 < 20000 AND col1 > 50000  ORDER BY col3\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: testtab
         type: range
possible_keys: PRIMARY,col1key
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 21209
        Extra: Using where
1 row in set (0.00 sec)

The only difference from the first query (which has filesort) to that of other two are having the col3 in the where clause which appears in the ORDER BY clause…. and another example…

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
mysql> EXPLAIN SELECT * FROM testtab WHERE col3 < 20000 order by col3\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: testtab
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 21209
        Extra: Using where
1 row in set (0.00 sec)
 
mysql> EXPLAIN SELECT * FROM testtab WHERE col3 < 20000 order by col1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: testtab
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 21209
        Extra: Using where; Using filesort
1 row in set (0.00 sec)
 
mysql> EXPLAIN SELECT * FROM testtab WHERE col1 < 20000 order by col1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: testtab
         type: range
possible_keys: col1key
          key: col1key
      key_len: 5
          ref: NULL
         rows: 19892
        Extra: Using where
1 row in set (0.00 sec)
 
mysql> EXPLAIN SELECT * FROM testtab WHERE col1 < 20000 order by col3\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: testtab
         type: range
possible_keys: col1key
          key: col1key
      key_len: 5
          ref: NULL
         rows: 19892
        Extra: Using where; Using filesort
1 row in set (0.00 sec)

The above indicates that any column which appears in the ORDER BY clause should also appear in WHERE clause in order to avoid filesort. Consider yet another query for the same table using a self join this time…

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
mysql> explain select t1.col1, t2.col2 from testtab t1 left outer join testtab t2 on (t1.col1=t2.col1) order by t2.col1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: index
possible_keys: NULL
          key: col1key
      key_len: 5
          ref: NULL
         rows: 1000000
        Extra: Using index; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: t2
         type: ref
possible_keys: col1key
          key: col1key
      key_len: 5
          ref: test.t1.col1
         rows: 11
        Extra:
2 rows in set (0.00 sec)
 
mysql> explain select t1.col1, t2.col2 from testtab t1 left outer join testtab t2 on (t1.col1=t2.col1) order by t1.col1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: index
possible_keys: NULL
          key: col1key
      key_len: 5
          ref: NULL
         rows: 1000000
        Extra: Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: t2
         type: ref
possible_keys: col1key
          key: col1key
      key_len: 5
          ref: test.t1.col1
         rows: 11
        Extra:
2 rows in set (0.00 sec)

The only difference from the above case is “Order by column is from the left side of the join clause”… and here is yet another example..

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
mysql> explain select t1.col1, t2.col2 from testtab t1 left outer join testtab t2 on (t1.col1=t2.col1) order by t2.col1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: index
possible_keys: NULL
          key: col1key
      key_len: 5
          ref: NULL
         rows: 1000000
        Extra: Using index; Using temporary; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: t2
         type: ref
possible_keys: col1key
          key: col1key
      key_len: 5
          ref: test.t1.col1
         rows: 11
        Extra:
2 rows in set (0.00 sec)
 
mysql> explain select t1.col1, t2.col2 from testtab t1 right outer join testtab t2 on (t1.col1=t2.col1) order by t2.col1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t2
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 1000000
        Extra: Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: ref
possible_keys: col1key
          key: col1key
      key_len: 5
          ref: test.t2.col1
         rows: 11
        Extra: Using index
2 rows in set (0.00 sec)
mysql> explain select t1.col1, t2.col2 from testtab t2 right outer join testtab t1 on (t1.col1=t2.col1) order by t1.col1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: index
possible_keys: NULL
          key: col1key
      key_len: 5
          ref: NULL
         rows: 1000000
        Extra: Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: t2
         type: ref
possible_keys: col1key
          key: col1key
      key_len: 5
          ref: test.t1.col1
         rows: 11
        Extra:
2 rows in set (0.00 sec)
mysql> explain select t1.col1, t2.col2 from testtab t1 left outer join testtab t2 on (t1.col1=t2.col1) order by t1.col1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: index
possible_keys: NULL
          key: col1key
      key_len: 5
          ref: NULL
         rows: 1000000
        Extra: Using index
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: t2
         type: ref
possible_keys: col1key
          key: col1key
      key_len: 5
          ref: test.t1.col1
         rows: 11
        Extra:
2 rows in set (0.00 sec)

So, the conclussion:

  1. You can avoid the filesort by making order by column appear in the where clause
  2. When using join, make sure the left side join table column is used in the ORDER BY clause or change the join type

There are so many other ways to avoid, and the complete order by optimization can be found from the MySQL documentation.

November 27, 2007

iPhone Stocks Ticker - Markets Closed

Since couple of weeks I had a weired issue with the iPhone. The stocks gadget or application powered by Yahoo! Finance always shows “Markets Closed” for almost all the day from morning to evening (atleast I started noticing it from 9.00 AM PST)

So, I sent a mail to finance team today to confirm this, and I received a immediate attention and after digging in to this; they were able to isolate the issue..may be its a known issue to them…The reason is; since Alibaba started trading in Hong Kong Exchange; I had that symbol 1688.HK in my porpolio; even though rest of them are all NASDAQ traded… So, if you have any international stock symbol in your list; you may see the same problem irrespective of rest of the stocks trading time.

Looks like they will fix this as soon as possible; but am not very sure if its Yahoo! specific issue or Apple one.

November 14, 2007

Confused MySQL Server

Today I saw a case where MySQL server is writing its binlog entries to the error log file ‘mysqld.err’ instead of writing it to regular binlog. This is happening with MySQL version 4.0.26 and not sure how it was triggered.

I am guessing it could be because the file handles got swapped, it can not be a memory overrun or anything else as I can see the error log is getting the right binlog entries. I am still trying to find what could have caused this to happen. This happened twise so far from the same property with the same setup; even though there is more than an year in between. If you restart the server or upon issuing FLUSH LOGS, everything will be back to normal.

November 13, 2007

MySQL 5.1 - Broken Read-Only Server

I had a interesting problem from Yahoo! mail team that they were not able to start the 5.1 server at all on couple of their replication setup boxes. They tried 5.1.20 , 21 and 22 and all are behaving in the same way.

After digging into the box, the mysql error log has the following info that it was failed to open event table.

1
2
3
4
5
6
7
8
9
071113 10:09:28 [ERROR] Event Scheduler: Failed to open table mysql.event
071113 10:09:28 [ERROR] Event Scheduler: Error while loading from disk.
071113 10:09:28 [Note] Event Scheduler: Purging the queue. 0 events
071113 10:09:28 [ERROR] Aborting
 
Creating new mysql user for ymon and ysar...
071113 10:09:28  InnoDB: Starting shutdown...
071113 10:09:30  InnoDB: Shutdown completed; log sequence number 0 46410
071113 10:09:30 [Note] /mysql/libexec64/mysqld: Shutdown complete

And this happened only on slave server and master server seemed to run fine; and after fighting a while figured out that this happens only when the my.cnf had read_only variable specified as all the slaves are acting as read_only servers.

And searching through MySQL Bug System, found that this is a existing bug in recent MySQL 5.1 version and already fixed and will be available from 5.1.23 on wards. To get around, I suggested the following alternative ways till we have the patched 5.1.22 or new 5.1.23 version.

  • Add EVENT_SCHEDULER=DISABLED in the my.cnf under [mysqld] section along with READ_ONLY option enabled; which will disable the event scheduler and thus will not open any event system files.
    1
    2
    3
    4
    5
    6
    7
    8
    
    [vanugant@pkgbuild7:t5 ~]$ grep event /mysql/etc/my.cnf
    event_scheduler=disabled
    [vanugant@pkgbuild7:t5 ~]$ mysql -e "show variables like 'event%'"
    +-----------------+----------+
    | Variable_name   | Value    |
    +-----------------+----------+
    | event_scheduler | DISABLED |
    +-----------------+----------+
  • Start the server as non-read only (without read_only in my.cnf) and once the server is up; set the global read_only flag using
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    
    mysql> set global read_only=1;
    Query OK, 0 rows affected (0.00 sec)
     
    mysql> show variables like 'READ_ONLY';
    +---------------+-------+
    | Variable_name | Value |
    +---------------+-------+
    | read_only     | ON    |
    +---------------+-------+
    1 row in set (0.01 sec)
  • Another way is to completely remove the event.* files from the mysql system data directory; even though this is not recommended.

November 7, 2007

LOAD DATA INFILE - performance case study

I often noticed that people complain about the LOAD DATA performance when loading the table with large number of rows of data. Even today I saw a case where the LOAD DATA on a simple 3 column table with about 5 million rows taking ~15 minutes of time. This is because the server did not had any tuning in regards to bulk insertion.

Consider the following simple MyISAM table on Redhat Linux 32-bit.

1
2
3
4
5
6
CREATE TABLE load1 (
  `col1` varchar(100) NOT NULL default '',
  `col2` int(11) default NULL,
  `col3` char(1) default NULL,
  PRIMARY KEY  (`col1`)
) TYPE=MyISAM;

The table has a string key column. Here is the data file(download here) that I used it for testing:

1
2
3
4
5
[vanugant@escapereply:t55 tmp]$ wc loaddata.csv
  5164946   5164946 227257389 loaddata.csv
[vanugant@escapereply:t55 tmp]$ ls -alh loaddata.csv
-rw-r--r--  1 vanugant users 217M Nov  6 14:42 loaddata.csv
[vanugant@escapereply:t55 tmp]$

Here is the default mysql system variables related to LOAD DATA:

1
2
3
4
5
6
7
8
mysql> show variables;
+-------------------------+---------+
| Variable_name              | Value   |
+-------------------------+---------+
| bulk_insert_buffer_size   | 8388608 |
| myisam_sort_buffer_size   | 16777216 |
| key_buffer_size            | 33554432 |
+-------------------------+----------+

and here is the actual LOAD DATA query to load all ~5m rows (~256M of data) to the table and its timing.

1
2
3
mysql> LOAD DATA INFILE '/home/vanugant/tmp/loaddata.csv' IGNORE INTO TABLE load1 FIELDS TERMINATED BY ',';
Query OK, 4675823 rows affected (14 min 56.84 sec)
Records: 5164946  Deleted: 0  Skipped: 489123  Warnings: 0

Now, lets experiment by disabling the keys in the table before running the LOAD DATA:

1
2
3
4
5
6
7
8
9
mysql> SET SESSION BULK_INSERT_BUFFER_SIZE=314572800;
Query OK, 0 rows affected (0.00 sec)
 
mysql> alter table load1 disable keys;
Query OK, 0 rows affected (0.00 sec)
 
mysql> LOAD DATA INFILE '/home/vanugant/tmp/loaddata.csv' IGNORE INTO TABLE load1 FIELDS TERMINATED BY ',';
Query OK, 4675823 rows affected (13 min 47.50 sec)
Records: 5164946  Deleted: 0  Skipped: 489123  Warnings: 0

No use, just 1% increase or same…., now lets set the real MyISAM values… and try again…

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
mysql> SET SESSION BULK_INSERT_BUFFER_SIZE=256217728;
Query OK, 0 rows affected (0.00 sec)
 
mysql> set session MYISAM_SORT_BUFFER_SIZE=256217728;
Query OK, 0 rows affected (0.00 sec)
 
mysql> set global KEY_BUFFER_SIZE=256217728;
Query OK, 0 rows affected (0.05 sec)
 
mysql> alter table load1 disable keys;
Query OK, 0 rows affected (0.00 sec)
 
mysql> LOAD DATA INFILE '/home/vanugant/tmp/loaddata.csv' IGNORE INTO TABLE load1 FIELDS TERMINATED BY ',';
Query OK, 4675823 rows affected (1 min 55.05 sec)
Records: 5164946  Deleted: 0  Skipped: 489123  Warnings: 0
 
mysql> alter table load1 enable keys;
Query OK, 0 rows affected (0.00 sec)

Wow…thats almost 90% increase in the performance. So, disabling the keys in MyISAM is not just the key, but tuning the buffer size does play role based on the input data.

For the same case with Innodb, here is the status by adjusting the Innodb_buffer_pool_size=1G and Innodb_log_file_size=256M along with innodb_flush_logs_at_trx_commit=1.

1
2
3
4
5
6
7
8
9
10
11
12
13
mysql> show variables like '%innodb%size';
+---------------------------------+------------+
| Variable_name                   | Value      |
+---------------------------------+------------+
| innodb_additional_mem_pool_size | 26214400   |
| innodb_buffer_pool_size         | 1073741824 |
| innodb_log_buffer_size          | 8388608    |
| innodb_log_file_size            | 268435456  |
+---------------------------------+------------+
 
mysql> LOAD DATA INFILE '/home/vanugant/tmp/loaddata.csv' IGNORE INTO TABLE load1 FIELDS TERMINATED BY ',';
Query OK, 4675823 rows affected (2 min 37.53 sec)
Records: 5164946  Deleted: 0  Skipped: 489123  Warnings: 0

With innodb_flush_logs_at_trx_commit=2, innodb_flush_method=O_DIRECT and innodb_doublewrite=0; it will be another 40% difference (use all these variables with caution, unless you know what you are doing)

1
2
3
mysql> LOAD DATA INFILE '/home/vanugant/tmp/loaddata.csv' IGNORE INTO TABLE load1 FIELDS TERMINATED BY ',';
Query OK, 4675823 rows affected (1 min 53.69 sec)
Records: 5164946  Deleted: 0  Skipped: 489123  Warnings: 0

November 6, 2007

MySQL - Large File Support

Today I had an interesting issue from one of the internal property within Yahoo!, where they were not able to start the MySQL server after the server has been upgraded to 5.1 from 4.1.

Here is the error log when the 5.1.22 server has started on 32-bit RHEL:

1
2
3
4
5
6
7
8
071105 12:25:50  InnoDB: Operating system error number 27 in a file operation.
InnoDB: Error number 27 means 'File too large'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/refmastartn/5.1/en/operating-system-error-codes.html
InnoDB: File name /mysql/var/mysql/ibdata/ibdata1
InnoDB: File operation call: 'open'.
InnoDB: Cannot continue operation.
071105 12:25:50 mysqld_safe mysqld from pid file /mysql/var/mysql/mysqld.pid ended

The reason is pretty simple as the large file system is off and the above file is of size ~26G. By looking at the error realized that it is because of the large file support, but could not understand how the support has been disabled in the new 5.1.22 build and going back to 5.1.21 also caused the same problem, but 5.1.20 started working fine because it has the large file support on the 32-bit RHEL.

1
2
3
4
5
6
7
[vanugant@escapereply:t55 5.1]$ mysql -e "show variables like '%large_files%'"
+---------------------+-------+
| Variable_name       | Value |
+---------------------+-------+
| large_files_support | OFF   |
+---------------------+-------+
[vanugant@escapereply:t55 5.1]$

But the funny thing is; am using the same makefile and build environment and tools for all the above versions of MySQL builds, and the Makefile has the same build flags (getconf LFS_CFLAGS =”-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64″).

Looks like time to find out what went wrong in the build process.

November 2, 2007

YAC - Yet Another Car!

Well, I know people look at me strange at times.. whats wrong with this guy… keep changing the cars…but I have my own calculations..and hopefully I will stay with this car for next few years!!

Anyway, I brought yet another car this Sunday (Oct-28-2007), but this time it was a small model and it is not a luxury car. This is my first ever non-luxury car other than Mercedes and BMW. Its 2008 Nissan Versa SL Hatchback. It offers me everything that I needed in a car other than leather package. It opted for a convenience package which offers a key less entry to the car and ignition along with voice activated bucktooth phone.

Even though car looks small to the outside (may be because of hatchback), interior is spacious and leg room is longer than my previous Mercedes and BMW cars. I selected super black exterior with Charcoal interior. Here is the pictures of the car.

Nissan Versa