March 28, 2010

Dell MD1120 Storage Array Performance

Here is some file IO performance numbers from DELL MD1120 SAS storage array. Last year I did the same test with HP P800 storage array and numbers were impressive. But when it comes to this high end storage array, few surprises.  Before getting into actual details; lets see the test stats and configuration details.

System Configuration:

  1. DELL R710 with CentOS 5.4
  2. NOOP IO Scheduler
  3. MD1120 with 22 10K SAS disks
    • 20 disk RAID-10 (hardware)
    • 2 hot spares
    • Disk Cache disabled
  4. PERC 6/E RAID controller with BBU
    • Connected to DELL MD1120 using SAS
    • Write Back
    • Read Cache Disabled

Test Configuration:

  1. Sysbench fileio test with variable modes and threads
  2. 64 files with 50G total size
  3. All tests ran in un-buffered mode (O_DIRECT) as most of the workload is InnoDB based.

Test Results:

Number of Threads vs Number of Requests/Sec. Every mode ran with 5 iterations and average is taken.

Random IO:

rndio

Sequential IO:

seqio 

HDPARM Test:

[test~]# for i in `seq 1 3`; do hdparm --direct -tT /dev/sdc1; done | grep Timing
 Timing O_DIRECT cached reads:   2068 MB in  2.00 seconds = 1033.21 MB/sec
 Timing O_DIRECT disk reads:  2146 MB in  3.00 seconds = 715.32 MB/sec
 Timing O_DIRECT cached reads:   2020 MB in  2.00 seconds = 1010.26 MB/sec
 Timing O_DIRECT disk reads:  2162 MB in  3.00 seconds = 720.62 MB/sec
 Timing O_DIRECT cached reads:   2052 MB in  2.00 seconds = 1025.90 MB/sec
 Timing O_DIRECT disk reads:  2128 MB in  3.00 seconds = 709.17 MB/sec
 
[test ~]# for i in `seq 1 3`; do hdparm -tT /dev/sdc1; done | grep Timing
 Timing cached reads:   18920 MB in  2.00 seconds = 9475.34 MB/sec
 Timing buffered disk reads:  3442 MB in  3.02 seconds = 1141.44 MB/sec
 Timing cached reads:   19332 MB in  2.00 seconds = 9681.56 MB/sec
 Timing buffered disk reads:  3478 MB in  3.00 seconds = 1159.24 MB/sec
 Timing cached reads:   18012 MB in  2.00 seconds = 9019.50 MB/sec
 Timing buffered disk reads:  3492 MB in  3.02 seconds = 1155.53 MB/sec

Analysis:

  1. Overall the numbers are not bad when it comes to writes, but few surprises when it comes to reads. When compared with HP’s P800 storage array, the numbers still dropped by 20%.
  2. Radon IO:
    • Random write requests ranges from 3200-5000 per sec; due to write back mode (512M cache)
    • Writes are linearly scaling well with the threads, good sign that controller is able to manage the cache efficiently
    • Random reads and writes (rndrw) is also scaling linearly with the threads load, means the IO distribution and cache burst to satisfy reads seems be efficient as it needs to flush the data from controller cache to disk before the read can be satisfied.
  3. Sequential IO:
    • Writes seems to be scaling well even in sequential mode without much overhead
    • When it comes to reads, big surprise is drop from 5626 requests/sec to 615 from one thread to two threads. Which is really odd. Worst case it should be ~2000-3000 requests/sec; not sure where the overhead is. I can’t believe it could be thread scheduling as there is only 2 threads.
  4. During 100% IO, on and off I noticed IO serialization with higher queue waits, which indicates that there is some degree of serialization overhead in OS; but not able to track which layer is triggering this. Tried with cfq/deadline, still the same.
  5. Next attempt will be replacing 3Gb/s SAS to fiber channel HBA or 6Gb/s SAS (PERC H800) to see how it performs along with combination of HW and SW raid instead of only depending on controller.

March 21, 2010

Hyper Threading Performance

Its been a while anyone talked about Intel’s Hyper-Threading performance when it comes to databases. There were enough posts about disabling Hyper-Threading  completely when it comes to MySQL/InnoDB workloads way back when we had enough issues with scalability of InnoDB on multi-core systems. But things has changed quite a bit in the fast year or two in terms of multi-core support (thanks to Innobase/Mark Callaghan/Google and Percona). I still see lot of production servers running with HT disabled completely either in BIOS (append noht to kernel parameter) or manually disabling the CPUs (echo 0 > /sys/devices/system/node/node[0-1]/cpu[2-..]/online ).

Last few weeks or so; I ran quite a few tests on new production launch with 8-core CPUs and decided to go ahead with HT enabled (along with Intel Turbo mode) as it seems to give close to 15-20% performance gain. Ran tests which are completely InnoDB CPU/memory bound and HT seems to enable much better performance in every test mode.

Also tried with different thread schedulers with IO bound loads; still HT seems to give much better results; especially when most of the workload seems to be with 8-50 threads; and HT seems to give much better boost in that range; so no point to disable HT.

I tested both Intel Xeon X5570 and E5530 and here is the test results for CPU bound work load.

e5530

x5570

It is a clear indication that HT is not a bottleneck. In both the cases, turbo boost is enabled though.

As you can also notice; Intel Xeon X series CPUs yield much better results than E series (obvious as X-series is meant for performance where as E is for economy), which is close to $1000 extra, but yields ~15-20% performance for InnoDB workloads. You can find the comparison between these CPUs from Intel. If you are buying any new system and if you can spare extra $$; then always go for X series as its worth the price for performance

March 8, 2010

When indexes are created in internal temporary tables

During my previous post on how to improve derived tables performance, I patched the code to add indexes forcefully on internal derived table results, which made a huge difference in the performance. It was just an experiment and a thought to see if it really works without re-writing the queries, so that the logic can be pushed towards the engine rather than query re-write. \

But I got few emails in my inbox today asking whether MySQL really create any keys on internal temporary tables.

The answer is YES; and MySQL does create two keys on internal temporary tables namely ‘group_key‘ and ‘distinct_key‘ on the following conditions:

  • If there is any aggregate function and/or group-by (group_key)
  • Distinct column name(group_key)
  • Distinct in combination with group-by/aggregation functions (distinct_key)

Provided the query results are yielded in temporary table (Using temporary from the explain), else they get optimized away by the existing indexes from the regular table itself. These keys are added to both memory and disk based (MyISAM) internal temporary tables; so it does not matter if the internal temporary table is in memory or disk.

Here is a simple dump of internal temporary table index stats for some of the basic queries related to Information schema [Warning: these queries are really bad, and can't be used for any production use as they are meant for demonstration of different internal keys ]. This is a patch that I might be using for SHOW TEMPORARY TABLES when internal tables are included in the second version. The first version of the patch is already pushed to Maria branch, hoping that it gets pushed to 5.1.

-----------------------------
 TMP TABLE STATS, SESSION: 1
   temp file  : /tmp/#sqlf90_1_1f
   temp type  : MEMORY
   index count: 1
    key 1-1   : distinct_key
    field     : (null)
    key 1-2   : distinct_key
    field     : ENGINE
 query: select count(distinct engine) from information_schema.tables
-----------------------------
 
-----------------------------
 TMP TABLE STATS, SESSION: 1
   temp file  : /tmp/#sqlf90_1_21
   temp type  : MEMORY
   index count: 1
    key 1-1   : group_key
    field     : TABLE_NAME
 query: select table_name, sum(data_length+index_length) from information_schema.tables 
        where table_schema='mysql' group by 1
-----------------------------
 
-----------------------------
 TMP TABLE STATS, SESSION: 1
   temp file  : /tmp/#sqlf90_1_24
   temp type  : MEMORY
   index count: 1
    key 1-1   : group_key
    field     : TABLE_SCHEMA
    key 1-2   : group_key
    field     : TABLE_NAME
    key 1-3   : group_key
    field     : COLUMN_NAME
 query: select tab.table_schema, tab.table_name, column_name, index_name, seq_in_index 
        from Information_schema.tables tab join information_schema.statistics stast 
        using(table_schema,table_name) group by  1,2,3
-----------------------------
 
-----------------------------
 TMP TABLE STATS, SESSION: 1
   temp file  : /tmp/#sqlf90_1_bd
   temp type  : MEMORY
   index count: 1
    key 1-1   : group_key
    field     : TABLE_NAME
    key 1-2   : group_key
    field     : TABLE_TYPE
    key 1-3   : group_key
    field     : ENGINE
    key 1-4   : group_key
    field     : INDEX_SCHEMA
    key 1-5   : group_key
    field     : INDEX_NAME
 query: select tab.table_schema,  tab.table_name, table_type, engine, index_schema, 
        index_name from information_schema.tables tab join information_schema.statistics
        stats using(table_schema, table_name) where table_schema='mysql' group by 
        1,2,3,4,5, 6 order by 4,3,2,1
-----------------------------
 
-----------------------------
 TMP TABLE STATS, SESSION: 1
   temp file  : /tmp/#sqlf90_1_e0
   temp type  : MEMORY
   index count: 1
    key 1-1   : group_key
    field     : TABLE_NAME
 query: select  table_name, sum(data_length+index_length) from information_schema.tables
        where table_schema='mysql' group by 1
-----------------------------

March 6, 2010

How to improve subqueries derived tables performance

Last week I was working on one of the issue where the sub-query related to OLAP staging was running for about 2+ hours in the production server and finally nailed down to get the query to run in < 7 secs. It was bit interesting and kind of known issue in MySQL sub-queries world and one of the limitation from MySQL on giving more control over derived table results.

Sometimes we can re-write the sub-queries so that there is no derived tables complexity involved; but the bad part is; this particular sub-query is part of an UPDATE statement; so not all sub-queries can be re-written especially when they are part of UPDATE or DELETE statements due to its own limitations.

PROBLEM:

Here is the subset of the problem query and as you can see it runs for about 6 minutes in this small subset of data that I used for testing on Mac. All tables are InnoDB based.

--------------
SELECT
    SUM(aggrpt.imps) as imps,
    SUM(aggrpt.clicks) as clicks,
    SUM(aggrpt.pos) as pos
 
FROM aggrpt
LEFT JOIN
(
    SELECT
    DISTINCT ext_group_id, group_id
    FROM sub
) sub2  ON(sub2.ext_group_id=aggrpt.adgroupid)
 
GROUP BY
aggrpt.report_date,
aggrpt.campaignid,
aggrpt.adgroupid,
aggrpt.keywordid
ORDER BY NULL
INTO OUTFILE '/tmp/test-sub.txt'
-------------- 
 
Query OK, 47827 rows affected (6 min 47.48 sec)

HOW TO GET AROUND – SOLUTIONS:

Moving the derived table (in the above case sub2) to a view did not help; and timings are more or less the same. Here is the two alternative ways, which made the query run in < 10 secs.

  1. By creating external table and adding an index instead of using derived table
  2. Added an index within the mysql code on the derived table temporary results table by adding FORCE INDEX syntax (changed the syntax to support this, so that engine will create an index on temporary derived table results, which in this case happens to fit within heap engine instead of disk based)

CASE 1:

This is a known alternative and lot of people use this in production by avoiding the derived tables and/or sub-queries completely by creating tables for derived tables. The only thing that made the big difference is adding an index on this; without index it takes more or less the same 6 minutes times.

--------------
CREATE TEMPORARY TABLE sub_temp SELECT DISTINCT ext_group_id, group_id FROM sub
--------------
Query OK, 72385 rows affected (0.81 sec)
Records: 72385  Duplicates: 0  Warnings: 0
--------------
alter table sub_temp add index i_ext_group_id(ext_group_id)
--------------
Query OK, 72385 rows affected (0.10 sec)
Records: 72385  Duplicates: 0  Warnings: 0
--------------
SELECT
    SUM(aggrpt.imps) as imps,
    SUM(aggrpt.clicks) as clicks,
    SUM(aggrpt.pos) as pos
 
FROM aggrpt
LEFT JOIN sub_temp sub ON(sub.ext_group_id=aggrpt.adgroupid)
 
GROUP BY
    aggrpt.report_date,
    aggrpt.campaignid,
    aggrpt.adgroupid,
    aggrpt.keywordid
ORDER BY NULL
INTO OUTFILE '/tmp/test-sub-temp.txt'
-------------- 
 
Query OK, 47827 rows affected (10.41 sec)

As you can see its a great improvement from 6minutes to 10secs; but without index i_ext_group_id on sub-temp table; then things will be as usual.

CASE 2:

The engine should be smart enough to add the index on intermediate results temporary table (in this case sub2, index on column i_ext_group_id); but MySQL does not support this or at least should allow people to specify one using FORCE/USE INDEX, so that engine can add one.

For example; I patched MySQL to support this syntax, so that it adds the index on the derived table sub2 results automatically and the query immediately returns in 7 secs.

SELECT
    SUM(aggrpt.imps) as imps,
    SUM(aggrpt.clicks) as clicks,
    SUM(aggrpt.pos) as pos
 
FROM aggrpt
LEFT JOIN
(
    SELECT
    DISTINCT ext_group_id, group_id
    FROM sub
) sub2 <strong>USE INDEX(ext_group_id)</strong> ON(sub2.ext_group_id=aggrpt.adgroupid)
 
GROUP BY
aggrpt.report_date,
aggrpt.campaignid,
aggrpt.adgroupid,
aggrpt.keywordid
ORDER BY NULL
INTO OUTFILE '/tmp/test-sub-force.txt'
-------------- 
 
Query OK, 47827 rows affected (7.18 sec)

This is again a great improvement; but again using USE/FORCE INDEX is a hack here (this can also be implemented as hint) with column name; which acts as a hint to optimizer to create an index on that column (if its hint, then on JOIN column).

CHANGES TO MYSQL OPTIMIZER

But the idea for this post is to show how important the index in the intermediate temporary table results is for the query performance and optimizer should be smart enough to identify and add one automatically. The logic is same as how currently optimizer opts for index vs non-index scan by estimating the cost.

February 7, 2010

Changing MySQL parser code on Windows – Build breaks due to Bison

In case if you working on Windows environment for MySQL development (sometimes I use visual studio for easy debugging); and if you change the parser code (sql_yacc.yy) or if you working directly from the development branch (bzr launchpad), then the build breaks to generate the parser yacc files (sql_yacc.h and sql_yacc.cc) with an error bison: M4: Invalid argument as shown below:

------ Build started: Project: sql, Configuration: Debug Win32 ------
Generating sql_yacc.h, sql_yacc.cc
------ Build started: Project: GenServerSource, Configuration: Debug Win32 ------
Generating sql_yacc.h, sql_yacc.cc
bison: m4: Invalid argument
Project : error PRJ0019: A tool returned an error code from "Generating sql_yacc.h, sql_yacc.cc"
sql - 1 error(s), 0 warning(s)
bison: m4: Invalid argument
Project : error PRJ0019: A tool returned an error code from "Generating sql_yacc.h, sql_yacc.cc"
GenServerSource - 1 error(s), 0 warning(s)

But if use source zip file for any particular release, then it won’t fail as the files (sql_yacc.cc and sql_yacc.h) are pre-built and copied to the distribution zip file.

It looks like lot of people are experiencing the same problem to build parser code on Windows using any recent version of bison (not just MySQL code base).

Both bison.exe and m4.exe are in the path and they are the latest version; but still it fails..

c:\mysql-5.1\sql&gt;which bison
C:\Gnu\GetGnuWin32\gnuwin32\bin\bison.EXE
 
c:\mysql-5.1\sql&gt;which m4
C:\Gnu\GetGnuWin32\gnuwin32\bin\m4.EXE
 
c:\mysql-5.1\sql&gt;bison --version
bison (GNU Bison) 2.4.1
 
c:\mysql-5.1\sql&gt;m4 --version
m4 (GNU M4) 1.4.13

It looks like the problem is with Windows version of bison to pick m4 executable even though m4 is in the path. For example, you can directly try to generate the files from sql directory using bison as…

c:\mysql-5.1\sql&gt;bison -y -p MYSQL --defines=sql_yacc.h --output=sql_yacc.cc sql_yacc.yy
bison: m4: Invalid argument

The work around what I found is to copy m4.exe to sql directory directly, so that bison can pick from local working directory, then everything starts working as expected.

c:\mysql-5.1\sql&gt;bison -y -p MYSQL --defines=sql_yacc.h --output=sql_yacc.cc sql_yacc.yy
bison: m4: Invalid argument
 
c:\mysql-5.1\sql&gt;ls -al sql_yacc.*
-rw-rw-rw-  1 venu 0 413012 2010-02-07 11:58 sql_yacc.yy
 
c:\mysql-5.1\sql&gt;which m4
C:\Gnu\GetGnuWin32\gnuwin32\bin\m4.EXE
 
c:\mysql-5.1\sql&gt;copy C:\Gnu\GetGnuWin32\gnuwin32\bin\m4.EXE .
        1 file(s) copied.
 
c:\mysql-5.1\sql&gt;which m4
c:\mysql-5.1\sql\m4.EXE
 
c:\mysql-5.1\sql&gt;bison -y -p MYSQL --defines=sql_yacc.h --output=sql_yacc.cc sql_yacc.yy
 
c:\mysql-5.1\sql&gt;ls -al sql_yacc.*
-rw-rw-rw-  1 venu 0 1510389 2010-02-07 14:33 sql_yacc.cc
-rw-rw-rw-  1 venu 0   30532 2010-02-07 14:33 sql_yacc.h
-rw-rw-rw-  1 venu 0  413012 2010-02-07 11:58 sql_yacc.yy

Kind of weird, but at least there is a work around to change the parser code on Windows now; apart from command-line, even visual studio starts working without any errors.  But if you remove m4.exe from sql directory, then things starts to break immediately.

February 3, 2010

SHOW TEMPORARY TABLES

I had this patch for a while where one can get listing of both session and global temporary tables across all sessions. It really helped lot of times to understand the bottlenecks of some of the temporary table issues as MySQL never exposed them in the form of SHOW TABLES.

I also added a new status variable called ‘Created_tmp_heap_to_disk_tables‘, which keeps track of how many memory based temp tables are re-created back to disk based.

The patch is now ported to newer MySQL versions, both 5.0 and 5.1; and it works great on most of the platforms that I tested (Mac, Linux and Windows)

It introduces two new INFORMATION_SCHEMA tables, TEMPORARY_TABLES and GLOBAL_TEMPORARY_TABLES along with supporting regular SHOW syntax

 
SHOW [SESSION/GLOBAL] TEMPORARY TABLES [FROM db]

Some examples of how it works at present

Session Temp Tables

mysql> show session temporary tables;
+----+------+-------+--------+--------------+
| Id | Db   | Table | Engine | Name         |
+----+------+-------+--------+--------------+
|  1 | test | t2    | MEMORY | #sql29da_1_3 |
|  1 | test | t1    | MyISAM | #sql29da_1_2 |
+----+------+-------+--------+--------------+
2 rows in set (0.00 sec)
 
mysql> show temporary tables;
+----+------+-------+--------+--------------+
| Id | Db   | Table | Engine | Name         |
+----+------+-------+--------+--------------+
|  1 | test | t2    | MEMORY | #sql29da_1_3 |
|  1 | test | t1    | MyISAM | #sql29da_1_2 |
+----+------+-------+--------+--------------+
2 rows in set (0.00 sec)
 
mysql> select * from information_schema.temporary_tables;
+------------+----------+------------+--------+--------------+
| SESSION_ID | DATABASE | TABLE_NAME | ENGINE | NAME         |
+------------+----------+------------+--------+--------------+
|          1 | test     | t2         | MEMORY | #sql29da_1_3 |
|          1 | test     | t1         | MyISAM | #sql29da_1_2 |
+------------+----------+------------+--------+--------------+
2 rows in set (0.00 sec)

Global Temp Tables Across All Sessions:

mysql> select * from information_schema.global_temporary_tables;
+------------+----------+------------+--------+--------------+
| SESSION_ID | DATABASE | TABLE_NAME | ENGINE | NAME         |
+------------+----------+------------+--------+--------------+
|          6 | test     | t3         | MyISAM | #sql29da_6_0 |
|          5 | test     | t2         | MEMORY | #sql29da_5_3 |
|          5 | test     | t1         | MyISAM | #sql29da_5_2 |
|          4 | venu     | v1         | InnoDB | #sql29da_4_0 |
+------------+----------+------------+--------+--------------+
4 rows in set (0.00 sec)
 
mysql> show global temporary tables;
+----+------+-------+--------+--------------+
| Id | Db   | Table | Engine | Name         |
+----+------+-------+--------+--------------+
|  6 | test | t3    | MyISAM | #sql29da_6_0 |
|  5 | test | t2    | MEMORY | #sql29da_5_3 |
|  5 | test | t1    | MyISAM | #sql29da_5_2 |
|  4 | venu | v1    | InnoDB | #sql29da_4_0 |
+----+------+-------+--------+--------------+
4 rows in set (0.00 sec)
  • Id: Session ID
  • Db: Database Name
  • Engine: Engine Type
  • Name: Internal Name, how its stored in the file system

Will publish the patch once am done with porting internal temp tables listing by introducing two new columns Table_Type and Info where Table_Type will indicate if its internal or external and Info will have initial 1024 bytes of the query that caused the internal temp table.

January 10, 2010

Create Table Like Scalability Issues

When we wanted to quickly process intermediate data by using temporary or heap tables; then its normal tendency to create a table like its source table; so people will simply opt for CREATE TEMPORARY TABLE temp_table LIKE source_table.

This is an easy and convenient way. But the problem is if you have a simple stored procedure or code module that gets executed frequently, then you might indirectly experience a slow-down in the performance if your source_table is large enough and highly contended. Things will be really worst if concurrent threads starts using the source_table or in combination of source_table and CREATE LIKE.. statements and can lead to disaster scalability issues. The main reason for the post is; I noticed from SHOW PROCESSLIST from random servers where X threads getting into WAITING for TABLE lock state for simple CREATE TABLE LIKE statements and lot of threads associated with source table is opening and closing the tables.

The main source of issue is, In MySQL 5.0 or lower versions; CREATE LIKE uses exclusive named lock (lock_and_wait_for_table_name) on the source_table; which forces all threads to close the table which are currently in use and gets the named lock and no other thread can read/write from source_table until destination table is created. If source_table is large enough and few long running selects are running; then CREATE TABLE will wait until it gets the lock. To make it simple; you can even simulate this with a simple read lock on a table in one session and try to create the table like the locked table in another session; and notice that create table will wait until session one unlocks the tables.

Session 1:

localhost:test> create table source_table (id int)Engine=InnoDB;
Query OK, 0 rows affected (0.00 sec)
 
localhost:test> lock table source_table read;
Query OK, 0 rows affected (0.00 sec)

Session 2:

localhost:test> create temporary table temp_table like source_table;

Thats a simple case, but lets simulate a real example with stored procedure and mysqlslap with multiple threads where few threads will be using the source_table and few threads starts creating the temporary tables.

First, create the the source_table and populate the table with 10K rows..

localhost:test> use test;
Query OK, 0 rows affected (0.02 sec)
 
localhost:test> DROP TABLE IF EXISTS source_table;
Query OK, 0 rows affected, 1 warning (0.00 sec)
 
localhost:test> CREATE TABLE source_table(id int, name VARCHAR(20))Engine=InnoDB;
Query OK, 0 rows affected (0.00 sec)
 
localhost:test> DROP PROCEDURE IF EXISTS load_table_proc;
Query OK, 0 rows affected (0.00 sec)
 
localhost:test> DROP PROCEDURE IF EXISTS temp_scale_proc;
Query OK, 0 rows affected (0.00 sec)
 
localhost:test> DELIMITER GO
localhost:test> 
localhost:test> CREATE PROCEDURE load_table_proc(count INT)
    -> BEGIN
    -> DECLARE _i INT;
    -> SET _i = 1;
    -> WHILE _i < = count DO
    ->   INSERT INTO source_table SELECT _i, CONCAT('testing', _i);
    ->   SET _i = _i + 1;
    -> END WHILE;
    -> END
    -> GO
Query OK, 0 rows affected (0.00 sec)
 
localhost:test> 
localhost:test> DELIMITER ;
localhost:test> CALL load_table_proc(10000);
Query OK, 1 row affected (1.54 sec)

CREATE another stored procedure which can use the source_table to create the temp table and searches for a given number

localhost:test> DELIMITER GO
localhost:test> CREATE PROCEDURE temp_scale_proc(cnt INT)
    -> BEGIN
    ->     DROP TABLE IF EXISTS temp_tab;
    ->     CREATE TEMPORARY TABLE temp_tab LIKE source_table;
    ->     INSERT INTO temp_tab SELECT * FROM source_table;
    ->     SELECT * FROM temp_tab WHERE id=cnt;
    -> END
    -> GO
Query OK, 0 rows affected (0.00 sec)
localhost:test> DELIMITER ;
localhost:test>

Now execute the stored procedure temp_scale_proc in parallel using 10 threads through mysqlslap; and start watching the processlist and observe the thread states.

mysqlslap --create-schema="test" --concurrency=10 --iteration=100 --delimiter=";"  --query="call test.temp_scale_proc(1+(RAND() * 10000))"

The above calls the stored procedure by passing a random number between 1 to 10000.

Some thread states from SHOW PROCESSLIST showing locking and scalability issues…

2117    root    localhost       NULL    Sleep   0               NULL
2118    root    localhost       test    Query   0       Waiting for table       INSERT INTO temp_tab SELECT * FROM source_table
2119    root    localhost       test    Query   0       Sending data    INSERT INTO temp_tab SELECT * FROM source_table
2120    root    localhost       test    Query   0       Waiting for table       CREATE TEMPORARY TABLE temp_tab LIKE source_table
2121    root    localhost       test    Query   0       Waiting for table       CREATE TEMPORARY TABLE temp_tab LIKE source_table
2122    root    localhost       test    Query   0       Waiting for table       CREATE TEMPORARY TABLE temp_tab LIKE source_table
2123    root    localhost       test    Query   0       Waiting for table       CREATE TEMPORARY TABLE temp_tab LIKE source_table
2124    root    localhost       test    Query   0       Waiting for table       CREATE TEMPORARY TABLE temp_tab LIKE source_table
2125    root    localhost       test    Query   0       Waiting for table       INSERT INTO temp_tab SELECT * FROM source_table
2126    root    localhost       test    Query   0       Waiting for table       CREATE TEMPORARY TABLE temp_tab LIKE source_table
2127    root    localhost       test    Query   0       Waiting for table       CREATE TEMPORARY TABLE temp_tab LIKE source_table
--
Total open tables: 8 and Name Locks: 7 (output of SHOW OPEN TABLES)

Not just the locking, the statement also indirectly causes table to be closed and opened again and again by other sessions…

2117    root    localhost       NULL    Sleep   15              NULL
3032    root    localhost       test    Query   0       closing tables  SELECT * FROM temp_tab WHERE id= NAME_CONST('cnt',9319)
3033    root    localhost       test    Query   0       Opening tables  INSERT INTO temp_tab SELECT * FROM source_table
3034    root    localhost       test    Query   0       Waiting for table       CREATE TEMPORARY TABLE temp_tab LIKE source_table
3035    root    localhost       test    Query   0       Waiting for table       INSERT INTO temp_tab SELECT * FROM source_table
3036    root    localhost       test    Query   0       checking permissions    CREATE TEMPORARY TABLE temp_tab LIKE source_table
3037    root    localhost       test    Query   0       closing tables  CREATE TEMPORARY TABLE temp_tab LIKE source_table
3038    root    localhost       test    Query   0       Opening tables  INSERT INTO temp_tab SELECT * FROM source_table
3039    root    localhost       test    Query   0       checking permissions    CREATE TEMPORARY TABLE temp_tab LIKE source_table
3040    root    localhost       test    Query   0       checking permissions    CREATE TEMPORARY TABLE temp_tab LIKE source_table
3041    root    localhost       test    Query   0       Waiting for table       CREATE TEMPORARY TABLE temp_tab LIKE source_table
--
Total open tables: 1 and Name Locks: 1 (output of SHOW OPEN TABLES)

So, the best bet is not to use CREATE TABLE LIKE on a highly contended source table or use direct table schema to create the table or if you upgrade to MySQL 5.1 and above; where locking issue is addressed as it opens the source_table instead of getting the named lock that prevents any updates to FRM file.

January 4, 2010

Performance comparison of Repair by Sorting or by Keycache

Other day I noticed a case where loading the same set of data to InnoDB took only 10 minutes where as loading it to MyISAM took ~2 hours.

Digging it further found that it is all because of well known Repair with keycache issue. But for some reason, it took me a while to get to the root cause of the issue as it was working fine until few days. When MyISAM needs to repair the table (REPAIR, ALTER or LOAD or ENABLE KEYS); it uses two modes for repair:

  • repair by sorting
  • repair using keycache (falls to this mode by default if repair by sort fails for any reason)

first it tests if the table can be repaired by sorting provided it meets the following requirements:

  • table at least has one key
  • total size needed for individual key is less than myisam_max_sort_file_size

If it meets the above requirements, then it uses either regular sorting if myisam_repair_threads  = 1 (default) by building each key at a time or in parallel if myisam_repair_threads > 1 by using ‘n’ threads in parallel (n = total keys in the table). If you have a table with more than one key and table needs a frequent key rebuild, then setting myisam_repair_threads = 2 can speedup the repair/alter process.

If it fails to satisfy the above conditions, then it falls to expensive keycache repair mode.

Lets consider the following simple example (lengthy index) where one uses ‘Repair by sort’ and another uses ‘Repair by keycache’ when rebuilding 2M rows; the only difference between the two cases is difference in myisam_max_sort_file_size

mysql> create table test_repair_cache (s_id bigint unsigned not null, 
                                       s_key VARCHAR(255) NOT NULL, 
                                       s_value text, 
                                       PRIMARY KEY(s_id, s_key), INDEX(s_key, s_id)
        )Engine=MyISAM DEFAULT CHARSET=utf8 PACK_KEYS=1;
Query OK, 0 rows affected (0.00 sec)

with myisam_max_sort_file_size=1.5G with myisam_max_sort_file_size=512M
mysql> show global variables like 'myisam_max_sort_file_size';
+---------------------------+------------+
| Variable_name             | Value      |
+---------------------------+------------+
| myisam_max_sort_file_size | 1610612736 | 
+---------------------------+------------+
1 row in set (0.00 sec)
 
mysql> alter table test_repair_cache disable keys;
Query OK, 0 rows affected (0.00 sec)
 
mysql> load data local infile 'x' into table test_repair_cache  ;
Query OK, 2000000 rows affected (2 min 28.67 sec)
Records: 2000000  Deleted: 0  Skipped: 0  Warnings: 0
 
mysql> alter table test_repair_cache enable keys;
Query OK, 0 rows affected (13.96 sec)
 
mysql>
mysql> show global variables like 'myisam_max_sort_file_size';
+---------------------------+-----------+
| Variable_name             | Value     |
+---------------------------+-----------+
| myisam_max_sort_file_size | 536870912 | 
+---------------------------+-----------+
1 row in set (0.00 sec)
 
mysql> alter table test_repair_cache disable keys;
Query OK, 0 rows affected (0.00 sec)
 
mysql> load data local infile 'x' into table test_repair_cache  ;
Query OK, 2000000 rows affected (2 min 28.95 sec)
Records: 2000000  Deleted: 0  Skipped: 0  Warnings: 0
 
mysql> alter table test_repair_cache enable keys;
Query OK, 0 rows affected (6 min 47.92 sec)
 
mysql>

As you can see, it took only 13 secs when there is a sufficient myisam_max_sort_file_size to sort 2M rows and ~6m  in case of keycache mode. For larger data sets, this can take hours and sometimes even a day or two.

Here is the formula to find the right sort size for the above case (~1.5G)

mysql> select (2000000 * (255 * 3) + 8) / (1024 * 1024 * 1024) as SizeInGig;
+-----------+
| SizeInGig |
+-----------+
|    1.4249 | 
+-----------+
1 row in set (0.00 sec)

SHOW PROCESSLIST thread state will indicate if its using sort or keycache for repairing the keys, for example, here is the show processlist state for all three repair modes

mysql> show processlist;
+-------+---------+-----------+------+---------+------+----------------------+-------------------------------------------+
| Id    | User    | Host      | db   | Command | Time | State                | Info                                      |
+-------+---------+-----------+------+---------+------+----------------------+-------------------------------------------+
| 23735 | testing | localhost | test | Query   |  248 | Repair with keycache | alter table test_repair_cache enable keys | 
+-------+---------+-----------+------+---------+------+----------------------+-------------------------------------------+
 
+-------+---------+-----------+------+---------+------+-----------------------+-------------------------------------------+
| Id    | User    | Host      | db   | Command | Time | State                 | Info                                      |
+-------+---------+-----------+------+---------+------+-----------------------+-------------------------------------------+
| 23740 | testing | localhost | test | Query   |    4 | Repair with 2 threads | alter table test_repair_cache enable keys | 
+-------+---------+-----------+------+---------+------+-----------------------+-------------------------------------------+
 
+-------+---------+-----------+------+---------+------+-------------------+-------------------------------------------+
| Id    | User    | Host      | db   | Command | Time | State             | Info                                      |
+-------+---------+-----------+------+---------+------+-------------------+-------------------------------------------+
| 23743 | testing | localhost | test | Query   |    5 | Repair by sorting | alter table test_repair_cache enable keys | 
+-------+---------+-----------+------+---------+------+-------------------+-------------------------------------------+