Performance comparison of Repair by Sorting or by Keycache
Other day I noticed a case where loading the same set of data to InnoDB took only 10 minutes where as loading it to MyISAM took ~2 hours.
Digging it further found that it is all because of well known Repair with keycache issue. But for some reason, it took me a while to get to the root cause of the issue as it was working fine until few days. When MyISAM needs to repair the table (REPAIR, ALTER or LOAD or ENABLE KEYS); it uses two modes for repair:
- repair by sorting
- repair using keycache (falls to this mode by default if repair by sort fails for any reason)
first it tests if the table can be repaired by sorting provided it meets the following requirements:
- table at least has one key
- total size needed for individual key is less than myisam_max_sort_file_size
If it meets the above requirements, then it uses either regular sorting if myisam_repair_threads = 1 (default) by building each key at a time or in parallel if myisam_repair_threads > 1 by using ‘n’ threads in parallel (n = total keys in the table). If you have a table with more than one key and table needs a frequent key rebuild, then setting myisam_repair_threads = 2 can speedup the repair/alter process.
If it fails to satisfy the above conditions, then it falls to expensive keycache repair mode.
Lets consider the following simple example (lengthy index) where one uses ‘Repair by sort’ and another uses ‘Repair by keycache’ when rebuilding 2M rows; the only difference between the two cases is difference in myisam_max_sort_file_size
mysql> create table test_repair_cache (s_id bigint unsigned not null, s_key VARCHAR(255) NOT NULL, s_value text, PRIMARY KEY(s_id, s_key), INDEX(s_key, s_id) )Engine=MyISAM DEFAULT CHARSET=utf8 PACK_KEYS=1; Query OK, 0 rows affected (0.00 sec) |
| with myisam_max_sort_file_size=1.5G | with myisam_max_sort_file_size=512M | ||
|---|---|---|---|
|
|
As you can see, it took only 13 secs when there is a sufficient myisam_max_sort_file_size to sort 2M rows and ~6m in case of keycache mode. For larger data sets, this can take hours and sometimes even a day or two.
Here is the formula to find the right sort size for the above case (~1.5G)
mysql> select (2000000 * (255 * 3) + 8) / (1024 * 1024 * 1024) as SizeInGig; +-----------+ | SizeInGig | +-----------+ | 1.4249 | +-----------+ 1 row in set (0.00 sec) |
SHOW PROCESSLIST thread state will indicate if its using sort or keycache for repairing the keys, for example, here is the show processlist state for all three repair modes
mysql> show processlist; +-------+---------+-----------+------+---------+------+----------------------+-------------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | +-------+---------+-----------+------+---------+------+----------------------+-------------------------------------------+ | 23735 | testing | localhost | test | Query | 248 | Repair with keycache | alter table test_repair_cache enable keys | +-------+---------+-----------+------+---------+------+----------------------+-------------------------------------------+ +-------+---------+-----------+------+---------+------+-----------------------+-------------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | +-------+---------+-----------+------+---------+------+-----------------------+-------------------------------------------+ | 23740 | testing | localhost | test | Query | 4 | Repair with 2 threads | alter table test_repair_cache enable keys | +-------+---------+-----------+------+---------+------+-----------------------+-------------------------------------------+ +-------+---------+-----------+------+---------+------+-------------------+-------------------------------------------+ | Id | User | Host | db | Command | Time | State | Info | +-------+---------+-----------+------+---------+------+-------------------+-------------------------------------------+ | 23743 | testing | localhost | test | Query | 5 | Repair by sorting | alter table test_repair_cache enable keys | +-------+---------+-----------+------+---------+------+-------------------+-------------------------------------------+ |
Related posts:
11 Comments





myisam_repair_threads has numerous bugs over the years: #37756 (full text), 47444 (truncated table), etc…
It is a footgun.
- Rob
Comment :: January 4, 2010 @ 5:14 am
Yes, and at times, parallel repair takes more time than the sequential one
Comment :: January 4, 2010 @ 8:55 am
Nice article Venu, thanks for explaining how the index rebuild works, its always confusing by looking at the thread state. Good to know that ‘Repair with keycache’ should never appear.
Keep blogging more, I see no updates from you lately.
Comment :: January 4, 2010 @ 11:42 am
good article
in this formula (2000000 * (255 * 3) +
you want to know index size, is it right?
can we take the size from information_schema.index_length i think its easier when the table have multi index?
Comment :: January 5, 2010 @ 12:21 am
Comment :: January 5, 2010 @ 12:22 am
Not really as you are not looking for the total index size here.. also; I never use Information_Schema.tables/columns as its very expensive especially when you have more tables in the server.
Comment :: January 5, 2010 @ 12:25 pm
Venu
This is good piece of work. However, even after checking everything you mentioned, I still get “repair by keycache” occasionally. When I say “occasionally”, I mean that it works most of the time (“repair by sorting”), yet every once in a while, it still falls into “repair by keycache” without reason. When that does happen, the myisam_max_sort_file_size shows plenty of size (“9223372036854775807″), and the /tmp directory is not full.
What else am I missing?
Thanks,
Bing
Comment :: July 9, 2010 @ 2:26 pm
Bing
Unless you give an example; it is hard. Contact me at http://venublog.com/contact/
Comment :: July 15, 2010 @ 11:19 pm
I figured it out with help from others http://forums.mysql.com/read.php?21,375635,375939#msg-375939
It is because myisam_sort_buffer_size was not set properly. Apparently, there is one more thing you need to take into consideration when Repairing a MyISAM.
Thanks,
Bing
Comment :: July 16, 2010 @ 7:25 am
[...] พังเดี้ยงได้ อาจะเพิ่มเติมได้ที่ ลิงค์นี้ระหว่าง config มั่วๆถ้าไม่ set connection timeout [...]
Pingback :: September 16, 2011 @ 9:44 pm
[...] По этому поводу, есть хороше сранение тут: Performance comparison of Repair by Sorting or by Keycache [...]
Pingback :: April 11, 2012 @ 12:14 am