This is a continuation post on InnoDB plugin row format performance with few more test scenarios to cover based on the feedback and questions that I got in my Inbox.

This time the initial table looks like this

CREATE TABLE `sbtest` (
  `id` int(10) unsigned NOT NULL,
  `k` int(10) unsigned NOT NULL DEFAULT ‘0′,
  `c` char(120) NOT NULL DEFAULT ,
  `pad` char(60) NOT NULL DEFAULT ,
  PRIMARY KEY (`id`),
) ENGINE=InnoDB ROW_FORMAT=COMPRESSED;

This time, the table does not have a secondary index in the create time, instead it will be added after the table is populated with the data to test the performance of the new faster index support from the plugin. I also added a new scenario to test 2K compression as we noticed 4K compression seems to be a bottleneck in the earlier case.

Same test environment is chosen as of the first one, RHEL-4 64-Bit having MySQL-5.1.24 with InnoDB plugin 1.0 statically linked.

InnoDB configuration variables:

   1: mysql> show variables like ‘%innodb%’;
   2: +———————————+————————–+
   3: | Variable_name                   | Value                    |
   4: +———————————+————————–+
   5: | have_innodb                     | YES                      |
   6: | innodb_adaptive_hash_index      | ON                       |
   7: | innodb_additional_mem_pool_size | 104857600                |
   8: | innodb_autoextend_increment     | 8                        |
   9: | innodb_autoinc_lock_mode        | 1                        |
  10: | innodb_buffer_pool_size         | 6442450944               |
  11: | innodb_checksums                | ON                       |
  12: | innodb_commit_concurrency       | 0                        |
  13: | innodb_concurrency_tickets      | 500                      |
  14: | innodb_data_file_path           | ibdata1:256M:autoextend  |
  15: | innodb_data_home_dir            | /mysql/ibdata |
  16: | innodb_doublewrite              | ON                       |
  17: | innodb_fast_shutdown            | 1                        |
  18: | innodb_file_format              | Barracuda                |
  19: | innodb_file_io_threads          | 4                        |
  20: | innodb_file_per_table           | ON                       |
  21: | innodb_flush_log_at_trx_commit  | 1                        |
  22: | innodb_flush_method             | O_DIRECT                 |
  23: | innodb_force_recovery           | 0                        |
  24: | innodb_lock_wait_timeout        | 50                       |
  25: | innodb_locks_unsafe_for_binlog  | OFF                      |
  26: | innodb_log_buffer_size          | 8388608                  |
  27: | innodb_log_file_size            | 268435456                |
  28: | innodb_log_files_in_group       | 2                        |
  29: | innodb_log_group_home_dir       | /mysql/iblog  |
  30: | innodb_max_dirty_pages_pct      | 90                       |
  31: | innodb_max_purge_lag            | 0                        |
  32: | innodb_mirrored_log_groups      | 1                        |
  33: | innodb_open_files               | 300                      |
  34: | innodb_replication_delay        | 0                        |
  35: | innodb_rollback_on_timeout      | OFF                      |
  36: | innodb_strict_mode              | OFF                      |
  37: | innodb_support_xa               | ON                       |
  38: | innodb_sync_spin_loops          | 20                       |
  39: | innodb_table_locks              | ON                       |
  40: | innodb_thread_concurrency       | 32                       |
  41: | innodb_thread_sleep_delay       | 10000                    |
  42: +———————————+————————–+
  43: 37 rows in set (0.00 sec)

Normally I disable the following InnoDB configuration options when running any benchmark tests; but this time enabled all of them (including in the first case)…

innodb_checksums
innodb_doublewrite
innodb_support_xa

Table Load:

Load time from a dump of SQL script having 10M rows (not batched)

  Compact Compressed (8K) Compressed (4K) Compressed (2K) Dynamic
6G buffer pool 27m 34s 28m 25s 29m 37s 32m 49s 27m 23s

Secondary Index Creation:

Secondary Index create time after the table has been populated. The index added using

CREATE INDEX k ON sbtest(k)
 
  Compact Compressed (8K) Compressed (4K) Compressed (2K) Dynamic
6G buffer pool 0m 51s 1m 23s 7m 0s 7m 42s 0m 51s

File Sizes:

Here is the size of the .ibd file after each data load

  Compact Compressed (8K) Compressed (4K) Compressed (2K) Dynamic
Before Secondary Index 2.2G 1.1G 552M 284M 2.2G
After Secondary Index 2.3G 1.2G 592M 324M 2.3G

Testing:

I used the sysbench to run the tests by varying the threads to run mixed read/write on a  pre-populated table by creating the table with different ROW_FORMATS along with KEY_BLOCK_SIZE. The server is taken down by cleaning its logs and data directories completely and a fresh server is started back to re-run a different row-format .

Here is the sysbench options:

--test=oltp --db-driver=mysql --oltp-table-size=10000000
–oltp-skip-trx=on –oltp-auto-inc=off –mysql-table-engine=innodb
–mysql-user=root –mysql-ignore-duplicates=onmax-requests=100000
–init-rng=on –oltp-test-mode=complex –oltp-dist-type=uniform
–mysql-engine-trx=yes –oltp-read-only=off
–num-threads=XXX run

Some variables are non-relevant to test, but I use the same scripts to run lot of other cases, –mysql-ignore-duplicates is not a standard sysbench option, its a patch that I created to make sure the test does not bail on duplicate key errors due to poor random number generator.

The test typically executes 950000 queries in each iteration (700K read queries and 250K write queries) with max of 50000 transactions in this test and 100K transactions in the part-I case. Here is the typical output from sysbench:

LTP test statistics:
    queries performed:
        read:                            700000
        write:                           250000
        other:                           0
        total:                           950000
    transactions:                        50000  (206.00 per sec.)
    deadlocks:                           0      (0.00 per sec.)
    read/write requests:                 950000 (3914.01 per sec.)
    other operations:                    0      (0.00 per sec.)

Test execution summary:
    total time:                          242.7179s
    total number of events:              50000
    total time taken by event execution: 123984.2189
    per-request statistics:
         min:                            0.2133s
         avg:                            2.4797s
         max:                            15.6503s
         approx.  95 percentile:         4.0491s

Performance:

Here is the performance of various row formats with threads ranging from 1 to 512 for 6G buffer pool size for both concurrent reads and writes.

compress6g-new

 Observations:

  • The load time is almost same for both Compact and Dynamic row formats. In this table and data scenario, both perform same amount of transactions (as you can see from the graph, they overlap)
  • The load/insert time increases based on the compression ratio from 8K, 4K, 2K, 1K etc
  • When the whole data set can fit in memory; then compression will relatively slowdown the transaction rate
  • When the whole data set can not fit in memory, then compression on the table seems to be a right choice (8K in this case)
  • Creation of secondary index still takes lot of time when the compression is enabled. The 8K compression is much better and bit close to Compact/Dynamic; but 4K/2K seems to take ~6 times more time than 8K
  • The best compression in terms of performance is 8K; but it may vary based on the table structure and data and its distribution.
  • Part-I observations can be found from here

Looks like there is something wrong in terms of performance of 4K/2K compression when compared with 8K when the data set can fit everything in memory (as we have 6G buffer pool). I will dig further and see if we can find a solution for the actual problem or file a bug report..more later.

Convert from one row format to another:

Alter table seems to be a right choice to change from one row format to another one or to change the compression ratio and it is efficient. For example, here is a case where the table is converted from compressed to compact and compact to Compressed and InnoDB re-creates both .frm and .ibd by using temp files. Even here you can see that 4K takes much longer time than expected.

mysql> alter table sbtest ROW_FORMAT=Compact;
Query OK, 10000000 rows affected, 1 warning (2 min 56.48 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> alter table sbtest ROW_FORMAT=Compressed KEY_BLOCK_SIZE=8;
Query OK, 10000000 rows affected (4 min 11.56 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> alter table sbtest ROW_FORMAT=Compressed KEY_BLOCK_SIZE=4;
Query OK, 10000000 rows affected (20 min 6.36 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> alter table sbtest ROW_FORMAT=Compact;
Query OK, 10000000 rows affected, 1 warning (2 min 56.61 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> show warnings;
+———+——+—————————————————————–+
| Level   | Code | Message                                                         |
+———+——+—————————————————————–+
| Warning | 1478 | InnoDB: ignoring KEY_BLOCK_SIZE=4 unless ROW_FORMAT=COMPRESSED. |
+———+——+—————————————————————–+
1 row in set (0.00 sec)

mysql> alter table sbtest ROW_FORMAT=Dynamic;
Query OK, 10000000 rows affected, 1 warning (2 min 42.12 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> show warnings;
+———+——+—————————————————————–+
| Level   | Code | Message                                                         |
+———+——+—————————————————————–+
| Warning | 1478 | InnoDB: ignoring KEY_BLOCK_SIZE=4 unless ROW_FORMAT=COMPRESSED. |
+———+——+—————————————————————–+
1 row in set (0.00 sec)

mysql> alter table sbtest ROW_FORMAT=Compressed KEY_BLOCK_SIZE=2;
Query OK, 10000000 rows affected (20 min 53.76 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

Nice to have:

It will be nice if InnoDB team can extend the SHOW TABLE STATUS to show the compression details. Right now the Information schema.cmp will only lists general compression statistics but not on per table basis.

Also, it will be added advantage if they can support the following two config variables:

innodb_default_row_format (default compact or dynamic) and innodb_default_key_block_size (optionally, and make sure it does not raise a warning on default row_format)