Wednesday, April 18, 2012

Copying unused bytes is bad (duh!)

Last summer my colleague Marko Mäkelä committed this seemingly innocent performance fix for InnoDB in MySQL 5.6:

3581 Marko Makela    2011-08-10
Bug#12835650 VARCHAR maximum length performance impact

row_sel_field_store_in_mysql_format(): Do not pad the unused part of
the buffer reserved for a True VARCHAR column (introduced in 5.0.3).
Add Valgrind instrumentation ensuring that the unused part will be
flagged uninitialized.

Before this, buffers which were used to send VARCHARs from InnoDB to the MySQL server were padded with 0s if the string was shorter than specified by the column. If, e.g., the string "foo" was stored in a VARCHAR(8), InnoDB used to write "3foo00000" to the buffer (the first character - 3 - determines the actual length of the string). However, even though these trailing bytes are not used anywhere, writing 0s to the buffer certainly has a cost. Hence the fix which stops InnoDB from writing them.

Oh my were we up for a ride! All of a sudden Valgrind started barking like this (and similar) for a number of tests:
Thread 19:
Syscall param pwrite64(buf) points to uninitialised byte(s)
at 0x381C60EEE3: ??? (in /lib64/libpthread-2.12.so)
by 0x84B703: my_pwrite (my_pread.c:162) by 0x86FE6F: mi_nommap_pwrite (mysql_file.h:1203)
   by 0x88621B: _mi_write_static_record (mi_statrec.c:65)
   by 0x889592: mi_write (mi_write.c:142)
   by 0x532085: handler::ha_write_row(unsigned char*) (handler.cc:6043)
   by 0x69B7F7: end_write(JOIN*, st_join_table*, bool) (sql_select.cc:20237)
   by 0x69A43C: evaluate_join_record(JOIN*, st_join_table*, int)
(sql_select.cc:19187)
   by 0x69A8E0: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:19294)
The modifications we did to the server to remedy the problems can be divided into two categories as described below. In the end, the result was significantly improved performance.


What has been done 1: Don't copy trailing VARCHAR bytes
The MySQL server used to copy full-length VARCHARs in various places in the code. We've patched the server to employ the same strategy as InnoDB: don't copy unused parts of VARCHARs. The following are examples of such fixes:
Bug#12997905: VALGRIND: SYSCALL PARAM PWRITE64(BUF)
                           POINTS TO UNINITIALISED BYTE(S)
(...)
The fix is to reset the record buffer for MyISAM temporary
tables in addition to restricting the copying of fields from
one table to another to the bytes actually used by the varchar.

Bonus: Restricting Copy_field to copy only used varchar bytes
instead of the full column length also has a performance
benefit.

Bug#13389854 - VALGRIND ERROR IN _MI_WRITE_BLOB_RECORD
(...)
In this bug, trailing and unitialized space from a VARCHAR was
copied from an InnoDB fulltext index and stored in a MyISAM
temp table. This made valgrind bark.

The fix is to restrict copying of VARCHARs in field_conv() to
only the bytes actually used.

All these got into MySQL 5.6 in 2011.

What has been done 2: Enable MyISAM dynamic format for temp tables
Temporary tables in MySQL are either stored in memory or, if they grow too big, in MyISAM. For MyISAM there are a few different record formats, most notably "fixed" and "dynamic". One difference is that with the fixed format VARCHARs are stored in full length while with the dynamic format only the used bytes are stored.

One would expect that a temporary table used for big VARCHARs would pick the dynamic format. To our surprise that was not the case; there was a legacy bug that prevented temporary tables ("normal" tables not affected) from using this format at all! This has been fixed.

Performance impact
After fixing all issues mentioned above I created a test to illustrate the performance improvements :

CREATE TABLE t1 (
  uniqe int,
  col1 varchar(10),
  col2 varchar(2048)
);
/* insert 4M rows */
SELECT col1 FROM t1 GROUP BY col2, uniqe LIMIT 1;


I used three different data scenarios:
  • all rows had a string of size 1 in col2 ("empty")
  • all rows had a string of size 1024 in col2 ("half-full")
  • all rows had a string of size 2048 in col2 ("full")
I then ran the test on MySQL versions
  • 5.5 - without any of the described fixes
  • 5.6 from January 2012 - containing the "don't copy unused bytes" fixes but not the record format fix
  • 5.6.5 - containing all described fixes
The results speak for themselves:

Data scenario Response time (s) Temp table size (GB)
5.5 5.6 Jan 5.6.5 5.5 5.6 Jan 5.6.5
"empty" 342 327 147 8.1 8.1 0.08
"half-full" 469 439 290 8.1 8.1 4.1
"full" no difference


That's not half bad for a non-functional change that doesn't require the user to do anything. Admittedly it isn't realistic to assume anyone has VARCHAR(2048) columns with only a single character in all rows, but about half-full VARCHARs are fairly realistic.

All in all a good batch of changes to MySQL 5.6.

5 comments:

  1. I disagree -- it's completely realistic to assume people have long VARCHAR columns with a single character :-) I've seen a lot of things like this in the wild! The rule is "if it's possible, someone will do it."

    The more usual case will be VARCHAR(255) with something like a firstname stored in it. That should still be a decent improvement, given that most names are 4-10 characters or so.

    Thanks for fixing this!

    ReplyDelete
  2. Hi Baron, thanks for commenting. It's interesting to get the opinions of people with real field experience :-)

    ReplyDelete
  3. Any specific reason why all this good stuff is not aggregated on planet.mysql.com?

    ReplyDelete
  4. @Hartmut: It is aggregated both on planet.mysql.com and mysqloptimizerteam.blogspot.com. You must have missed it :-)

    ReplyDelete
  5. Uh ... it was there indeed, but it's not in my feed readers history ... weird, but most likely a local problem indeed ...

    ReplyDelete