Wednesday, April 18, 2012

Copying unused bytes is bad (duh!)

Last summer my colleague Marko Mäkelä committed this seemingly innocent performance fix for InnoDB in MySQL 5.6:

3581 Marko Makela    2011-08-10
Bug#12835650 VARCHAR maximum length performance impact

row_sel_field_store_in_mysql_format(): Do not pad the unused part of
the buffer reserved for a True VARCHAR column (introduced in 5.0.3).
Add Valgrind instrumentation ensuring that the unused part will be
flagged uninitialized.

Before this, buffers which were used to send VARCHARs from InnoDB to the MySQL server were padded with 0s if the string was shorter than specified by the column. If, e.g., the string "foo" was stored in a VARCHAR(8), InnoDB used to write "3foo00000" to the buffer (the first character - 3 - determines the actual length of the string). However, even though these trailing bytes are not used anywhere, writing 0s to the buffer certainly has a cost. Hence the fix which stops InnoDB from writing them.

Oh my were we up for a ride! All of a sudden Valgrind started barking like this (and similar) for a number of tests:
Thread 19:
Syscall param pwrite64(buf) points to uninitialised byte(s)
at 0x381C60EEE3: ??? (in /lib64/libpthread-2.12.so)
by 0x84B703: my_pwrite (my_pread.c:162) by 0x86FE6F: mi_nommap_pwrite (mysql_file.h:1203)
   by 0x88621B: _mi_write_static_record (mi_statrec.c:65)
   by 0x889592: mi_write (mi_write.c:142)
   by 0x532085: handler::ha_write_row(unsigned char*) (handler.cc:6043)
   by 0x69B7F7: end_write(JOIN*, st_join_table*, bool) (sql_select.cc:20237)
   by 0x69A43C: evaluate_join_record(JOIN*, st_join_table*, int)
(sql_select.cc:19187)
   by 0x69A8E0: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:19294)
The modifications we did to the server to remedy the problems can be divided into two categories as described below. In the end, the result was significantly improved performance.

Friday, April 13, 2012

New MySQL optimizer team page launched

FYI: A MySQL Optimizer Team blog aggregate page has been launched. It will be continuously updated with the latest blogs published by the MySQL optimizer developers. Now you can easily stay updated ;-)

Tuesday, April 10, 2012

Improvements for many-table joins in MySQL 5.6

A lot has happened in MySQL 5.6 for queries joining many tables. For the most common use cases we have drastically reduced the cost of finding the execution plan. We have also improved the heuristics and removed bugs so that the final plan is often better than it used to be. Read on if you are one of those people who do 15 way joins!

Finding a query execution plan
First some background. You can skip this part if you know how MySQL picks the table join order in 5.5.

When presented with a query, MySQL will try to find the best order to join tables by employing a greedy search algorithm. The outcome is what we call a query execution plan, QEP. When you join just a few tables, there's no problem calculating the cost of all join order combinations and then pick the best plan. However, since there are (#tables)! possible combinations, the cost of calculating them all soon becomes too high: for five tables, e.g., there are 120 combinations which is no problem to compute. For 10 tables there are 3.6 million combinations and for 15 tables there are 1307 billion. You get the picture. For this reason, MySQL makes a trade off: use heuristics to only explore promising plans. This is supposed to significantly reduce the number of plans MySQL needs to calculate, but at the same time you risk not finding the best one.

On queries with many values in the IN clause

A few customers with rather extreme needs have contacted us about a performance issue with the range optimizer. Our solution to the problem is to introduce a new variable in MySQL 5.6, eq_range_index_dive_limit, which can be used to control whether or not the range optimizer will a) do index dives, or b) use index statistics when estimating the number of rows in the ranges of the query. The former method gives a far more accurate estimate while the latter costs a lot less to compute.

This is what the help text has to tell about the variable:

The optimizer will use existing index statistics instead of doing index dives for equality ranges if the number of equality ranges for the index is larger than or equal to [the value of variable]. If set to 0, index dives are always used.
"Equality range" means predicates using operators IN() or =, and it's important to notice that the number of such ranges is counted on a per index basis. Furthermore, index dives are not used on unique/primary indexes, so only non-unique indexes are affected. 

The two ways row estimates can be computed

For as long as there have been a range access method in MySQL, the number of rows in a range has been estimated by diving down the index to find the start and end of the range and use these to count the number of rows between them. This technique is accurate, and is therefore a good basis to make the best possible execution plan (QEP) for the query. Unfortunately, it involves two index dives for each range. That's not a big deal if you run normal queries with a few equality ranges like

SELECT title FROM albums WHERE artist IN (10, 12)
But some MySQL users have queries with hundreds of values in the IN clause. For such queries it may take considerably longer to optimize the query than to execute it. This is when it makes sense to use the less accurate index statistics. Consider this example: