- is not used when it should have been
- is used when ref access is obviously better
- merges suboptimal indexes
- is too restricted in which conditions can be used
The good news is that we have fixed all these issues in the latest released MySQL 5.6 server.
Consider the problem in BUG#65274 (simplified a bit):
CREATE TABLE phpbb_posts (
...
KEY `topic_id` (`topic_id`),
KEY `forum_id` (`forum_id`)
KEY `tid_fid` (`topic_id`,`forum_id`),
);
SELECT count(*) FROM phpbb_posts WHERE topic_id = 110126 AND forum_id = 19;
Before, MySQL would do an intersecting index merge between the indexes 'topic_id' and 'forum_id' even though 'ref' access using index 'tid_fid' is obviously better. The main problem was the order indexes were attempted added to the merge: MySQL used to start with 'topic_id' and 'forum_id' and, since all required fields were covered, would settle for this merge.
In 5.6.7 we have changed the order in which potential indexes are added to an intersection merge. Simply put, we now try to merge as few indexes as possible by preferring those that cover the most fields that are part of the condition. Thus, the index merge now starts out with the 'tid_fid' index. Since one index is enough to evaluate all conditions in this query, there will not be any index merge at all and 'ref' access is chosen instead.
Another improvement in 5.6.7 removes a great deal of restrictions to which conditions can be used for index merge union. Consider the following example, in which we want to count the biggest (in terms of value or quantity) orders through all times and in 2012, respectively. Note that the data is from an imagined startup, so only 5% of the orders are from before 2012. A session from MySQL 5.5.27:
mysql> CREATE TABLE orders (
-> order_id INT PRIMARY KEY AUTO_INCREMENT,
-> cust_id INT,
-> order_date date,
-> no_products INT,
-> value INT,
-> KEY ix_cust (cust_id),
-> KEY ix_odate (order_date),
-> KEY ix_noprod (no_products),
-> KEY ix_val (value)
-> ) ENGINE = InnoDB;
mysql> /* insert 130k rows /*
mysql> EXPLAIN SELECT COUNT(order_id) FROM orders
-> WHERE (no_products > 150 OR value > 1100)\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: orders
type: index_merge
possible_keys: ix_noprod,ix_val
key: ix_noprod,ix_val
key_len: 5,5
ref: NULL
rows: 9052
Extra: Using sort_union(ix_noprod,ix_val); Using where
1 row in set (0.00 sec)
mysql> EXPLAIN SELECT COUNT(order_id) FROM orders
-> WHERE (no_products > 150 OR value > 1100) AND order_date >= '2012-01-01'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: orders
type: ALL
possible_keys: ix_odate,ix_noprod,ix_val
key: NULL
key_len: NULL
ref: NULL
rows: 131649
Extra: Using where
1 row in set (0.00 sec)
See how MySQL switches from index_merge union to table scan, and how this affects the rows estimate? This is obviously a bad choice and was due to a simplification in the range optimizer that just ignored possible index merge unions if it was possible to do a simple range scan instead (in this case a range scan using "order_date >= '2012-01-01'").
Note that this was a well documented restriction: "If a range scan is possible on some key, the optimizer will not consider using Index Merge Union or Index Merge Sort-Union algorithms."
Now let's see what MySQL 5.6.7 does. The query counting the biggest orders through all times choses the same execution plan as 5.5.27 so it's left out:
mysql> EXPLAIN SELECT COUNT(order_id) FROM orders
-> WHERE (no_products > 150 OR value > 1100) AND order_date >= '2012-01-01'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: orders
type: index_merge
possible_keys: ix_odate,ix_noprod,ix_val
key: ix_noprod,ix_val
key_len: 5,5
ref: NULL
rows: 8952
Extra: Using sort_union(ix_noprod,ix_val); Using where
1 row in set (0.01 sec)
As you can see, MySQL is now able to recognize that union index merge is the best way to access this table.
PS: it is still possible to construct test cases with complex conditions that in theory could and should be handled by index merge but is not. Please file bugs if you have examples of real use cases where this applies - this gives me an excuse to further improve the range optimizer. For now, we have raised the bar significantly and can (as far as I know) handle the conditions we have bug reports on.
Here's an example:
ReplyDeletefor the table t0 from mysql-test/include/index_merge1.inc:
mysql> explain select * from t0 where (key1 < 2 or key2 < 4) and (key3 > 10) or (key1 < 1 or key2 < 3) and (key3 < 20);
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref
| rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
| 1 | SIMPLE | t0 | ALL | i1,i2,i3 | NULL | NULL | NULL | 1024 | Using
where |
+----+-------------+-------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)
(This is the output from mysql-5.6.7-rc-linux2.6-x86_64.tar.gz. BTW
could you update 5.6 on launchpad, please?)
Index merge is not chosen here though it's much cheaper than the table scan.
Also almost any test case with multi-component indexes
from lp:~maria-captains/maria/5.3/mysql-test/t/range_vs_index_merge.test
fails to choose an index merge.
I'm afraid the problem cannot be resolved with such simple tweaks in opt_range.cc as you used. They could be enough to fix all the reported
bugs you mentioned though.
Igor Babaev
igor@askmonty.org
Igor,
DeleteThanks for the example. As I wrote above I am aware of and able to write examples of such complex conditions myself. What I wanted to know was if there are users with real problems that would need even more enhancements in this area. If not, I think you agree with me that a developer's time is best spent on enhancements that users care about.
I figured the MySQL community would be eager to provide this feedback, but it is better to do so through bug reports.
Jørgen.
DeleteLet's take the query from
http://bugs.mysql.com/bug.php?id=17673
We have for mysql-5.6.7
mysql> explain select * from tmerge2 where ((c2 = 1 and c3 = 1) or (c4 =2 and c5 = 1)) and cp = 1\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tmerge2
type: index_merge
possible_keys: k1,k2
key: k1,k2
key_len: 12,12
ref: NULL
rows: 2
Extra: Using sort_union(k1,k2); Using where
and
mysql> explain select * from tmerge2 where (c2 = 1 and c3 = 1 and cp=1) or (c4=2 and c5=1 and cp=1)\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tmerge2
type: index_merge
possible_keys: k1,k2
key: k1,k2
key_len: 14,14
ref: NULL
rows: 2
Extra: Using sort_union(k1,k2); Using where
See that in the second case the index merge uses 3 components for each merged index, while in the first case
only two major components
(key_len: 14,14 and key_len: 12,12 respectively)
Igor,
DeleteI agree that this should be improved as well so I just made a fix. I'll brush the patch up as soon as I get the time for it :-)