tag:blogger.com,1999:blog-71714774571291878732024-03-05T07:53:37.594+01:00Jørgen's point of viewMostly about the MySQL OptimizerJørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.comBlogger19125tag:blogger.com,1999:blog-7171477457129187873.post-37796305897595095862014-03-28T13:59:00.000+01:002014-03-28T15:17:12.884+01:00The range access method and why you should use EXPLAIN JSONI got an interesting question about <span style="font-size: small;"><span style="font-family: "Courier New",Courier,monospace;">EXPLAIN</span></span> and the range access method recently. The person had a query that could be written either with a BETWEEN predicate or an IN predicate, something similar to this:
<br>
<div style="background: none repeat scroll 0% 0% black; color: white; height: 100%; line-height: 100%; overflow: auto; padding: 10px; width: 100%;">
<pre><span style="font-size: xx-small;">mysql> EXPLAIN SELECT *
-> FROM orders WHERE customer_id BETWEEN 7 AND 10 AND value > 500;
+----+-------------+--------+-------+----------+----------+------+------
| id | select_type | table | type | key | key_len | rows | Extra
+----+-------------+--------+-------+----------+----------+------+------
| 1 | SIMPLE | orders | range | cust_val | 10 | 91 | ...
+----+-------------+--------+-------+----------+----------+------+------
mysql> EXPLAIN SELECT *
-> FROM orders WHERE customer_id IN (7,8,9,10) AND value > 500;
+----+-------------+--------+-------+----------+----------+------+------
| id | select_type | table | type | key | key_len | rows | Extra
+----+-------------+--------+-------+----------+----------+------+------
| 1 | SIMPLE | orders | range | cust_val | 10 | 44 | ...
+----+-------------+--------+-------+----------+----------+------+------
</span></pre>
</div>
<br>
The table was:<br>
<br>
<span style="font-family: Verdana,sans-serif; font-size: xx-small;">CREATE TABLE orders (</span><br>
<div style="font-family: Verdana,sans-serif;">
<span style="font-size: xx-small;"> order_id INT NOT NULL PRIMARY KEY,</span></div>
<div style="font-family: Verdana,sans-serif;">
<span style="font-size: xx-small;"> customer_id INT,</span></div>
<div style="font-family: Verdana,sans-serif;">
<span style="font-size: xx-small;"> value INT,</span></div>
<div style="font-family: Verdana,sans-serif;">
<span style="font-size: xx-small;"> order_date DATE,</span></div>
<div style="font-family: Verdana,sans-serif;">
<span style="font-size: xx-small;"> KEY custid_value (customer_id, value)</span></div>
<div style="font-family: Verdana,sans-serif;">
<span style="font-size: xx-small;">)</span></div>
<br>
Given that <span style="font-family: "Courier New",Courier,monospace;"><span style="font-size: x-small;">customer_id</span></span> is an integer value, these queries should be equivalent. And that's what <span style="font-family: "Courier New",Courier,monospace;">EXPLAIN</span> seems to tell us too: same access method, same key, same key_length etc. <br>
<br>
You may have guessed the question by now: If the queries are equivalent, why isn't the rows estimate identical? Is one of the numbers wrong or do these queries have different execution plans after all?<br>
<br>
<a href="http://jorgenloland.blogspot.com/2014/03/the-range-access-method-and-why-you.html#more">Read more »</a>Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com4tag:blogger.com,1999:blog-7171477457129187873.post-50311484082594429182013-10-21T13:11:00.000+02:002013-10-21T13:12:41.107+02:00FAQ: InnoDB extended secondary keysMySQL 5.6 introduced a new feature called <i>extended secondary keys</i>. We get a lot of questions about it and find that most of them come from a few incorrect assumption. In this post I'll try to get rid of the confusion once and for all. Famous last words... here goes:<br>
<b><br></b>
<b>Q1: Do I need to do anything to enable extended secondary keys?</b><br>
<br>
No, nothing at all. It's on by default and I can't see any sensible reason why you would want to disable it. However, it is possible to disable it by tuning the optimizer_switch: <span style="font-family: "Courier New",Courier,monospace;"><span style="font-size: x-small;">SET optimizer_switch='use_index_extensions={on|off}'.</span></span><br>
<br>
<b>Q2: Does extended secondary keys only work with InnoDB?</b><br>
<br>
No, it should work with any storage engine that uses the primary key columns as reference to the row, which means most storage engines with clustered primary keys. I say "should" because it requires a minimum of work from the storage engine provider; it must announce to MySQL that it is supported.<br>
<br>
<a href="http://jorgenloland.blogspot.com/2013/10/faq-innodb-extended-secondary-indexes.html#more">Read more »</a>Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com4tag:blogger.com,1999:blog-7171477457129187873.post-21403147681640387002013-10-18T13:02:00.000+02:002013-10-20T07:53:29.756+02:00The second MySQL seminar: a summaryOnce again, Geir Høydalsvik and I had the pleasure of hosting a MySQL mini-seminar in Trondheim. 25+ attendants from at least 7 different companies and a few professors from the <a href="http://www.idi.ntnu.no//" target="_blank">computer science dept.</a> at <a href="http://www.ntnu.edu/" target="_blank">NTNU</a> showed up on yesterdays event. I recognized many of these from the first seminar but there were some new faces as well.<br />
<br />
This time, <a href="http://www.markleith.co.uk/" target="_blank">Mark Leith</a> came on a visit from the UK. He gave an introduction to Performance Schema and ps_helper. ps_helper is a really nice tool to make sense of the overwhelming amount of data collected by PS. He also gave a very convincing demo of MySQL Enterprise Monitor (MEM). More than a few attendants now plan to give MEM a try in their environment. You can too - there's a 30 day trial, which should be more than enough to decide if you need it :-)<br />
<br />
Like last time, we had great discussions, pizza and a very good time. I'm looking forward to the next seminar already.<br />
<br />
If you're interested, Mark's PS presentation can be found on <a href="http://www.slideshare.net/Leithal/performance-schema-andpshelper" target="_blank">slideshare</a>. ps_helper is both on <a href="http://github.com/MarkLeith/dbahelper" target="_blank">github</a> and on his <a href="http://www.markleith.co.uk/ps_helper/" target="_blank">blog</a>. ps_helper comes highly recommended!Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com0tag:blogger.com,1999:blog-7171477457129187873.post-47992410061610656362013-09-13T14:21:00.003+02:002013-09-16T09:49:38.843+02:00Next MySQL mini-seminar in Trondheim October 17We have the pleasure to announce that the next MySQL mini-seminar in Trondheim will be held on October 17 15:00-18:00. This time, MySQL Monitoring and performance schema evangelist Mark Leith will visit us. The agenda will be similar to last mini-seminar:<br />
<br />
<ul>
<li>Presentation: <span class="bodycopy"><span class="bodycopy">MySQL monitoring and performance Schema,</span></span> by Mark Leith</li>
<li>Q&A, discussions</li>
<li>Suggestions for presentations on future seminars</li>
<li>Food and mingling</li>
</ul>
For more info, see the <a href="http://www.oracle.com/webapps/events/ns/EventsDetail.jsp?p_eventId=174804&src=7866894&src=7866894&Act=75" target="_blank">registration page</a>. <br />
<br />
Don't miss this opportunity to meet experts, developers and other MySQL users. Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com0tag:blogger.com,1999:blog-7171477457129187873.post-72511651610737807262013-06-20T11:16:00.000+02:002013-06-20T11:16:03.870+02:00The first MySQL mini-seminar in Trondheim was a huge success!Yesterday, Geir Høydalsvik and I had the pleasure of hosting a MySQL mini-seminar in Trondheim, Norway.<br />
<br />
The topic of the day was a presentation by yours truly on how the MySQL optimizer works. Geir briefly explained how the MySQL teams are organized in Oracle, and that our focus is on delivering high quality on time.<br />
<br />
We had lots of interesting questions and discussions (and pizza) afterwards. Of particular interest was:<br />
<ul>
<li>How the MySQL code base can be modularized to make it maintainable and testable. The takeaway was that MySQL has invested a lot on refactoring the last couple of years to improve in this area and will continue to do so in both the near and far future.</li>
<li>How testing is done in MySQL. The answer was that the QA teams have been significantly ramped up since Sun acquired MySQL. In addition to much more resources to the QA teams, developers are now expected to write unit tests using the Google unit test framework for new functionality whenever it makes sense. Refactoring and modularization of the code makes this a lot easier to do now.</li>
</ul>
<br />
Close to 20 MySQL users attended in addition to a handful of MySQL hackers located in the Trondheim office. The users had very different backgrounds: from start-ups to solid established companies, and deployment on MySQL 5.1, 5.5 and 5.6. All agreed that the mini-seminar was a great initiative and that it should not be a one-time event. Yesterday's event will therefore not be <i>the</i> mini-seminar, but rather <i>the first in a series</i>.<br />
<br />
We'll get back to you with agenda and date for the next mini-seminar once the details have been decided.Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com0tag:blogger.com,1999:blog-7171477457129187873.post-42598857881515458392013-02-13T09:18:00.002+01:002013-02-13T12:30:06.170+01:00DBT-3 Q3: 6 x performance in MySQL 5.6.10When MySQL gets a query, it is the job of the optimizer to find the cheapest way to execute that query. Decisions include access method (range access, table scan, index lookup etc), join order, sorting strategy etc. If we simplify a bit, the optimizer first identifies the different ways to access each table and calculate their cost. After that, the join order is decided.<br>
<br>
However, some access methods can only be considered after the join order has been decided and therefore gets special treatment in the MySQL optimizer. For join conditions, e.g. <span style="font-family: "Courier New",Courier,monospace;"><span style="font-size: x-small;">"WHERE table1.col1 = table2.col2"</span></span>, index lookup can only be used in table2 if table1 is earlier in the join sequence. Another class of access methods is only meaningful for tables that are first in the join order. An example is queries with <span style="font-family: "Courier New",Courier,monospace;"><span style="font-size: x-small;">ORDER BY ... LIMIT</span></span>. Prior to MySQL 5.6.10 there was a bug in MySQL that made the optimizer choose inefficient execution plans for this query type. That's the topic I discuss below.<br>
<br>
<a href="http://jorgenloland.blogspot.com/2013/02/dbt-3-q3-6-x-performance-in-mysql-5610.html#more">Read more »</a>Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com6tag:blogger.com,1999:blog-7171477457129187873.post-54518544469636674482012-12-20T08:38:00.000+01:002012-12-20T08:41:05.753+01:00Gluh on InnoDB extended secondary keys<div class="post-title entry-title" itemprop="name">
Sergey Glukhov (Gluh) recently wrote an interesting blog about InnoDB secondary key improvements in MySQL 5.6. His blog isn't aggregated to planet.mysql.com but certainly deserves some attention. </div>
<div class="post-title entry-title" itemprop="name">
<br /></div>
<div class="post-title entry-title" itemprop="name">
Here it is: <span style="font-size: small;"><a href="http://glukhsv.blogspot.co.uk/2012/12/innodb-extended-secondary-keys.html" target="_blank">InnoDB, extended secondary keys</a>.
</span></div>
Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com0tag:blogger.com,1999:blog-7171477457129187873.post-9142256718497892592012-12-17T13:55:00.001+01:002012-12-20T08:40:33.473+01:00Favorite MySQL 5.6 features: an optimizer perspectiveThere are so many exciting new features in MySQL 5.6 that I almost don't know where to start. To mention a few, MySQL's <a href="http://dimitrik.free.fr/blog/archives/2012/09/mysql-performance-welcome-56-rc-the-best-mysql-ever.html" target="_blank">multi-core scalability</a> has been significantly improved to meet modern hardware, InnoDB has better <a href="http://blogs.innodb.com/wp/2011/04/innodb-persistent-statistics-at-last/" target="_blank">index statistics</a>, much <a href="http://dev.mysql.com/doc/refman/5.6/en/innodb-performance.html" target="_blank">better performance</a>, and <a href="https://blogs.oracle.com/mysqlinnodb/entry/online_alter_table_in_mysql" target="_blank">online ALTER</a>, replication has <a href="http://d2-systems.blogspot.co.uk/2011/10/multi-threaded-slave-its-in.html" target="_blank">multi-threaded slaves</a> and <a href="http://d2-systems.blogspot.co.uk/2012/04/global-transaction-identifiers-are-in.html" target="_blank">global transaction identifiers</a>, <a href="http://www.markleith.co.uk/2012/07/13/monitoring-processes-with-performance-schema-in-mysql-5-6/" target="_blank">performance schema</a> has added capabilities to provide a much more detailed view of what's bogging a server down, and more... much more.<br>
<br>
However, my prime interest is the optimizer, which is why I've compiled a list of my favorite new optimizer features in 5.6. Here goes:<br>
<br>
<b>New ways to understand query plans:</b><br>
The most common requests from DBAs is to get more information to understand how and why MySQL behaves like it does. You can come a very long way at squeezing performance out of MySQL if you can answer questions like "What Query Execution Plan (QEP) did the server decide to use?" and "Why was this QEP picked?". In this department, MySQL 5.6 delivers a bunch of new features.<br>
<br>
<a href="http://jorgenloland.blogspot.com/2012/12/favorite-mysql-56-features-optimizer.html#more">Read more »</a>Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com3tag:blogger.com,1999:blog-7171477457129187873.post-89050588233376568172012-10-02T10:32:00.001+02:002012-10-02T10:32:34.062+02:00Index merge annoyances fixed in MySQL 5.6While the <a href="http://dev.mysql.com/doc/refman/5.6/en/index-merge-optimization.html" target="_blank">index merge access types</a> certainly are useful for a number of queries, there has been some frustration expressed both from customers and the community about how it...<br>
<ol>
<li>is not used when it should have been</li>
<li>is used when ref access is obviously better</li>
<li>merges suboptimal indexes</li>
<li>is too restricted in which conditions can be used</li>
</ol>
I could come up with numerous examples of related bugs and feature requests dating back more than six years. To list a few: <a href="http://bugs.mysql.com/bug.php?id=17673" target="_blank">17673</a>, <a href="http://bugs.mysql.com/bug.php?id=30151" target="_blank">30151</a>, <a href="http://bugs.mysql.com/bug.php?id=23322" target="_blank">23322</a>, <a href="http://bugs.mysql.com/bug.php?id=65274" target="_blank">65274</a>, <a href="http://bugs.mysql.com/bug.php?id=65359" target="_blank">65359</a>.<br>
<br>
The good news is that we have fixed all these issues in the latest released MySQL 5.6 server.<br>
<br>
Consider the problem in <a href="http://bugs.mysql.com/bug.php?id=65274" target="_blank">BUG#65274</a> (simplified a bit):<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;"> </span></span><br>
<span style="font-size: x-small;"><span style="font-family: "Courier New",Courier,monospace;">CREATE TABLE phpbb_posts (<br> ...<br> KEY `topic_id` (`topic_id`),<br> KEY `forum_id` (`forum_id`)<br> KEY `tid_fid` (`topic_id`,`forum_id`),<br>);</span><br><span style="font-family: "Courier New",Courier,monospace;">SELECT count(*) FROM phpbb_posts WHERE topic_id = 110126 AND forum_id = 19;</span></span><br>
<br>
<a href="http://jorgenloland.blogspot.com/2012/10/index-merge-annoyances-fixed-in-mysql-56.html#more">Read more »</a>Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com4tag:blogger.com,1999:blog-7171477457129187873.post-31387767436732228942012-05-04T08:27:00.001+02:002012-05-04T16:27:06.586+02:00Performance improvements for big INFORMATION_SCHEMA tablesA short while after <a href="http://jorgenloland.blogspot.co.uk/2012/04/copying-unused-bytes-is-bad-duh.html" target="_blank">I fixed the legacy bug</a> that prevented temporary
MyISAM tables from using the dynamic record format, I got an email from Davi
Arnaut @ Twitter. It turned out that Twitter needed to fix the very same
problem, but for the case when INFORMATION_SCHEMA temporary tables use
MyISAM.
<br>
<br>
In short, INFORMATION_SCHEMA tables provide access to database metadata.
Despite their name, they are more like views than tables: when you query
them, relevant data is gathered from the dictionary and other server
internals, not from tables. The gathered data is stored in a temporary table (memory or MyISAM depending on size) and then returned to the user.
<br>
<br>
The reason Davi emailed me was to let me know that he had further
improved the fix for temporary MyISAM tables to also enable the use of
dynamic record format for INFORMATION_SCHEMA tables. I usually don't
have huge databases on my development box so the problem of querying
metadata had gone unnoticed. But it turns out that Davi and his
colleagues at Twitter <i>do</i> deal with massive amounts of data :-)
<br>
<br>
This was one of the queries that caused problems:
<br>
<br>
<span style="font-family: "Courier New",Courier,monospace; font-size: x-small;">SELECT pool_id FROM INFORMATION_SCHEMA.INNODB_BUFFER_PAGE limit 1;
</span><br>
<br>
<a href="http://jorgenloland.blogspot.com/2012/05/performance-improvements-for-big.html#more">Read more »</a>Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com0tag:blogger.com,1999:blog-7171477457129187873.post-44504818564443975222012-04-18T10:59:00.000+02:002012-04-24T10:52:34.740+02:00Copying unused bytes is bad (duh!)Last summer my colleague Marko Mäkelä committed this seemingly innocent performance fix for InnoDB in MySQL 5.6:<br>
<br>
<blockquote class="tr_bq">
<span style="font-size: x-small;">3581 Marko Makela 2011-08-10<br>Bug#12835650 VARCHAR maximum length performance impact<br><br>row_sel_field_store_in_mysql_format(): Do not pad the unused part of<br>the buffer reserved for a True VARCHAR column (introduced in 5.0.3).<br>Add Valgrind instrumentation ensuring that the unused part will be<br>flagged uninitialized.</span></blockquote>
<br>
Before this, buffers which were used to send VARCHARs from InnoDB to the MySQL server were padded with 0s if the string was shorter than specified by the column. If, e.g., the string "foo" was stored in a VARCHAR(8), InnoDB used to write "3foo00000" to the buffer (the first character - 3 - determines the actual length of the string). However, even though these trailing bytes are not used anywhere, writing 0s to the buffer certainly has a cost. Hence the fix which stops InnoDB from writing them.<br>
<br>
Oh my were we up for a ride! All of a sudden Valgrind started barking like this (and similar) for a number of tests:<br>
<blockquote class="tr_bq" style="font-family: "Courier New",Courier,monospace;">
<span style="font-size: x-small;">Thread 19:<br>Syscall param pwrite64(buf) points to uninitialised byte(s)<br>at 0x381C60EEE3: ??? (in /lib64/libpthread-2.12.so)<br>by 0x84B703: my_pwrite (my_pread.c:162) by 0x86FE6F: mi_nommap_pwrite (mysql_file.h:1203)<br> by 0x88621B: _mi_write_static_record (mi_statrec.c:65)<br> by 0x889592: mi_write (mi_write.c:142)<br> by 0x532085: handler::ha_write_row(unsigned char*) (handler.cc:6043)<br> by 0x69B7F7: end_write(JOIN*, st_join_table*, bool) (sql_select.cc:20237)<br> by 0x69A43C: evaluate_join_record(JOIN*, st_join_table*, int)<br>(sql_select.cc:19187)<br> by 0x69A8E0: sub_select(JOIN*, st_join_table*, bool) (sql_select.cc:19294)</span></blockquote>
The modifications we did to the server to remedy the problems can be divided into two categories as described below. In the end, the result was significantly improved performance.<br>
<br>
<a href="http://jorgenloland.blogspot.com/2012/04/copying-unused-bytes-is-bad-duh.html#more">Read more »</a>Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com5tag:blogger.com,1999:blog-7171477457129187873.post-77171892608667089092012-04-13T15:07:00.001+02:002012-04-13T15:07:18.839+02:00New MySQL optimizer team page launchedFYI: A MySQL <a href="http://mysqloptimizerteam.blogspot.com/" target="_blank">Optimizer Team blog</a> aggregate page has been launched. It will be continuously updated with the latest blogs published by the MySQL optimizer developers. Now you can easily stay updated ;-)Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com0tag:blogger.com,1999:blog-7171477457129187873.post-42279409401327111222012-04-10T22:00:00.000+02:002012-04-10T22:00:33.732+02:00Improvements for many-table joins in MySQL 5.6A lot has happened in MySQL 5.6 for queries joining many tables. For the most common use cases we have drastically reduced the cost of finding the execution plan. We have also improved the heuristics and removed bugs so that the final plan is often better than it used to be. Read on if you are one of those people who do 15 way joins!<br>
<br>
<span style="font-size: small;"><b>Finding a query execution plan</b></span><br>
First some background. You can skip this part if you know how MySQL picks the table join order in 5.5.<br>
<br>
When presented with a query, MySQL will try to find the best order to join tables by employing a greedy search algorithm. The outcome is what we call a query execution plan, QEP. When you join just a few tables, there's no problem calculating the cost of all join order combinations and then pick the best plan. However, since there are (#tables)! possible combinations, the cost of calculating them all soon becomes too high: for five tables, e.g., there are 120 combinations which is no problem to compute. For 10 tables there are 3.6 million combinations and for 15 tables there are 1307 billion. You get the picture. For this reason, MySQL makes a trade off: use heuristics to only explore promising plans. This is supposed to significantly reduce the number of plans MySQL needs to calculate, but at the same time you risk not finding the best one.<br>
<br>
<a href="http://jorgenloland.blogspot.com/2012/04/improvements-for-many-table-joins-in.html#more">Read more »</a>Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com1tag:blogger.com,1999:blog-7171477457129187873.post-84103932027702320592012-04-10T15:00:00.000+02:002012-04-12T10:59:51.001+02:00On queries with many values in the IN clauseA few customers with rather extreme needs have contacted us about a performance issue with the range optimizer. Our solution to the problem is to introduce a new variable in MySQL 5.6, <span style="font-size: x-small;">eq_range_index_dive_limit<span style="font-size: small;">, which can be used to control whether or not the range optimizer will a) do index dives, or b) use index statistics when estimating the number of rows in the ranges of the query. The former method gives a far more accurate estimate while the latter costs a lot less to compute.</span></span><br>
<br>
<span style="font-size: x-small;"><span style="font-size: small;">This is what the help text has to tell about the variable:</span></span><br>
<span style="font-size: x-small;"><span style="font-size: small;"></span></span><br>
<blockquote class="tr_bq">
<span style="font-size: x-small;"><span style="font-size: small;"><span style="font-size: x-small;">The optimizer will use existing index statistics instead of doing index dives for equality ranges if the number of equality ranges for the index is larger than or equal to [the value of variable]. If set to 0, index dives are always used.</span></span></span></blockquote>
"Equality range" means predicates using operators <span style="font-size: x-small;">IN()<span style="font-size: small;"> or <span style="font-size: x-small;">=</span></span></span>, and it's important to notice that the number of such ranges is counted on a per index basis. Furthermore, index dives are not used on unique/primary indexes, so only non-unique indexes are affected.<b> </b><br>
<br>
<b>The two ways row estimates can be computed</b> <br>
<br>
For as long as there have been a range access method in MySQL, the number of rows in a range has been estimated by diving down the index to find the start and end of the range and use these to count the number of rows between them. This technique is accurate, and is therefore a good basis to make the best possible execution plan (QEP) for the query. Unfortunately, it involves two index dives for each range. That's not a big deal if you run normal queries with a few equality ranges like<br>
<br>
<blockquote class="tr_bq" style="font-family: "Courier New",Courier,monospace;">
<span style="font-size: x-small;">SELECT title FROM albums WHERE artist IN (10, 12)</span></blockquote>
But some MySQL users have queries with hundreds of values in the IN clause. For such queries it may take considerably longer to optimize the query than to execute it. This is when it makes sense to use the less accurate index statistics. Consider this example:<br>
<br>
<a href="http://jorgenloland.blogspot.com/2012/04/on-queries-with-many-values-in-in.html#more">Read more »</a>Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com4tag:blogger.com,1999:blog-7171477457129187873.post-58045632854015802412012-03-19T13:22:00.001+01:002012-03-19T13:33:07.842+01:00Index Condition Pushdown to the rescue!A while ago, I <a href="http://jorgenloland.blogspot.com/2011/08/mysql-range-access-method-explained.html" target="_blank">explained</a> how range access in a multiple-part index works and why MySQL can't utilize key parts beyond the first occurrence of some often used comparison operators. Luckily, there is a great improvement underway in MySQL 5.6 that will remedy much of this limitation. Meet Index Condition Pushdown.<br>
<br>
<b>How does ICP work? </b><br>
<br>
Index Condition Pushdown is a new way for MySQL to evaluate conditions. Instead of evaluating conditions on rows read from a table, ICP makes it possible to evaluate conditions in the index and thereby avoid looking at the table if the condition is false.<br>
<br>
Let's assume that we have a multiple-part index covering columns <span style="font-size: x-small;">(keypart_1, ..., keypart_n)</span>. Further assume that we have a condition with a comparison operator on <span style="font-size: x-small;">keypart_1</span> that does not allow <span style="font-size: x-small;">keypart_2,...,keypart_n</span> to be used as a range condition (more on that <a href="http://jorgenloland.blogspot.com/2011/08/mysql-range-access-method-explained.html" target="_blank">here</a>).<br>
<br>
When MySQL does range access without ICP enabled, the index is traversed to locate rows in the table that are within the range(s) relevant for the conditions on <span style="font-size: x-small;">keypart_1</span>. The rows in these ranges are read from the table and returned from the storage engine to the MySQL server. The server then evaluates the remaining conditions, including those that apply to <span style="font-size: x-small;">keypart_2,..,keypart_n</span>.<br>
<br>
<a href="http://jorgenloland.blogspot.com/2012/03/index-condition-pushdown-to-rescue.html#more">Read more »</a>Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com2tag:blogger.com,1999:blog-7171477457129187873.post-88144699615741031552011-10-04T10:02:00.001+02:002012-03-01T09:11:15.944+01:00Optimizer tracing: Query Execution Plan descriptions beyond EXPLAINUnderstanding why MySQL chooses a particular join order or why table scan is chosen instead of range scan is often very hard even for experienced MySQL users. Two almost identical queries, differing only in constant values, may produce completely different plans. That's why we're introducing a great new feature in 5.6: <a href="http://forge.mysql.com/wiki/MySQL_Internals_Optimizer_tracing"><i>Optimizer Tracing</i></a>. The target users of this feature are developers and MySQL users experienced enough to understand the ins and outs of EXPLAIN.<br>
<br>
<b>What Optimizer Tracing is</b><br>
You may already have guessed
this, but optimizer tracing is a printout of important decisions the
MySQL optimizer has done during the process of making the Query
Execution Plan. <br>
<br>
The
trace is presented in JSON format which is easy to read both for humans and others.<br>
<br>
Currently, the optimizer trace includes short explanations for:<br>
<ul>
<li>Why the join order was chosen.</li>
<li>Important query transformations like IN to EXISTS.</li>
<li>The access methods applicable and why one of them was chosen.</li>
<li>Which conditions are attached to each table (i.e., what hides behind "Using where" in EXPLAIN)</li>
</ul>
More coverage will be added as we go along.<br>
<b></b><br>
<br>
<b>What Optimizer Tracing is NOT</b><br>
The feature is not, and never will be, complete in the sense that it does not describe all choices the optimizer does. However, we're going to add more information when we find valuable things to add. We will not guarantee backwards compatibility like we do for EXPLAIN. This gives us the freedom to improve tracing at a high pace without going through multi-release deprecation warnings etc. In other words: optimizer tracing is not a replacement for EXPLAIN. <br>
<br>
<b>A quick tour</b><br>
The best way get a feeling of optimizer tracing is to give it a try. Below is a teaser:<br>
<b></b><a href="http://jorgenloland.blogspot.com/2011/10/optimizer-tracing-query-execution-plan.html#more">Read more »</a>Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com8tag:blogger.com,1999:blog-7171477457129187873.post-63694333736870393432011-09-27T13:42:00.002+02:002012-03-12T13:50:36.121+01:00Tips and tricks: Killer response time for non-overlapping intervalsAssume you have a table where you store <i>non-overlapping intervals</i> using two columns, e.g. IP ranges. IP ranges are simple to represent using <a href="http://www.silisoftware.com/tools/ipconverter.php">integer notation</a>:<br>
<br>
<div style="font-family: Verdana,sans-serif;">
<span style="font-size: x-small;">CREATE TABLE ip_owner (</span></div>
<div style="font-family: Verdana,sans-serif;">
<span style="font-size: x-small;"> owner_id int NOT NULL,<br> /* some columns */ </span></div>
<div style="font-family: Verdana,sans-serif;">
<span style="font-size: x-small;"> ip_start_int bigint NOT NULL, /* IP address converted to integer */<br> ip_end_int bigint NOT NULL, /* IP address converted to integer */<br> PRIMARY KEY (owner_id),</span></div>
<div style="font-family: Verdana,sans-serif;">
<span style="font-size: x-small;"> INDEX ip_range (ip_start_int, ip_end_int)</span></div>
<div style="font-family: Verdana,sans-serif;">
<span style="font-size: x-small;">) ENGINE=InnoDB;</span></div>
<br>
<br>
And then you find yourself in a situation where you want to know who, if anyone, owns the IP address <span style="font-size: x-small;">X</span>. This can be done using the following query:<br>
<br>
<div style="font-family: Verdana,sans-serif;">
<span style="font-size: x-small;">SELECT * FROM ip_owner WHERE ip_start_int <= X AND ip_end_int >= X;</span></div>
<br>
MySQL can resolve this using a range scan, but will unfortunately only be able to use the <span style="font-size: x-small;">ip_start_int <= X</span> part of the condition as a range <a href="http://jorgenloland.blogspot.com/2011/08/mysql-range-access-method-explained.html">as explained here</a>. Thus, the query will either be resolved by range scan if fairly few records have <span style="font-size: x-small;">ip_start_int <= X</span> or table scan otherwise. That means unreliable response time because it will be much quicker to query low-valued IPs than high valued IPs. I inserted 1M records into the table before running the queries below:<br>
<br>
<a href="http://jorgenloland.blogspot.com/2011/09/tips-and-tricks-killer-response-time.html#more">Read more »</a>Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com2tag:blogger.com,1999:blog-7171477457129187873.post-16386166087087375582011-08-23T13:32:00.000+02:002012-03-12T15:45:23.469+01:00The MySQL range access method explainedThe range access method uses an index to read a subset of rows that form one or multiple continuous index value intervals. The intervals are defined by the query's range predicates, which are comparisons using any of <span style="font-size: x-small;">=, <=>, IN(), IS NULL, IS NOT NULL, >, <, >=, <=, BETWEEN, !=, <> or LIKE</span>.<br>
<br>
Some examples:<br>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div class="separator" style="clear: both; text-align: center;">
</div>
<div style="font-family: Verdana,sans-serif;">
<span style="font-size: x-small;">SELECT * FROM blog WHERE author_id IN (1, 7, 8, 10)</span></div>
<div style="font-family: Verdana,sans-serif;">
<span style="font-size: x-small;">SELECT * FROM orders WHERE value > 1000</span></div>
<br>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvfd7tW3AF8g6ik4us9-GK5pwVsiMjQ9u0y0g7HGgF8fAIpAoT_l0OWKBgZqi7stZN2cMyo4ZyAAPMr9cETm4u7l1g2Ms53cwwL_JdlyGzt6mzW9inIDh10Z_SMZTlYuMsTrg6ssd-xfoY/s1600/single.JPG" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="231" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjvfd7tW3AF8g6ik4us9-GK5pwVsiMjQ9u0y0g7HGgF8fAIpAoT_l0OWKBgZqi7stZN2cMyo4ZyAAPMr9cETm4u7l1g2Ms53cwwL_JdlyGzt6mzW9inIDh10Z_SMZTlYuMsTrg6ssd-xfoY/s320/single.JPG" width="320"></a></div>
You know that the range access method is used when EXPLAIN shows <span style="font-size: x-small;">type=range</span>.<br>
<br>
Naturally, there has to be an index on the column used by the range predicate. Since indexes are ordered, MySQL will, for each interval, dive down the index using the interval start value and read it's way through the index leaves until it reaches the interval end value:<br>
<br>
<br>
<br>
<a href="http://jorgenloland.blogspot.com/2011/08/mysql-range-access-method-explained.html#more">Read more »</a>Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com3tag:blogger.com,1999:blog-7171477457129187873.post-31814045261950264732011-08-17T12:52:00.001+02:002011-08-22T14:33:29.612+02:00The meaning of ref=func in MySQL EXPLAINWhen EXPLAIN shows that a table is accessed using the [eq_]ref access type, we're used to look at the ref column to see where MySQL gets the value to look up from. Usually, we see either "const" if the value is provided as a constant in the query or a column name if the value is read from a column in an already read table:<br>
<br>
<div style="background: none repeat scroll 0% 0% black; color: white; height: 100%; line-height: 100%; overflow: auto; padding: 10px; width: 100%;"><pre><span style="font-size: xx-small;">EXPLAIN
SELECT acc_rec.cust_id, acc_rec.amount, acc_rec.invoicenumber
FROM accounts_receivable as acc_rec
WHERE <span style="color: lime;">acc_rec.cust_id=1</span>;
+----+-------------+---------+------+---------+-------+
| id | select_type | table | type | key | ref |
+----+-------------+---------+------+---------+-------+
| 1 | SIMPLE | acc_rec | ref | cust_id | <span style="color: lime;">const</span> |
+----+-------------+---------+------+---------+-------+
EXPLAIN
SELECT acc_rec.cust_id, acc_rec.amount, acc_rec.invoicenumber,
customers.firstname, customers.lastname, customers.phone
FROM accounts_receivable AS acc_rec JOIN customers AS cust
ON <span style="color: lime;">acc_rec.cust_id = customers.cust_id</span>;
+----+-------------+---------+--------+---------+-----------------+
| id | select_type | table | type | key | ref |
+----+-------------+---------+--------+---------+-----------------+
| 1 | SIMPLE | acc_rec | ALL | NULL | NULL |
| 1 | SIMPLE | cust | eq_ref | PRIMARY | <span style="color: lime;">acc_rec.cust_id</span> |
+----+-------------+---------+--------+---------+-----------------+</span></pre></div><br>
But what if ref shows the value "func"? In this case, the value used as input to [eq_]ref is the output of some function. A few examples:<br>
<br>
<a href="http://jorgenloland.blogspot.com/2011/08/meaning-of-reffunc-in-mysql-explain.html#more">Read more »</a>Jørgen Lølandhttp://www.blogger.com/profile/07423633132356058535noreply@blogger.com0