CREATE TABLE ip_owner (
owner_id int NOT NULL,
/* some columns */
/* some columns */
ip_start_int bigint NOT NULL, /* IP address converted to integer */
ip_end_int bigint NOT NULL, /* IP address converted to integer */
PRIMARY KEY (owner_id),
ip_end_int bigint NOT NULL, /* IP address converted to integer */
PRIMARY KEY (owner_id),
INDEX ip_range (ip_start_int, ip_end_int)
) ENGINE=InnoDB;
And then you find yourself in a situation where you want to know who, if anyone, owns the IP address X. This can be done using the following query:
SELECT * FROM ip_owner WHERE ip_start_int <= X AND ip_end_int >= X;
MySQL can resolve this using a range scan, but will unfortunately only be able to use the ip_start_int <= X part of the condition as a range as explained here. Thus, the query will either be resolved by range scan if fairly few records have ip_start_int <= X or table scan otherwise. That means unreliable response time because it will be much quicker to query low-valued IPs than high valued IPs. I inserted 1M records into the table before running the queries below:
mysql> /* IP near lower end */
mysql> EXPLAIN SELECT owner_id, ip_start_int, ip_end_int
-> FROM owner_ip
-> WHERE ip_start_int <= 3270002832 AND ip_end_int >= 3270002832;
+----+-------------+----------+-------+---------------+----------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+-------+---------------+----------+---------+------+------+--------------------------+
| 1 | SIMPLE | owner_ip | range | ip_range | ip_range | 8 | NULL | 1417 | Using where; Using index |
+----+-------------+----------+-------+---------------+----------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
mysql> SELECT owner_id, ip_start_int, ip_end_int
-> FROM owner_ip
-> WHERE ip_start_int <= 3270002832 AND ip_end_int >= 3270002832;
+----------+--------------+------------+
| owner_id | ip_start_int | ip_end_int |
+----------+--------------+------------+
| 1417 | 3270002832 | 3270002833 |
+----------+--------------+------------+
1 row in set (0.01 sec)
mysql> /* IP near upper end */
mysql> EXPLAIN SELECT owner_id, ip_start_int, ip_end_int
-> FROM owner_ip
-> WHERE ip_start_int <= 3272090000 AND ip_end_int >= 3272090000;
+----+-------------+----------+------+---------------+------+---------+------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------+------+---------------+------+---------+------+---------+-------------+
| 1 | SIMPLE | owner_ip | ALL | ip_range | NULL | NULL | NULL | 1048576 | Using where |
+----+-------------+----------+------+---------------+------+---------+------+---------+-------------+
1 row in set (0.00 sec)
mysql> SELECT owner_id, ip_start_int, ip_end_int
-> FROM owner_ip
-> WHERE ip_start_int <= 3272090000 AND ip_end_int >= 3272090000;
+----------+--------------+------------+
| owner_id | ip_start_int | ip_end_int |
+----------+--------------+------------+
| 1045001 | 3272090000 | 3272090001 |
+----------+--------------+------------+
1 row in set (6.63 sec)
mysql> /* IP close to median IP value */
mysql> SELECT owner_id, ip_start_int, ip_end_int
-> FROM owner_ip
-> WHERE ip_start_int <= 3271032832 AND ip_end_int >= 3271032832;
+----------+--------------+------------+
| owner_id | ip_start_int | ip_end_int |
+----------+--------------+------------+
| 516417 | 3271032832 | 3271032833 |
+----------+--------------+------------+
1 row in set (3.32 sec)
Luckily, we have valuable information we can use to significantly improve the response time:
- The intervals are non-overlapping, so there is at most one record matching the conditions
- The record we're looking for is the last record in the range of records with ip_start_int <= X. If ordered by ip_start_int in descending order, MySQL will scan the range in reverse order and thus find the record we're interested in first.
mysql> /* Apply knowledge */
mysql> /* IP near lower end */
mysql> SELECT owner_id, ip_start_int, ip_end_int
-> FROM owner_ip FORCE INDEX(ip_range)
-> WHERE ip_start_int <= 3270002832 AND ip_end_int >= 3270002832
-> ORDER BY ip_start_int DESC
-> LIMIT 1;
+----------+--------------+------------+
| owner_id | ip_start_int | ip_end_int |
+----------+--------------+------------+
| 1417 | 3270002832 | 3270002833 |
+----------+--------------+------------+
1 row in set (0.00 sec)
mysql> /* IP close to median IP value */
mysql> SELECT owner_id, ip_start_int, ip_end_int
-> FROM owner_ip FORCE INDEX(ip_range)
-> WHERE ip_start_int <= 3271032832 AND ip_end_int >= 3271032832
-> ORDER BY ip_start_int DESC
-> LIMIT 1;
+----------+--------------+------------+
| owner_id | ip_start_int | ip_end_int |
+----------+--------------+------------+
| 516417 | 3271032832 | 3271032833 |
+----------+--------------+------------+
1 row in set (0.00 sec)
mysql> /* IP near upper end */
mysql> SELECT owner_id, ip_start_int, ip_end_int
-> FROM owner_ip FORCE INDEX(ip_range)
-> WHERE ip_start_int <= 3272090000 AND ip_end_int >= 3272090000
-> ORDER BY ip_start_int DESC
-> LIMIT 1;
+----------+--------------+------------+
| owner_id | ip_start_int | ip_end_int |
+----------+--------------+------------+
| 1045001 | 3272090000 | 3272090001 |
+----------+--------------+------------+
1 row in set (0.00 sec)
This can of course be used for any non-overlapping ranges, e.g to query which customer rented a particular car on the date 2011-01-01:
SELECT cust_id
FROM car_rental FORCE INDEX (rental_dates)
WHERE car_id=1001 AND pickup_date <= 2011-01-01 AND return_date >= 2011-01-01
ORDER BY picup_date DESC
LIMIT 1;
Good write.
ReplyDeletePlease also see my earlier similar post:
SQL: finding a user's country/region based on IP.
shlominoach: Whad'ya Know! Thanks for the link :)
ReplyDelete