冗余索引对查询效率的影响

背景：
在一般的数据库书籍中，简述到如何合理创建索引时都会出现这么一段话：
“索引能提高sql的执行效率，但是过多不合理的索引也会影响数据库的性能”
过度索引是如何影响数据库的性能的呢？
1。在执行sql之前，数据库会根据metadata信息决定该使用哪个索引，如果索引过多会影响这一步骤的效率。
2。由于每次数据更新和插入都要更新索引，因此会影响相关操作的效率
而第一点就是本文的讨论重点所在。
过度索引是否真的会影响sql执行效率？
如果影响，程度是多大？测试环境：
drop table if EXISTS test_index_performance;
CREATE TABLE test_index_performance (
id int primary key ,
col1 varchar(10),
col2 varchar(10),
col3 varchar(10),
col4 varchar(10),
col5 varchar(10),
col6 varchar(10),
col7 varchar(10),
col8 varchar(10),
col9 varchar(10),
col10 varchar(10)
)engine=innodb;
delimiter $$
create PROCEDURE insert_data_for_test_index_performance ()
begin
DECLARE total int default 100000;
DECLARE i int default 0;
truncate table test_index_performance;
while(i < total)
do
insert into test_index_performance values (i,’a’,’a’,’a’,’a’,’a’,’a’,’a’,’a’,’a’,’a’);
set i=i+1;
end while ;
end $$
delimiter ;
call insert_data_for_test_index_performance(); 正文：结果一：与执行计划相关的索引（出现在possible keys的那些），索引的数量与sql执行消耗时间成正比。
create index idx1 on test_index_performance (col1);
create index idx2 on test_index_performance (col1,col2);
create index idx3 on test_index_performance (col1,col2,col3);
create index idx4 on test_index_performance (col1,col2,col3,col4);
create index idx5 on test_index_performance (col1,col2,col3,col4,col5);
create index idx6 on test_index_performance (col1,col2,col3,col4,col5,col6);
create index idx7 on test_index_performance (col1,col2,col3,col4,col5,col6,col7);
create index idx8 on test_index_performance (col1,col2,col3,col4,col5,col6,col7,col8);
create index idx9 on test_index_performance (col1,col2,col3,col4,col5,col6,col7,col8,col9);
create index idx10 on test_index_performance (col1,col2,col3,col4,col5,col6,col7,col8,col9,col10);
执行以下语句
select count(*) from test_index_performance where col1=’a’ ;
show profile for query 1; 结果的statistics部分
1索引 0.000070
2索引 0.000083
3索引 0.000107
4索引 0.000112
5索引 0.000126
6索引 0.000155
7索引 0.000152
8索引 0.000164
9索引 0.000187 结果二：与执行计划无关的索引（不出现在possible keys的那些），不会影响sql的执行效率。
create index idx12 on test_index_performance (col2);
create index idx13 on test_index_performance (col2,col3);
create index idx14 on test_index_performance (col2,col3,col4);
create index idx15 on test_index_performance (col2,col3,col4,col5);
create index idx16 on test_index_performance (col2,col3,col4,col5,col6);
create index idx17 on test_index_performance (col2,col3,col4,col5,col6,col7);
create index idx18 on test_index_performance (col2,col3,col4,col5,col6,col7,col8);
create index idx19 on test_index_performance (col2,col3,col4,col5,col6,col7,col8,col9);
create index idx20 on test_index_performance (col2,col3,col4,col5,col6,col7,col8,col9,col10);
执行以下语句
select count(*) from test_index_performance where col1=’a’ ; 结果三：表的大小，与索引对于sql执行效率的影响，没有直接联系
show profile for query 1; 结果的statistics部分
1w条 0.000187
10w条 0.000192
20w条 0.000198
30w条 0.000192 总结：
1。与本条语句执行相关的index的数量（possible key），会影响最终效率
2。对效率的影响体现在，statistics阶段
3。原因在于优化器需要从information_schema中获取相关索引的metadata信息并分析，索引数量越多，这个过程越漫长
4。与本条语句执行无关的index数量不影响最终效率
5。效率影响在10%左右转自：mysqlsystems 作者：cenalulu