MySQL GROUP BY with Using Temporary unnecessarily? - mysql

I am trying to optimize a query. Using EXPLAIN tells me it is Using temporary. This is really inefficient given the size of the table (20m+ records). Looking at the MySQL documentation Internal Temporary Tables I don't see anything that would imply the need for a Temporary table in my query. I also tried setting the ORDER BY to the same as the GROUP BY, but still says Using Temporary and query takes forever to run. I am using MySQL 5.7.
Is there a way to avoid using a temporary table for this query:
SELECT url,count(*) as sum
FROM `digital_pageviews` as `dp`
WHERE `publisher_uuid` = '8b83120e-3e19-4c34-8556-7b710bd7b812'
GROUP BY url
ORDER BY NULL;
This is my table schema:
create table digital_pageviews
(
id int unsigned auto_increment
primary key,
visitor_uuid char(36) null,
publisher_uuid char(36) default '' not null,
property_uuid char(36) null,
ip_address char(15) not null,
referrer text null,
url_delete text null,
url varchar(255) null,
url_tmp varchar(255) null,
meta text null,
date_created timestamp not null,
date_updated timestamp null
)
collate = utf8_unicode_ci;
create index digital_pageviews_url_index
on digital_pageviews (url);
create index ndx_date_created
on digital_pageviews (date_created);
create index ndx_property_uuid
on digital_pageviews (property_uuid);
create index ndx_publisher_uuid
on digital_pageviews (publisher_uuid);
create index ndx_visitor_uuid_page
on digital_pageviews (visitor_uuid);

The reason it needs a temporary table is that it cannot both filter by publisher_uuid and sort on a column without an index to do both. The first step is to filter by publisher_uuid, so it uses the index on publisher_uuid.
However, next it has to group by and order the records, which will require a temporary table because it cannot use an index which will do this. The reason it cannot use an index is that it already used the publisher_uuid, which is not indexed on the url field to do the group by or on the field you are ordering by.
To filter where publisher_uuid = '8b83120e-3e19-4c34-8556-7b710bd7b812', group by url, and order by url, create an index with these fields in this order:
publisher_uuid
url
create index ndx_publisher_uuid
on digital_pageviews (publisher_uuid, url);

Related

Why does MySQL return different result when query column with or without order by?

I have table
CREATE TABLE `record_temp_var` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`p_id` bigint(20) NOT NULL,
`var_name` VARCHAR(128) NOT NULL,
`var_value` VARCHAR(128) NOT NULL,
PRIMARY KEY (`id`),
KEY idx_var_name(`var_name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The table has a bunch of records, but I found a strange thing with executing these two sqls.
SELECT id FROM `record_temp_var` order by id limit 10;
The first row return with order by id is the really first record of this table.
SELECT id FROM `record_temp_var` limit 10;
But without order by id, the first row return is not the first record in the table.
I have researched MySQL for quite a long time. I believe that MySQL may use a different index.
Primary Key
idx_var_name
But why I select id, MySQL use the idx_var_name index?
Regarding your second LIMIT query which does not have an ORDER BY clause:
SELECT id FROM record_temp_var LIMIT 10;
it is not really a relevant point how MySQL chooses 10 records from the table. Rather, you the only thing which you should be assuming in this case is that MySQL is free to choose any 10 records it wants, since you did not provide instructions to the contrary.
In general, when using LIMIT you should always include an ORDER BY clause to make the query have sense.

Optimize MYSQL Select query in large table

Given the table:
CREATE TABLE `sample` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`vendorid` VARCHAR(45) NOT NULL,
`year` INT(10) NOT NULL,
`title` TEXT NOT NULL,
`description` TEXT NOT NULL
PRIMARY KEY (`id`) USING BTREE
)
Table size: over 7 million. All fields are not unique, except id.
Simple query:
SELECT * FROM sample WHERE title='milk'
Takes over 45s-60s to complete.
Tried to put unique index on title and description but got 1170 error.
How could I optimize it? Would be very grateful for suggestions.
TEXT columns need prefix indexes -- it's not possible to index their entire contents; they can be too large. And, if the column values aren't unique, don't use UNIQUE indexes; they won't work.
Try this:
ALTER TABLE simple ADD INDEX title_prefix (title(64));
Pro tip For columns you need to use in WHERE statements, do your best to use VARCHAR(n) where n is less than 768. Avoid TEXT and other blob types unless you absolutely need them; they can make for inefficient operation of your database server.

why mysql still use index to get data when use the 2nd col of multiple column index in mysql?

Why mysql still use index to get data when use the 2nd col of multiple column index in mysql?
We know mysql use leftmost match rule, but here I didn't use the 1st col and I use the 2nd col, the two select operation results bellow show that mysql sometimes use index and sometimes didn't use it. Why? In addtion, my mysql version is 5.6.17.
1.create table:
CREATE TABLE `student` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`cid` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `name_cid_INX` (`name`,`cid`)
) ENGINE=InnoDB AUTO_INCREMENT=101 DEFAULT CHARSET=utf8
2.run select:
EXPLAIN SELECT * FROM student WHERE cid=1;
3. result:
Result with index
It shows that mysql use index to get data.
The following is another table.
1.create table:
CREATE TABLE `test_table` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(45) DEFAULT NULL,
`birthday` datetime DEFAULT NULL,
`address` varchar(45) DEFAULT NULL,
`phone` varchar(45) DEFAULT NULL,
`note` varchar(45) DEFAULT NULL,
`age` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `NAME` (`name`),
KEY `AGE` (`age`),
KEY `LeftMostPreFix` (`name`,`address`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
2.run select:
explain SELECT * FROM test.test_table where address = '东京'
3.result:
Result without index
On the contrary here it shows that mysql didn't use index to get data.
Comparing above two results, I feel puzzled why the 1st result use index which is against leftmost match rule.
From the mysql manual
it is possible that key will name an index that is not present in the possible_keys value. This can happen if none of the possible_keys indexes are suitable for looking up rows, but all the columns selected by the query are columns of some other index. That is, the named index covers the selected columns, so although it is not used to determine which rows to retrieve, an index scan is more efficient than a data row scan.
So while there is a key used here, it's not actually used in the normal sense. In some situations it is still more efficient to use that as a table scan (in your first example), in others it might not be (in your second)
Most of the times these things are decided by the optimizer based on several things (usage of the table, etc).
Best thing to remember is that here you can NOT "use the index", and that's why there is no index in possible keys. You can only use the index if the first column is in there.
Neither index in either Case starts with what is in the WHERE, so there will be a full scan of table or of index.
Case 1: The index is "covering", so it is a tossup as to which (table scan vs index scan) is better. The Optimizer happened to pick the secondary index. EXPLAIN FORMAT=JSON SELECT ... may have enough details to explain 'why' in this case.
Case 2: Because of * (in SELECT *), the secondary index is at a disadvantage -- it is not "covering", so the processing will bounce back and forth between the index and the data. So it is clearly better to simply scan the table.
Instead of trying to understand EXPLAIN (in these cases), turn the question around: "What is the optimal index for this query against this table?" Then follow the guidelines here.

Can I create a MySQL index for LIKE searches with both left and right wildcards?

I’m using MySQL 5.5.37. I have a table with a column
`NAME` varchar(100) COLLATE utf8_bin NOT NULL
and I intend to have partial searches on the name column like
select * FROM organnization where name like ‘%abc%’
Note that I want to search that the string “abc” occur anywhere, not necessarily at the beginning. Given this, is there any index I can use on the column to optimize query execution?
If you expect a few matching results only, you can still create index on the name column to speed up queries, with help of a primary key.
If your table have a primary key like
org_id int not null auto_increment primary key,
name varchar(100) COLLATE utf8_bin NOT NULL,
desc varchar(200) COLLATE utf8_bin NOT NULL,
size int,
....
you can create an index on (name, org_id)
and do your query like this:
select * from orgnizations o1 join (select org_id from orgnizations where name like '%abcd%' ) o2 using (org_id)
should be faster than your original query
if you only need one or two other columns for the name searching, you can include those columns in your name index and do queries like
select org_id, name, size from orgnizations where name like '%abcd%'
will still be much faster then the full table scan

Stuck with optimization on MySql, getting using temporary, using file sort... messages on EXPLAIN

I'm profiling a web application, trying to cut on unnecessary delays on queries and I found one that seems to be simple, but take a lot of time for execute.
Using EXPLAIN I get the following messages:
Using where; Using temporary; Using filesort
This is the query:
SELECT `bt`.`id_brand`
FROM `brands_translation` AS `bt`
WHERE bt.code_language = 'es'
GROUP BY `bt`.`id_brand`
ORDER BY `bt`.`name` ASC
And the table definition:
CREATE TABLE IF NOT EXISTS `brands_translation` (
`id_brand` int(64) unsigned NOT NULL,
`code_language` varchar(3) NOT NULL,
`name` varchar(128) NOT NULL,
`link` varchar(255) default NULL,
`logo` varchar(255) default NULL,
`description` text NOT NULL,
KEY `id_brand` (`id_brand`),
KEY `code_language` (`code_language`),
KEY `name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
I try to solve it creating indexes for every involved field with no result.
Any idea with that?
Thanks in advance
With a proper INDEX, MySQL may very well sort your results properly without the ORDER BY statement. You'll want to look into ORDER BY Optimization and LIMIT Optimization.
Otherwise, temporary and filesort usage come from a differing GROUP BY and ORDER BY. For this particular query, I would remove the indices you have in place and add this one:
ALTER TABLE `brands_translation` ADD INDEX (`code_language` , `id_brand` , `name` );
Of course, this may affect other queries throughout your project. Once that's done, change your query to:
SELECT `bt`.`id_brand`
FROM `brands_translation` AS `bt`
WHERE bt.code_language = 'es'
GROUP BY `bt`.`id_brand`, `bt`.`name`
ORDER BY `bt`.`id_brand`, `bt`.`name`
Realizing that you may not want to group by name, you can remove name from the GROUP BY statement, but that will give you using temporary again (since the GROUP and ORDER are different).
please look at the following...
drop table if exists brands_translation;
create table brands_translation
(
code_language varchar(3) not null,
id_brand int unsigned not null,
-- other fields here...
primary key(code_language, id_brand) -- clustered composite primary key (please !!)
)
engine=innodb;
why quote ` when you dont need to and sort out your data types pls.