mysql query with multiple groups? - mysql

I have a query that gets product IDs based on keywords.
SELECT indx_search.pid
FROM indx_search
LEFT JOIN word_index_mem ON (word_index_mem.word = indx_search.word)
WHERE indx_search.word = "phone"
GROUP BY indx_search.pid
ORDER BY indx_search.pid ASC
LIMIT 0,20
This works well but now I'm trying to go a step further and implement "price range" into this query.
CREATE TABLE `price_range` (
`pid` int(11) NOT NULL,
`range_id` tinyint(3) NOT NULL,
PRIMARY KEY (`pid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
This table simply contains product IDs and a range_id. The price range values are stored here:
CREATE TABLE `price_range_values` (
`ID` tinyint(3) NOT NULL AUTO_INCREMENT,
`rangeFrom` float(10,2) NOT NULL,
`rangeTo` float(10,2) NOT NULL,
PRIMARY KEY (`ID`)
) ENGINE=MyISAM AUTO_INCREMENT=32 DEFAULT CHARSET=latin1
I want to GROUP price_range.range_id with COUNT() of how many products match the certain price range within the current query. I still want to receive my 20 results of product IDs.
So something along the lines of:
SELECT indx_search.pid, price_range.range_id as PriceRangeID, COUNT(price_range.range_id) as PriceGroupTotal
FROM indx_search
LEFT JOIN windex_mem ON ( windex_mem.word = indx_search.word )
LEFT JOIN price_range ON ( price_range.pid = indx_search.pid )
WHERE indx_search.word = "memory"
GROUP BY indx_search.pid, PriceRangeID
ORDER BY indx_search.pid ASC
LIMIT 0 , 20
Is this possible to accomplish without busting an additional query?

Try to use a subquery with a LIMIT clause, e.g. -
SELECT
i_s.pid, p_r.range_id as PriceRangeID, COUNT(p_r.range_id) as PriceGroupTotal
FROM
(SELECT * FROM indx_search WHERE i_s.word = 'memory' ORDER BY i_s.pid ASC LIMIT 0 , 20) i_s
LEFT JOIN
windex_mem w_m ON w_m.word = i_s.word
LEFT JOIN
price_range p_r ON p_r.pid = i_s.pid
GROUP BY
i_s.pid, PriceRangeID

Related

MySQL - how to optimize query with order by

I am trying to generate a list of the 5 most recent history items for for a collection of user tasks. If I remove the order by the execution drops from ~2 seconds to < 20msec.
Indexes are on
h.task_id
h.mod_date
i.task_id
i.user_id
This is the query
SELECT h.*
, i.task_id
, i.user_id
, i.name
, i.completed
FROM h
, i
WHERE i.task_id = h.task_id
AND i.user_id = 42
ORDER
BY h.mod_date DESC
LIMIT 5
Here is the explain:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE i ref PRIMARY,UserID UserID 4 const 3091 Using temporary; Using filesort
1 SIMPLE h ref TaskID TaskID 4 myDB.i.task_id 7
Here are the show create tables:
CREATE TABLE `h` (
`history_id` int(6) NOT NULL AUTO_INCREMENT,
`history_code` tinyint(4) NOT NULL DEFAULT '0',
`task_id` int(6) NOT NULL,
`mod_date` datetime NOT NULL,
`description` text NOT NULL,
PRIMARY KEY (`history_id`),
KEY `TaskID` (`task_id`),
KEY `historyCode` (`history_code`),
KEY `modDate` (`mod_date`)
) ENGINE=InnoDB AUTO_INCREMENT=185647 DEFAULT CHARSET=latin1
and
CREATE TABLE `i` (
`task_id` int(6) NOT NULL AUTO_INCREMENT,
`user_id` int(6) NOT NULL,
`name` varchar(60) NOT NULL,
`due_date` date DEFAULT NULL,
`create_date` date NOT NULL,
`completed` tinyint(1) NOT NULL DEFAULT '0',
`task_description` blob,
PRIMARY KEY (`task_id`),
KEY `name_2` (`name`),
KEY `UserID` (`user_id`)
) ENGINE=InnoDB AUTO_INCREMENT=12085 DEFAULT CHARSET=latin1
INDEX(task_id, mod_date, history_id) -- in this order
Will be "covering" and the columns will be in the optimal order
Also, DROP
KEY `TaskID` (`task_id`)
So that the Optimizer won't be tempted to use it.
Try changing the index on h.task_id so it's this compound index.
CREATE OR REPLACE INDEX TaskID ON h(task_id, mod_date DESC);
This may (or may not) allow MySql to shortcut some or all the extra work in your ORDER BY ... LIMIT ... request. It's a notorious performance anti pattern, by the way, but sometimes necessary.
Edit the index didn't help. So let's try a so-called deferred join so we don't have to ORDER and then LIMIT all the data from your h table.
Start with this subquery. It retrieves only the primary key values for the rows involved in your results, and will generate just five rows.
SELECT h.history_id, i.task_id
FROM h
JOIN i ON h.task_id = i.task_id
WHERE i.user_id = 42
ORDER BY h.mod_date
LIMIT 5
Why this subquery? It handles the work-intensive ORDER BY ... LIMIT operation while manipulating only the primary keys and the date. It still must sort tons of rows only to discard all but five, but the rows it has to handle are much shorter. Because this subquery does the heavy work, you focus on optimizing it, rather than the whole query.
Keep the index I suggested above, because it covers the subquery for h.
Then, join it to the rest of your query like this. That way you'll only have to retrieve the expensive h.description column for the five rows you care about.
SELECT h.* , i.task_id, i.user_id , i.name, i.completed
FROM h
JOIN i ON i.task_id = h.task_id
JOIN (
SELECT h.history_id, i.task_id
FROM h
JOIN i ON h.task_id = i.task_id
WHERE i.user_id = 42
ORDER BY h.mod_date
LIMIT 5
) selected ON h.history_id = selected.history_id
AND i.task_id = selected.task_id
ORDER BY h.mod_date DESC
LIMIT 5

How to get NULL at the top of the sort?

Two tables are defined:
CREATE TABLE `users` (
`user_id` mediumint(6) unsigned NOT NULL AUTO_INCREMENT,
`score` tinyint(1) unsigned DEFAULT NULL,
PRIMARY KEY (`user_id`)
);
CREATE TABLE `online` (
`user_id` mediumint(6) unsigned NOT NULL AUTO_INCREMENT,
`url` varchar(255) NOT NULL,
PRIMARY KEY (`user_id`)
);
How to combine the tables so that the result would be sorted by the score field from the largest to the smallest but at the top there were records with the value NULL?
This query does not sort the second sample:
(SELECT * FROM `online` JOIN `users` USING(`user_id`) WHERE `score` IS NULL)
UNION
(SELECT * FROM `online` JOIN `users` USING(`user_id`) WHERE `score` IS NOT NULL ORDER BY `score` DESC)
Use two keys in the sort:
SELECT *
FROM `online` o JOIN
`users`
USING (user_id)
ORDER BY (`score` IS NULL) DESC, Score DESC;
MySQL treats booleans as numbers in a numeric context, with "1" for true and "0" for false. So, DESC puts the true values first.
Incidentally, your version would look like it works if you used UNION ALL rather than UNION. However, it is not guaranteed that the results are in any particular order unless you explicitly have an ORDER BY.
The UNION incurs overhead for removing duplicates and in doing so rearranges the data.
Try:
select * from online join users using (user_id) order by ifnull(score, 10) desc;
You can use order by Nulls Last in the end of your sql to show nulls on the first.
You can try below -
select * from
(
SELECT *,1 as ord FROM `online` JOIN `users` USING(`user_id`) WHERE `score` IS NULL
UNION
SELECT *,2 FROM `online` JOIN `users` USING(`user_id`) WHERE `score` IS NOT NULL
)A ORDER BY ord asc,`score` DESC

mysql: inner join with in and not in

I'm trying to pull rows from one table "articles" based on specific category tags from table "article_category_reference", to exclude articles that have a specific tag. I have this query right now:
SELECT DISTINCT
a.article_id,
a.`title`,
a.`text`,
a.`date`
FROM
`articles` a
INNER JOIN `article_category_reference` c ON
a.article_id = c.article_id AND c.`category_id` NOT IN (54)
WHERE
a.`active` = 1
ORDER BY
a.`date`
DESC
LIMIT 15
The problem is, it seems to grab rows even if they do have a row in the "article_category_reference" table where "category_id" matches "54". I've also tried it in the "where" clause and it makes no difference.
Keep in mind I'm using "NOT IN" as it may be excluding multiple tags.
SQL fiddle to show it: http://sqlfiddle.com/#!9/b2172/1
Tables:
CREATE TABLE `article_category_reference` (
`ref_id` int(11) NOT NULL,
`article_id` int(11) NOT NULL,
`category_id` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
CREATE TABLE `articles` (
`article_id` int(11) UNSIGNED NOT NULL,
`author_id` int(11) UNSIGNED NOT NULL,
`date` int(11) NOT NULL,
`title` varchar(120) NOT NULL,
`text` text CHARACTER SET utf8mb4 NOT NULL,
`active` int(1) NOT NULL DEFAULT '1'
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
One option is to use an EXISTS clause:
SELECT DISTINCT
a.article_id,
a.title,
a.text,
a.date
FROM articles a
WHERE
a.active = 1 AND
NOT EXISTS (SELECT 1 FROM article_category_reference c
WHERE a.article_id = c.article_id AND c.category_id = 54)
ORDER BY
a.date DESC
LIMIT 15;
The logical problem with your current approach of checking the category in the WHERE clause is that it is checking individual records. You need to assert that all category records for a given article, in aggregate, do not match the category you wish to exclude. An EXISTS clause, as I have written above, is one way to do it. Using GROUP BY in a subquery is another way.
The NOT IN condition is evaluated for each joined rows. Since you have same article_id with multiple category_id-values, the ones that do match the NOT IN condition will get picked.
See SQLFiddle.
To select articles that do not have any rows with category_id 54 use a subquery:
SELECT
a.article_id,
a.`title`,
a.`text`,
a.`date`
FROM `articles` a
WHERE a.`active` = 1 AND
a.`article_id` not in (
SELECT c.article_id
FROM `article_category_reference` c
WHERE c.`category_id` = 54
)
ORDER BY a.`date`
DESC
LIMIT 15

Improve speed of MySQL query with 5 left joins

Working on a support ticketing system with not a lot of tickets (~3,000). To get a summary grid of ticket information, there are five LEFT JOIN statements on custom field table (j25_field_value) containing about 10,000 records. The query runs too long (~10 seconds) and in cases with a WHERE clause, it runs even longer (up to ~30 seconds or more).
Any suggestions for improving the query to reduce the time to run?
Four tables:
j25_support_tickets
CREATE TABLE `j25_support_tickets` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`category_id` int(11) NOT NULL DEFAULT '0',
`user_id` int(11) DEFAULT NULL,
`email` varchar(50) DEFAULT NULL,
`subject` varchar(255) DEFAULT NULL,
`message` text,
`modified_date` datetime DEFAULT NULL,
`priority_id` tinyint(3) unsigned DEFAULT NULL,
`status_id` tinyint(3) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=3868 DEFAULT CHARSET=utf8
j25_support_priorities
CREATE TABLE `j25_support_priorities` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=14 DEFAULT CHARSET=utf8
j25_support_statuses
CREATE TABLE `j25_support_statuses` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=7 DEFAULT CHARSET=utf8
j25_field_value (id, ticket_id, field_id, field_value)
CREATE TABLE `j25_support_field_value` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`ticket_id` int(11) DEFAULT NULL,
`field_id` int(11) DEFAULT NULL,
`field_value` tinytext,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=10889 DEFAULT CHARSET=utf8
Also, ran this:
SELECT LENGTH(field_value) len FROM j25_support_field_value ORDER BY len DESC LIMIT 1
note: the result = 38
The query:
SELECT DISTINCT t.id as ID
, (select p.title from j25_support_priorities p where p.id = t.priority_id) as Priority
, (select s.title from j25_support_statuses s where s.id = t.status_id) as Status
, t.subject as Subject
, t.email as SubmittedByEmail
, type.field_value AS IssueType
, ver.field_value AS Version
, utype.field_value AS UserType
, cust.field_value AS Company
, refno.field_value AS RefNo
, t.modified_date as Modified
FROM j25_support_tickets AS t
LEFT JOIN j25_support_field_value AS type ON t.id = type.ticket_id AND type.field_id =1
LEFT JOIN j25_support_field_value AS ver ON t.id = ver.ticket_id AND ver.field_id =2
LEFT JOIN j25_support_field_value AS utype ON t.id = utype.ticket_id AND utype.field_id =3
LEFT JOIN j25_support_field_value AS cust ON t.id = cust.ticket_id AND cust.field_id =4
LEFT JOIN j25_support_field_value AS refno ON t.id = refno.ticket_id AND refno.field_id =5
ALTER TABLE j25_support_field_value
ADD INDEX (`ticket_id`,`field_id`,`field_value`(50))
This index will work as a covering index for your query. It will allow the joins to use only this index to look up the values. It should perform massively faster than without this index, since currently your query would have to read every row in the table to find what matches each combination of ticket_id and field_id.
I would also suggest converting your tables to InnoDB engine, unless you have a very explicit reason for using MyISAM.
ALTER TABLE tablename ENGINE=InnoDB
As above - a better index would help. You could probably then simplify your query into something like this too (join to the table only once):
SELECT t.id as ID
, p.title as Priority
, s.title as Status
, t.subject as Subject
, t.email as SubmittedByEmail
, case when v.field_id=1 then v.field_value else null end as IssueType
, case when v.field_id=2 then v.field_value else null end as Version
, case when v.field_id=3 then v.field_value else null end as UserType
, case when v.field_id=4 then v.field_value else null end as Company
, case when v.field_id=5 then v.field_value else null end as RefNo
, t.modified_date as Modified
FROM j25_support_tickets AS t
LEFT JOIN j25_support_field_value v ON t.id = v.ticket_id
LEFT JOIN j25_support_priorities p ON p.id = t.priority_id
LEFT JOIN j25_support_statuses s ON s.id = t.status_id;
You can do away with the subqueries for starters and just get them from another join. You can add an index to j25_support_field_value
alter table j25_support_field_value add key(id, field_type);
I assume there is an index on id in j25_support_tickets - if not and if they are unique, add a unique index alter table j25_support_tickets add unique key(id); If they're not unique, remove the word unique from that statement.
In MySQL, a join usually requires an index on the field(s) that you are using to join on. This will hold up and produce very reasonable results with huge tables (100m+), if you follow that rule, you will not go wrong.
are the ids in j25_support_tickets unique? If they are you can do away with the distinct - if not, or if you are getting exact dupicates in each row, still do away with the distinct and add a group by t.id to the end of this:
SELECT t.id as ID
, p.title as Priority
, s.title as Status
, t.subject as Subject
, t.email as SubmittedByEmail
, type.field_value AS IssueType
, ver.field_value AS Version
, utype.field_value AS UserType
, cust.field_value AS Company
, refno.field_value AS RefNo
, t.modified_date as Modified
FROM j25_support_tickets AS t
LEFT JOIN j25_support_field_value AS type ON t.id = type.ticket_id AND type.field_id =1
LEFT JOIN j25_support_field_value AS ver ON t.id = ver.ticket_id AND ver.field_id =2
LEFT JOIN j25_support_field_value AS utype ON t.id = utype.ticket_id AND utype.field_id =3
LEFT JOIN j25_support_field_value AS cust ON t.id = cust.ticket_id AND cust.field_id =4
LEFT JOIN j25_support_field_value AS refno ON t.id = refno.ticket_id AND refno.field_id =5
LEFT JOIN j25_support_priorities p ON p.id = t.priority_id
LEFT JOIN j25_support_statuses s ON s.id = t.status_id;
Switch to InnoDB.
After switching to InnoDB, make the PRIMARY KEY for j25_support_field_value be (ticket_id, field_id) (and get rid if id). (Tacking on field_value(50) will hurt, not help.)
A PRIMARY KEY is a UNIQUE KEY, so don't have both.
Use VARCHAR(255) instead of the nearly-equivalent TINYTEXT.
EAV schema sucks. My ran on EAV.

Some help needed with a SQL query

I need some help with a MySQL query. I have two tables, one with offers and one with statuses. An offer can has one or more statuses. What I would like to do is get all the offers and their latest status. For each status there's a table field named 'added' which can be used for sorting.
I know this can be easily done with two queries, but I need to make it with only one because I also have to apply some filters later in the project.
Here's my setup:
CREATE TABLE `test`.`offers` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`client` TEXT NOT NULL ,
`products` TEXT NOT NULL ,
`contact` TEXT NOT NULL
) ENGINE = MYISAM ;
CREATE TABLE `statuses` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`offer_id` int(11) NOT NULL,
`options` text NOT NULL,
`deadline` date NOT NULL,
`added` datetime NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
Should work but not very optimal imho :
SELECT *
FROM offers
INNER JOIN statuses ON (statuses.offer_id = offers.id
AND statuses.id =
(SELECT allStatuses.id
FROM statuses allStatuses
WHERE allStatuses.offer_id = offers.id
ORDER BY allStatuses.added DESC LIMIT 1))
Try this:
SELECT
o.*
FROM offers o
INNER JOIN statuses s ON o.id = s.offer_id
ORDER BY s.added
LIMIT 1