I have a list with filters and I have to count how many items are in each one filter, but the following query gets slower and much slower when multiple filters are set
SELECT COUNT(*) FROM (
SELECT itf.`filter_id`
FROM `item_to_filter` AS `itf`
JOIN `item_to_inventory` AS `iti` ON (itf.`item_id` = iti.`item_id` AND iti.`quantity` > 0)
WHERE 1 = 1
AND (
(itf.`filter_group_id` = 2 AND itf.`filter_id` IN (1))
OR (itf.`filter_group_id` = 4 AND itf.`filter_id` IN (55)) //gets slower
OR (itf.`filter_group_id` = 1 AND itf.`filter_id` IN (107, 108)) //gets much slower
)
GROUP BY itf.`item_id`
HAVING COUNT(DISTINCT itf.`filter_group_id`) = 3
) AS `total_items`
Is there any other way to write the query to count items from each filter
Here you can see the tables structure and data (are the indexes correctly sets?)
Try this one: Wrap the conditions into a UNION ALL subquery. Then join it with your tables.
SELECT COUNT(*) FROM (
SELECT itf.filter_id
FROM (
SELECT 2 as filter_group_id 1 as filter_id UNION ALL
SELECT 4 as filter_group_id 55 as filter_id UNION ALL
SELECT 1 as filter_group_id 107 as filter_id UNION ALL
SELECT 1 as filter_group_id 108 as filter_id
) f -- filters
JOIN item_to_filter AS itf USING(filter_group_id, filter_id)
JOIN item_to_inventory AS iti ON itf.item_id = iti.item_id
WHERE iti.quantity > 0
GROUP BY itf.item_id
HAVING COUNT(DISTINCT itf.filter_group_id) = 3
) AS total_items
You should define an index on item_to_filter(filter_group_id, filter_id). But as I wrote in the comment, it's possible that the engine can do the same optimization if that index is present.
To make a better use of the given indexes you could rewrite your query to
SELECT COUNT(*) FROM (
SELECT item_id
FROM item_to_inventory AS iti
JOIN item_to_filter AS itf2 USING(item_id) -- itf2: group_id = 2
JOIN item_to_filter AS itf4 USING(item_id) -- itf4: group_id = 4
JOIN item_to_filter AS itf1 USING(item_id) -- itf1: group_id = 1
WHERE iti.quantity > 0
AND itf2.filter_group_id = 2 AND itf2.filter_id IN (1)
AND itf4.filter_group_id = 4 AND itf4.filter_id IN (55)
AND itf1.filter_group_id = 1 AND itf1.filter_id IN (107,108)
) AS total_items
But even here an index on item_to_filter(filter_group_id, filter_id) should improve the performance.
Related
I have a query making use of COALESCE() to count the combination of 2 columns:
SELECT method, main_ingredient, COUNT(*) AS cnt FROM `recipes`
GROUP BY COALESCE( method, main_ingredient )
The result is useful. Sample result:
method main_ingredient cnt
================================
1 4 10
2 1 6
3 6 3
4 6 5
5 2 4
6 8 2
However, how can I obtain the results that has COUNT(*) equals to 0 ?
UPDATE with expected output:
method main_ingredient cnt
================================
1 2 0
1 3 0
1 5 0
1 6 0
2 2 0
2 3 0
.
.
.
.
etc
UPDATE added the tbl_methods and tbl_main_ingredients:
Schema of tbl_methods:
id method_name
=================
1 Method 1
2 Method 2
.
.
.
6 Method 6
Schema of tbl_main_ingredients:
id ingredient_name
======================
1 Ingredient 1
2 Ingredient 2
.
.
.
8 Ingredient 8
Both id are the primary key of their table, auto-increment.
First you need to make a CROSS JOIN between tbl_methods and tbl_main_ingredients table in order to obtain the all possible combination of method and ingredient.
Later make a left join between the above cross joined table and your reipes table on matching method and main_ingredient.
Thus you will obtain a result for all possible combination of method and main_ingredient. If any combination exists in recipes table then you will get the corresponding count otherwise you will obtain 0 as count.
SELECT
method_ingredients.method_id,
method_ingredients.ingredients_id,
COUNT(R.method) AS cnt
FROM
(
SELECT
TM.id AS method_id,
TMI.id AS ingredients_id
FROM tbl_methods TM
CROSS JOIN tbl_main_ingredients TMI
) AS method_ingredients
LEFT JOIN `recipes` R ON R.method = method_ingredients.method_id AND R.main_ingredient = method_ingredients.ingredients_id
GROUP BY method_ingredients.method_id, method_ingredients.ingredients_id
ORDER BY method_ingredients.method_id, method_ingredients.ingredients_id;
Or
you can prefer the shorter version of this query:
SELECT
TM.id AS method_id,
TMI.id AS ingredients_id,
COUNT(R.method) AS cnt
FROM tbl_methods TM
CROSS JOIN tbl_main_ingredients TMI
LEFT JOIN `recipes` R ON R.method = TM.id AND R.main_ingredient = TMI.id
GROUP BY TM.id, TMI.id
ORDER BY TM.id, TMI.id;
More:
Some subtleties regarding COUNT:
SELECT COUNT(0); Result: 1
SELECT COUNT(-1); Result: 1
SELECT COUNT(NULL); Result: 0
SELECT COUNT(71); Result: 1
SQL FIDDLE
BTW there's nothing to do with COALESCE in your use case. COALESCE returns the first non-NULL element from the list if there's any otherwise NULL.
Example:
SELECT COALESCE(NULL,NULL,NULL,'abc',NULL,'def'); returns abc
SELECT COALESCE(NULL,NULL,NULL); returns NULL
Could be you need to check if the main_ingredient is null
SELECT method, ifnull(main_ingredient,0), COUNT(*) AS cnt FROM `recipes`
GROUP BY method
Cross join your 2 base tables, then left join on recipes. Then, if you count any of the left joined columns, you will get the desired result:
select m.id, i.id, count(r.method) as cnt
from tbl_methods m
cross join tbl_main_ingredients i
left join recipes r
on r.method = m.id
and r.main_ingredient = i.id
group by m.id, i.id
order by m.id, i.id
I have the following tables, for example:
invoices
ID Name
1 A
2 B
3 C
4 D
5 E
transactions
ID Invoice_ID User_ID
1 1 10
2 1 10
3 1 10
4 2 30
5 3 20
6 3 40
7 2 30
8 2 30
9 4 40
10 3 50
Now I want to make a select that will pull the invoices and the user_id from the related transactions, but of course if I do that I won't get all the ids, since they may be distinct but there will be only one column for that. What I want to do is that if there are distinct User_ids, I will display a pre-defined text in the column instead of the actual result.
select invoices.id, invoices.name, transactions.user_id(if there are distinct user_ids -> return null)
from invoices
left join transactions on invoices.id = transactions.invoice_id
and then this would be the result
ID Name User_ID
1 A 10
2 B 30
3 C null
4 D 40
5 E null
Is this possible?
You can do the following :
select
invoices.id,
invoices.name,
IF (
(SELECT COUNT(DISTINCT user_id) FROM transactions WHERE transactions.invoice_id = invoices.id) = 1,
(SELECT MAX(user_id) FROM transactions WHERE transactions.invoice_id = invoices.id),
null
) AS user_id
from invoices
Or, alternatively, you can use the GROUP_CONCAT function to output a comma-separated list of users for each invoice. It is not exactly what you asked, but maybe in fact it will be more useful :
select
invoices.id,
invoices.name,
GROUP_CONCAT(DISTINCT transactions.user_id SEPARATOR ',') AS user_ids
from invoices
left join transactions on invoices.id = transactions.invoice_id
group by invoices.id
Try somethingh like:
select i.id, i.name, t.user_id
from invoices i left join
(
select invoice_ID, User_ID
from transactions
group by invoice_ID
having count(invoice_ID)=1
) t on i.id=t.invoice_id
SQL fiddle
You could list all the transactions that have multiple user ids, like this:
select invoices.id, invoices.name, null
from invoices
left join transactions on invoices.id = transactions.invoice_id having count(distinct transactions.user_id) > 1
Also, I think this CASE might suit your needs here:
select invoices.id, invoices.name,
case when count(distinct transactions.user_id) > 1 then null else transactions.user_id end
from invoices
left join transactions on invoices.id = transactions.invoice_id
group by invoices.id
although, I'm not sure this is syntactically correct
Create or replace view cnPointsDetailsvw
as select sum(cd.value), sum(cd1.points)
from customerdetails cd left join
customerdetails1 cd1 on cd.customerid = cd1.customerid;
The problem is that the above query is calculating sum multiple times for the column cd1.points
If table customerdetails1 has only 1 row, so why you use SUM() function?
Just use MAX().
I am confused of your table, so let me give a sample structurs and data.
table1
id points
-----------
1 10
2 20
3 40
table2
id points
-----------
1 10
1 2
1 4
2 20
3 40
3 5
And your query should be looks like this :
CREATE OR REPLACE VIEW view_name AS
SELECT t1.id,max(t1.points) as points1, sum(t2.points) as points2
FROM table1 t1
LEFT JOIN table2 t2 ON t1.id = t2.id
GROUP BY t1.id
Your view should be looks like this :
id points1 points2
---------------------
1 10 16
2 20 20
3 30 45
Do the calculation in subqueries, then join their results:
SELECT
CD.sum_value, CD1.sum_points
FROM
(SELECT sum(value) as sum_value FROM customerdetails) CD
INNER JOIN (SELECT sum(points) AS sum_points FROM customerdetails1) CD1
ON 1 = 1
Please note, that SUM() returns NULL if there were no matching rows, so the subqueries will return with exactly one record -> any ON condition will be fine which results to true.
If you want to group by customers, then do the grouping in the subqueries:
SELECT
CD.customerid, CD.sum_value, CD1.sum_points
FROM
(
SELECT customerid, sum(value) as sum_value
FROM customerdetails
GROUP BY customerid
) CD
LEFT JOIN
(
SELECT customerid, sum(points) AS sum_points
FROM customerdetails1
GROUP BY customerid
) CD1
ON CD.customerid = CD1.customerid
UPDATE
To create a view (and bypass the limitation of MySQL), you have to create 3 views: 2 for the 2 subresults, 1 to join their results:
CREATE VIEW customer_value AS
SELECT SUM(value) as sum_value FROM customerdetails;
CREATE VIEW customer_points AS
SELECT SUM(points) as sum_points FROM customerdetails1;
CREATE VIEW cnPointsDetailsvw AS
SELECT cv.sum_value, cp.sum_points
FROM customer_value cv
INNER JOIN customer_points cp
ON 1=1;
I've got a database table with logs which has 3 columns:
date | status | projectId
status can be either 0 or 1, primary key is on date and projectID
I'm trying to find out how many times a projectID had status 0 since the last time it was 1.
so if there would be only one projectId
date | status | projectId
1 0 3
2 0 3
3 1 3
4 1 3
5 0 3
6 0 3
this should return 2 (row 5 and 6 are 0 and row 4 is 1)
The thing that makes it hard for me is that I have to maintain the order of date. What would be a good way to tackle such problems, and this one in particular?
Here is how you would do it for one project:
select count(*)
from logs l
where status = 0 and
projectid = 3 and
date > (select max(date) from logs where projectid = 3 and status = 1)
Here is how you would do it for all projects:
select l.projectId, count(l1.projectId)
from logs l left outer join
(select projectId, max(date) as maxdate
from logs
where status = 1
group by projectId
) l1
on l.projectId = l1.projectId and
l.date > l1.date and
l.status = 0
group by l.projectId;
here you have an option in just one select.
http://sqlfiddle.com/#!2/6ce87/11
select *
from logs
where status=0 and date > (select date from logs where status=1 order by date desc limit 1)
Here's one way to get the result for all project_id:
SELECT m.project_id
, COUNT(1) AS mycount
FROM ( SELECT l.project_id
, MAX(l.date) AS latest_date
FROM mytable l
WHERE l.status = 1
) m
JOIN mytable t
ON t.project_id = m.project_id
AND t.date > m.latest_date
AND t.status = 0
If you need only a subset of project_id, the predicate should be added to the WHERE clause in the inline view query:
WHERE l.status = 1
AND l.project_id IN (3,5,7)
EDIT
That query does not return a row if there is no status=0 row after the latest status=1 row. To return a zero count, this could be done with an outer join.
SELECT m.project_id
, COUNT(t.status) AS mycount
FROM ( SELECT l.project_id
, MAX(l.date) AS latest_date
FROM mytable l
WHERE l.status = 1
AND l.project_id IN (3)
) m
LEFT
JOIN mytable t
ON t.project_id = m.project_id
AND t.date > m.latest_date
AND t.status = 0
For optimum performance, the statement could make use of an index with leading columns of project_id and date (in that order) and including the status column, e.g.
ON mytable (`project_id`,`date`,`status`)
Say I have the following table, named data:
ID foo1 foo2 foo3
1 11 22 33
2 22 17 92
3 31 33 53
4 53 22 11
5 43 23 9
I want to select all rows where either foo1, foo2 or foo3 match either of these columns in the first row. That is, I want all rows where at least one of the foos appears also in the first row. In the example above, I want to select rows 1, 2, 3 and 4. I thought that I could use something like
SELECT * FROM data WHERE foo1 IN (SELECT foo1,foo2,foo3 FROM data WHERE ID=1)
OR foo2 IN (SELECT foo1,foo2,foo3 FROM data WHERE ID=1)
OR foo3 IN (SELECT foo1,foo2,foo3 FROM data WHERE ID=1)
but this does not seem to work. I can, of course, use
WHERE foo1=(SELECT foo1 FROM data WHERE ID=1)
OR foo1=(SELECT foo2 FROM data WHERE ID=1)
OR ...
but that would invlove many lines, and in my real data set there are actually 16 columns, so it will really be a pain in the lower back. Is there a more sophisticated way to do so?
Also, what should I do if I want to count also the number of hits (in the example above, get 4 for row 1, 2 for row 4, and 1 for rows 2,3)?
SELECT data.*,
(data.foo1 IN (t.foo1, t.foo2, t.foo3))
+ (data.foo2 IN (t.foo1, t.foo2, t.foo3))
+ (data.foo3 IN (t.foo1, t.foo2, t.foo3)) AS number_of_hits
FROM data JOIN data t ON t.id = 1
WHERE data.foo1 IN (t.foo1, t.foo2, t.foo3)
OR data.foo2 IN (t.foo1, t.foo2, t.foo3)
OR data.foo3 IN (t.foo1, t.foo2, t.foo3)
See it on sqlfiddle.
Actually, on reflection, you might consider normalising your data:
CREATE TABLE data_new (
ID BIGINT UNSIGNED NOT NULL,
foo_number TINYINT UNSIGNED NOT NULL,
val INT,
PRIMARY KEY (ID, foo_number),
INDEX (val)
);
INSERT INTO data_new
(ID, foo_number, val)
SELECT ID, 1, foo1 FROM data
UNION ALL SELECT ID, 2, foo2 FROM data
UNION ALL SELECT ID, 3, foo3 FROM data;
DROP TABLE data;
Then you can do:
SELECT ID,
MAX(IF(foo_number=1,val,NULL)) AS foo1,
MAX(IF(foo_number=2,val,NULL)) AS foo2,
MAX(IF(foo_number=3,val,NULL)) AS foo3,
number_of_hits
FROM data_new JOIN (
SELECT d1.ID, COUNT(*) AS number_of_hits
FROM data_new d1 JOIN data_new d2 USING (val)
WHERE d2.ID = 1
GROUP BY d1.ID
) t USING (ID)
GROUP BY ID
See it on sqlfiddle.
As you can see from the execution plan, this will be considerably more efficient for large data sets.
There are several ways to get the result set.
Here's one approach, (if you don't care about which fooN gets matched with with fooN, and also want to return that "first" row).
SELECT DISTINCT d.*
JOIN ( SELECT foo1 AS foo FROM data WHERE id = 1
UNION ALL
SELECT foo2 FROM data WHERE id = 1
UNION ALL
SELECT foo3 FROM data WHERE id = 1
) f
JOIN data d
ON f.foo IN (d.foo1, d.foo2, d.foo3)
That ON clause could also be written like this:
ON d.foo1 = f.foo
OR d.foo2 = f.foo
OR d.foo2 = f.foo
To get a "count" of the hits...
SELECT d.id
, d.foo1
, d.foo2
, d.foo3
, SUM( IFNULL(d.foo1=f.foo,0)
+IFNULL(d.foo2=f.foo,0)
+IFNULL(d.foo3=f.foo,0)
) AS count_of_hits
JOIN ( SELECT foo1 AS foo FROM data WHERE id = 1
UNION ALL
SELECT foo2 FROM data WHERE id = 1
UNION ALL
SELECT foo3 FROM data WHERE id = 1
) f
JOIN data d
ON f.foo IN (d.foo1, d.foo2, d.foo3)
GROUP
BY d.id
, d.foo1
, d.foo2
, d.foo3
eggyal is right, as usual. Getting the count of hits is actually much simpler: we can just use a SUM(1) or COUNT(1) aggregate, no need to run all those comparisons, we've already done all the necessary comparisons.
SELECT d.id
, d.foo1
, d.foo2
, d.foo3
, COUNT(1) AS count_of_hits
JOIN ( SELECT foo1 AS foo FROM data WHERE id = 1
UNION ALL
SELECT foo2 FROM data WHERE id = 1
UNION ALL
SELECT foo3 FROM data WHERE id = 1
) f
JOIN data d
ON f.foo IN (d.foo1, d.foo2, d.foo3)
GROUP
BY d.id
, d.foo1
, d.foo2
, d.foo3