Is there better way to do this query? - mysql

SELECT *
FROM a
WHERE a.re_id = 3443499
AND a.id IN
(
SELECT b.rsp_id FROM b
WHERE b.f_id = 9
GROUP BY b.rsp_id
HAVING FIND_IN_SET(16, GROUP_CONCAT(b.o_id)) > 0
AND FIND_IN_SET(15, GROUP_CONCAT(b.o_id)) > 0
UNION
SELECT b.rsp_id FROM b
WHERE b.f_id = 4
GROUP BY b.rsp_id
HAVING FIND_IN_SET(5, GROUP_CONCAT(b.o_id)) > 0
)
ORDER BY id DESC
Here "f_id" is array and its values are those in first parameter of "FIND_IN_SET" function.
For example
9=>(
16,
15
),
4=>(
5
)
Sample data for those 2 folumns in table b, 2 columns f_id and o_id
f_id o_id
9 15
9 18
9 23
4 5
3 8

The gist of this answer is that the current query does not run. So, fix the syntax and ask another question.
First, you could write the query so it is syntactically correct. The query will fail as written, because the first subquery returns at least two rows and the second only one.
Second, use UNION ALL instead of UNION, unless you specifically want to incur the overhead of removing duplicates.
Third, the ORDER BY will generate an error.
Fourth, the GROUP_CONCAT() is dangerous and unnecessary.
I'm not 100% sure this is the intention, but I would start with a query like this:
SELECT a.id, a.re_id
FROM a
WHERE a.re_id = 3443499 AND
a.id IN (SELECT b.rsp_id
FROM b
WHERE b.f_id = 9
GROUP BY b.rsp_id
HAVING MAX(b.o_id = 16) > 0 AND
MAX(b.o_id = 15) > 0
)
UNION ALL
SELECT b.rsp_id, NULL
FROM b
WHERE b.f_id = 4
GROUP BY b.rsp_id
HAVING MAX(b.o_id = 5) > 0
ORDER BY id;
Then, if you want this optimized, I would suggest asking another question, along with relevant information about the table structures and current performance.

Related

MySQL query row/column difference

I'm trying to write a MySQL query that does the following:
I have a table that looks like this
Id t
-----------
1 1
2 4
1 6
2 9
1 12
2 14
I need to find the sum of the t column for each Id of 2, and subtract from it the sum of the t column for each Id of 1.
So for this example, the sum of Id 1 is 19, and the sum of Id 2 is 27.
I would want the output to then be 8.
I would imagine the statement would look similar to:
SELECT sum(t) WHERE Id = 2 - sum(t) WHERE Id = 1;
But this obviously isn't proper syntax.
And I apologize for the poorly drawn table, I'm still new to stackoverflow.
You could use a CASE statement:
SELECT
SUM(CASE
WHEN Id = 2 THEN t
WHEN Id = 1 THEN 0 - t
ELSE 0
END) AS mySum
FROM myTable
Hopefully that works as-is... I only have SQL Server to test on, but the syntax should be the same for MySQL.
SELECT SUM(IF(`id` = 2, t, 0)) - SUM(IF(`id` = 1, t, 0)) as `result` FROM `table`
Depends on how big is your table. If it is small or no indexes you can do:
select sum(if( Id=2,t,if(Id=1,-t,0)))
from data;
If you have plenty of rows and have an index in column Id:
select sum(id2)-sum(id1)
from (
select 0 as 'id1', sum(t) as 'id2'
from data
where id=2
union
select sum(t) as 'id1', 0 as 'id2'
from data
where id=1
) as d;

Nested queries and Join

As a beginner with SQL, I’m ok to do simple tasks but I’m struggling right now with multiple nested queries.
My problem is that I have 3 tables like this:
a Case table:
id nd date username
--------------------------------------------
1 596 2016-02-09 16:50:03 UserA
2 967 2015-10-09 21:12:23 UserB
3 967 2015-10-09 22:35:40 UserA
4 967 2015-10-09 23:50:31 UserB
5 580 2017-02-09 10:19:43 UserA
a Value table:
case_id labelValue_id Value Type
-------------------------------------------------
1 3633 2731858342 X
1 124 ["864","862"] X
1 8981 -2.103 X
1 27 443 X
... ... ... ...
2 7890 232478 X
2 765 0.2334 X
... ... ... ...
and a Label table:
id label
----------------------
3633 Value of W
124 Value of X
8981 Value of Y
27 Value of Z
Obviously, I want to join these tables. So I can do something like this:
SELECT *
from Case, Value, Label
where Case.id= Value.case_id
and Label.id = Value.labelValue_id
but I get pretty much everything whereas I would like to be more specific.
What I want is to do some filtering on the Case table and then use the resulting id's to join the two other tables. I'd like to:
Filter the Case.nd's such that if there is serveral instances of the same nd, take the oldest one,
Limit the number of nd's in the query. For example, I want to be able to join the tables for just 2, 3, 4 etc... different nd.
Use this query to make a join on the Value and Label table.
For example, the output of the queries 1 and 2 would be:
id nd date username
--------------------------------------------
1 596 2016-02-09 16:50:03 UserA
2 967 2015-10-09 21:12:23 UserB
if I ask for 2 different nd. The nd 967 appears several times but we take the oldest one.
In fact, I think I found out how to do all these things but I can't/don't know how to merge them.
To select the oldest nd, I can do someting like:
select min((date)), nd,id
from Case
group by nd
Then, to limit the number of nd in the output, I found this (based on this and that) :
select *,
#num := if(#type <> t.nd, #num + 1, 1) as row_number,
#type := t.nd as dummy
from(
select min((date)), nd,id
from Case
group by nd
) as t
group by t.nd
having row_number <= 2 -- number of output
It works but I feel it's getting slow.
Finally, when I try to make a join with this subquery and with the two other tables, the processing keeps going on for ever.
During my research, I could find answers for every part of the problem but I can't merge them. Also, for the "counting" problem, where I want to limit the number of nd, I feel it's kind of far-fetch.
I realize this is a long question but I think I miss something and I wanted to give details as much as possible.
to filter the case table to eliminate all but oldest nds,
select * from [case] c
where date = (Select min(date) from case
where nd = c.nd)
then just join this to the other tables:
select * from [case] c
join value v on v.Case_id = c.Id
join label l on l.Id = v.labelValue_id
where date = (Select min(date) from [case]
where nd = c.nd)
to limit it to a certain number of records, there is a mysql specific command, I think it called Limit
select * from [case] c
join value v on v.Case_id = c.Id
join label l on l.Id = v.labelValue_id
where date = (Select min(date) from [case]
where nd = c.nd)
Limit 4 -- <=== will limit return result set to 4 rows
if you only want records for the top N values of nd, then the Limit goes on a subquery restricting what values of nd to retrieve:
select * from [case] c
join value v on v.Case_id = c.Id
join label l on l.Id = v.labelValue_id
where date = (Select min(date) from [case]
where nd = c.nd)
and nd In (select distinct nd from [case]
order by nd desc Limit N)
So finally, here is what worked well for me:
select *
from (
select *
from Case
join (
select nd as T_ND, date as T_date
from Case
where nd in (select distinct nd from Case)
group by T_ND Limit 5 -- <========= Limit of nd's
) as t
on Case.nd = t.T_ND
where date = (select min(date)
from Case
where nd = t.T_ND)
) as subquery
join Value
on Value.context_id = subquery.id
join Label
on Label.id = Value.labelValue_id
Thank you #charlesbretana for leading me on the right track :).

Count first occurence with column value ordered by another column

I have an assigns table with the following columns:
id - int
id_lead - int
id_source - int
date_assigned - int (this represents a unix timestamp)
Now, lets say I have the following data in this table:
id id_lead id_source date_assigned
1 20 5 1462544612
2 20 6 1462544624
3 22 6 1462544615
4 22 5 1462544626
5 22 7 1462544632
6 25 6 1462544614
7 25 8 1462544621
Now, lets say I want to get a count of the rows whose id_source is 6, and is the first entry for each lead (sorted by date_assigned asc).
So in this case, the count would = 2, because there are 2 leads (id_lead 22 and 25) whose first id_source is 6.
How would I write this query so that it is fast and would work fine as a subquery select? I was thinking something like this which doesn't work:
select count(*) from `assigns` where `id_source`=6 order by `date_assigned` asc limit 1
I have no idea how to write this query in an optimal way. Any help would be appreciated.
Pseudocode:
select rows
with a.id_source = 6
but only if
there do not exist any row
with same id_lead
and smaller date_assigned
Translate it to SQL
select * -- select rows
from assigns a
where a.id_source = 6 -- with a.id_source = 6
and not exists ( -- but only if there do not exist any row
select 1
from assigns a1
where a1.id_lead = a.id_lead -- with same id_lead
and a1.date_assigned < a.date_assigned -- and smaller date_assigned
)
Now replace select * with select count(*) and you'll get your result.
http://sqlfiddle.com/#!9/3dc0f5/7
Update:
The NOT-EXIST query can be rewritten to an excluding LEFT JOIN query:
select count(*)
from assigns a
left join assigns a1
on a1.id_lead = a.id_lead
and a1.date_assigned < a.date_assigned
where a.id_source = 6
and a1.id_lead is null
If you want to get the count for all values of id_source, the folowing query might be the fastest:
select a.id_source, count(1)
from (
select a1.id_lead, min(a1.date_assigned) date_assigned
from assigns a1
group by a1.id_lead
) a1
join assigns a
on a.id_lead = a1.id_lead
and a.date_assigned = a1.date_assigned
group by a.id_source
You still can replace group by a.id_source with where a.id_source = 6.
The queries need indexes on assigns(id_source) and assigns(id_lead, date_assigned).
Simple query for that would be
check here http://sqlfiddle.com/#!9/8666e0/7
select count(*) from
(select * from assigns group by id_lead )t
where t.id_source=6

row dependant variables in mysql query

I have a mysql database as follows. I am using it with phpmyadmin.
id time_came time_exit
0 2 3
1 3 5
5 5 1
7 1 10
9 1 8
I want another column as "wait" with the following logic,
foreach(i in time_came){
wait=count(time_came<i&&i<time_exit)
}
So then each column has a "wait" value too. I can do this with php. But I need to do this with mysql. I am confusing because "i" is varying for each row?
Thanks in advance.
Is this what you want?
select t.*,
(select count(*)
from table t2
where t2.time_came between t.time_came and t2.time_came < t.time_exit
) as wait
from table t;
EDIT:
To do the update, you need to use join:
update table t join
(select t.id,
(select count(*)
from table t2
where t2.time_came between t.time_came and t2.time_came < t.time_exit
) as wait
from table t
) as newval
on t.id = newval.id
set t.wait = newval.wait

JOIN vs UNION vs IN() - big tables and many WHERE conditions

I use MySQL 5.5 and I have 3 tables created for testing:
attributes (entity_id, cid, aid, value) - indexes: ALL
items (entity_id, price, currency) - indexes: entity_id
rates (currency_from, currency_to, rate) - indexes: NONE
I need to count the results for specified conditions (search by attributes) and select X rows ordered by some column.
The query should support searching in item attributes (attributes table).
I have a query like this at first:
SELECT i.entity_id, i.price * COALESCE(r.rate, 1) AS final_price
FROM items i
JOIN attributes a ON a.entity_id = i.entity_id
LEFT JOIN rates r ON i.currency = r.currency_from AND r.currency_to = 'EUR'
WHERE a.cid = 4 AND ( (a.aid >= 10 AND a.value > 2000) OR (a.aid <= 10 AND a.value > 5) )
HAVING final_price BETWEEN 0 AND 9000
ORDER BY final_price DESC
LIMIT 20
but it's quite slow on big tables. The where conditions can be bigger (even to 30 params) and use CAST(a.value as SIGNED) to use BETWEEN sometimes (for range values).
For example:
SELECT
i.entity_id,
i.price * COALESCE(r.rate, 1) AS final_price
FROM
attributes a
JOIN items i
ON a.entity_id = i.entity_id
LEFT JOIN rates r
ON i.currency = r.currency_from
AND r.currency_to = 'EUR'
WHERE
a.cid = 4 AND (
(a.aid = 10 AND CAST(a.value AS SIGNED) BETWEEN 2000 AND 2014)
OR (a.aid = 121 AND CAST(a.value AS SIGNED) BETWEEN 40 AND 60)
OR (a.aid = 45 AND CAST(a.value AS SIGNED) BETWEEN 770 AND 1500)
OR (a.aid = 95 AND CAST(a.value AS SIGNED) BETWEEN 12770 AND 15500)
OR (a.aid = 98 AND a.value = 'some value')
OR (a.aid = 199 AND a.value = 'some another value')
OR (a.aid = 102 AND a.value = 1)
OR (a.aid = 112 AND a.value = 42) )
GROUP BY
i.entity_id
HAVING
COUNT(i.entity_id) = 7
AND final_price BETWEEN 0 AND 9000
ORDER BY
final_price DESC
LIMIT 20
I group by COUNT() equal to 7 (number of attributes to search), because I need to find items with all these attributes.
EXPLAIN for the base query (the first one):
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE a ALL entity_id,value NULL NULL NULL 379999 Using where; Using temporary; Using filesort
1 SIMPLE i eq_ref PRIMARY PRIMARY 4 testowa.a.entity_id 1 Using where
1 SIMPLE r ALL NULL NULL NULL NULL 2
I read many topics about comparing UNION vs JOIN vs IN() and the best results gives the second option, but it's too slow all the time.
Is there any way to get better performance here? Why is it so slow?
Should I think about moving some logic (split this query to 3 small) to backend (php/ror) code?
I would restructure your query slightly and have the attributes table first
and then joined to the items. Also, I would have a covering index on the
items table via (entity_id, price) and an index on your attributes table
ON (cid, aid, value, entity_id), and your rates table index
ON (currency_from, currency_to, rate). This way, all are covering indexes
and the engine won't need to go to the raw data pages to get the data, it can
pull it from the indexes it is already using for the joining / criteria.
SELECT
i.entity_id,
i.price * COALESCE(r.rate, 1) AS final_price
FROM
attributes a
JOIN items i
ON a.entity_id = i.entity_id
LEFT JOIN rates r
ON i.currency = r.currency_from
AND r.currency_to = 'EUR'
WHERE
a.cid = 4 AND ( (a.aid >= 10 AND a.value > 2000) OR (a.aid <= 10 AND a.value > 5) )
HAVING
final_price BETWEEN 0 AND 9000
ORDER BY
final_price DESC
LIMIT 20
So, although this would help the query you have provided, could you show some other where you would have many more criteria conditions... you mentioned it could be as many (or more) than 30. Looking at more might alter the query slightly.
As for your updated query with multiple criteria, I would then add an IN() clause for all the "aid" values after the "a.cid = 4". This way, before it has to hit all the "OR" conditions, if it fails on the "aid" not being one you consider, it never has to hit those... such as
a.cid = 4
AND a.id in ( 10, 121, 45, 95, 98, 199, 102 )
AND ( rest of the complex aid, casting and between criteria )