Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
A fellow programmer showed me a query he created which looked like this:
SELECT a.row, b.row, c.row
FROM
a LEFT JOIN
b ON (a.id = b.id) LEFT JOIN
c ON (c.otherid= b.otherid)
WHERE a.id NOT IN (SELECT DISTINCT b.id bb
INNER JOIN
c cc ON (bb.a_id = cc.a_id)
WHERE (bb.date BETWEEN '2018-08-04 00:00:00' AND '2018-08-06 23:59:59'))
GROUP BY a.id ORDER BY c.otherid DESC;
So I shortened it by removing the second query and applying the WHERE clause directly:
SELECT a.row, b.row, c.row
FROM
a LEFT JOIN
b ON (a.id = b.id) LEFT JOIN
c ON (c.otherid= b.otherid)
WHERE b.date NOT BETWEEN '2018-08-04 00:00:00' AND '2018-08-06 23:59:59'
GROUP BY a.id ORDER BY c.otherid DESC;
Until here, everything seems fine and both queries return the same result set. The problem is that the second query takes three times longer to execute than the first one. How is that possible?
Thanks
The queries are significantly different. (We're assuming that the missing FROM keyword in the subquery in the first version is a result of putting that into the question, and that the original query doesn't have the same syntax errors. Also, the reference to b.id in the SELECT list of the subquery is highly suspicious, we suspect that's really meant to be a reference to bb.id ... but we're just guessing.)
If the two queries are returning the same exact resultset, that's a circumstance in the data. (We could demonstrate data sets where the results of the two queries would be different.)
"Shortening" a query does not necessarily optimize it.
What really matters (in terms of performance) is the execution plan. That is, what operations are being performed, in what order, and with large tables, which indexes are available and being used.
Without table and index definitions, it's not possible to give a definitive diagnosis.
Suggestion: Use MySQL EXPLAIN to view the execution plan of each query.
Assuming that the original query has a WHERE clause of the form:
WHERE a.id NOT IN ( SELECT DISTINCT bb.id
FROM b bb
JOIN c cc
ON bb.a_id = cc.a_id
WHERE bb.date BETWEEN '2018-08-04 00:00:00'
AND '2018-08-06 23:59:59'
AND bb.id IS NOT NULL
)
(assuming that we have a guarantee that a value returned by the subquery will never be NULL...)
That could be re-written as a NOT EXISTS correlated subquery to achieve an equivalent result:
WHERE NOT EXISTS ( SELECT 1
FROM b bb
JOIN c cc
ON cc.a_id = bb.a_id
WHERE bb.date >= '2018-08-04 00:00:00'
AND bb.date < '2018-08-07 00:00:00'
AND bb.id = a.id
)
or it could be re-written as an anti-join
LEFT
JOIN b bb
ON bb.id = a.id
AND bb.date >= '2018-08-04 00:00:00'
AND bb.date < '2018-08-07 00:00:00'
LEFT
JOIN c cc
ON cc.a_id = bb.a_id
WHERE cc.a_id IS NULL
With large sets, appropriate indexes would need to be available for optimal performance.
The re-write presented in the question is not guaranteed to return an equivalent result.
Related
I have two queries. First query returns 4554 results from database and second query returns 3830 results. I need to fetch and list those 724 results that are the difference between two queries, that exist in first query and does not exist in second query. I tried with EXCEPT function, but I get error in my console
FOR, GROUP, HAVING, INTO expected got 'EXCEPT'
Any help is appreciated. Here are my queries.
select * from billing_trans B,members M where (sign>='2021-03-01 00:00:00' && sign<='2021-03-31 23:59:59') AND M.id=B.mid
EXCEPT
select * from billing_trans B,members M where (sign>='2021-03-01 00:00:00' && sign<='2021-03-31 23:59:59') AND M.id=B.mid
AND (trans LIKE 'BH%' OR bank IN ('SM', 'TO', 'II'));
You can use not exists. Presumably you intend:
select *
from dt_billing_trans B join
dt_members M
on M.id = B.mid
where signD >= '2021-03-01' and signD < '2021-04-01' and
not exists (select 1
from dt_billing_trans b2
where b2.mid = b.mid and
b2.signD >= '2021-03-01' and
b2.signD < '2021-04-01' and
(b2.transId LIKE 'WH%' OR b2.bank IN ('WT', 'MO', 'SL'))
);
This returns all transactions for members who do not have transactions that match the second set of conditions.
This is not exactly equivalent to your code (which does not require except in any database), but I suspect it is closer to what you intend.
Note:
JOIN. JOIN. JOIN.
Note the improved date comparisons that do not miss fractions of a second before midnight. And the coding is much simpler.
The SQL standard AND boolean operator is AND, not &&.
You don't need 2 queries.
Use 1 WHERE clause:
SELECT *
FROM dt_billing_trans B INNER JOIN dt_members M
ON M.id = B.mid
WHERE (signD >= '2021-03-01 00:00:00' AND signD <= '2021-03-31 23:59:59')
AND NOT (transId LIKE 'WH%' OR bank IN ('WT', 'MO', 'SL'));
This question already has an answer here:
SQL statement is ignoring where parameter
(1 answer)
Closed 5 years ago.
I have a query which uses BETWEEN for showing the records between two dates. My query needs to show records whose arrival_date and departure_date between specific dates. But query somehow shows all records.
Column types are DATE.
SELECT DISTINCT art.* FROM accommodation_room_types art
INNER JOIN accommodation_rooms ar ON art.id = ar.room_type
INNER JOIN accommodation a ON art.accommodation = a.id
WHERE a.id = 13 AND NOT EXISTS
(
SELECT 1 FROM booked_rooms br INNER JOIN booking b ON br.booking = b.id
WHERE br.room = ar.id
AND
(
b.arrival_date BETWEEN '2017-12-16' AND '2018-04-16'
)
OR
(
b.departure_date BETWEEN '2017-12-16' AND '2018-04-16'
)
)
Even I write BETWEEN 'asd' AND 'asd', it still shows all records and doesn't give any format error.
Is my query wrong for showing records between two specific dates?
I don't know if your logic is right or wrong, but your syntax is not doing what you intend. I would suggest:
WHERE a.id = 13 AND
NOT EXISTS (SELECT 1
FROM booked_rooms br INNER JOIN
booking b
ON br.booking = b.id
WHERE br.room = ar.id AND
(b.arrival_date BETWEEN '2017-12-16' AND '2018-04-16' OR
b.departure_date BETWEEN '2017-12-16' AND '2018-04-16'
)
)
It strikes me that all that empty space in the query makes it hard to see that your logic was written as: A AND B OR C. Your intention (presumably) is A AND (B OR C).
Substitute 1 for *
The way you wrote query, it always return 1, regardless of conditions. Moreover, that is totally legit.
my table user contains these fields
id,company_id,created_by,name,image
table valet contains
id,vid,dept_id
table cart contains
id,dept_id,map_id,purchase,time
to get the details i have written this mysql query
SELECT c.id, a.id, c.purchace, c.time
FROM user a
LEFT JOIN valet b ON a.vid = b.id
AND a.is_deleted = 0
LEFT JOIN cart c ON b.dept_id = c.dept_id
WHERE a.company_id = 18
AND a.created_by = 102
AND a.is_deleted = 0
AND c.time
IN ( SELECT MAX( time ) FROM cart WHERE dept_id = b.dept_id )
from these three table i want to select last updated raw from cart along with id from user table which is mapped in valet table
this query works fine but it takes almost 15 sec to retrieve the details .
is there any way to improve this query or may be i am doing some wrong.
any help would be appreciated
For one thing, I can see that you’re running the subquery for each row. Depending on what the optimiser does, that may have an impact. max is a pretty expensive operation (there’s nothing for it but to read every row).
If you plan to update and use this query repeatedly, perhaps you should at least index the table on cart.time. This will make it much easier to find the maximum value.
MySQL has the concept of user variables, so you can set a variable to the result of the subquery, and that might help:
SELECT c.id, a.id, c.purchace, c.time
FROM
user a
LEFT JOIN valet b ON a.vid = b.id AND a.is_deleted = '0'
LEFT JOIN cart c ON b.dept_id = c.dept_id
LEFT JOIN (SELECT dept_id,max(time) as mx FROM cart GROUP BY dept_id) m on m.dept_id=c.dept_id
WHERE
a.company_id = '18'
AND a.created_by = '102'
AND a.is_deleted = '0'
AND c.time=m.mx;
Note also:
since you’re only testing a single value (max) for c.time, you should be using = not in.
I’m not sure about is why you are using strings instead of integers. I shold have though that leaving off the quotes makes more sense.
Your JOIN includes AND a.is_deleted = '0', though you make no mention of it in your table description. In any case, why is it in the JOIN and not in the WHERE clause?
I'm always be amused and confused(at same time) whenever I have been to asked prepare and run Join query on Sql Console.
And the cause of most confusion is mainly based upon the fact whether/or not the ordering of join predicate hold any importances in Join results.
Example.
SELECT "zones"."name", "ip_addresses".*
FROM "ip_addresses"
INNER JOIN "zones" ON "zones"."id" = "ip_addresses"."zone_id"
WHERE "ip_addresses"."resporg_accnt_id" = 1
AND "zones"."name" = 'us-central1'
LIMIT 1;
Given the sql query, the Join predicate look like this.
... INNER JOIN "zones" ON "zones"."id" = "ip_addresses"."zone_id" WHERE "ip_addresses"."resporg_accnt_id"
Now, would it make any difference in term of performance of Join as well as the authenticity of the obtained result. If happen to change the predicate to look like this
... INNER JOIN "zones" ON "ip_addresses"."zone_id" = "zones"."id" WHERE "ip_addresses"."resporg_accnt_id"
The predicate order won't make a performance difference in your case, a simple equality condition, but personally I like to place the columns from the table I'm JOINing to on the LHS of each ON condition
SELECT ...
FROM ip_addresses ia
JOIN zones z
ON z.id = ia.zone_id
WHERE ...
The optimiser can use any index available on these columns during the JOIN and I find it easier to visualise this way.
Any additional conditions also tend to be on columns of the table being JOINed to and I find again this reads better when this table is consistently on the LHS
Not quite the same, but I did see a case where performance was affected by the choice of column to isolate
I think the JOIN looked something like
SELECT ...
FROM table_a a
JOIN table_b b
ON a.id = b.id - 1
Changing this to
SELECT ...
FROM table_a a
JOIN table_b b
ON b.id = a.id + 1
allowed the optimiser to use an index on b.id, but presumably at the cost of an index on a.id
I suspect this kind of query might need analysing on a case by case basis
Furthermore, I would probably switch your table order round too and write your original query:
SELECT z.name,
ia.*
FROM zones z
JOIN ip_addresses ia
ON ia.zone_id = z.id
AND ia.resporg_accnt_id = 1
WHERE z.name = 'us-central1'
LIMIT 1
Conceptually, you are saying "Start with the 'us-central1' zone and fetch me all the ip_addresses associated with a resporg_accnt_id of 1"
Check the EXPLAIN plans if you want to verify that there is no difference in your case
I have the following query which returns the number of appointments that a particular subject has had:
select s.last_name, count(c.length)
from data.appointments a, data.subjects s, data.clinics c, research.sublog t
where s.id = a.subject_id and c.id = a.clinic_id and
s.ssn = t.ssn and a.status = '1' and
a.appt_date between '2012-10-01' and '2013-09-30' and a.appt_time not like '%01'
group by t.id
I would like to have counts for multiple time periods in the same query (add different years or quarters). I believe I would need to use subqueries for this but am having trouble deciphering what conditions to put in each subquery and what needs to remain outside of the subqueries (I have little experience in this area). Is this correct or is there a different method that would be better to use to return such a result? Thanks in advance for any help you can offer!
First, you want proper join syntax. Second, the solution to your problem is conditional aggregation functions. Here is an example:
select s.last_name,
sum(a.appt_date between '2012-10-01' and '2013-09-30') as cnt_2012,
sum(a.appt_date between '2013-10-01' and '2014-09-30') as cnt_2013
from data.appointments a join
data.subjects s
on s.id = a.subject_id join
data.clinics c
on c.id = a.clinic_id join
research.sublog t
on s.ssn = t.ssn
where a.status = '1' and
a.appt_time not like '%01'
group by t.id;
I didn't make the change, but you should probably have group by s.last_name because you have last_name in the select clause. And, the filter on appt_time doesn't make sense to me. You shouldn't use like on a date/time field.