MySQL Subquery Optimization - mysql

The Query:
SELECT a FROM table1 WHERE b IN (SELECT b FROM table1 WHERE c = 2);
If the subquery returns zero results, mysql takes a long time to realize it before it finally returns an empty result. (2.5s when subquery is empty, 0.0005s when there is a subquery result).
My question: is there a way to modify the query such that it will still return an empty result but take the same time as it did when there was a result?
I tried:
SELECT a FROM table1 WHERE b IN ((SELECT b FROM table1 WHERE c = 2), 555);
...but it only works WHEN the subquery is empty. If there is a result, the query fails.
-- I don't want to change the query format from nested to join/etc.
Ideas..? Thanks!
-- EDIT: Also, I forgot to add: The subquery will likely result in a decent-sized list of results, not just one result. --- ALSO, when I type '555', I am referring to a value that will never exist in the column.
-- EDIT 2: I also tried the following query and it "works" but it still takes several orders of magnitude longer than the original query when it has results:
SELECT a FROM table1 WHERE b IN (SELECT 555 AS b UNION SELECT b FROM table1 WHERE c = 2);

Wild guess (I can't test it right now):
SELECT a FROM table1 WHERE
EXISTS (SELECT b FROM table1 WHERE c = 2)
AND b IN (SELECT b FROM table1 WHERE c = 2);

Related

Explaining MySQL query with multiple tables listed in FROM

a, b are not directly related.
What does a,b have to do with the results?
select * from a,b where b.id in (1,2,3)
can you explain sql?
Since you haven't specified a relationship between a and b, this produces a cross product. It's equivalent to:
SELECT *
FROM a
CROSS JOIN b
WHERE b.id IN (1, 2, 3)
It will combine every row in a with the three selected rows from b. If a has 100 rows, the result will be 300 rows.
What you using is Multitable SELECT.
Multitable SELECT (M-SELECT) is similar to the join operation. You
select values from different tables, use WHERE clause to limit the
rows returned and send the resulting single table back to the
originator of the query.
The difference with M-SELECT is that it would return multiply tables
as the result set. For more deatils: https://dev.mysql.com/worklog/task/?id=358
In other word, you query is :
SELECT *
FROM a
CROSS JOIN b
WHERE b.id in (1,2,3)

Compare one table with another in mysql and display matched record

I have two mysql tabales.
Table1:opened_datatable
Table2:unidata
Table1 has only one column:Emails
Table2 has 45 columns, three of them are:Email_Office, Email_Personal1, Email_Personal2
I want to fetch full rows from Table2-unidata if Emails column of Table1 matches with either Email_Office or Email_Personal1 or Email_Personal2. I am getting little bit confused here.I tried this way:
select a.emails
from opened_datatable a
where a.Emails in (select *
from unidata b
where b.email_office=a.emails
or b.Email_Personal1=a.emails
or b.Email_Personal2=a.Emails
)
Its showing only one row of first table while I want to show matched rows of Table2 -unidata. First I need to mention table 2 and then I should have to match it with table 1-opened_datatable. But how can I do that?
Try This:
SELECT a.emails, b.*
FROM opened_datatable a
INNER JOIN unidata b ON a.emails IN (b.email_office, b.Email_Personal1, b.Email_Personal2)
Your current query should return an error.
Try a Corrrelated Subquery using EXISTS, quite similar to your apporach:
select a.emails
from opened_datatable a
where EXISTS
( select *
from unidata b
where b.email_office=a.emails
or b.Email_Personal1=a.emails
or b.Email_Personal2=a.Emails
)
You will probably not get good performance due to the OR-ed conditions.
Edit:
If performance is too bad, you might try a UNION approach:
select a.emails
from opened_datatable a
where a.emails
IN
( select email_office
from unidata b
UNION
select Email_Personal1
from unidata b
UNION
select b.Email_Personal2
from unidata b
)

Count rows in second table with LEFT JOIN

I have a query where I output some results
SELECT
t1.busName,
t1.busCity,
COUNT(t2.ofr_id) AS cntOffers
FROM t1
LEFT JOIN t2 ON (t2.ofr_busID = t1.busID)
The query above returns only one row, however, if I remove COUNT and leave only below query I get multiple results. What am I missing? And how can I fetch results from the first table while getting associated results count from t2?
SELECT
t1.busName,
t1.busCity
FROM t1
LEFT JOIN t2 ON (t2.ofr_busID = t1.busID)
You need group by:
SELECT t1.busName, t1.busCity,
COUNT(t2.ofr_id) AS cntOffers
FROM t1 LEFT JOIN
t2
ON t2.ofr_busID = t1.busID
GROUP BY t1.busName, t1.busCity;
Most databases would return an error on your version of the query, because you have unaggregated and aggregated columns in the SELECT.
It actually seems that the COUNT() in your first query is forcing a GROUP BY (because of the aggregation) on that field, which explains why you get only one row, but that does not imply that you only have one row in it.
Check out this SQL Fiddle
MySQL 5.6 Schema Setup:
CREATE TABLE TestData (a int, b int);
INSERT INTO TestData
(a, b)
VALUES
(1, 1),
(2, 2),
(3, 3);
Query:
SELECT a, count(b) from TestData
Results:
| a | count(b) |
|---|----------|
| 1 | 3 |
As Gordon Linoff suggested, you need to use GROUP BY explicitly in order to replicate the same behavior without the COUNT.

Discrepance obtaining values not in inner join by difference

I have table A and table B. I know table B has 7848 rows (count(*)) and I want see which of those 7848 exist inside table A. As far as I know INNER JOIN returns the values that appear in BOTH tables A and B. So I inner joined them like this:
SELECT *
FROM
TABLE1 AS A
INNER JOIN
TABLE2 AS B
ON A.field1 = B.field1
This query returns 1902 rows. Now, I want to find out which rows did NOT appear in table B so I do this:
SELECT * FROM TABLE_B WHERE FIELD1 NOT IN (field1*1902....);
By difference I think I should be getting a result of 5946 rows, since I found 1902 positive rows. What is weird is that this NOT IN statement returns 6175 rows and if I add them I get 8077 which is more than count(*) told me table B had.
What can I possibly be doing wrong?
Thanks in advance.
A join is a kind-of multiply. If you have multiple rows in table A with the same field1, then rows in B are counted multiple times.
Perhaps you want
SELECT * FROM TABLE_B B
WHERE EXISTS (SELECT field1 from TABLE_A A WHERE A.field1 = B.field1);
Try:
SELECT *
FROM
TABLE1 AS A
LEFT JOIN
TABLE2 AS B
ON A.field1 = B.field1
WHERE B.field1 IS NULL
The following query returns rows from table A that aren't on table B:
SELECT * FROM TABLE1 WHERE field1 NOT IN (SELECT field1 FROM TABLE2)
You can also get rid of the IN condition for better performance:
SELECT * FROM TABLE1 A WHERE NOT EXISTS (SELECT 1 FROM TABLE2 B WHERE B.field1 = A.field1)
You might have some duplicated values in Table1 that are also present in Table2. Your first query will return those records multiple times.
You also need to be careful if you have null values: INNER JOIN and NOT IN won't return those values.

How to express count(distinct) with subquery in MySQL?

A query results a certain number. The query is:
select
count(distinct case when (A or B or C) and D then table_a.field1 else null end)
from table_a
left join table_b on table_b.x = table_a.y
group by table_a.y
;
where A, B, C and D are given conditions. Now, written in this form:
select
sum((select count(1) from table_b where table_b.x = table_a.y and ((A or B or C) and D) ))
from table_a
left join table_b on table_b.x = table_a.y
group by table_a.y
;
the result does not match the one we got with count(distinct).
What is the correct way of writing count(distinct) with a subquery?
It's not at all clear why you need a subquery. You still have the JOIN, so that subquery is potentially going to be "counting" the same rows multiple times.
If you want to get the number of distinct values for field1 in table_a which meets a set of criteria (on table_a), then you don't really need a subquery on table_b to get that. At least, I don't see anyway that you can get that result using a subquery on table_b.
Here's an example that returns an equivalent result:
select (select sum(1) as mycount
from ( select a.field1
from table_a a
left join table_b on table_b.x = a.y
where a.y = t.y
and ( (A or B or C) and D )
and a.field1 IS NOT NULL
group by a.field1
) s
) as mycount
from table_a t
group by t.y
That's really the only way I know to get something equivalent to a COUNT(DISTINCT expr). You've got to do a SELECT expr FROM ... WHERE expr IS NOT NULL GROUP BY expr, and then count the rows it returns. In this case, you could use either a COUNT(1) or a SUM(1).
(I'm not at all sure that answers the question you were asking, but it's my best shot at it.)
(We note that in your original query, you have a GROUP BY table_a.y, so that query can return multiple rows, each with its own count.