Count rows in second table with LEFT JOIN - mysql

I have a query where I output some results
SELECT
t1.busName,
t1.busCity,
COUNT(t2.ofr_id) AS cntOffers
FROM t1
LEFT JOIN t2 ON (t2.ofr_busID = t1.busID)
The query above returns only one row, however, if I remove COUNT and leave only below query I get multiple results. What am I missing? And how can I fetch results from the first table while getting associated results count from t2?
SELECT
t1.busName,
t1.busCity
FROM t1
LEFT JOIN t2 ON (t2.ofr_busID = t1.busID)

You need group by:
SELECT t1.busName, t1.busCity,
COUNT(t2.ofr_id) AS cntOffers
FROM t1 LEFT JOIN
t2
ON t2.ofr_busID = t1.busID
GROUP BY t1.busName, t1.busCity;
Most databases would return an error on your version of the query, because you have unaggregated and aggregated columns in the SELECT.

It actually seems that the COUNT() in your first query is forcing a GROUP BY (because of the aggregation) on that field, which explains why you get only one row, but that does not imply that you only have one row in it.
Check out this SQL Fiddle
MySQL 5.6 Schema Setup:
CREATE TABLE TestData (a int, b int);
INSERT INTO TestData
(a, b)
VALUES
(1, 1),
(2, 2),
(3, 3);
Query:
SELECT a, count(b) from TestData
Results:
| a | count(b) |
|---|----------|
| 1 | 3 |
As Gordon Linoff suggested, you need to use GROUP BY explicitly in order to replicate the same behavior without the COUNT.

Related

Explaining MySQL query with multiple tables listed in FROM

a, b are not directly related.
What does a,b have to do with the results?
select * from a,b where b.id in (1,2,3)
can you explain sql?
Since you haven't specified a relationship between a and b, this produces a cross product. It's equivalent to:
SELECT *
FROM a
CROSS JOIN b
WHERE b.id IN (1, 2, 3)
It will combine every row in a with the three selected rows from b. If a has 100 rows, the result will be 300 rows.
What you using is Multitable SELECT.
Multitable SELECT (M-SELECT) is similar to the join operation. You
select values from different tables, use WHERE clause to limit the
rows returned and send the resulting single table back to the
originator of the query.
The difference with M-SELECT is that it would return multiply tables
as the result set. For more deatils: https://dev.mysql.com/worklog/task/?id=358
In other word, you query is :
SELECT *
FROM a
CROSS JOIN b
WHERE b.id in (1,2,3)

How to filter rows `(a,b)` and `(b,a)` to a single row `(a,b)` in SQL-result?

How to filter rows (a,b) and (b,a) to a single row (a,b) in SQL-result?
In SQL I join a table with itself and extract all rows for which the primary key match and some other attribute don't match.
The result is that every row is "duplicated" in result. How do I filter these as described above ?
SELECT t1.courseId, t1.teacherId, t2.teacherId
FROM Gives AS t1 INNER JOIN Gives AS t2 ON t1.courseId = t2.courseId AND t1.teacherName <> t2.teacherName
Gives result:
dIntProg mch jat
dIntProg jat mch
dDbb ira sch
dDbb sch ira
Try this?
SELECT t1.courseId, t1.teacherId, t2.teacherId
FROM Gives AS t1 INNER JOIN Gives AS t2
ON t1.courseId = t2.courseId
AND t1.teacherName < t2.teacherName
Usually you would have to use GROUP BY to eliminate duplicates and use aggregate functions on all fields that are not part of the GROUP BY criteria.
For instance:
SELECT name, SUM(myCount)
FROM myTable
GROUP BY name

Inserting values from two tables into a new table

I ran this query:
Insert into transaction(matric,surname,other,level,bk_id,bk_title)
values(
(select matric,surname,others,level from member_master),
(select isbn,bk_title from book_master)
)
but I got this error:
column count doesn't match value count at row 1
You have to use the same columns which you have mentioned in the insert statement. Presently your insert statement mentions matric,surname,other,level,bk_id,bk_title columns whereas the columns in select are different. Try like this:
Insert into transaction(matric,surname,other,level,bk_id,bk_title)
values
(select m.matric,m.surname,m.others,m.level,b.isbn,b.bk_title
from member_master m inner join book_master b on m.id = b.id)
Assuming that the two tables are linked with the ID column

left join returning more than expected

Using the following query
select *
from table1
left join table2 on table1.name = table2.name
table1 returns 16 rows and table2 returns 35 rows.
I was expecting the above query to return 16 rows because of the left join, but it is returning 35 rows. right join also returns 35 rows
Why is this happening and how do I get it to return 16 rows?
LEFT JOIN can return multiple copies of the data from table1, if the foreign key for a row in table 1 is referenced by multiple rows in table2.
If you want it to only return 16 rows, one for each table 1 row, and with a random data set for table 2, you can use just a plain GROUP BY:
select *
from table1
left join table2 on table1.name = table2.name
group by table1.name
GROUP BY aggregates rows based on a field, so this will collapse all the table1 duplicates into one row. Generally, you specify aggregate functions to explain how the rows should collapse (for example, for a number row, you could collapse it using SUM() so the one row would be the total). If you just want one random row though, don't specify any aggregate functions. MySQL will by default just choose one row (note that this is specific to MySQL, most databases will require you to specify aggregates when you group). The way it chooses it is not technically "random", but it is not necessarily predictable to you. I guess by "random" you really just mean "any row will do".
Let's assume you have the following tables:
tbl1:
|Name |
-------
|Name1|
|Name2|
tbl2:
|Name |Value |
--------------
|Name1|Value1|
|Name1|Value2|
|Name3|Value1|
For your LEFT JOIN you'll get:
|tbl1.Name|tbl2.Name|Value |
----------------------------
|Name1 | Name1 |Value1|
|Name1 | Name1 |Value2|
|Name2 | NULL | NULL |
So, LEFT JOIN means that all records from LEFT (first) table will be returned regardless of their presence in right table.
For your question you need to specify some specific fields instead of using "*" and add GROUP BY tbl1.Name - so your query will look like
select tbl1.Name, SOME_AGGREGATE_FUNCTION(tbl2.specific_field), ...
from table1
left join table2 on table1.name = table2.name
GROUP BY tbl1.Name
One way to use this is by using the power of SQL distinct.
select distinct tbl1.id, *
from table1 tbl1
left join table2 tbl2 on tbl2.name = tbl1.name
where
....................
Please not that I am also using aliasing.
If the name column is not unique in the tables then you may simply have duplicates on table2.
Try running:
select * from table2 where name not in (select name from table1);
If you get no results back then duplicates on the name column is the reason for the extra rows coming back.
Duplication may be reason. See example in the post
https://alexpetralia.com/posts/2017/7/19/more-dangerous-subtleties-of-joins-in-sql
if you want to join the single latest/earliest relative row from right table, you can limit the join data using min/max primary key and then limiting to 1 row using group Like this:
SELECT * FROM table1
LEFT JOIN (SELECT max(tbl2_primary_col), {table2.etc} FROM table2 GROUP BY name) AS tbl2
ON table1.name = tbl2.name
WHERE {condition_for_table1}
And remember don't use * for left join because it will disable min/max and always return first row.
As per your comment "A random row from table2, as long as name from table1 matches name from table2", you can use the following:
select table1.name, (select top 1 somecolumn from table2 where table2.name = table1.name)
from table1
Note that top 1 is not mysql but it is for SQL Server

MySQL Subquery Optimization

The Query:
SELECT a FROM table1 WHERE b IN (SELECT b FROM table1 WHERE c = 2);
If the subquery returns zero results, mysql takes a long time to realize it before it finally returns an empty result. (2.5s when subquery is empty, 0.0005s when there is a subquery result).
My question: is there a way to modify the query such that it will still return an empty result but take the same time as it did when there was a result?
I tried:
SELECT a FROM table1 WHERE b IN ((SELECT b FROM table1 WHERE c = 2), 555);
...but it only works WHEN the subquery is empty. If there is a result, the query fails.
-- I don't want to change the query format from nested to join/etc.
Ideas..? Thanks!
-- EDIT: Also, I forgot to add: The subquery will likely result in a decent-sized list of results, not just one result. --- ALSO, when I type '555', I am referring to a value that will never exist in the column.
-- EDIT 2: I also tried the following query and it "works" but it still takes several orders of magnitude longer than the original query when it has results:
SELECT a FROM table1 WHERE b IN (SELECT 555 AS b UNION SELECT b FROM table1 WHERE c = 2);
Wild guess (I can't test it right now):
SELECT a FROM table1 WHERE
EXISTS (SELECT b FROM table1 WHERE c = 2)
AND b IN (SELECT b FROM table1 WHERE c = 2);