GROUP BY with aggregate and an INNER JOIN

GROUP BY with aggregate and an INNER JOIN - mysql

I tried to narrow down the problem as much as possible, it is still quite something. This is the query that doesn't work the way I want it:
SELECT *, MAX(tbl_stopover.dist)
FROM tbl_stopover
INNER JOIN
(SELECT edges1.id id1, edges2.id id2, COUNT(edges1.id) numConn
FROM tbl_edges edges1
INNER JOIN tbl_edges edges2
ON edges1.nodeB = edges2.nodeA
GROUP BY edges1.id HAVING numConn = 1) AS tbl_conn
ON tbl_stopover.id_edge = tbl_conn.id1
GROUP BY id_edge
Here is what I get:
|id | edge | dist | id1 | id2 | numConn | MAX(tbl_stopover.dist) |
------------------------------------------------------------------
|2 | 23 | 2 | 23 | 35 | 1 | 9 |
|4 | 24 | 5 | 24 | 46 | 1 | 9 |
------------------------------------------------------------------
and this is what I would want:
|id | edge | dist | id1 | id2 | numConn | MAX(tbl_stopover.dist) |
------------------------------------------------------------------
|3 | 23 | 9 | 23 | 35 | 1 | 9 |
|5 | 24 | 9 | 24 | 46 | 1 | 9 |
------------------------------------------------------------------
But let me elaborate a bit...
I have a graph, let's say as such:
node1
|
node2
/ \
node3 node4
| |
node5 node6
Therefore I have a table I call tbl_edges like this:
| id | nodeA | node B |
------------------------
| 12 | 1 | 2 |
| 23 | 2 | 3 |
| 24 | 2 | 4 |
| 35 | 3 | 5 |
| 46 | 4 | 6 |
------------------------
Now each edge has "stop_overs" at a certain distance (to nodeA). Therefore I have a table tbl_stopover like this:
| id | edge | dist |
------------------------
| 1 | 12 | 5 |
| 2 | 23 | 2 |
| 3 | 23 | 9 |
| 4 | 24 | 5 |
| 5 | 24 | 9 |
| 6 | 35 | 5 |
| 7 | 46 | 5 |
------------------------
Why this query?
Let's assume I want to calculate the distance between the stop_overs. Within one edge that is no problem. Across edges it gets more difficult. But if I have two edges that are connected and there is no other connection I can also calculate the distance. Here an example assuming all edges have a length of 10. :
edge23 has a stop_over(id=3) at dist=9, edge35 has a stop_over(id=6) at dist=5. Therefore the distance between these two stop_overs is:
dist = (length - dist_id3) + dist_id5 = (10-9) + 5
I am not sure if I made my self clear. If this is not understandable, feel free to ask question and I will do my best to make this more understandable.

MySQL allows you to do something silly - display fields in an aggregate query that are not a part of the GROUP BY or an aggregate function like MAX. When you do this, you get random (as you said) results for the remaining fields.
In your query you are doing this twice - once in your inner query (id2 is not part of a GROUP BY or aggregate) and once in the outer.
Prepare for random results!
To fix it, try something like this:
SELECT tbl_stopover.id,
tbl_stopover.dist,
tbl_conn.id1,
tbl_conn.id2,
tbl_conn.numConn,
MAX(tbl_stopover.dist)
FROM tbl_stopover
INNER JOIN
(SELECT edges1.id id1, edges2.id id2, COUNT(edges1.id) numConn
FROM tbl_edges edges1
INNER JOIN tbl_edges edges2
ON edges1.nodeB = edges2.nodeA
GROUP BY edges1.id, edges2.id
HAVING numConn = 1) AS tbl_conn
ON tbl_stopover.id_edge = tbl_conn.id1
GROUP BY tbl_stopover.id,
tbl_stopover.dist,
tbl_conn.id1,
tbl_conn.id2,
tbl_conn.numConn
The major changes are the explicit field list (note that I removed the id_edge since you are joining on id1 and already have that field), and addition of additional fields to both the inner and outer GROUP BY clauses.
If this gives you more rows than you want then you may need to explain more about your desired result set. Something like this is the only way to ensure you get appropriate groupings.

Okay. This seems to be the answer to my question. I will do some further "investigation" though, because I'm not sure if this is reliable. If anybody has some though on this, please leave a comment.
SELECT tbl.id, tbl.dist, tbl.id1, tbl.id2, MAX(dist) maxDist
FROM
(
SELECT tbl_stopover.id,
tbl_stopover.dist,
tbl_conn.id1,
tbl_conn.id2,
tbl_conn.numConn
FROM tbl_stopover
INNER JOIN
(SELECT edges1.id id1, edges2.id id2, COUNT(edges1.id) numConn
FROM tbl_edges edges1
INNER JOIN tbl_edges edges2
ON edges1.nodeB = edges2.nodeA
GROUP BY edges1.id
HAVING numConn = 1) AS tbl_conn
ON tbl_stopover.id_edge = tbl_conn.id1
GROUP BY tbl_stopover.dist, tbl_conn.id1
ORDER BY dist DESC) AS tbl
GROUP BY tbl.id1, tbl.id2
Thanks to JNK (my colleague at work) without whom I wouldn't have gotten this far.

Related

select rows where related record doesn't exist

I need to retrieve rows from a mysql database as follows: I have a contract table, a contract line item table, and another table called udac. I need all contracts which DO NOT have a line item record with criteria based on a relationship between contract line item and udac. If there is a better way to state this question, let me know.
Table Structures
----contract--------------------- ---contractlineitem-----------
| id | customer_id | entry_date | | id | contract_id | udac_id |
--------------------------------- ------------------------------
| 1 | 1234 | 2010-01-01 | | 1 | 1 | 5 |
| 2 | 2345 | 2016-01-31 | | 2 | 1 | 2 |
--------------------------------- | 3 | 1 | 1 |
| 4 | 2 | 4 |
| 5 | 2 | 2 |
------------------------------
---udac----------
| id | udaccode |
-----------------
| 1 | SWBL/R |
| 2 | SWBL |
| 3 | ABL/R |
| 4 | ABL |
| 5 | XRS/F |
-----------------
Given the above data, contract 2 would show up but contract 1 would not, because it has contractlineitems that point to udacs that end in /F or /R.
Here's what i have so far, but it's not correct.
SELECT c.*
FROM contract c
JOIN contractlineitem cli
ON c.id = cli.contract_id
WHERE c.entry_timestamp > '2016-01-01 00:00:00'
AND NOT EXISTS (
SELECT cli.id
FROM contractlineitem cli_i
JOIN udac u
ON cli_i.udac_id = u.id
WHERE u.udaccode LIKE '%/F' OR u.udaccode LIKE '%/R'
AND cli_i.contract_id = cli.contract_id);

Tom's comment that your WHERE clause is wrong may be the problem you are chasing. Plus, using a correlated subquery may be problematic for performance if the optimizer can't figure out a better way to do it.
Here is the better way to do it using an OUTER JOIN:
SELECT c.*
FROM contract c
JOIN contractlineitem cli
ON c.id = cli.contract_id
LEFT OUTER JOIN udac u
ON ( u.id = cli.udac_id
AND ( u.udaccode LIKE '%/F' OR u.udaccode LIKE '%/R' ) )
WHERE c.entry_timestamp > '2016-01-01 00:00:00'
AND u.id IS NULL
Try that out and see if it does what you want. The query essentially does what you stated: It tries to join to udac where the code ends in '/F' or '/R', but then it only accepts the ones where it can't find a match (u.id IS NULL).
If the same row is returned multiple times incorrectly, throw a distinct on the front.

How to join tables with SQL query and take number of tied columns?

I'm having BookTable in database (with foregin hey LibID):
| BookID | BookName | BookPrice | LibID |
-------------------------------------------
| 1 | Book_1 | 200 | 1 |
| 2 | Book_2 | 100 | 1 |
| 3 | Book_3 | 300 | 2 |
| 4 | Book_4 | 150 | 4 |
and also LibraryTable:
| LibID | LibName | LibLocation |
-----------------------------------
| 1 | Lib_1 | Loc_1 |
| 2 | Lib_2 | Loc_2 |
| 3 | Lib_3 | Loc_3 |
| 4 | Lib_4 | Loc_4 |
I need to write SQL query that will return be the info about the library and number of books for that library:
| LibID | LibName | NumberOfBooks|
------------------------------------
| 1 | Lib_1 | 2 |
| 2 | Lib_2 | 1 |
| 3 | Lib_3 | 0 |
| 4 | Lib_4 | 1 |
It should be one SQL query, probably with nested queries or joins.. Not sure how the query should look like:
SELECT L.LibID AS LibID, L.LibName AS LibName, COUNT(B) AS NumberOfBooks
FROM LibraryTable L, BookTable B
WHERE L.LibID = B.LibID
Will that work?

No, this query will not work. COUNT aggregates data, so you must explicitely tell the DBMS for which group of data you want the count. In your case this is the library (you want one result record per library).
COUNT's parameter is a column, not a table, so change this to * (i.e. count records) or a certain column (e.g. LibID).
The join syntax you are using is valid, but deprecated. Use explicit joins instead. In your case an outer join would even show libraries that have no books at all, if such is possible.
select l.libid, l.libname, count(b.libid) as numberofbooks
from librarytable l
left outer join booktable b on b.libid = l.libid
group by l.libid;
You could also do all this without a join at all and get the book count in a subquery instead. Then you wouldn't have to aggregate. That's way simpler and more readable in my opinion.
select
l.libid,
l.libname,
(select count(*) booktable b where b.libid = l.libid) as numberofbooks
from librarytable l;

SELECT lt.LibID AS LibID, lt.LibName AS LibName, count(*) AS NumberOfBooks
FROM BookTable AS bt
LEFT JOIN LibraryTable AS lt ON bt.LibID = lt.LibID
GROUP BY bt.LibID

Getting the sum of several columns from two tables result is not correct

I'am trying to get the sum of the two columns from different tables, however i have found great posts on stack. some of them helped me out. but i still can't solve this problem out..
This query somehow down below returns incorrect total of the sum of the coulmns, ( rate Coulmn - materialprice Column )
mysql> Tbl as_servicetickets;
+----------+----------+
|ticket_id | rate |
+----------+----------+
| 11 | 250.00 |
| 11 | 300.00 |
| 11 | 400.00 |
| 9 | 300.00 |
| 9 | 300.00 |
| 9 | 1500.00 |
| 9 | 250.00 |
+----------+----------+
total is 2 350.00
mysql> Tbl as_ticketmaterials;
+----------+---------------+
|ticket_id | materialprice |
+----------+---------------+
| 11 | 100 |
| 9 | 20 |
| 9 | 50 |
+----------+---------------+
total is 70.00
query---------------------////
SELECT SUM(`as_servicetickets`.`rate`) AS `sercnt`, SUM(`as_ticketmaterials`.`materialprice`) AS `sercnt`
FROM `as_servicetickets`, `as_ticketmaterials`
WHERE `as_servicetickets`.`ticket_id` = 9
AND `as_ticketmaterials`.`ticket_id` = 9
GROUP BY `as_servicetickets`.`ticket_id`, `as_ticketmaterials`.`ticket_id
result ------------------///// this is not correct
+---------+--------+
| sercnt | sercnt |
+---------+--------+
| 4700.00 | 280 |
+---------+--------+

This is not the correct way to achieve the desired result. Try this rather:-
SELECT (SELECT SUM(`as_servicetickets`.`rate`) AS `sercnt`
FROM `as_servicetickets`
WHERE `as_servicetickets`.`ticket_id` = 9),
(SELECT SUM(`as_ticketmaterials`.`materialprice`) AS `sercnt`
FROM `as_ticketmaterials`
WHERE `as_ticketmaterials`.`ticket_id` = 9);

Try using explicit join as implicit joins are discouraged (You where condition has issue)
SELECT `as_servicetickets`.`ticket_id`, SUM(`as_servicetickets`.`rate`) AS `sercnt`, SUM(`as_ticketmaterials`.`materialprice`) AS `sercnt`
FROM `as_servicetickets` INNER JOIN `as_ticketmaterials`
ON `as_servicetickets`.`ticket_id` = `as_ticketmaterials`.`ticket_id`
WHERE `as_servicetickets`.`ticket_id` = 9
GROUP BY `as_servicetickets`.`ticket_id`

select sum(a.rate) as sercnt, sum(b.materialprice) as sercnt from
as_servicetickets a inner join as_ticketmaterials b on
a.ticket_id = b.ticket_id where a.ticket_id = 9

Select rows based on 2 columns

I am not the greatest at SQL and I am trying to achieve the following:
I have a table with columns like so:
id | cup_type | cup_id | name
I have a ton of records in the database which will have the same cup_id but different cup_types
I would really like to select records that have the same cup_id but different cup_types
id | cup_type | cup_id | name
1 | TypeOne | 12 | NameOne
2 | TypeTwo | 12 | NameTwo
3 | TypeOne | 13 | NameThree
4 | TypeTwo | 13 | NameFour
5 | TypeOne | 14 | NameFive
6 | TypeOne | 14 | NameSix
When I run the said query it would being me back the following:
id | cup_type | cup_id | name
1 | TypeOne | 12 | NameOne
2 | TypeTwo | 12 | NameTwo
3 | TypeOne | 13 | NameThree
4 | TypeTwo | 13 | NameFour
I hope I have explained this ok and let me know if more clarity is needed.

This query would do the trick
select * from
yourtable a
join (select cup_id, count(distinct cup_type) nbType
from yourTable
group by cup_id) b using(cup_id)
where b.nbType >= 2;
Get a result set from your table where you count the distinct cup_type.
Group that result set by cup_id.
Keep the cup_id so we can join on the same table, using that id.
Return only those where the count of distinct types was at least two.

Try something like this:
select a.id, b.id ....... from t1 as a, t2 as b where a.cup_id=b.cup_id and a.cup_type !=b.cup_type

Finding Max() values via foreign key

Consider this database structure:
__________ __________
| Trucks | | Mileage |
|__________|__________ |__________|________________________
| ID | DRIVER | | TRUCK_ID | MILEAGE | OIL_CHANGE |
|---------------------| |-----------------------------------|
| 1 | Tony | | 1 | 100000 105000 |
| 2 | George | | 2 | 6020 10020 |
| 3 | Mary | | 3 | 37798 41000 |
|_____________________| | 3 | 41233 47200 |
| 3 | 49000 |
|___________________________________|
I want to end up with a result set containing the maximum miles and maximum oil_change for each driver.
_________________________________
| 1 | Tony | 100000 | 105000 |
| 2 | George| 6020 | 10020 |
| 3 | Mary | 49000 | 47200 |
|_______________________________|
This is what I have tried so far:
SELECT t.*, MAX(m.mileage) AS mileage, MAX(m.oil_change) AS oil_change
FROM trucks t
LEFT JOIN mileage m ON t.id = m.truck_id
GROUP BY t.id
But this doesn't seem to allow the MAX function to work properly. It does not always contain the actual maximum value for mileage

Got it! Your mileage column must be defined as a character type, not a numeric type! When than happens, order is done alphabetically, not by value.
You should convert your mileage and oil_change columns to a numeric type (I'd recommend INT based on the data sample provided).
While you don't convert them, this will work:
SELECT t.*, MAX(cast(m.mileage as int)) AS mileage,
MAX(cast(m.oil_change as int)) AS oil_change
FROM trucks t
LEFT JOIN mileage m ON t.id = m.truck_id
GROUP BY t.id

The below queries should work for your question.
SELECT T.DRIVER,MIN(MILEAGE) AS MIN_MILEAGE,MIN(OIL_CHANGE) AS MIN_OIL_CHANGE
FROM TRUCKS T INNER JOIN MILEAGE M
ON T.ID = M.TRUCK_ID
GROUP BY T.DRIVER;
SELECT T.DRIVER,MAX(MILEAGE) AS MAX_MILEAGE,MAX(OIL_CHANGE) AS MAX_OIL_CHANGE
FROM TRUCKS T INNER JOIN MILEAGE M
ON T.ID = M.TRUCK_ID
GROUP BY T.DRIVER;
Regards
Venk

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

GROUP BY with aggregate and an INNER JOIN - mysql

Related

select rows where related record doesn't exist

How to join tables with SQL query and take number of tied columns?

Getting the sum of several columns from two tables result is not correct

Select rows based on 2 columns

Finding Max() values via foreign key

Categories

Resources