MySQL: join grouped data to table - only first row joined

MySQL: join grouped data to table - only first row joined - mysql

I have problem joining tables with following content:
Table RingOrderItem:
+----+-------------+--------+
| ID | ID_RingType | Amount |
+----+-------------+--------+
| 1 | A | 100 |
| 2 | B | 50 |
| 3 | A | 500 |
| 4 | C | 100 |
+----+-------------+--------+
Grouped table Rings - result of SELECT min(Rings.Number) AS Number, ID_RingType FROM Rings GROUP BY ID_RingType statement:
+--------+-------------+
| Number | ID_RingType |
+--------+-------------+
| 1 | A |
| 1 | B |
+--------+-------------+
I want to retrieve all records from RingOrderItem and join number from grouped table Rings to them, for which I used this query:
SELECT
roi.ID,
roi.ID_RingOrder,
roi.ID_RingType,
roi.Amount,
min(r.Number) AS `FromValue`,
min(r.Number) + roi.Amount - 1 AS `ToValue`
FROM
RingOrderItem AS roi
LEFT JOIN
(SELECT min(Rings.Number) AS Number, ID_RingType FROM Rings
GROUP BY ID_RingType)
AS r ON r.ID_RingType = roi.ID_RingType;
For some reason, I get only the first row from RingOrderItem table:
+----+--------------+-------------+--------+-----------+---------+
| ID | ID_RingOrder | ID_RingType | Amount | FromValue | ToValue |
+----+--------------+-------------+--------+-----------+---------+
| 1 | 1 | A | 100 | 1 | 100 |
+----+--------------+-------------+--------+-----------+---------+
I want all rows, and if the data can not be joined (value C in ID_RingType), than return simply NULL.
Thanks,
Zbynek

I don't think you need the two min() functions on the main query since you are already getting the min values in the sub query.
Also, it's not really a good idea to do math to a column that might be NULL
Try this:
SELECT
roi.ID,
roi.ID_RingOrder,
roi.ID_RingType,
roi.Amount,
r.Number AS FromValue,
COALESCE(r.Number, 0) + roi.Amount - 1 AS ToValue
FROM
RingOrderItem AS roi
LEFT JOIN
(
SELECT
MIN(Rings.Number) AS Number,
ID_RingType
FROM
Rings
GROUP BY
ID_RingType
) AS r ON roi.ID_RingType = r.ID_RingType;
Also, switch your left join ON clause to have the first table listed first.

Related

How to select the sum() of a group of rows and the sum() of another group

I have created a SQLfiddle demo with sample data and desired result here :(http://sqlfiddle.com/#!9/dfe73a/7)
sample data
-- table company
+--------+---------+
| id | name |
+--------+---------+
| 1 | foo |
| 2 | bar |
+--------+---------+
-- table sales
+--------+---------------+-----------------+
| id | company_id | total_amount |
+--------+---------------+-----------------+
| 1 | 1 | 300.0 |
| 2 | 1 | 300.0 |
| 2 | 1 | 100.0 |
+--------+---------------+-----------------+
-- table moves
+--------+---------------+-----------------+
| id | company_id | balance_move |
+--------+---------------+-----------------+
| 1 | 1 | 700.0 |
| 2 | 1 | -300.0 |
| 2 | 1 | -300.0 |
+--------+---------------+-----------------+
I need to select every company along with the sum of it's total amount of sales and the sum of it's total balance moves
desired result
+----+----------------------+---------------------+
| id | total_amount_sum | balance_move_sum |
+----+----------------------+---------------------+
| 1 | 700 | 100 |
+----+----------------------+---------------------+
| 2 | (null) | (null) |
+----+----------------------+---------------------+
I tried this SQL query
SELECT
company.id,
sum(total_amount) total_amount_sum,
sum(balance_move) balance_move_sum
FROM company
LEFT JOIN sales ON company.id = sales.company_id
LEFT JOIN moves ON company.id = moves.company_id
GROUP BY company.id
But the sum() functions add all the redundant values came from the joins which result in 2100 (700*3) for total amount and 300 (100*3) for net balance
bad SQL statement result
+----+----------------------+---------------------+
| id | total_amount_sum | balance_move_sum |
+----+----------------------+---------------------+
| 1 | 2100 | 300 |
+----+----------------------+---------------------+
| 2 | (null) | (null) |
+----+----------------------+---------------------+
Is it possible to achieve the result I want ?

You're repeating rows by doing your joins.
Company: 1 row per company
After Sales join: 3 rows per company (1x3)
After Moves join: 9 rows per company (3x3)
You end up triplicating your SUM because of this.
One way to fix is to use derived tables like this, which calculate the SUM first, then join the resulting rows 1-to-1.
SELECT
company.id,
total_amount_sum,
balance_move_sum
FROM company
LEFT JOIN (SELECT SUM(total_amount) total_amount_sum, company_id
FROM sales
GROUP BY company_id
) sales ON company.id = sales.company_id
LEFT JOIN (SELECT SUM(balance_move) balance_move_sum, company_id
FROM moves
GROUP BY company_id
) moves ON company.id = moves.company_id

Using sub-queries to calculate the two sums separately will work.
SELECT
company.id,
(Select sum(total_amount) from sales where sales.company_id = company.id) total_amount_sum,
(Select sum(balance_move) from moves where moves.company_id = company.id) balance_move_sum
FROM company

Identifying the pairs of ID's in a column with the highest number of matches in SQL

I am trying to find the pairs of businesses with the highest number of common customers using MySQL.
The table is like the following:
+------------+------------+
| BusinessID | CustomerID |
+------------+------------+
| A | 1 |
| A | 2 |
| A | 3 |
| B | 4 |
| B | 1 |
| B | 3 |
| B | 2 |
| C | 3 |
| C | 4 |
| C | 5 |
+------------+------------+
And I want the output to be the pairs of businesses and the number of common customers, like this:
+-------------+-------------+------------------------+
| BusinessID | BusinessID | Common Customers Count |
+-------------+-------------+------------------------+
| A | B | 3 |
| A | C | 1 |
| B | C | 2 |
+-------------+-------------+------------------------+
This is the query I wrote:
SELECT a.BusinessID,b.BusinessID,COUNT(*) AS ncom
FROM (SELECT BusinessID, CustomerID FROM MYTABLE) AS a JOIN
(SELECT BusinessID,CustomerID FROM MYTABLE) AS b
ON a.BusinessID < b.BusinessID AND a.CustomerID = b.CustomerID
GROUP BY a.BusinessID, b.BusinessID
ORDER BY ncom
The problem is that my dataset has about 5m rows, and this seems to be too inefficient on large datasets. I tested the query on smaller datasets by limiting the data -- it took 8 seconds to process 10k rows and 30 seconds for 20k rows, so this query wouldn't be feasible to run for 5m rows. How else can I write the query to make it faster?

Don't use subqueries to get the columns from the table, that's probably preventing it from using indexes.
SELECT a.BusinessID, b.BusinessID, COUNT(*) as ncom
FROM MYTABLE AS a
JOIN MYTABLE AS b ON a.BusinessID < b.BusinessID AND a.CustomerID = b.CustomerID
GROUP BY a.BusinessID, b.BusinessID
ORDER BY ncom
Also, give the table the following index:
CREATE INDEX ix_cust_bus ON MYTABLE (CustomerID, BusinessID);

using HAVING to filter results based on a reference row

I have the following table:
+---------+--------------+----------+
| item_id | location_id | price |
+---------+--------------+----------+
| 1 | 1 | 100 |
| 1 | 1 | 250 |
| 1 | 2 | 50 |
| 2 | 1 | 250 |
| 2 | 1 | 1000 |
| 3 | 1 | 1000 |
| 3 | 2 | 100 |
+---------+--------------+----------+
I can reduce this down to the minimum values using this query
SELECT
item_id, location_id, MIN(price) AS Price
from
table
GROUP BY item_id , location_id
This gets me
+---------+--------------+----------+
| item_id | location_id | price |
+---------+--------------+----------+
| 1 | 1 | 100 |
| 1 | 2 | 50 |
| 2 | 1 | 250 |
| 3 | 1 | 1000 |
| 3 | 2 | 100 |
+---------+--------------+----------+
I want to reduce this further. I am using the rows with a location_id of 1 as a reference row. For each row that has an item_id matching the reference row's item_id but a different location id. I want to compare that row's price with the reference row's price. If the price is lower than the reference row's price, I want to filter that row out.
My final result should include the reference row for each item id and any rows that met the criteria of the price being lower than the reference row price.
I have a hunch that I can use the HAVING clause to do this but I am having trouble compiling the statement. How should I construct the HAVING statement?
Thanks in advance

Nah, having can't help you like this, having is for things like you need filter min() result for something
e.g:
select id,min(price) from table where date = '2016-3-18' group by id having min(price) = 50
it will show you the records that min(price)=50
let's back to your case, there are lots of way to do that,
1. left join
select a.item_id,a.location_id,a.price
from table a
left join table b
on a.location_id = b.location_id and a.price > b.price
where b.price is null
2. exists
select a.item_id,a.location_id,a.price
from table a
where exists(
select 1 from
(select location_id,min(price)as price from table group by location_id)b
where a.location_id = b.location_id and a.price = b.price
)
normally i ll recommand you use exists

MySQL select unique rows in two columns with the highest value in one column

I have a basic table:
+-----+--------+------+------+
| id, | name, | cat, | time |
+-----+--------+------+------+
| 1 | jamie | 1 | 100 |
| 2 | jamie | 2 | 100 |
| 3 | jamie | 1 | 50 |
| 4 | jamie | 2 | 150 |
| 5 | bob | 1 | 100 |
| 6 | tim | 1 | 300 |
| 7 | alice | 4 | 100 |
+-----+--------+------+------+
I tried using the "Left Joining with self, tweaking join conditions and filters" part of this answer: SQL Select only rows with Max Value on a Column but some reason when there are records with a value of 0 it breaks, and it also doesn't return every unique answer for some reason.
When doing the query on this table I'd like to receive the following values:
+-----+--------+------+------+
| id, | name, | cat, | time |
+-----+--------+------+------+
| 1 | jamie | 1 | 100 |
| 4 | jamie | 2 | 150 |
| 5 | bob | 1 | 100 |
| 6 | tim | 1 | 300 |
| 7 | alice | 4 | 100 |
+-----+--------+------+------+
Because they are unique on name and cat and have the highest time value.
The query I adapted from the answer above is:
SELECT a.name, a.cat, a.id, a.time
FROM data A
INNER JOIN (
SELECT name, cat, id, MAX(time) as time
FROM data
WHERE extra_column = 1
GROUP BY name, cat
) b ON a.id = b.id AND a.time = b.time

The issue here is that ID is unique per row you can't get the unique value when getting the max; you have to join on the grouped values instead.
SELECT a.name, a.cat, a.id, a.time
FROM data A
INNER JOIN (
SELECT name, cat, MAX(time) as time
FROM data
WHERE extra_column = 1
GROUP BY name, cat
) b ON A.Cat = B.cat and A.Name = B.Name AND a.time = b.time
Think about it... So what ID is mySQL returning form the Inline view? It could be 1 or 3 and 2 or 4 for jamie. Hows does the engine know to pick the one with the max ID? it is "free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. " it could pick the wrong one resulting in incorrect results. So you can't use it to join on.
https://dev.mysql.com/doc/refman/5.0/en/group-by-handling.html

If you want to use a self join, you could use this query:
SELECT
d1.*
FROM
date d1 LEFT JOIN date d2
ON d1.name=d2.name
AND d1.cat=d2.cat
AND d1.time<d2.time
WHERE
d2.time IS NULL

It is very simple
SELECT MAX(TIME),name,cat FROM table name group by cat

MySQL/MariaDB GROUP BY, ORDER BY returns same result twice

Assume I have the following table
+----+--------+--------+
| id | result | person |
+----+--------+--------+
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | 2 | 2 |
| 4 | 4 | 3 |
| 5 | 4 | 1 |
| 6 | 1 | 2 |
+----+--------+--------+
Now I want to get the best result by each person ordered high to low, where best result means highest value of the result-column, so basically I want to GROUP BY person and ORDER BY result. Also if a person has the same result more than one time, I only want to return want one of those results. So the return I want is this:
+----+--------+--------+
| id | result | person |
+----+--------+--------+
| 4 | 4 | 3 |
| 5 | 4 | 1 |
| 2 | 2 | 2 |
+----+--------+--------+
The following query almost gets me there:
SELECT id, groupbytest.result, groupbytest.person
FROM groupbytest
JOIN (
SELECT MAX(result) as res, person
FROM groupbytest
GROUP BY person
) AS tmp
ON groupbytest.result = tmp.res
AND groupbytest.person = tmp.person
ORDER BY groupbytest.result DESC;
but returns two rows for the same person, if this person has made the same best result twice, so what I get back is
+----+--------+--------+
| id | result | person |
+----+--------+--------+
| 4 | 4 | 3 |
| 5 | 4 | 1 |
| 2 | 2 | 2 |
| 3 | 2 | 2 |
+----+--------+--------+
If two results for the same person are similar, only the one with lowest id should be returned, so instead of returning rows with ids 2 and 3, only row with id 2 should be returned.
Any ideas how to implement this?

Try this:
SELECT ttable.* from ttable
inner join
(
SELECT max(ttable.id) as maxid FROM `ttable`
inner join (SELECT max(`result`) as res, `person` FROM `ttable` group by person) t
on
ttable.result = t.res
and
ttable.person = t.person
group by ttable.person ) tt
on
ttable.id = tt.maxid

Check if tmp results in the correct resulting table. I think tmp should group correctly. The join adds new rows, because you have different values of "id".
Hence the rows with different id's will be treatet as different rows, no matter if the other columns are equal. You do not have duplicate results as long as there is no duplicate id. Try to remove the id from the SELECT. Then you should have the result you wanted, but without the id.
Example: Imagine Rooms with your id's from above. Let result be the amount of tables in the room and person the amount of people. Just because you have randomly the same amount of tables and people in room 2 and 3, it doesn't mean, that this are the same rooms.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

MySQL: join grouped data to table - only first row joined - mysql

Related

How to select the sum() of a group of rows and the sum() of another group

Identifying the pairs of ID's in a column with the highest number of matches in SQL

using HAVING to filter results based on a reference row

MySQL select unique rows in two columns with the highest value in one column

MySQL/MariaDB GROUP BY, ORDER BY returns same result twice

Categories

Resources