Query to lookup reference tables on sum the result - mysql

I am new to SQL, would like to have your suggestions on how to solve this problem,
I have the sales information by type
I want to sum the Prices of certain references by Type and based on the resulting sum, fetch the values from another table and populate in the Output Column.
Group Type 100000 200000 300000
1 A 1 2 3
1 B 0 1 1
2 T 2 2 4
2 U 0 2 2
3 V 2 2 3
4 N 1 1 1
From the above table 2 we find the TYPE A and B belong to same group - Group 1. So in the first table, the query should sum Prices of the references belonging to the Group 1. If the sum is >100000 and <=200000 then based on the type the corresponding value must be chosen.
Incase the sum of Prices based on group is less than 100000 or the type not found in Table 2 then it should take the values from the below table
[+------+----+---+
| Type | 1 | 2 |
+------+----+---+
| A | 50 | 2 |
| B | 60 | 5 |
| C | 65 | 2 |
| D | 65 | 3 |
| E | 65 | 4 |
+------+----+---+][3]
Thus the final output for the above datasheet would be like below,
Order ID Reference Type Price Output
101 AAA A 500000 3
101 AAB B 100000 1
101 ABC C 20000 67
101 DCE B 50000 1
101 BOD D 200000 68
101 ZYZ E 200000 69
102 AAA A 20000 52
So for the first line, its TYPE A and Type A is present under Group 1 and in Group1 we also have Type 2. So for the same order ID 101 , the overall Sales of Type A and B is 650000 > 300000, therefore for Type A we chose the value 3 from the table 2. Since Type C is not present in Table 2, I went to Table 3 and added the two values and so on
Sorry for the long post. Hope my question is clear? Would like to have your expert opinion.
Thanks,
SS

Join all tables and make sure you do LEFT JOIN as we want to keep records from the first table even we don't have corresponding data in the second or third table.
For total count, give priority to the second table, use case when to verify in which range this mrp field is falling. If lies within a range pick count from the second table otherwise pick count from the third table.
SELECT
s.order_id,
s.reference,
s.`type`,
s.mrp,
#a:= IFNULL(g_total.Total, s.mrp) AS MRP_Total, -- #a variable to use it in CASE WHEN clause
CASE
WHEN #a > 100000 AND #a <= 200000 AND sg.`type` IS NOT NULL THEN sg.price_100000
WHEN #a > 200000 AND #a <= 300000 AND sg.`type` IS NOT NULL THEN sg.price_200000
WHEN #a > 300000 AND sg.`type` IS NOT NULL THEN sg.price_300000
ELSE tp.price_1 + tp.price_2
END Total
FROM sales s
LEFT JOIN sales_group sg ON s.`type` = sg.`type`
LEFT JOIN type_prices tp ON s.`type` = tp.`type`
LEFT JOIN (
SELECT
s.order_id, sgg.`group`, SUM(mrp) as Total
FROM sales s
INNER JOIN sales_group sgg ON s.`type` = sgg.`type`
GROUP BY s.order_id, sgg.`group`
) AS g_total -- Temp table to find total MRP, order and group wise
ON s.order_id = g_total.order_id AND sg.`group` = g_total.`group`
ORDER BY s.order_id, s.`type`;
Output:
sales
---
| order_id | reference | type | mrp | MRP_Total | Total |
---------------------------------------------------------
| 101 | AAA | A | 500000 | 650000 | 3 |
| 101 | DCE | B | 50000 | 650000 | 1 |
| 101 | AAB | B | 100000 | 650000 | 1 |
| 101 | ABC | C | 200000 | 200000 | 67 |
| 101 | BOD | D | 200000 | 200000 | 68 |
| 101 | ZYZ | E | 200000 | 200000 | 69 |
| 102 | AAA | A | 20000 | 20000 | 52 |
Note: sg.type IS NOT NULL is added in CASE WHEN clause because if we don't have any mapping in the second table, we should move to ELSE part which refers to the third table.

Related

Joining tables but needs 0 for empty rows

I don't know how to explain the scenario using words. So am writing the examples:
I have a table named tblType:
type_id | type_name
---------------------
1 | abb
2 | cda
3 | edg
4 | hij
5 | klm
And I have another table named tblRequest:
req_id | type_id | user_id | duration
-------------------------------------------
1 | 4 | 1002 | 20
2 | 1 | 1002 | 60
3 | 5 | 1008 | 60
....
So what am trying to do is, fetch the SUM() of duration for each type, for a particular user.
This is what I tried:
SELECT
SUM(r.`duration`) AS `duration`,
t.`type_id`,
t.`type_name`
FROM `tblRequest` AS r
LEFT JOIN `tblType` AS t ON r.`type_id` = t.`type_id`
WHERE r.`user_id` = '1002'
GROUP BY r.`type_id`
It might return something like this:
type_id | type_name | duration
-------------------------------
1 | abb | 60
4 | hij | 20
It works. But the issue is, I want to get 0 as value for other types that doesn't have a row in tblRequest. I mean I want the output to be like this:
type_id | type_name | duration
-------------------------------
1 | abb | 60
2 | cda | 0
3 | edg | 0
4 | hij | 20
5 | klm | 0
I mean it should get the rows of all types, but 0 as value for those type that doesn't have a row in tblRequest
You could perform the aggregation on tblRequest and only then join it, using a left join to handle missing rows and coalesce to convert the nulls to 0s:
SELECT t.type_id, type_name, COALESCE(sum_duration, 0) AS duration
FROM tblType t
LEFT JOIN (SELECT type_id, SUM(duration) AS sum_duration
FROM tblRequest
WHERE user_id = '1002'
GROUP BY type_id) r ON t.type_id = r.type_id
Select a.type_id, isnull(sum(b.duration), 0)
From tblType a Left Outer Join tblRequest b
ON a.type_id = b.type_id and b.user_id = 1002
Group by a.type_id

a query that returns a single row for each foreign key

I have a table of routines. In this table, I have the column "grade" (which is not mandatory), and the column "date". Also, I have a number of days and an array of ids of users. I need a query that returns me the last routine that have a value != null for "grade" column and datediff(current_date,date) >= number_of_days for each id in the array and make an average of all these values.
e.g.
today = 2014/10/15
number_of_days = 10
ids(1,3)
routines
id | type | date | grade | user_id
1 | 1 | 2014-10-10 | 3 | 1
2 | 1 | 2014-10-04 | 3 | 1
3 | 1 | 2014-10-01 | 3 | 1
4 | 1 | 2014-09-24 | 2 | 1
5 | 1 | 2014-10-10 | 2 | 2
6 | 1 | 2014-10-04 | 3 | 2
7 | 1 | 2014-10-01 | 3 | 2
8 | 1 | 2014-09-24 | 1 | 2
9 | 1 | 2014-10-10 | 1 | 3
10 | 1 | 2014-10-04 | 1 | 3
11 | 1 | 2014-10-01 | 1 | 3
12 | 1 | 2014-09-24 | 1 | 3
In this case, my query would return an avg between "grade" of row id #2 and #10
I think you're saying that you want to consider rows having non-null values in the grade column, a date within a given number of days of the current date, and one of a given set of user_ids. Among those rows, for each user_id you want to choose the row with the latest date, and compute an average of the grade columns for those rows.
I will assume that you cannot have any two rows with the same user_id and date, both with non-null grades, else the question you want to ask does not have a well-defined answer.
A query along these lines should do the trick:
SELECT AVG(r.grade) AS average_grade
FROM
(SELECT user_id, MAX(date) AS date
FROM routines
WHERE grade IS NOT NULL
AND DATEDIFF(CURDATE(), date) >= 10
AND user_id IN (1,3)
GROUP BY user_id) AS md
JOIN routines r
ON r.user_id = md.user_id AND r.date = md.date
Note that in principle you need a grade IS NOT NULL condition on both the inner and the outer query to select the correct rows to average, but in practice AVG() ignores nulls, so you don't actually have to filter out the extra rows in the outer query.

prevent JOIN from matching same row over and over again

So i have 2 tables like this
trans_data table
mnth| id | units
2 | ab | 20
3 | cd | 20
2 | ab | 25
2 | fd | 28
2 | ab | 40
2 | cd | 70
3 | ab | 80
2 | ab | 10
quota table
mnth | metric | id | quota
2 | 1 | ab | 30
2 | 1 | cd | 30
2 | 1 | fd | 30
3 | 1 | ab | 40
3 | 1 | cd | 40
3 | 1 | fd | 40
Here is my SQL
SELECT
SUM(trans_data.units) AS ga, SUM(quota.quota)
FROM
trans_data
LEFT JOIN quota ON
trans_data.id = quota.id
AND quota.mnth BETWEEN 2 AND 2
AND quota.metric = 1
WHERE trans_data.id = 'ab'
AND trans_data.mnth BETWEEN 2 AND 2
What is happening is that since there are multiple rows in the trans_data table that have a id='ab', each of those rows is getting paired with the one row in quota that has id='ab'.
This throws off the sum value. what can i do so that the rows from quota are not repeated in the sum() calculation
Desired Result:
sum(trans_data.units) | sum(Quota.quota)
183 | 30
You don't need an INNER JOIN, just calculate the two sums independently:
SELECT ga, total_quota
FROM (SELECT SUM(units) AS ga
FROM trans_data
WHERE id = 'ab'
AND mnth BETWEEN 2 AND 2) AS t1
CROSS JOIN
(SELECT SUM(quota) AS total_quota
FROM quota
WHERE id = 'ab'
AND mnth BETWEEN 2 AND 2
AND metric = 1) AS t2
Or:
SELECT
(SELECT SUM(units)
FROM trans_data
WHERE id = 'ab'
AND mnth BETWEEN 2 AND 2) AS ga,
(SELECT SUM(quota)
FROM quota
WHERE id = 'ab'
AND mnth BETWEEN 2 AND 2
AND metric = 1) AS total_quota
You don't actually need to sum the quota in this case since there is only a singe row. You could change the sum function to a max function in this case. In general Barmar's solution is best since it will handle more than one row if that ever occurs.

MySQL - Get row with the maximum HISTORY ID for COMPONENT IDs in non-existing months

I have a table INVENTORY which consists of inventory items. I have the following table structure:
INSTALLATION_ID
COMPONENT_ID
HISTORY_ID
ON_STOCK
LAST_CHANGE
I need to obtain the row with the max HISTORY ID for records for which the spcified LAST_CHANGE month doesn't exist.
Each COMPONENT_ID and INSTALLATION_ID can occur multiple times, they are distinguished by their respective HISTORY_ID
Example:
I have the following records
COMPONENT_ID | INSTALLATION_ID | HISTORY_ID | LAST_CHANGE
1 | 100 | 1 | 2013-01-02
1 | 100 | 2 | 2013-02-01
1 | 100 | 3 | 2013-04-09
2 | 100 | 1 | 2013-02-22
2 | 100 | 2 | 2013-03-12
2 | 100 | 3 | 2013-07-07
2 | 100 | 4 | 2013-08-11
2 | 100 | 5 | 2013-09-15
2 | 100 | 6 | 2013-09-29
3 | 100 | 1 | 2013-02-14
3 | 100 | 2 | 2013-09-23
4 | 100 | 1 | 2013-04-17
I am now trying to retrieve the rows with the max HISTORY ID for each component but ONLY for COMPONENT_IDs in which the specifiec month does not exists
I have tried the following:
SELECT
INVENTORY.COMPONENT_ID,
INVENTORY.HISTORY_ID
FROM INVENTORY
WHERE INVENTORY.HISTORY_ID = (SELECT
MAX(t2.HISTORY_ID)
FROM INVENTORY t2
WHERE NOT EXISTS
(
SELECT *
FROM INVENTORY t3
WHERE MONTH(t3.LAST_CHANGE) = 9
AND YEAR(t3.LAST_CHANGE)= 2013
AND t3.HISTORY_ID = t2.HISTORY_ID
)
)
AND INVENTORY.INSTALLATION_ID = 200
AND YEAR(INVENTORY.LAST_CHANGE) = 2013
The query seems to have correct syntax but it times out.
In this particular case, i would like to retrieve the maximum HISTORY_ID for all components except for those that have records in September.
Because I need to completely exclude rows by their month, i cannot use NOT IN, since they will just suppress the records for september but the same component could show up with another month.
Could anybody give some pointers? Thanks a lot.
If I understand correctly what you want you can do it like this
SELECT component_id, MAX(history_id) history_id
FROM inventory
WHERE last_change BETWEEN '2013-01-01' AND '2013-12-31'
AND installation_id = 100
GROUP BY component_id
HAVING MAX(MONTH(last_change) = 9) = 0
Output:
| COMPONENT_ID | HISTORY_ID |
|--------------|------------|
| 1 | 3 |
| 4 | 1 |
If you always filter by installation_id and a year of last_change make sure that you have a compound index on (installation_id, last_change)
ALTER TABLE inventory ADD INDEX (installation_id, last_change);
Here is SQLFiddle demo

Remove duplicates from one column keeping whole rows

id | userid | total_points_spent
1 | 1 | 10
2 | 2 | 15
3 | 2 | 50
4 | 3 | 5
5 | 1 | 15
With the above table, I would first like to remove duplicates of userid keeping the rows with the largest total_points_spent, like so:
id | userid | total_points_spent
3 | 2 | 50
4 | 3 | 5
5 | 1 | 15
And then I would like to sum the values of total_points_spent, which would be the easy part, resulting in 70.
I am not really sure the "remove" you meant is to delete or to select. Here is the query for select only max totalpointspend record respectively.
SELECT tblA.*
FROM ( SELECT userid, MAX(totalpointspend) AS maxtotal
FROM tblA
GROUP BY userid ) AS dt
INNER JOIN tblA
ON tblA.userid = dt.userid
AND tblA.totalpointspend = dt.maxtotal
ORDER BY tblA.userid