SQL dividing numbers by different group by results - mysql

Basically I have a table with several columns but I'm only interested in 4. My table looks like this:
accountID productID profit cost date country city quantity
2 10 25 15 2017/03/13 Afghanistan Kabul 330
18 3 45 42 2017/05/14 UK London 5300
25 14 22 17 2017/05/21 UK London 300
3 11 30 26 2017/04/23 Afghanistan Herat 400
What I want to achieve is to get the total quantity of products by city, by country and the ratio of city_quantity/country_quantity:
country city city_quantity country_quantity city_percentage
Afghanistan Kabul 800 1400 0.57
Afghanistan Kandahar 400 1400 0.29
Afghanistan Herat 200 1400 0.14
UK London 6500 10000 0.65
UK Manchester 3000 10000 0.3
UK Newcastle 500 10000 0.05
So far my script looks like this:
select country, city, sum(quantity)
from table
where date > dateadd(month,-1,getdate())
group by country, city
order by country, city
The where condition is because I only want the last month of data so consider it irrelevant.
How can I achieve what I want with a simple script?

You can do this using a self-join to sum the quantity grouped by country and then use that value to come up with the percentage like so:
CREATE TABLE testTable (country VARCHAR(2), city CHAR(1), quantity DECIMAL(10,2));
INSERT INTO testTable (country, city, quantity)
VALUES
('Af', 'K', 330),
('Af', 'H', 400),
('Af', 'Q', 700),
('UK', 'L', 1000),
('UK', 'M', 500);
SELECT t.country, t.city, t.quantity AS city_quantity, q.qty AS country_quantity,
CAST(t.quantity / q.qty AS DECIMAL(10,2)) AS city_percentage
FROM testTable t
JOIN
(
SELECT z.country, SUM(z.quantity) qty
FROM testTable z
GROUP BY z.country
) q ON t.country = q.country
This will give you the desired results set:
+---------+------+--------------+-----------------+-----------------+
| country | city | city_quantity| country_quantity| city_percentage |
+---------+------+--------------+-----------------+-----------------+
| Af | K | 330 | 1430 | 0.23 |
| Af | H | 400 | 1430 | 0.28 |
| Af | Q | 700 | 1430 | 0.49 |
| UK | L | 1000 | 1500 | 0.67 |
| UK | M | 500 | 1500 | 0.33 |
+---------+------+--------------+-----------------+-----------------+
Keep in mind that you'll want to check for division by zero in any production query.

Related

Conditional Window Functions

I have a sales table that looks like this:
store_id cust_id txn_id txn_date amt industry
200 1 1 20180101 21.01 1000
200 2 2 20200102 20.01 1000
200 2 3 20200103 19 1000
200 3 4 20180103 19 1000
200 4 5 20200103 21.01 1000
300 2 6 20200104 1.39 2000
300 1 7 20200105 12.24 2000
300 1 8 20200105 25.02 2000
400 2 9 20180106 103.1 1000
400 2 10 20200107 21.3 1000
Here's the code to generate this sample table:
CREATE TABLE sales(
store_id INT,
cust_id INT,
txn_id INT,
txn_date bigint,
amt float,
industry INT);
INSERT INTO sales VALUES(200,1,1,20180101,21.01,1000);
INSERT INTO sales VALUES(200,2,2,20200102,20.01,1000);
INSERT INTO sales VALUES(200,2,3,20200103,19.00,1000);
INSERT INTO sales VALUES(200,3,4,20180103,19.00,1000);
INSERT INTO sales VALUES(200,4,5,20200103,21.01,1000);
INSERT INTO sales VALUES(300,2,6,20200104,1.39,2000);
INSERT INTO sales VALUES(300,1,7,20200105,12.24,2000);
INSERT INTO sales VALUES(300,1,8,20200105,25.02,2000);
INSERT INTO sales VALUES(400,2,9,20180106,103.1,1000);
INSERT INTO sales VALUES(400,2,10,20200107,21.3,1000);
What I would like to do is create a new table, results that answers the question: what percentage of my VIP customers have, since January 3rd 2020, shopped i) at my store only; ii) at my store and at other stores in the same industry; iii) at only other stores in the same industry? Define a VIP customer to be someone who has shopped at a given store at least once since 2019.
Here's the target output table:
store industry pct_my_store_only pct_both pct_other_stores_only
200 1000 0.5 0.5 0.0
300 2000 0.5 0.5 0.0
400 1000 0.0 1.0 0.0
I'm trying to use window functions to accomplish this. Here's what I have so far:
CREATE TABLE results as
SELECT s.store_id, s.industry,
COUNT(DISTINCT (CASE WHEN s.txn_date>=20200103 THEN s.cust_id END)) * 1.0 / sum(count(DISTINCT (CASE WHEN s.txn_date>=20200103 THEN s.cust_id END))) OVER (PARTITION BY s.industry) AS pct_my_store_only
...AS pct_both
...AS pct_other_stores_only
FROM sales s
WHERE sales.txn_date>=20190101
GROUP BY s.store_id, s.industry;
The above does not seem to be correct; how can I correct this?
Join the distinct store_ids and industries to the concatenated distinct store_ids and industries for each customer and then use window function avg() with the function find_in_set() to determine if a customer how many customer have shopped or not from each store:
with
stores as (
select distinct store_id, industry
from sales
where txn_date >= 20190103
),
customers as (
select cust_id,
group_concat(distinct store_id) stores,
group_concat(distinct industry) industries
from sales
where txn_date >= 20190103
group by cust_id
),
cte as (
select *,
avg(concat(s.store_id) = concat(c.stores)) over (partition by s.store_id, s.industry) pct_my_store_only,
avg(find_in_set(s.store_id, c.stores) = 0) over (partition by s.industry) pct_other_stores_only
from stores s inner join customers c
on find_in_set(s.industry, c.industries) and find_in_set(s.store_id, c.stores)
)
select distinct store_id, industry,
pct_my_store_only,
1 - pct_my_store_only - pct_other_stores_only pct_both,
pct_other_stores_only
from cte
order by store_id, industry
See the demo.
Results:
> store_id | industry | pct_my_store_only | pct_both | pct_other_stores_only
> -------: | -------: | ----------------: | -------: | --------------------:
> 200 | 1000 | 0.5000 | 0.5000 | 0.0000
> 300 | 2000 | 0.5000 | 0.5000 | 0.0000
> 400 | 1000 | 0.0000 | 1.0000 | 0.0000

Adding Amt1 and Amt2 values to the output column value of previous record

Input:
dated amount Amt1 Amt2
1/1/2017 100 0 10
1/2/2017 100 10 0
1/4/2017 100 0 0
1/6/2017 100 300 10
1/10/2017 100 0 20
1/11/2017 100 350 650
1/12/2017 100 0 234
Output:
dated amount Amt1 Amt2 Output Column
1/1/2017 100 0 10 100
1/2/2017 100 10 0 110
1/4/2017 100 0 0 120
1/6/2017 100 300 10 120
1/10/2017 100 0 20 430
1/11/2017 100 350 650 450
1/12/2017 100 0 234 1450
Output column is calculated with adding Amt1 and Amt2 values to the Output Column value of previous record.
Example: Output Column of
first record is as it is of Amount column,
second record will get from first record value of output column and Amt1 and Amt2 of first record i.e 100+0+10=110,
third record is from 110+10+0=120
fourth record is from 120+0+0=120
fifth record is from 120+300+10=430 ...
There are lots of examples of how to calculate running totals on this site and here's one which uses a variable. I am concerned that the purpose of the amount column is not defined but this solution works with the data provided for installation with mysql lower than version 8 (it will work with version 8 or above but there are better ways of doing it there). #tcadidot0 no hard coding required.
drop table if exists t;
create table t
( dated date, amount int, Amt1 int, Amt2 int);
insert into t values
(str_to_date('1/1/2017','%d/%m/%Y') , 100 , 0 , 10),
(str_to_date('1/2/2017','%d/%m/%Y') , 100 , 10 , 0),
(str_to_date('1/4/2017','%d/%m/%Y') , 100 , 0 , 0),
(str_to_date('1/6/2017','%d/%m/%Y') , 100 , 300 , 10),
(str_to_date('1/10/2017','%d/%m/%Y') , 100 , 0 , 20),
(str_to_date('1/11/2017','%d/%m/%Y') , 100 , 350 , 650),
(str_to_date('1/12/2017','%d/%m/%Y') , 100 , 0 , 234);
select t.dated,t.amount,t.amt1,t.amt2,
if(t.dated = (select min(t1.dated) from t t1),#op:=amount,
#op:=#op +
(select amt1 + amt2 from t t1 where t1.dated < t.dated order by t1.dated desc limit 1)
) op
from t
cross join (select #op:=0) o
order by dated;
+------------+--------+------+------+------+
| dated | amount | amt1 | amt2 | op |
+------------+--------+------+------+------+
| 2017-01-01 | 100 | 0 | 10 | 100 |
| 2017-02-01 | 100 | 10 | 0 | 110 |
| 2017-04-01 | 100 | 0 | 0 | 120 |
| 2017-06-01 | 100 | 300 | 10 | 120 |
| 2017-10-01 | 100 | 0 | 20 | 430 |
| 2017-11-01 | 100 | 350 | 650 | 450 |
| 2017-12-01 | 100 | 0 | 234 | 1450 |
+------------+--------+------+------+------+
7 rows in set (0.00 sec)

MySQL SUM while keeping Summed Data Seperate

Wondering if someone can point me in the right direction.
I am trying to build some reports, querying a sales table, I have a issue where using SUM and Group by, but still wanting to use the original lines.
For example:
Select StockCode, SUM('Sell' * Qty) as Total from 'Sales'
Group by StockCode
What i would like is to display the results as a SUM, But i want to also retain the Qty.
So the output would be something like:
StockCode Qty Total.
I can obviously run some PHP to do the calculation easy enough, but i was trying to complete as much as possible in SQL Queries to avoid unnecessary cluttered code.
Is there a simple way to do this, or would you advise just doing the calculations in PHP.
Table Example:
StockCode Qty Sell
1234 2 1.99
5468 1 0.99
2456 2 2.99
1234 3 1.99
5648 1 2.99
2546 2 4.99
2456 3 2.99
Sell is Per Item
Example:
StockCode Qty Sell Total
1234 2 1.99 3.98
5468 1 0.99 0.99
2456 2 2.99 5.98
1234 3 1.99 5.97
5648 1 2.99 2.99
2546 2 4.99 9.98
2456 3 2.99 8.97
Results example:
1234 5 1.99 9.95
5468 2 0.99 1.98
2456 5 2.99 14.95
If I understand correctly you need to do this:
SELECT StockCode, Sell, SUM(Qty) AS Quantity_Sold_At_This_Price, SUM(Sell * Qty) AS Quantity_x_Price
FROM sales
GROUP BY StockCode, Sell
The result:
| StockCode | Sell | Quantity_Sold_At_This_Price | Quantity_x_Price |
| 1234 | 1.99 | 5 | 9.95 |
| 2456 | 2.99 | 5 | 14.95 |
| 2546 | 4.99 | 2 | 9.98 |
| 5468 | 0.99 | 1 | 0.99 |
| 5648 | 2.99 | 1 | 2.99 |
it may be problem in "quote", you can try this
Select StockCode, SUM(`Sell` * `Qty`) as Total from `Sales`
Group by StockCode

MySQL inner join to sort parent, child and grand-child from the same table

MySQL database contains Countries, Town and Areas within the towns, all in the "mailshot" table. I want to return the whole set in order of descending granularity, using and inner join. I actually want to present the user with a drop down list for them to choose a Country or Town or Area within a town.
The data looks like this:
mailshot_id mailshot_parent mailshot_name mailshot_level
49 0 England 0
56 0 Scotland 0
140 49 London 1
149 49 York 1
191 56 Glasgow 1
300 140 Wimbledon 2
310 140 Westminster 2
493 56 Edinburgh 1
and I want it output like this:
mailshot_id mailshot_parent mailshot_name mailshot_level
49 0 England 0
149 49 York 1
140 49 London 1
300 140 Wimbledon 2
310 140 Westminster 2
56 0 Scotland 0
191 56 Glasgow 1
493 56 Edinburgh 1
I've almost got it with this:
SELECT
p.mailshot_id as p_id,
p.mailshot_name as p_name,
p.mailshot_level as p_level,
p.mailshot_parent as p_parent,
c.mailshot_id as c_id,
c.mailshot_parent as c_parent,
c.mailshot_level as c_level,
c.mailshot_name as c_name,
case
WHEN p.mailshot_parent = 0 THEN
p.mailshot_id
ELSE
p.mailshot_parent
END AS calcOrder
FROM
mailshot p LEFT JOIN mailshot c
ON p.mailshot_id = c.mailshot_parent
ORDER BY calcOrder , p_id "
but it's not grouping the grandchildren records (level 2) close to the child records (level 1) I think the "case" part must be wrong, and I need to have some relationship between mailshot_id and parent_id dependent on level. But I can't think if it.
Any suggestions? Thanks in advance.
Unfortunately MySQL does not support hierarchicaly queries (no START WITH...CONNECT BY or CTE equivalents). Becasue of that you need to do this the hard and ugly way.
The following will work for your 3 levels, buts gets pretty cumbersome if you need a lot more depth in the tree. Here is the Fiddle
SELECT C.MAILSHOT_ID
,C.MAILSHOT_PARENT
,C.MAILSHOT_NAME
,C.MAILSHOT_LEVEL
,CASE WHEN C.MAILSHOT_LEVEL = 0
THEN CAST(C.MAILSHOT_ID AS CHAR(4))
WHEN C.MAILSHOT_LEVEL = 1
THEN CONCAT(CAST(C.MAILSHOT_PARENT AS CHAR(4)),"..",CAST(C.MAILSHOT_ID AS CHAR(4)))
ELSE CONCAT(CAST(P.MAILSHOT_PARENT AS CHAR(4)),"..",CAST(C.MAILSHOT_PARENT AS CHAR(4)),"..",CAST(C.MAILSHOT_ID AS CHAR(4)))
END AS SORT_ORDER
FROM MAILSHOT C
LEFT OUTER JOIN
MAILSHOT P
ON P.MAILSHOT_ID = C.MAILSHOT_PARENT
ORDER BY CASE WHEN C.MAILSHOT_LEVEL = 0
THEN CAST(C.MAILSHOT_ID AS CHAR(4))
WHEN C.MAILSHOT_LEVEL = 1
THEN CONCAT(CAST(C.MAILSHOT_PARENT AS CHAR(4)),"..",CAST(C.MAILSHOT_ID AS CHAR(4)))
ELSE CONCAT(CAST(P.MAILSHOT_PARENT AS CHAR(4)),"..",CAST(C.MAILSHOT_PARENT AS CHAR(4)),"..",CAST(C.MAILSHOT_ID AS CHAR(4)))
END
This is a typical example for a hierarchical table, which is easier to query in oracle, but that's beside the point. #Declan_K gave you a good answer to achieve what you want. If you were looking for an alternative that gives you a slightly different by still well organized output, you could try this approach:
SELECT m1.mailshot_name AS lev1n ,
m1.mailshot_id AS lev1,
m1.mailshot_parent AS lev1p,
m2.mailshot_name AS lev2n,
m2.mailshot_id AS lev2,
m2.mailshot_parent AS lev2p,
m3.mailshot_name lev3n,
m3.mailshot_id lev3,
m3.mailshot_parent AS lev3p
FROM mailshot m1
LEFT JOIN mailshot m2 ON m2.mailshot_parent = m1.mailshot_id
LEFT JOIN mailshot m3 ON m3.mailshot_parent = m2.mailshot_id
WHERE m1.mailshot_parent = 0;
Gives output:
+----------+------+-------+-----------+------+-------+-------------+------+-------+
| lev1n | lev1 | lev1p | lev2n | lev2 | lev2p | lev3n | lev3 | lev3p |
+----------+------+-------+-----------+------+-------+-------------+------+-------+
| England | 49 | 0 | London | 140 | 49 | Wimbledon | 300 | 140 |
| England | 49 | 0 | London | 140 | 49 | Westminster | 310 | 140 |
| England | 49 | 0 | York | 149 | 49 | NULL | NULL | NULL |
| Scotland | 56 | 0 | Glasgow | 191 | 56 | NULL | NULL | NULL |
| Scotland | 56 | 0 | Edinburgh | 493 | 56 | NULL | NULL | NULL |
+----------+------+-------+-----------+------+-------+-------------+------+-------+
Good summaries on how to deal with hierarchical data in MySQL can be found here:
http://mikehillyer.com/articles/managing-hierarchical-data-in-mysql/
http://explainextended.com/2009/03/17/hierarchical-queries-in-mysql/

Empty set returned from query

Any help is greatly appreciated.
I have a table hospital:
Nurse + Year + No.Patients
A001 |2000 | 23
A001 |2001 | 30
A001 |2002 | 35
B001 |2000 | 12
B001 |2001 | 15
B001 |2002 | 45
C001 |2000 | 50
C002 |2001 | 59
C003 |2002 | 69
etc
What I am trying to do is work out which nurse
had the greatest increase of patients for the years 2000 - 2002.
Clearly B001 did as her patients increased from 12 to 45 and increase of 33
and what I am trying to produce is the result B001 | 33.
This is what I have so far:
select a.nurse,a.nopats from hospital as a
join
( select nurse,max(nopats)-min(nopats) as growth
from hospital where year between 2000 and 2002 group by nurse ) as s1
on a.nurse = s1.nurse and a.nopats = s1.growth
where year between 2000 and 2002;
but all I get returned is an empty set.
I think I need an overall max(nopats) after the join.
Any help here would be great.
Thanks!
Try this:
SELECT nurse, (max(nopats) - min(nopats)) AS growth
FROM hospital
WHERE year BETWEEN 2000 AND 2002
GROUP BY nurse
ORDER BY growth DESC
LIMIT 1;
Result: B001 | 33 due to LIMIT 1; just leave it away if you want more results.
SELECT nurse, MAX(nopats) - MIN(nopats) AS Growth
FROM hospital
WHERE year BETWEEN 2000 AND 2002
GROUP BY nurse
ORDER BY Growth
That should do it. Let me know if thats what you needed.