Group data by age, gender and report requested in MySQL - mysql

I'm having trouble formulating my MySQL statement / query string. I have a table (see sample below)
It's a sample table containing gender, age and report_requested fields. What I would like to have as an output is something like this:
20 years old, male | Total number of users | Total who requested min 1 report | total who requested 2 reports | total 3 or more reports
20 years old female | Total number of users | Total who requested min 1 report | total who requested 2 reports | total 3 or more reports
20 years old combined | Total number of users | Total who requested min 1 report | total who requested 2 reports | total 3 or more reports
21 years old, male | Total number of users | Total who requested min 1 report | total who requested 2 reports | total 3 or more reports
21 years old female | Total number of users | Total who requested min 1 report | total who requested 2 reports | total 3 or more reports
21 years old combined | Total number of users | Total who requested min 1 report | total who requested 2 reports | total 3 or more reports
... etc.
But I'm having a hard time with it. What I just know is to determine the number of users who requested (1,2,3,...etc) credit reports given the gender and age.
Here's what I used:
SELECT COUNT(*) as cnt, report_requested
FROM sampletable WHERE age = '39'
AND gender = 'M' GROUP BY report_requested
Here's the result:
It just return the data for 20 yrs old male, number of users requested 1 credit report, 2 credit reports up to 8 (but this is also wrong since it should combine the number of users who requested 3 credit reports or more)
Can anybody here help me or give me an idea on how I could accomplish this?

Your GROUP BY clause is actually both age and gender since you are trying to aggregate for these two. The way to think about it is that you want exactly one row per age and gender, i.e. 1 row for male/20 yrs, 1 row for female/20 yrs, 1 row for male/21 yrs, etc. So you would do:
GROUP BY age, gender
And instead of report_requested column, I think you need to SUM(report_requested) with a condition on the number of reports requested. This is handled in SQL via the CASE clause. So your query would look like this:
SELECT AGE, GENDER,
SUM(CASE WHEN report_requested = 1 THEN 1 ELSE 0 END) AS 'Total who requested 1 report',
SUM(CASE WHEN report_requested = 2 THEN 1 ELSE 0 END) AS 'Total who requested 2 reports',
SUM(CASE WHEN report_requested >= 3 THEN 1 ELSE 0 END) AS 'Total who requested 3 or more reports'
FROM sampletable
GROUP BY AGE, GENDER
Let me know how it goes. I removed the WHERE clause because I assumed that was for testing only.
EDIT: Updated after the comments below, that it's not the total requested rather total who requested 1 report, total who requested 2 reports, etc.

Related

Refine SQL Query given list of ids

I am trying to improve this query given that it takes a while to run. The difficulty is that the data is coming from one large table and I need to aggregate a few things. First I need to define the ids that I want to get data for. Then I need to aggregate total sales. Then I need to find metrics for some individual sales. This is what the final table should look like:
ID | Product Type | % of Call Sales | % of In Person Sales | Avg Price | Avg Cost | Avg Discount
A | prod 1 | 50 | 25 | 10 | 7 | 1
A | prod 2 | 50 | 75 | 11 | 4 | 2
So % of Call Sales for each product and ID adds up to 100. The column sums to 100, not the row. Likewise for % of In Person Sales. I need to define the IDs separately because I need it to be Region Independent. Someone could make sales in Region A or Region B, but it does not matter. We want aggregate across Regions. By aggregating the subqueries and using a where clause to get the right ids, it should cut down on memory required.
IDs Query
select distinct ids from tableA as t where year>=2021 and team = 'Sales'
This should be a unique list of ids
Aggregate Call Sales and Person Sales
select ids
,sum(case when sale = 'call' then 1 else 0 end) as call_sales
,sum(case when sale = 'person' then 1 else 0 end) as person_sales
from tableA
where
ids in t.ids
group by ids
This will be as follows with the unique ids, but the total sales are from everything in that table, essentially ignoring the where clause from the first query.
ids| call_sales | person_sales
A | 100 | 50
B | 60 | 80
C | 100 | 200
Main Table as shown above
select ids
,prod_type
,cast(sum(case when sale = 'call' then 1 else 0 end)/CAST(call_sales AS DECIMAL(10, 2)) * 100 as DECIMAL(10,2)) as call_sales_percentage
,cast(sum(case when sale = 'person' then 1 else 0 end)/CAST(person_sales AS DECIMAL(10, 2)) * 100 as DECIMAL(10,2)) as person_sales_percentage
,mean(price) as price
,mean(cost) as cost
,mean(discount) as discount
from tableA as A
where
...conditions...
group by
...conditions...
You can combine the first two queries as:
select ids, sum( sale = 'call') as call_sales,
sum(sale = 'person') as person_sales
from tableA
where
ids in t.ids
group by ids
having sum(year >= 2021 and team = 'Sales') > 0;
I'm not exactly sure what the third is doing, but you can use the above as a CTE and just plug it in.

Need validation that interpretation for a Grouping Query is correct

I am running the following query and at first it appears to give the sub totals for customers and shows by date each customers payment amounts only if that total for all payments is greater than $90,000.
SELECT
Customername,
Date(paymentDate),
CONCAT('$', Round(SUM(amount),2)) AS 'High $ Paying Customers'
FROM Payments
JOIN Customers
On payments.customernumber = customers.customernumber
Group by customername, Date(paymentDate) WITH ROLLUP
having sum(amount)> 90000;
But upon looking at the records for Dragon Souveniers, Ltd. and Euro+ Shopping Channel is is actually showing the paydates that have amounts individually over $90000 as well as the subtotal for that customer as a rollup. For all other customers, their individual payment dates are not reported in the result set and only their sum is if it over $90000. For example Annna's Decorations as 4 payment records and none of them are over 90000 but her sum is reported as the value for the total payments in the query with the rollup. Is this the correct interpretation?
The HAVING clause work correct, It filters all records with a total no above 90000. It also does do this for totals.
When using GROUP BY .... WITH ROLLUP, you can detect the created ROLL UP lines by using the GROUPING() function.
You should add a condition in a way that the desired columns are not filtered.
Simple example:
select a, sum(a), grouping(a<3)
from (select 1 as a
union
select 2
union select 3) x
group by a<3 with rollup;
output:
+---+--------+---------------+
| a | sum(a) | grouping(a<3) |
+---+--------+---------------+
| 3 | 3 | 0 |
| 1 | 3 | 0 |
| 1 | 6 | 1 |
+---+--------+---------------+
this shows that the last line (with grouping(i<3) == 1) is a line containing totals for a<3.

Combine results from 3 sql queries to calculate running stock

I am trying to calculate the stock by product a warehouse had over time. I have the information about today's stock, and also the amount of products sold and purchased by day. So, the calculation for yesterday values would be:
Yesterday_stock=Stock-yesterday_sold_quantity+yesterday_purchased_quantity. My problem is that i should save somewhere the amount of everyday's stock in order to calculate the stock of the previous day. I found that in order to do that i could use over sql clause with order by. But unfortunately, i have sql server 2008 and this is not a choice.
The tables are:
Prdamount which holds the current stock per product (StuPrdID ) and if it is blocked for some reason.
|-------------- |------------------|---------------
| StuPrdID | StuQAmount |prdBlockingReason
|---------------|------------------|-------------
| 12345| 16 |
|---------------|------------------|--------------
| 08889| 12 | expired
|---------------|------------------|------------
Table Moves which holds information about inserts and outputs of products. If MoveCase field has value equal 1 it is an output move, if it is a 2 it is a purchased quantity. Moves table dummy data:
|-------------- |--------------------- -|--------|-------
|MoveItemCode | MoveDate |MoveCase|MoveRealQty
|---------------|---------------------- |--------|-------
| 12345 |2018-06-24 00:00:00.000| 1 |14
|---------------|-----------------------|--------|--------
| 08889 |2018-06-24 00:00:00.000| 2 |578
|---------------|-----------------------|--------|--------
and table Product with information related with data:
|-------------- |------------------|
| PrdCode | PrdDespription |
|---------------|------------------|
| 12345| Orange juice|
|---------------|------------------|
| 08889| Chocolate|
|---------------|------------------|
I want an output like this:
|------------|--------------------- -|--------|--------------|------------
|Prdcode | PrdDescription |Stock |Stock 18/07/03|Stock 18/7/02
|------------|---------------------- |--------|--------------|------------
| 12345 |Orange Juice | 80 |50 34
|----------- |-----------------------|--------|--------------|------------
| 08889 |Chocolate | 45 |82 17
|------------|-----------------------|--------|--------------|-------------
this query gives me the running stock:
select
product.PrdCode,
product.PrdDescr,
SUM(StuQAmount) as Stock
from prdamount
left join product on (product.PrdID=prdamount.StuPrdID)
where prdamount.prdBlockingReason=' '
group by product.PrdCode,product.PrdDescr
order by product.PrdCode asc
This query gives me the quantity sold by product per day:
select
moves.MoveItemCode,
prd.PrdDescr,
moves.MoveDate,
SUM(MoveRealQty) as 'sold_quantity'
from moves
left join prd on (moves.MoveItemCode=product.PrdCode)
where (moves.MoveDate>'2018-06-01' and and moves.MoveCase=1)
group by moves.MoveItemCode,product.PrdDescr,moves.MoveDate
order by moves.MoveItemCode asc,moves.MoveDate asc
And this query gives me the quantity purchases by product per day:
select
moves.MoveItemCode,
prd.PrdDescr,
moves.MoveDate,
SUM(MoveRealQty) as 'Purchased_Quantity'
from Moves
left join product on (moves.MoveItemCode=product.PrdCode)
where (moves.MoveDate>'2018-06-01' and moves.MoveCase=2)
group by moves.MoveItemCode,product.PrdDescr,moves.MoveDate
order by moves.MoveItemCode asc,moves.MoveDate asc
I tried to combine these 3 queries into one using subqueries, but it didn't work. So how can i accomplish the result that i want? Sorry if the question is silly, i am a beginner in sql
try this,
select
product.PrdCode,
moves.MoveItemCode,
product.PrdDescr,
moves.MoveDate,
SUM( case when moves.MoveCase=1 then MoveRealQty else 0 end) as 'sold_quantity',
SUM( case when moves.MoveCase=2 then MoveRealQty else 0 end) as 'Purchased_Quantity',
(select SUM(StuQAmount) from prdamount where StuPrdID = product.PrdID and prdBlockingReason=' ')
from moves
left join product on (moves.MoveItemCode=product.PrdCode)
where (moves.MoveDate>'2018-06-01')
group by moves.MoveItemCode,product.PrdDescr,moves.MoveDate, product.PrdCode
order by moves.MoveItemCode asc,moves.MoveDate asc

SQL Sum not returning the correct data

An example of my Progress database, opdetail table
invoice invline article size qty
----------------------------------------
905155 1 Shoe 10 5
905155 2 Slipper 3 2
905155 2 Slipper 4 6
905155 2 Slipper 5 1
905156 1 Boot 10 1
905156 1 Boot 11 1
905157 1 Slipper 5 4
905157 2 Shoe 8 6
a simple SQL select statement, run from the OpenEdge editor returns just what I need, a list of invoices with their total quantities:-
SELECT invoice, sum(qty) FROM opdetail GROUP BY qty ORDER BY invoice ASC
905155 14
905156 2
905157 10
HOWEVER:-
When run from an ASP page via DSN I have to list both fields in the GROUP BY otherwise progress returns a GROUP BY error
SELECT invoice, sum(qty) FROM opdetail GROUP BY qty, invoice ORDER BY invoice ASC
905155 5
905155 9
905156 2
905157 4
905157 6
Its not summarizing the qty, and seems to be taking into account the line number even though the line number plays no part in my sql statement. Can anyone throw any light on this or how I can do a sum of the total qty taking into account the line number? Thanks!
You are using qty in the aggregate function and then using on the group by this makes no sense and you should group by on some other column something as
SELECT
invoice,
sum(qty) FROM opdetail
GROUP BY invoice ORDER BY invoice ASC

Select with a where clause and without it in the same query

I'm trying to find a way to sum amounts that match a specific term, and also amounts that don't match it. For example, if my table looks like this
user amount description
1 34 bike
1 78 toys
2 3 fuel
2 12 bike
I'm trying to get a table that will look like this in the end:
user amount spent on bike amount spent total
1 34 112
2 12 15
I'm using mysql
You can use a CASE statement within a SUM grouping:
SELECT user,
SUM(CASE WHEN description = 'bike' THEN amount ELSE 0 END) bike_amount,
SUM(amount) total_amount
FROM mytable
GROUP BY user