Count() return total instead distinct count - mysql

I have qualified risks with description and creation date, who are attached to subcategory of risks this last ones are attached to category of risks, each risk has a name like 'Risk_1' , my aim is to count the number of risks by month and risk category including zero.
I have this request :
SELECT DISTINCT risk_names.type as risk_name, MONTH(risk.creation_date) as month, count(risk.id) as number FROM risk As risk , risk_category
JOIN (
SELECT risk_category.name as type
FROM
risk_category
) as risk_names on risk_names.type = risk_category.name
where risk.creation_date >= (NOW()-INTERVAL 3 MONTH) GROUP BY MONTH(risk.creation_date), risk_names.type;
Who return this result :
Risk_name month number
---------------------------------
Risk_1 1 10 ---> instead 8
Risk_2 1 10 ---> instead 1
Risk_3 1 10 ---> instead 1
Risk_1 2 12 ......
Risk_2 2 12
Risk_3 2 12
Risk_1 12 4
Risk_2 12 4
Risk_3 12 4
As you can see the number returned is the total for each month , but my aim is to get total for each distinct risk.
Can you help me . thanks

The comma in your FROM is doing a CROSS JOIN. A Cartesian product is unnecessary and throws all the counts off.
I suspect you want something like this:
SELECT rc.type as risk_name, MONTH(r.creation_date) as month,
count(r.id) as number
FROM risk_category rc LEFT JOIN
risk r
ON r.?? = rc.??
where risk.creation_date >= (NOW()-INTERVAL 3 MONTH)
GROUP BY rc.type, MONTH(r.creation_date);
I don't know what the JOIN criterion is between risk and risk_category.

Then try using distinct keuword with count() like count(distinct risk.id) as number instead

Related

Multiple Select Count in Different Condition

I have table that contain clients_status, and my query just like these:
SELECT ROUND((SUM(client_status='BAD DEBT')/COUNT(*))*100) AS BAD,
ROUND((SUM(client_status='ALERT')/COUNT(*))*100) AS ALERT,
ROUND((SUM(client_status='REMIND BAYAR')/COUNT(*))*100) AS REMIND,
ROUND((SUM(client_status='RUTIN BAYAR')/COUNT(*))*100) AS RUTIN,
ROUND((SUM(client_status='POTENSI KOREKSI')/COUNT(*))*100) AS POTENSI,
ROUND((SUM(client_status='TOP 10')/COUNT(*))*100) AS TOP FROM clients
And the output of my query just like this:
What should i do, to make my output like this:
Data value
===============
Bad 29
Alert 29
Remind 14
Rutin 14
Potensi 0
Top 15
Please help guys... thanks,
Aggregate grouping by the status and cross join a subquery getting the total count to calculate the percentage. For the renaming and the custom order you can use CASEs.
SELECT cs.s data,
cs.c / ca.c * 100 value
FROM (SELECT CASE client_status
WHEN 'BAD DEPT' THEN
1
WHEN 'ALERT' THEN
2
WHEN 'REMIND BAYAR' THEN
2
WHEN 'RUTIN BAYAR' THEN
4
WHEN 'POTENSI KOREKSI' THEN
5
WHEN 'TOP 10' THEN
6
END o,
CASE client_status
WHEN 'BAD DEPT' THEN
'Bad'
WHEN 'ALERT' THEN
'Alert'
WHEN 'REMIND BAYAR' THEN
'Remind'
WHEN 'RUTIN BAYAR' THEN
'Rutin'
WHEN 'POTENSI KOREKSI' THEN
'Potensi'
WHEN 'TOP 10' THEN
'Top'
END s,
count(*) c
FROM clients
GROUP BY client_status) cs
CROSS JOIN (SELECT count(*) c
FROM clients) ca
ORDER BY cs.o;

get 1st and 2nd highest vlaue rows in case of similar values

I have a table with the columns : id, status, value.
id status value
-- ------ -----
1 10 100
2 10 100
3 10 60
4 11 20
5 11 15
6 12 100
7 12 50
8 12 50
I would like to get the id and value of the first and second highest valued rows, from each status group. My table should have the following columns:
status, id of the first highest value, first highest value, id of second highest value, second highest value.
I should get:
status 1stID 1stValue 2ndID 2ndValue
------ ----- -------- ----- --------
10 1/2 100 2/1 100
11 4 20 5 15
12 6 100 7/8 50
I tried all kinds of solutions, but I couldn't find a solution for same-value 1st s (two rows with the same value, which happened to be the highest in that status group) or same-value seconds.
For example, in case of two rows sharing the highest value in their status group, this not-so-elegant query will return two rows with the same status, different 1sts and same 2nd:
SELECT 2nds.status, 1sts.id AS "1stID",1sts.value AS "1stValue",
2nds.id AS "2ndID",2nds.value AS "2ndValue"
FROM
(SELECT v.* FROM
(SELECT status, MAX(value) AS "SecMaxValue" FROM table o
WHERE value < (SELECT MAX(value) FROM table
WHERE status = o.status
GROUP BY status) AS m
INNER JOIN table v
ON v.status = m.status AND v.value = m.SecMaxValue) AS 2nds
INNER JOIN
(SELECT v.* FROM
(SELECT status, MAX(value) AS maxValue FROM table
GROUP BY status) AS m
INNER JOIN table v
ON v.status = m.status AND v.value = m.MaxValue) AS 1sts
ON 1sts.status = 2nds.status ;
This query will give me:
status 1stID 1stValue 2ndID 2ndValue
------ ----- -------- ----- --------
10 1 100 3 60
10 2 100 3 60
11 4 20 5 15
12 6 100 7 50
12 6 100 8 50
To conclude, I would like to find a solution in which:
a. if there are two rows with the highest value the query puts the details one of them in the column of the 1st and the details of other in 2nd (no mather which)
b. if there are two rows with the second highst value it puts the highest in its place and one of the seconds in the second place.
Is there a way to change the query above? someone has a nicer solution?
I came across several 1st and 2nd queries but they had the same problem - for example this solution: Finding the highest n values of each group in MySQL. it does not deliver 1st and 2nd in the same row, but the main problem it provides only one of the firsts.
Thanks
After spent a lot of time, finally I found a solution for above problem. Please try it out:
select 1st.status as Status,
SUBSTRING_INDEX(1st.id,'/',1) as 1stID,
1st.value as 1stValue,
(case when locate('/',1st.id) > 0 then SUBSTRING_INDEX(1st.id,'/',-1)
else 2nd.id
end) as 2ndID,
(case when locate('/',1st.id) > 0 then 1st.value
else 2nd.value
end) as 2ndValue
from
(
(select status, SUBSTRING_INDEX(Group_concat(id separator '/'),'/',2) as id,value
from t1
where (status,value) in (select status,value
from t1
group by status
having max(value))
group by status) 1st
inner join
(select status,id,value
from t1
where (status,value) not in (select status,value
from t1
group by status
having max(value))
group by status,value
order by status,value desc) 2nd
on 1st.status = 2nd.status)
group by 1st.status;
Just replace t1 with your tablename and it should work like a charm.
Click here for Updated Demo
If you have any doubt(s), feel free to ask.
Hope it helps!

MySQL - How to sum up first occurrences from many different products

I have a big view called: how_many_per_month
name_of_product | how_many_bought | year | month
p1 20 2012 1
p2 7 2012 1
p1 10 2012 2
p2 5 2012 2
p1 3 2012 3
p2 20 2012 3
p3 66 2012 3
How to write MySQL query in order to get only first few occurences of product p1, p2, p3 at once?
To get it one by one for first 3 months I can write:
SELECT name_of_product , sum(how_many_bought) FROM
(SELECT name_of_product, how_many_bought FROM `how_many_per_month`
WHERE name_of_product= 'p1' LIMIT 3) t
How to do it to all possible products at once so my result for taking only first month is like:
p1 20
p2 7
p3 66
For two months:
p1 30
p2 12
p3 66
The problem is that some products are published in different months and I have to make statistic how many of total of them are sold in first month, first 3 months, 6 months, 1 year divided by total.
Example using union
select
name_of_product,
sum(how_many_bought) as bought,
"first month" as period
from how_many_per_month
where month = 1
group by name_of_product
union
select
name_of_product,
sum(how_many_bought) as bought,
"first 2 month" as period
from how_many_per_month
where month <= 2
group by name_of_product
union
select
name_of_product,
sum(how_many_bought) as bought,
"first 6 month" as period
from how_many_per_month
where month <= 6
group by name_of_product
union
select
name_of_product,
sum(how_many_bought) as bought,
"first 12 month" as period
from how_many_per_month
where month <= 12
group by name_of_product
Demo: http://www.sqlfiddle.com/#!2/788ea/11
Results are different a little bit from your expectation. Are you sure that you write them properly? If you need to gain more speed in query time you can use group by case as I've already said.
I'm not quite sure what you're trying to achieve as the description of your question is a bit unclear. From what I've read so far, I understand you want to show the total of how many ITEM_X, ITEM_Y, ITEM_Z were sold for the past 1,3,6 months.
Based on the data you've provided, I've created this sqlfiddle that sums all results and groups them by item. This is the query:
SELECT
name_of_product,
sum(how_many_bought) as how_many_bought
FROM how_many_per_month
WHERE year = 2012
AND month BETWEEN 1 AND 3
GROUP BY name_of_product
-- NOTE: Not specifying an year will result in including all "months"
which are between the values 1 and 3 for all years. Remove it
in case you need that effect.
In the example above the database will sum all sold items between months 1 and 3 (including) for 2012. When you execute this query in your application just change the range in the BETWEEN X AND X and you'll be good to go.
Additional tip:
Avoid using sub-queries or try using the as a last resort method (in case there's simply no other way to do it). They are significantly slower than normal and even join queries. Usually sub-queries can be transformed into a join query.
SELECT
hmpm.name_of_product , SUM(hmpm.how_many_bought)
FROM (
SELECT name_of_product
FROM how_many_per_month
/* WHERE ... */
/* ORDER BY ... */
) sub
INNER JOIN how_many_per_month hmpm
ON hmpm.name_of_product = sub.name_of_product
GROUP BY hmpm.name_of_product
/* LIMIT ... */
MySQL not support LIMIT in subquery, but you need ordering and condition. And why not have id_of_product field?

Get the last 2 rows of a table while grouping one of the column. MySQL

Consider Facebook. Facebook displays the latest 2 comments of any status. I want to do something similar.
I have a table with e.g. status_id, comment_id, comment and timestamp.
Now I want to fetch the latest 2 comments for each status_id.
Currently I am first doing a GROUP_CONCAT of all columns, group by status_id and then taking the SUBSTRING_INDEX with -2.
This fetches the latest 2 comments, however the GROUP_CONCAT of all the records for a status_id is an overhead.
SELECT SUBSTRING_INDEX(GROUP_CONCAT('~', comment_id,
'~', comment,
'~', timestamp)
SEPARATOR '|~|'),
'|~|', -2)
FROM commenttable
GROUP BY status_id;
Can you help me with better approach?
My table looks like this -
status_id comment_id comment timestamp
1 1 xyz1 3 hour
1 2 xyz2 2 hour
1 3 xyz3 1 hour
2 4 xyz4 2 hour
2 6 xyz6 1 hour
3 5 xyz5 1 hour
So I want the output as -
1 2 xyz2 2 hour
1 3 xyz3 1 hour
2 4 xyz4 2 hour
2 6 xyz6 1 hour
3 5 xyz5 1 hour
Here is a great answer I came across here:
select status_id, comment_id, comment, timestamp
from commenttable
where (
select count(*) from commenttable as f
where f.status_id = commenttable.status_id
and f.timestamp < commenttable.timestamp
) <= 2;
This is not very efficient (O(n^2)) but it's a lot more efficient than concatenating strings and using substrings to isolate your desired result. Some would say that reverting to string operations instead of native database indexing robs you of the benefits of using a database in the first place.
After some struggle I found this solution -
The following gives me the row_id -
SELECT a.status_id,
a.comments_id,
COUNT(*) AS row_num
FROM comments a
JOIN comments b
ON a.status_id = b.status_id AND a.comments_id >= b.comments_id
GROUP BY a.status_id , a.comments_id
ORDER BY row_num DESC
The gives me the total rows -
SELECT com.status_id, COUNT(*) total
FROM comments com
GROUP BY com.status_id
In the where clause of the main select -
row_num = total OR row_num = total - 1
This gives the latest 2 rows. You can modify the where clause to fetch more than 2 latest rows.

Query to add missing rows using values from prior period

I have a record set for inspections of many pieces of equipment. The four cols of interest are equip_id, month, year, myData.
My requirement is to have EXACTLY ONE record per month for each piece of equipment.
I have a query that makes the data unique over equip_id, month, year. So there is no more than one record for each month/year for a piece of equipment. But now I need to simulate data for the missing month. I want to simply go back in time to get the last piece of my data.
So that may seem confusing, so I'll show by example.
Given this sample data:
equip_id month year myData
-----------------------------
1 1 2010 500
1 2 2010 600
1 5 2010 800
2 2 2010 300
2 4 2010 400
2 6 2010 500
I want this output:
equip_id month year myData
-----------------------------
1 1 2010 500
1 2 2010 600
1 3 2010 600
1 4 2010 600
1 5 2010 800
2 2 2010 300
2 3 2010 300
2 4 2010 400
2 5 2010 400
2 6 2010 500
Notice that I'm filling in missing data with the data from the month (or two months etc.) before. Also note that if the first record for equip 2 is in 2/2010 than I don't need a record for 1/2010 even though I have one for equip 1.
I just need exactly one record for each month/year for each piece of equipment. So if the record does not exist I just want to go back in time and grab the data for that record.
Thanks!
By no means perfect:
SELECT equip_id, month, mydata
FROM (
SELECT equip_id, month, mydata FROM equip
UNION ALL
SELECT EquipNum.equip_id, EquipNum.Num,
(SELECT Top 1 mydata
FROM equip
WHERE equip.month<n.num And equip.equip_id=equipnum.equip_id
ORDER BY equip.month desc) AS Data
FROM
(SELECT e.equip_id, n.Num
FROM
(SELECT DISTINCT equip_id FROM equip) AS e,
Numbers AS n) AS EquipNum
LEFT JOIN equip
ON (EquipNum.Num = equip.month)
AND (EquipNum.equip_id = equip.equip_id)
WHERE EquipNum.Num<DMax("month","equip")
AND
(SELECT top 1 mydata
FROM equip
WHERE equip.month<n.num And equip.equip_id=equipnum.equip_id
ORDER BY equip.month desc) Is Not Null
AND equip.equip_id Is Null AND equip.Month Is Null) AS x
ORDER BY equip_id, month
For this to work you need a Numbers table, in this case it needs only hold integers from 1 to 12. The numbers table I used is called Numbers and the field is called Num.
EDIT re years comment
SELECT equip_id, year, month, mydata
FROM (
SELECT equip_id, year, month, mydata FROM equip
UNION ALL
SELECT en.equip_id, en.year, en.Num, (SELECT Top 1 mydata
FROM equip e
WHERE e.month<n.num And e.year=en.year And e.equip_id=en.equip_id
ORDER BY e.month desc) AS Data
FROM (SELECT e.equip_id, n.Num, y.year
FROM
(SELECT DISTINCT equip_id FROM equip) AS e,
Numbers AS n,
(SELECT DISTINCT year FROM equip) AS y) AS en
LEFT JOIN equip AS e ON en.equip_id = e.equip_id
AND en.year = e.year
AND en.Num = e.month
WHERE en.Num<DMax("month","equip") AND
(SELECT Top 1 mydata
FROM equip e
WHERE e.month<n.num And e.year=en.year And e.equip_id=en.equip_id
ORDER BY e.month desc) Is Not Null
AND e.equip_id Is Null
AND e.Month Is Null) AS x
ORDER BY equip_id, year, month
I've adjusted to account for year and month... The primary principles remain the same as the original queries presented where just the month. However, for applying a month and year, you need to test for the SET of YEAR + MONTH, ie: what happens if Nov/2009, then jump to Feb/2010, You can't rely on just a month being less than another, but the "set". So, I've apply the year * 12 + month to prevent a false value such as Nov=11 + year=2009 = 2009+11 = 2020, then Feb=2 of year=2010 = 2010+2 = 2012... But 2009*12 = 24108 + Nov = 11 = 24119 compared to 2010*12 = 24120 + Feb =2 = 24122 -- retains proper sequence per year/month combination. The rest of the principles apply. However, one additional, I created a table to represent the span of years to consider. For my testing, I added a sample Equip_ID = 1 entry with a Nov-2009, and Equip_ID = 2 with a Feb-2011 entry and the proper roll-over works too. (Table C_Years, column = year and values of 2009, 2010, 2011)
SELECT
PYML.Equip_ID,
PYML.Year,
PYML.Mth,
P1.MyData
FROM
( SELECT
PAll.Equip_ID,
PAll.Year,
PAll.Mth,
( SELECT MAX( P1.Year*12+P1.Mth )
FROM C_Preset P1
WHERE PAll.Equip_ID = P1.Equip_ID
AND P1.Year*12+P1.Mth <= PAll.CurYrMth) as MaxYrMth
FROM
( SELECT
PYM1.Equip_ID,
Y1.Year,
M1.Mth,
Y1.Year*12+M1.Mth as CurYrMth
FROM
( SELECT p.equip_id,
MIN( p.year*12+p.mth ) as MinYrMth,
MAX( p.year*12+p.mth ) as MaxYrMth
FROM
C_Preset p
group by
1
) PYM1,
C_Years Y1,
C_Months M1
WHERE
Y1.Year*12+M1.Mth >= PYM1.MinYrMth
AND Y1.Year*12+M1.Mth <= PYM1.MaxYrMth
) PAll
) PYML,
C_Preset P1
WHERE
PYML.Equip_ID = P1.Equip_ID
AND PYML.MaxYrMth = P1.Year*12+P1.Mth
If this is going to be a repetative thing/report, I would just create a temporary table with 12 months -- then use that as the primary table, and do a left OUTER join to the rest of your data. This way, you know you'll always get every month, but only when a valid join to the "other side" is identified, you'll get that data too. Ooops... missed your point about the filling in missing elements from the last element... Thinking...
The following works... and I'll describe the elements to what is going on. First, I created a temp table "C_Months" with a column Mth (month) with numbers 1-12. I used "Mth" as an abbreviation of Month to not cause possible conflict with POSSIBLE reserved word MONTH. Additionally, in my query, the table reference "C_Preset" is the prepared set of data you mentioned you already have of distinct elements.
SELECT
LVM.Equip_ID,
LVM.Mth,
P1.Year,
P1.MyData
FROM
( SELECT
JEM.Equip_ID,
JEM.Mth,
( SELECT MAX( P.Mth )
FROM C_Preset P
WHERE P.Equip_ID = JEM.Equip_ID
AND P.Mth <= JEM.Mth ) as MaxMth
FROM
( SELECT distinct
p.equip_id,
c.mth
FROM
C_months c,
C_Preset p
group by
1, 2
HAVING
c.mth >= MIN( p.Mth )
and c.mth <= MAX( p.Mth )
ORDER BY
1, 2 ) JEM
) LVM,
C_Preset P1
WHERE
LVM.Equip_ID = P1.Equip_ID
AND LVM.MaxMth = P1.Mth
ORDER BY
1, 2
The inner most query is a query of the available months (C_Months) associated with a given equipment ID. In your example, equipment ID 1 had a values of 1,2,5. So this would return 1, 2, 3, 4, 5. And for Equipment ID 2, it started with 2, but ended with 6, so it would return 2, 3, 4, 5, 6. Hence the aliased reference JEM (Just Equipment Months)
Then, the field selection for MaxMth (Maximum month)... This is the TRICKY ONE
( SELECT MAX( P.Mth )
FROM C_Preset P
WHERE P.Equip_ID = JEM.Equip_ID
AND P.Mth <= JEM.Mth ) as MaxMth
From this, stating I want the maximum month AVAILABLE (from JEM) associated with the given equipment that is AT OR LESS than the month In question (detecting the highest "valid" equipment item/month within the qualified list. The result of this would result in...
Equip_ID Mth MaxMth
1 1 1
1 2 2
1 3 2
1 4 2
1 5 5
2 2 2
2 3 2
2 4 4
2 5 4
2 6 6
So, for your example of ID = 1, you had months 1, 2, 5 (3 and 4 were missing), so the last valid month that 3 and 4 would refer to is sequence #2's month. Likewise for ID = 2, you had months 2, 4 and 6... Here, 3 would refer back to 2, 5 would refer back to 4.
The rest is the easy part. Now, we join your LVM (Last Valid Month) result as shown above to your original C_Preset (less records). But since we now have the last valid month to directly associate to an existing record in the C_Preset, we join by equipment id and the MaxMth colum, and NOT THE ACTUAL month.
Hope this helps... Again, you'll probably have to change my "mth" column references to "month" to match your format.