COUNT for every GROUP BY with non existent values - mysql

I have the following table
id rid eId vs isN
24 3 22 2 1
25 3 21 2 1
26 60 21 2 1
27 60 21 2 1
28 60 21 2 1
29 60 21 2 1
30 60 21 2 1
31 60 21 2 1
32 81 21 2 1
35 60 22 2 1
36 81 22 2 1
37 0 22 2 1
38 60 22 2 1
39 81 22 2 1
40 0 22 2 1
41 60 22 2 1
42 81 22 2 1
43 3 22 2 1
On eId i have 8 different numbers
I want to count this eight different eid , even counted as "0" what i want to get is an array contain 8 values and the keys should be the eight different names. "vs" is 3 different numbers every time i count
i want for this on "rid" = %d and "vs" = %d ( specific rid and specific vs)
SELECT count(*) as count FROM notification
WHERE rid = 60 AND vs = 2 AND isN = 1 GROUP BY eId
rid=>60,21=>6,22=>3,vs=>2,isN=>1
(this is what i get with the above)
rid=>60,21=>6,22=>3,23=>0,33=>0,34=>0,35=>0,36=>0,41=>0,42=>0,vs=>2,isN=>1
(this is what i want. eight counted, of course this numbers counted not existed on eId so i want to return as a zero)

Here's one way to get the specified resultset:
SELECT d.rid AS `rid`
, SUM(n.eid<=>21) AS `21`
, SUM(n.eid<=>22) AS `22`
, SUM(n.eid<=>23) AS `23`
, SUM(n.eid<=>33) AS `33`
, SUM(n.eid<=>34) AS `34`
, SUM(n.eid<=>35) AS `35`
, SUM(n.eid<=>36) AS `36`
, SUM(n.eid<=>41) AS `41`
, SUM(n.eid<=>42) AS `42`
, d.vs AS `vs`
, d.isN AS `isN`
FROM ( SELECT %d AS rid, %d AS vs, 1 AS isN ) d
LEFT
JOIN notification n
ON n.rid = d.rid
AND n.vs = d.vs
AND n.isN = d.isN
GROUP
BY d.rid
, d.vs
, d.isN
Note: the expression (n.eid<=>21) is shorthand for IF(n.eid=21,1,0), or the more ANSI-standard CASE WHEN n.eid = 21 THEN 1 ELSE 0 END. That gives a 0 or a 1, which can then be aggregated with a SUM function.
You could get equivalent results using any of these forms:
, SUM(n.eid<=>21) AS `21`
, COUNT(IF(n.eid=22,1,NULL)) AS `22`
, SUM(IF(n.eid=23,1,0)) AS `23`
, COUNT(CASE WHEN n.eid = 33 THEN 1 END) AS `33`
, SUM(CASE WHEN n.eid = 34 THEN 1 ELSE 0 END) AS `34`
The "trick" we are using here is that we are guaranteed that the inline view aliased as d will return one row. Then we are using a LEFT JOIN operator to pick up all "matching" rows from the notification table. The GROUP BY is going to force all those rows to be collapsed (aggregated) back down to a single row. And we are using a conditional test on each row to see if it is to be included in a given count or not, the "trick" is to return a 0 or a 1, for each row, and then add up all the 0s and 1s to get a count.
NOTE: If you use a COUNT(expr) aggregate, you want that expr to return a non-NULL when the row is to be included in the count, and a NULL when the row is not to be included in the count.
If you use a SUM(expr), then you want expr to return a 1 when the row is to be included in the count, and return a 0 when it's not. (We want a 0 rather than a NULL so that we will be guaranteed that SUM(expr) will return a "zero count" (i.e a 0 rather than a NULL) when there are no rows to be included. (Of course, we could use an IFNULL function to replace a NULL with a 0, but in this case it's simple enough to avoid the need for that.)
Note that one advantage of this approach to "counting" is that it can easily extended to get "combined" counts, or to include a row in several different counts. e.g.
, SUM(IF(n.eid IN (41,42),1,0)) AS `total_41_and_42`
would get us a total count of eid=41 and eid=42 rows. (That's not such a great example, because we could just as easily calculate that on the client side by adding the two counts together. But that really becomes an advantage if you were doing more elaborate counts, and wanted to count a single row in multiple columns ...
, SUM(IF(n.eid=42,1,0)) AS eid_42
, SUM(IF(n.eid=42 AND foo=1,1,0) AS eid_42_foo_1
, SUM(IF(n.eid=42 AND foo=2,1,0)) AS eid_42_foo_2
We can get all those separate counts with just "one pass" through notification table. If we tried to do those checks in the WHERE clause, we'd likely need multiple passes through the table.

What you need is a driver table that has all the values you want to output. You can then left outer join this to the actual data:
SELECT count(notification.eid) as count
FROM (select distinct eid
from notification
) drivers left outer join
(select *
from notification
WHERE rid = %d AND vs = %d AND isN = 1
) n
on driver.eid = notification.eid
GROUP BY driver.eId
You should also include the eid in the select clause, unless you are depending on the final ordering of the output (MySQL, unlike any other database, does guarantee the ordering of results after a group by.)

So, essentially what you're looking for is this?...
SELECT rid,eid,vs, COUNT(*) FROM notification GROUP BY rid,eid,vs;
+-----+-----+----+----------+
| rid | eid | vs | COUNT(*) |
+-----+-----+----+----------+
| 0 | 22 | 2 | 2 |
| 3 | 21 | 2 | 1 |
| 3 | 22 | 2 | 2 |
| 60 | 21 | 2 | 6 |
| 60 | 22 | 2 | 3 |
| 81 | 21 | 2 | 1 |
| 81 | 22 | 2 | 3 |
+-----+-----+----+----------+
7 rows in set (0.11 sec)

Related

Getting wrong data from DB when joining MySql [duplicate]

I have a table of revenue as
title_id revenue cost
1 10 5
2 10 5
3 10 5
4 10 5
1 20 6
2 20 6
3 20 6
4 20 6
when i execute this query
SELECT SUM(revenue),SUM(cost)
FROM revenue
GROUP BY revenue.title_id
it produces result
title_id revenue cost
1 30 11
2 30 11
3 30 11
4 30 11
which is ok, now i want to combine sum result with another table which has structure like this
title_id interest
1 10
2 10
3 10
4 10
1 20
2 20
3 20
4 20
when i execute join with aggregate function like this
SELECT SUM(revenue),SUM(cost),SUM(interest)
FROM revenue
LEFT JOIN fund ON revenue.title_id = fund.title_id
GROUP BY revenue.title_id,fund.title_id
it double the result
title_id revenue cost interest
1 60 22 60
2 60 22 60
3 60 22 60
4 60 22 60
I can't understand why is it double it,please help
Its doubling because you have title repeated in fund and revenue tables. This multiplies the number of records where it matches. This is pretty easy to see if you remove the aggregate functions and look at the raw data. See here
The way to get around this is to create inline views of your aggregates and join on the those results.
SELECT R.title_id,
R.revenue,
R.cost,
F.interest
FROM (SELECT title_id,
Sum(revenue) revenue,
Sum(cost) cost
FROM revenue
GROUP BY revenue.title_id) r
LEFT JOIN (SELECT title_id,
Sum(interest) interest
FROM fund
GROUP BY title_id) f
ON r.title_id = F.title_id
output
| TITLE_ID | REVENUE | COST | INTEREST |
----------------------------------------
| 1 | 30 | 11 | 30 |
| 2 | 30 | 11 | 30 |
| 3 | 30 | 11 | 30 |
| 4 | 30 | 11 | 30 |
demo
The reason for this is that you have joined the table the first derived table from the second table without grouping it. To solve the problem, group the second table (fund) and join it with the first derived table using LEFT JOIN.
SELECT b.title_id,
b.TotalRevenue,
b.TotalCost,
d.TotalInterest
FROM
(
SELECT a.title_id,
SUM(a.revenue) TotalRevenue,
SUM(a.cost) TotalCost
FROM revenue a
GROUP BY a.title_id
) b LEFT JOIN
(
SELECT c.title_id,
SUM(a.interest) TotalInterest
FROM fund c
GROUP BY c.title_id
) d ON b.title_id = d.title_id
There are two rows for each title_id in revenue table.

How to rank by counting data in multiple rows in a column with certain conditions on it in MySQL

I Have a table something like below
edit 1 column id is the primary key
id ref_id count_value
10 34 5
11 34 2
12 36 3
13 30 1
14 25 20
15 34 15
15 36 10
what I want is to align and fetch the data in such a manner where
the value in count_value field will be add up for each corresponding ref_id
so here in the example
ref_id 34 have three entries and total count_value of 22
ref_id 36 have two entries and total count_value of 13
ref_id 25 have one entry and total count_value of 20
so that I am expecting is to be in this manner
ref_id
34
25
36
30
I tried using group by but that isn't going to solve this I guess as I want to add up the value present inside cell and then rank it up according to the final count
regarding the condition part in the question there is a timestamp column and will need to get only that data which is created after certain datetime
You can group by ref_id, and then order the records by descending sum() of count_value:
select ref_id
from mytable
group by ref_id
order by sum(count_value) desc
You can add a where clause to the query to implement the filter on the timestamp column (which you did not show in your sample data): it goes between the from clause and the group by clause.
Demo on DB Fiddle:
| ref_id |
| -----: |
| 34 |
| 25 |
| 36 |
| 30 |

How to count the total amount of similar rows in a MySQL query

I need help with a MySQL query to count the number of times a combination of the same age_from and age_to occur.
Sample data:
age_from age_to
+---------+-------+
18 | 100
30 | 75
18 | 50
18 | 100
30 | 75
18 | 50
30 | 75
+---------+------+
Desired result:
18 to 100 = 2
30 to 75 = 3
18 to 50 = 2
I have already tried this:
SELECT
`p_age_from` AS `age_from`,
`p_age_to` AS `age_to`,
COUNT(`p_age_from`) AS `user_count`
FROM `user`
GROUP BY `p_age_from`, `p_age_to`
ORDER BY `p_age_from`
You can use CONCAT to join the age_from and age_to values. Then use this to group and count the data:
SELECT
CONCAT(`age_from`,' to ',`age_to`) as 'grouping',
COUNT('grouping') AS `user_count`
FROM `user`
GROUP BY grouping
You can get your result using following query
SELECT CONCAT(age_from,' to ',age_to,' = ',count(age_to)) as 'age_from to age_to'
FROM user
GROUP BY age_from, age_to
ORDER BY age_from;

Use mysql SUM() and generate a random number in a WHERE clause

Suppose I have this table :
+------------------------------------+
| T_BOULEVERSEMENT |
+---------------------+--------------+
| PK_A_BOULEVERSEMENT | I_OCCURRENCE |
+---------------------+--------------+
| 1 | 3 |
+---------------------+--------------+
| 2 | 5 |
+---------------------+--------------+
| 3 | 1 |
+---------------------+--------------+
| ... | ... |
+---------------------+--------------+
| X | Y |
+---------------------+--------------+
And I want to return the first row in which the sum of all the previous occurrences (I_OCCURRENCE) is greater than a random value.
The random value is comprised in the range [1 - SUM(I_OCCURRENCE)].
The following statement seems to work fine.
SELECT y.`PK_A_BOULEVERSEMENT`,
y.`I_OCCURRENCE`
FROM (SELECT t.`PK_A_BOULEVERSEMENT`,
t.`I_OCCURRENCE`,
(SELECT SUM(x.`I_OCCURRENCE`)
FROM `T_BOULEVERSEMENT` x
WHERE x.`PK_A_BOULEVERSEMENT` <= t.`PK_A_BOULEVERSEMENT`) AS running_total
FROM `T_BOULEVERSEMENT` t
ORDER BY t.`PK_A_BOULEVERSEMENT`) y
WHERE y.running_total >= ROUND(RAND() * ((SELECT SUM(z.`I_OCCURRENCE`) FROM `T_BOULEVERSEMENT` z) - 1) + 1)
ORDER BY y.`PK_A_BOULEVERSEMENT`
LIMIT 1
But in really it mainly returns rows where PK_A_BOULEVERSEMENT is less than 10.
However, if I execute the following statement :
SELECT ROUND(RAND() * ((SELECT SUM(z.`I_OCCURRENCE`) FROM `T_BOULEVERSEMENT` z) - 1) + 1)
The result seems to be uniform in the range [1 - SUM(I_OCCURRENCE)].
What can be wrong ?
Thanks
EDIT :
SQL Fiddle : http://sqlfiddle.com/#!2/b37d6/2
The desired result must be uniform in the range 1 - MAX(PK_A_BOULEVERSEMENT)
try this:
SET #random_sum = (SELECT ROUND(RAND() * ((SELECT SUM(z.`I_OCCURRENCE`) FROM `T_BOULEVERSEMENT` z) - 1) + 1));
SELECT y.PK_A_BOULEVERSEMENT, SUM(x.I_OCCURRENCE) AS tot_occurence
FROM T_BOULEVERSEMENT AS x, T_BOULEVERSEMENT AS y
WHERE x.PK_A_BOULEVERSEMENT <= y.PK_A_BOULEVERSEMENT
GROUP BY y.PK_A_BOULEVERSEMENT
HAVING tot_occurence <= #random_sum
I had to use a temporary variable because mysql seems to recalculate rand() every row when using it in a where clause (so every row is compared to a different value).
With temporary variable I evaluate random number just before executing the query.
The cause of your problem is that the random number is being regenerated for each row in the subquery. Chances are that within the first 10 rows, you'll get a random number that's less than that row's running total. If we add the RAND() call and look at the subquery, it will look like this:
PK_A_.. I_OCC.. RUNNING_TOTAL RNDM
1 3 3 58
2 1 4 30
3 3 7 38
4 1 8 33
5 3 11 53
6 3 14 40
7 3 17 37
8 3 20 1
9 3 23 21
10 1 24 39
11 3 27 3
12 1 28 23
We only have to go as far as row 8 to find a running_total that exceeds the random value. The solution is to get the random value once, as suggested in the other answer.

MySQL Left Join: if primary condition does not exist use backup condition return only one

I have seen variations of this question asked but either they didn't apply or I didn't understand the answer/s.
I have two tables one table with charges types with additional cost and one of charges. I want to join them to get the appropriate values. I want to join the tables where the charge is between the startDate and the endDate and on the types. If there is not a match I want it to choose the type -1 (same condition for dates). If there is not a match I don't want it to show up in the results.
I initially was was going to do a normal left join ordered by 'type' desc and then group by 'type' believing that it would only leave me with the first type but I read that MySQL advises against this because the group by can be unpredictable and not always return the first match.
Tables:
startDate | endDate | type | addCost
--------------------------------------
2010-01-01 2010-12-31 1 100
2010-01-01 2010-12-31 2 200
2010-01-01 2010-12-31 -1 50
2011-01-01 2012-02-20 3 350
2011-01-01 2012-02-20 1 150
2011-01-01 2012-02-20 -1 75
chargeDate | type | cost
---------------------------
2010-10-01 1 10
2010-11-01 2 20
2010-12-01 4 40
2011-02-01 3 60
2011-03-01 2 25
2011-04-01 4 25
Desired Results:
chargeDate | type | cost | addCost
---------------------------------
2010-10-01 1 10 100
2010-11-01 2 20 200
2010-12-01 4 40 50
2011-02-01 3 60 350
2011-03-01 2 25 75
I'm using a subquery where I am trying to join charges with charges_types. If the join doesn't succeed, type is null and with coalesce I set type_c as -1, otherwise I set it to type. Then I join this subquery with charges_types again, and on the join clause i use type_c instead of type:
select c.chargeDate, c.type, c.cost, ct.addCost
from
(select
charges.chargeDate,
charges.type,
coalesce(charges_types.type, -1) as type_c,
charges.cost
from
charges left join charges_types
on charges.chargeDate between charges_types.startDate and charges_types.endDate
and charges.type = charges_types.type) c
inner join charges_types ct
on c.chargeDate between ct.startDate and ct.endDate
and c.type_c = ct.type