sum of count(*) for all rows in MySQL - mysql

I'm stuck with sum() query where I want the sum of count(*) values in all rows with group by.
Here is the query:
select
u.user_type as user,
u.count,
sum(u.count)
FROM
(
select
DISTINCT
user_type,
count(*) as count
FROM
users
where
(user_type = "driver" OR user_type = "passenger")
GROUP BY
user_type
) u;
Current Output:
----------------------------------
| user | count | sum |
----------------------------------
| driver | 58 | 90 |
----------------------------------
Expected Output:
----------------------------------
| user | count | sum |
----------------------------------
| driver | 58 | 90 |
| passenger | 32 | 90 |
----------------------------------
If I remove sum(u.count) from query then output is looks like:
--------------------------
| user | count |
--------------------------
| driver | 58 |
| passenger | 32 |
--------------------------

You need a subquery:
SELECT user_type,
Count(*) AS count,
(SELECT COUNT(*)
FROM users
WHERE user_type IN ("driver","passenger" )) as sum
FROM users
WHERE user_type IN ("driver","passenger" )
GROUP BY user_type ;
Note you dont need distinct here.
OR
SELECT user_type,
Count(*) AS count,
c.sum
FROM users
CROSS JOIN (
SELECT COUNT(*) as sum
FROM users
WHERE user_type IN ("driver","passenger" )
) as c
WHERE user_type IN ("driver","passenger" )
GROUP BY user_type ;

You can use WITH ROLLUP modifier:
select coalesce(user_type, 'total') as user, count(*) as count
from users
where user_type in ('driver', 'passenger')
group by user_type with rollup
This will return the same information but in a different format:
user | count
----------|------
driver | 32
passenger | 58
total | 90
db-fiddle
In MySQL 8 you can use COUNT() as window function:
select distinct
user_type,
count(*) over (partition by user_type) as count,
count(*) over () as sum
from users
where user_type in ('driver', 'passenger');
Result:
user_type | count | sum
----------|-------|----
driver | 32 | 90
passenger | 58 | 90
db-fiddle
or use CTE (Common Table Expressions):
with cte as (
select user_type, count(*) as count
from users
where user_type in ('driver', 'passenger')
group by user_type
)
select user_type, count, (select sum(count) from cte) as sum
from cte
db-fiddle

I would be tempted to ask; Are you sure you need this at the DB level?
Unless you are working purely in the database layer, any processing of these results will be built into an application layer and will presumably require some form of looping through the results
It could be easier, simpler, and more readable to run
SELECT user_type,
COUNT(*) AS count
FROM users
WHERE user_type IN ("driver", "passenger")
GROUP BY user_type
.. and simply add up the total count in the application layer
As pointed out by Juan in another answer, the DISTINCT is redundant as the GROUP BY ensures that each resultant row is different
Like Juan, I also prefer an IN here, rather than OR condition, for the user_type as I find it more readable. It also reduces the likelihood of confusion if combining further AND conditions in the future
As an aside, I would consider moving the names of the user types, "driver" and "passenger" into a separate user_types table and referencing them by an ID column from your users table
N.B. If you absolutely do need this at the DB level, I would advocate using one of Paul's excellent options, or the CROSS JOIN approach proffered by Tom Mac, and by Juan as his second suggested solution

Try this. Inline view gets the overall total :
SELECT a.user_type,
count(*) AS count,
b.sum
FROM users a
JOIN (SELECT COUNT(*) as sum
FROM users
WHERE user_type IN ("driver","passenger" )
) b ON TRUE
WHERE a.user_type IN ("driver","passenger" )
GROUP BY a.user_type;

You could simply combine SUM() OVER() with COUNT(*):
SELECT user_type, COUNT(*) AS cnt, SUM(COUNT(*)) OVER() AS total
FROM users WHERE user_type IN ('driver', 'passenger') GROUP BY user_type;
db<>fiddle demo
Output:
+------------+------+-------+
| user_type | cnt | total |
+------------+------+-------+
| passenger | 58 | 90 |
| driver | 32 | 90 |
+------------+------+-------+

Add a group by clause at the end for user-type, e.g:
select
u.user_type as user,
u.count,
sum(u.count)
FROM
(
select
DISTINCT
user_type,
count(*) as count
FROM
users
where
(user_type = "driver" OR user_type = "passenger")
GROUP BY
user_type
) u GROUP BY u.user_type;

Tom Mac Explain Properly Your answer. Here is the another way you can do that.
I check the query performance and not found any difference within 1000 records
select user_type,Countuser,(SELECT COUNT(*)
FROM users
WHERE user_type IN ('driver','passenger ') )as sum from (
select user_type,count(*) as Countuser from users a
where a.user_type='driver'
group by a.user_type
union
select user_type,count(*) as Countuser from users b
where b.user_type='passenger'
group by b.user_type
)c
group by user_type,Countuser

Try this:
WITH SUB_Q AS (
SELECT USER_TYPE, COUNT (*) AS CNT
FROM USERS
WHERE USER_TYPE = "passenger" OR USER_TYPE = "driver"
GROUP BY USER_TYPE
),
SUB_Q2 AS (
SELECT SUM(CNT) AS SUM_OF_COUNT
FROM SUB_Q
)
SELECT A.USER_TYPE, A.CNT AS COUNT, SUB_Q2 AS SUM
FROM SUB_Q JOIN SUB_Q2 ON (TRUE);
I used postgresql dialect but you can easily change to a subquery.

select
u.user_type as user,
u.count,
sum(u.count)
FROM users group by user

Related

SQL: aggregate over aggregate (max over sums)

I have problem creating valid query to aggregate over aggregate subquery.
MySQL allows some non-ANSI constructs but they give incorrect results.
CREATE TABLE `log` (
`id` int NOT NULL,
`id_user` varchar(32) NOT NULL,
`datastamp` datetime NOT NULL DEFAULT now(),
`processed` int NOT NULL DEFAULT '0',
PRIMARY KEY (`id`));
I want to have result table consisting of "best" user for every year (where "best" means having highest total sum over processed field), like:
source table:
2010 | u1 | 1
2010 | u1 | 3
2010 | u2 | 2
2011 | u1 | 1
2011 | u1 | 1
2011 | u2 | 5
result:
2010 | u1 | 4
2011 | u2 | 5
simple query
select year(datastamp) as y, id_user, sum(processed) as ps from log group by id_user, y
gives all sums per user and year:
2010 | u1 | 4
2010 | u2 | 2
2011 | u1 | 2
2011 | u2 | 5
but I can't select rows with highest sum for every year.
Trying something like
select y, max(ps), id_user from(...) group by y
although accepted by MySQL gives incorrect id_user field. Other solutions I found on stackoverflow suggest joining base table with subquery but I cannot use aggregate results (sum(processed) as ps) inside ON condition.
I think windowing functions might help you in this case. You can query the data using below query -
select *
from
(
select year, id_user, ps, rank() over (partition by year order by ps desc) as ranks_per_year
from
(
select year, id_user, sum(processed) as ps
from table
group by 1,2
) A
) B
where ranks_per_year = 1
rank() and dense_rank() are 2 methods you can use in case of tie.
In case the rank() does not work in your engine like you were mentioning, you can go ahead with max() function. Here is the query
with tbl as
(
select '2010' as year,'u1' as id_user,1 as processed union all
select '2010','u1',3 union all
select '2010','u2',2 union all
select '2011','u1',1 union all
select '2011','u1',1 union all
select '2011','u2',5
)
select *
from
(
select year, id_user, ps,
max(ps) over (partition by year) as max_ps_per_year
from
(
select year, id_user, sum(processed) as ps
from tbl
group by 1,2
) A
) B
where ps = max_ps_per_year

Can this query, which groups users by amount of comments posted, be simplified?

Two tables are used in this query, and all that matters in the result is the number of users which have or haven't posted any comments so far. The table user of course has the column id, which is the foreign key in the table comment, identified by the column user_id.
The first super-simple query groups users by whether or not they have any comments so far. It outputs two rows (a row with the user count who have comments, and a row with the user count who have no comments), with two columns (number of users, and whether or not they have posted any comments).
SELECT
COUNT(id) AS user_count,
IF( id IN ( SELECT user_id FROM `comment` ), 1, 0) AS has_comment
FROM `user`
GROUP BY has_comment
An example of how the output would look like here:
+------------+-------------+
| user_count | has_comment |
+------------+-------------+
| 150 | 0 |
| 140 | 1 |
+------------+-------------+
Now here comes my question. I want slightly more information here, by grouping these users into 3 groups instead:
Users that have posted no comments
Users that have posted fewer than 10 comments
Users that have posted 10 or more comments
And the best query that I know how to write for this purpose is as follows, which works, but unfortunately runs 4 subqueries and has 2 derived tables:
SELECT
COUNT(id) AS user_count,
CASE
WHEN id IN ( SELECT user_id FROM ( SELECT COUNT(user_id) AS comment_count, user_id FROM `comment` GROUP BY user_id HAVING comment_count >= 10 ) AS a) THEN '10 or more'
WHEN id IN ( SELECT user_id FROM ( SELECT COUNT(user_id) AS comment_count, user_id FROM `comment` GROUP BY user_id HAVING comment_count < 10 ) AS b) THEN 'less than 10'
ELSE 'none'
END AS has_comment
FROM `user`
GROUP BY has_comment
An example of the output here would be something like:
+------------+-------------+
| user_count | has_comment |
+------------+-------------+
| 150 | none |
| 130 | less than 10|
| 100 | 10 or more |
+------------+-------------+
This second query; can it be written more simply and efficiently, and still produce the same kind of result? (potentially maybe even be expanded into more of these kinds of "groups")
You can use two levels of aggregation:
select
count(*) no_users,
case
when no_comments = 0 then 'none'
when no_comments < 10 then 'less than 10'
else '10 or more'
end has_comment
from (
select
u.id,
(select count(*) from comments c where c.user_id = u.id) no_comments
from users u
) t
group by has_comment
order by no_comments
The subquery counts how many comments each user has (you could also express this with a left join and aggregation); then, the outer query classifies and count the users per number of comments.

Finding users with at least one of every item

For example, I have the following table called, Information
user_id | item
-------------------------
45 | camera
36 | smartphone
23 | camera
1 | glucose monitor
3 | smartwatch
2 | smartphone
7 | smartphone
2 | camera
2 | glucose monitor
2 | smartwatch
How can I check which user_id has at least one of every item?
The following items will not be static and may be different everytime. However in this example there are 4 unique items: camera, smartphone, smartwatch, glucose monitor
Expected Result:
Because user_id : 2 has at least one of every item, the result will be:
user_id
2
Here is what I attempted at so far, however if the list of items changes from 4 unique items to 3 unique items, I don't think it works anymore.
SELECT *
FROM Information
GROUP BY Information.user_id
having count(DISTINCT item) >= 4
One approach would be to aggregate by user_id, and then assert that the distinct item_id count matches the total distinct item_id count from the entire table.
SELECT
user_id
FROM Information
GROUP BY
user_id
HAVING
COUNT(DISTINCT item_id) = (SELECT COUNT(DISTINCT item_id) FROM Information);
You can try to use self-join by count and total count
SELECT t1.user_id
FROM (
SELECT user_id,COUNT(DISTINCT item) cnt
FROM T
GROUP BY user_id
) t1 JOIN (SELECT COUNT(DISTINCT item) cnt FROM T) t2
WHERE t1.cnt = t2.cnt
or exists
Query 1:
SELECT t1.user_id
FROM (
SELECT user_id,COUNT(DISTINCT item) cnt
FROM T
GROUP BY user_id
) t1
WHERE exists(
SELECT 1
FROM T tt
HAVING COUNT(DISTINCT tt.item) = t1.cnt
)
Results:
| user_id |
|---------|
| 2 |
One more way of solving this problem is by using CTE and dense_rank function.
This also gives better performance on MySQL. The Dense_Rank function ranks every item among users. I count the number of distinct items and say pick the users who have the maximum number of distinct items.
With Main as (
Select user_id
,item
,Dense_Rank () over (
Partition by user_id
Order by item
) as Dense_item
From information
)
Select
user_id
From Main
Where
Dense_item = (
Select
Count(Distinct item)
from
information);

MySQL Query with the count, group by

Table: statistics
id | user | Message
----------------------
1 | user1 |message1
2 | user2 |message2
3 | user1 |message3
I am able to find the count of messages sent by each user using this query.
select user, count(*) from statistics group by user;
How to show message column data along with the count? For example
user | count | message
------------------------
user1| 2 |message1
|message3
user2| 1 |message2
You seem to want to show Count by user, which message sent by user.
If your mysql version didn't support window functions, you can do subquery to make row_number in select subquery, then only display rn=1 users and count
CREATE TABLE T(
id INT,
user VARCHAR(50),
Message VARCHAR(100)
);
INSERT INTO T VALUES(1,'user1' ,'message1');
INSERT INTO T VALUES(2,'user2' ,'message2');
INSERT INTO T VALUES(3,'user1' ,'message3');
Query 1:
SELECT (case when rn = 1 then user else '' end) 'users',
(case when rn = 1 then cnt else '' end) 'count',
message
FROM (
select
t1.user,
t2.cnt,
t1.message,
(SELECT COUNT(*) from t tt WHERE tt.user = t1.user and t1.id >= tt.id) rn
from T t1
join (
select user, count(*) cnt
from T
group by user
) t2 on t1.user = t2.user
) t1
order by user,message
Results:
| users | count | message |
|-------|-------|----------|
| user1 | 2 | message1 |
| | | message3 |
| user2 | 1 | message2 |
select user, count(*) as 'total' , group_concat(message) from statistics group by user;
You could join the result of your group by with the full table (or vice versa)?
Or, depending on what you want, you could use group_concat() using \n as separator.
Use Group_concat
select user, count(0) as ct,group_concat(Message) from statistics group by user;
This will give you message in csv format
NOTE: GROUP_CONCAT has size limit of 1024 characters by default in mysql.
For UTF it goes to 1024/3 and utfmb4 255(1024/4).
You can use group_concat_max_len global variable to set its max length as per need but take into account memory considerations on production environment
SET group_concat_max_len=100000000
Update:
You can use any separator in group_concat
Group_concat(Message SEPARATOR '----')
Try grouping with self-join:
select s1.user, s2.cnt, s1.message
from statistics s1
join (
select user, count(*) cnt
from statistics
group by user
) s2 on s1.user = s2.user

SELECT visitors that have visited more than one place in a day along with the details

my mysql table is like:
+---------+---------+------------+-----------------------+---------------------+
| visitId | userId | locationId | comments | time |
+---------+---------+------------+-----------------------+---------------------+
| 1 | 3 | 12 | It's a good day here! | 2012-12-12 11:50:12 |
+---------+---------+------------+-----------------------+---------------------+
| 2 | 3 | 23 | very beautiful | 2012-12-12 12:50:12 |
+---------+---------+------------+-----------------------+---------------------+
| 3 | 3 | 52 | nice | 2012-12-12 13:50:12 |
+---------+---------+------------+-----------------------+---------------------+
witch records visitors' trajectory and some comments on the places visited
I want to find visitors visited more than one place in a day, along with the specific day AND the places, Not only the count.
I tried the subquery:
mysql> SELECT userId, locationId, time FROM visits
WHERE (userId,DATE(time)) in (
SELECT userNum, Date(weiboTime) from visits GROUP BY userNum, Date(wei
boTime) Having COUNT(*)>1);
And the joint query:
mysql> select v2.userId, v1.loacationId, v1.time from visits as v1, visits as
v2 where v1.userId=v2.userId GROUP BY v2.userId, Date(v2.time) HAVING
COUNT(DISTINCT v2.locationId);
I am not sure whether it is correct for the second one. But both of them take too long time. Any suggestions for what should I do?
UPDATE
mysql> SELECT t.userId, locationId, t.time FROM (
SELECT userId, time
FROM visits GROUP BY userId,Date(time)
HAVING COUNT(*) > 1) AS t, visits
WHERE t.userId=visits.userId AND t.time=visits.time;
hope this will make myself more clear.
Your queries are including locationId, but your stated goal is to get user/date combos that had more than 1 visit in a day. Here's the sql to get that:
select userId, date(time), count(*)
from visits
group by userId, date(time)
having count(*) > 1;
Update:
To get all visits from user/day combo visits greater than 1:
select *
from visits
where (userId, date(time)) in (
select userId, date(time)
from visits
group by userId, date(time)
having count(*) > 1);
I think you'd be better suited using a GROUP BY in a subquery with your count. From MySQL's count documentation, you can do something like:
mysql> SELECT userId, locationId, time, visitCount
FROM
(SELECT COUNT(*) as visitCount
FROM visits
GROUP BY userId)
WHERE visitCount > 1;
I'd assume the slowness you're encountering comes from the HAVING and DISTINCT in your WHERE clauses.