MySQL Query with the count, group by

MySQL Query with the count, group by - mysql

Table: statistics
id | user | Message
----------------------
1 | user1 |message1
2 | user2 |message2
3 | user1 |message3
I am able to find the count of messages sent by each user using this query.
select user, count(*) from statistics group by user;
How to show message column data along with the count? For example
user | count | message
------------------------
user1| 2 |message1
|message3
user2| 1 |message2

You seem to want to show Count by user, which message sent by user.
If your mysql version didn't support window functions, you can do subquery to make row_number in select subquery, then only display rn=1 users and count
CREATE TABLE T(
id INT,
user VARCHAR(50),
Message VARCHAR(100)
);
INSERT INTO T VALUES(1,'user1' ,'message1');
INSERT INTO T VALUES(2,'user2' ,'message2');
INSERT INTO T VALUES(3,'user1' ,'message3');
Query 1:
SELECT (case when rn = 1 then user else '' end) 'users',
(case when rn = 1 then cnt else '' end) 'count',
message
FROM (
select
t1.user,
t2.cnt,
t1.message,
(SELECT COUNT(*) from t tt WHERE tt.user = t1.user and t1.id >= tt.id) rn
from T t1
join (
select user, count(*) cnt
from T
group by user
) t2 on t1.user = t2.user
) t1
order by user,message
Results:
| users | count | message |
|-------|-------|----------|
| user1 | 2 | message1 |
| | | message3 |
| user2 | 1 | message2 |

select user, count(*) as 'total' , group_concat(message) from statistics group by user;

You could join the result of your group by with the full table (or vice versa)?
Or, depending on what you want, you could use group_concat() using \n as separator.

Use Group_concat
select user, count(0) as ct,group_concat(Message) from statistics group by user;
This will give you message in csv format
NOTE: GROUP_CONCAT has size limit of 1024 characters by default in mysql.
For UTF it goes to 1024/3 and utfmb4 255(1024/4).
You can use group_concat_max_len global variable to set its max length as per need but take into account memory considerations on production environment
SET group_concat_max_len=100000000
Update:
You can use any separator in group_concat
Group_concat(Message SEPARATOR '----')

Try grouping with self-join:
select s1.user, s2.cnt, s1.message
from statistics s1
join (
select user, count(*) cnt
from statistics
group by user
) s2 on s1.user = s2.user

Related

Update duplicate email addresses on mysql database table

I have a huge database that I have almost over 10k row in my user table and there are 2700 duplicate email addresses.
Basically the application did not limit the users from registering their accounts with the same email address over and over again. I have cleaned the multiple ones -more than 2 times- manually, there weren't many, but there are 2700 email addresses with duplicate value occur at least 2 times. So I want to update the duplicate email addresses and change the email address with a smaller id number to something like from "email#mail.com" to "1email#mail.com", basically adding "1" to the beginning of all duplicate email addresses. I can select and display the duplicate email addresses but could not find the way to update only one of the email addresses and leave the other on untouched.
My table structure is like id username email password.

If you do not have MySQL 8:
Here I am just prepending the id of the row to the email address:
UPDATE my_table JOIN (
SELECT email, MAX(id) AS max_id, COUNT(*) AS cnt FROM my_table
GROUP BY email
HAVING cnt > 1
) sq ON my_table.email = sq.email AND my_table.id <> sq.max_id
SET my_table.email = CONCAT( my_table.id, my_table.email)
;
See DB-Fiddle
The inner query:
SELECT email, MAX(id) AS max_id, COUNT(*) AS cnt FROM my_table
GROUP BY email
HAVING cnt > 1
looks for all emails that that are duplicated (i.e. there is more than one row with the same email address) and computes the row that has the maximum id value for each email address. For the sample data in my DB-Fiddle demo, it would return the following:
| email | max_id | cnt |
| ---------------- | ------ | --- |
| emaila#dummy.com | 3 | 3 |
| emailb#dummy.com | 5 | 2 |
The above inner query is aliased as table sq.
Now if I join my_table with the above query as follows:
SELECT my_table.* from my_table join (
SELECT email, MAX(id) AS max_id, COUNT(*) AS cnt FROM my_table
GROUP BY email
HAVING cnt > 1
) sq on my_table.email = sq.email and my_table.id <> sq.max_id
I get:
| id | email |
| --- | ---------------- |
| 1 | emaila#dummy.com |
| 2 | emaila#dummy.com |
| 4 | emailb#dummy.com |
because I am selecting from my_table all rows that have duplicate email addresses (condition my_table.email = sq.email except for the rows that have the highest value of id for each email address (condition my_table.id <> sq.max_id).
It is the ids from the above join whose email addresses are to be modified.

WITH cte AS ( SELECT id,
email,
ROW_NUMBER() OVER (PARTITION BY email ORDER BY id) rn
FROM sourcetable )
UPDATE sourcetable src, cte
SET src.email = CONCAT(rn - 1, src.email)
WHERE src.id = cte.id
AND cte.rn > 1;
fiddle
I want to update the duplicate email addresses and change the email address with a smaller id number
If so the ordering in window function must be reversed:
WITH cte AS ( SELECT id,
email,
ROW_NUMBER() OVER (PARTITION BY email ORDER BY id DESC) rn
FROM sourcetable )
UPDATE sourcetable src, cte
SET src.email = CONCAT(rn - 1, src.email)
WHERE src.id = cte.id
AND cte.rn > 1;
fiddle

sum of count(*) for all rows in MySQL

I'm stuck with sum() query where I want the sum of count(*) values in all rows with group by.
Here is the query:
select
u.user_type as user,
u.count,
sum(u.count)
FROM
(
select
DISTINCT
user_type,
count(*) as count
FROM
users
where
(user_type = "driver" OR user_type = "passenger")
GROUP BY
user_type
) u;
Current Output:
----------------------------------
| user | count | sum |
----------------------------------
| driver | 58 | 90 |
----------------------------------
Expected Output:
----------------------------------
| user | count | sum |
----------------------------------
| driver | 58 | 90 |
| passenger | 32 | 90 |
----------------------------------
If I remove sum(u.count) from query then output is looks like:
--------------------------
| user | count |
--------------------------
| driver | 58 |
| passenger | 32 |
--------------------------

You need a subquery:
SELECT user_type,
Count(*) AS count,
(SELECT COUNT(*)
FROM users
WHERE user_type IN ("driver","passenger" )) as sum
FROM users
WHERE user_type IN ("driver","passenger" )
GROUP BY user_type ;
Note you dont need distinct here.
OR
SELECT user_type,
Count(*) AS count,
c.sum
FROM users
CROSS JOIN (
SELECT COUNT(*) as sum
FROM users
WHERE user_type IN ("driver","passenger" )
) as c
WHERE user_type IN ("driver","passenger" )
GROUP BY user_type ;

You can use WITH ROLLUP modifier:
select coalesce(user_type, 'total') as user, count(*) as count
from users
where user_type in ('driver', 'passenger')
group by user_type with rollup
This will return the same information but in a different format:
user | count
----------|------
driver | 32
passenger | 58
total | 90
db-fiddle
In MySQL 8 you can use COUNT() as window function:
select distinct
user_type,
count(*) over (partition by user_type) as count,
count(*) over () as sum
from users
where user_type in ('driver', 'passenger');
Result:
user_type | count | sum
----------|-------|----
driver | 32 | 90
passenger | 58 | 90
db-fiddle
or use CTE (Common Table Expressions):
with cte as (
select user_type, count(*) as count
from users
where user_type in ('driver', 'passenger')
group by user_type
)
select user_type, count, (select sum(count) from cte) as sum
from cte
db-fiddle

I would be tempted to ask; Are you sure you need this at the DB level?
Unless you are working purely in the database layer, any processing of these results will be built into an application layer and will presumably require some form of looping through the results
It could be easier, simpler, and more readable to run
SELECT user_type,
COUNT(*) AS count
FROM users
WHERE user_type IN ("driver", "passenger")
GROUP BY user_type
.. and simply add up the total count in the application layer
As pointed out by Juan in another answer, the DISTINCT is redundant as the GROUP BY ensures that each resultant row is different
Like Juan, I also prefer an IN here, rather than OR condition, for the user_type as I find it more readable. It also reduces the likelihood of confusion if combining further AND conditions in the future
As an aside, I would consider moving the names of the user types, "driver" and "passenger" into a separate user_types table and referencing them by an ID column from your users table
N.B. If you absolutely do need this at the DB level, I would advocate using one of Paul's excellent options, or the CROSS JOIN approach proffered by Tom Mac, and by Juan as his second suggested solution

Try this. Inline view gets the overall total :
SELECT a.user_type,
count(*) AS count,
b.sum
FROM users a
JOIN (SELECT COUNT(*) as sum
FROM users
WHERE user_type IN ("driver","passenger" )
) b ON TRUE
WHERE a.user_type IN ("driver","passenger" )
GROUP BY a.user_type;

You could simply combine SUM() OVER() with COUNT(*):
SELECT user_type, COUNT(*) AS cnt, SUM(COUNT(*)) OVER() AS total
FROM users WHERE user_type IN ('driver', 'passenger') GROUP BY user_type;
db<>fiddle demo
Output:
+------------+------+-------+
| user_type | cnt | total |
+------------+------+-------+
| passenger | 58 | 90 |
| driver | 32 | 90 |
+------------+------+-------+

Add a group by clause at the end for user-type, e.g:
select
u.user_type as user,
u.count,
sum(u.count)
FROM
(
select
DISTINCT
user_type,
count(*) as count
FROM
users
where
(user_type = "driver" OR user_type = "passenger")
GROUP BY
user_type
) u GROUP BY u.user_type;

Tom Mac Explain Properly Your answer. Here is the another way you can do that.
I check the query performance and not found any difference within 1000 records
select user_type,Countuser,(SELECT COUNT(*)
FROM users
WHERE user_type IN ('driver','passenger ') )as sum from (
select user_type,count(*) as Countuser from users a
where a.user_type='driver'
group by a.user_type
union
select user_type,count(*) as Countuser from users b
where b.user_type='passenger'
group by b.user_type
)c
group by user_type,Countuser

Try this:
WITH SUB_Q AS (
SELECT USER_TYPE, COUNT (*) AS CNT
FROM USERS
WHERE USER_TYPE = "passenger" OR USER_TYPE = "driver"
GROUP BY USER_TYPE
),
SUB_Q2 AS (
SELECT SUM(CNT) AS SUM_OF_COUNT
FROM SUB_Q
)
SELECT A.USER_TYPE, A.CNT AS COUNT, SUB_Q2 AS SUM
FROM SUB_Q JOIN SUB_Q2 ON (TRUE);
I used postgresql dialect but you can easily change to a subquery.

select
u.user_type as user,
u.count,
sum(u.count)
FROM users group by user

SQL writing custom query

I need to write a SQL Query which generates the name of the most popular story for each user (according to total reading counts). Here is some sample data:
story_name | user | age | reading_counts
-----------|-------|-----|---------------
story 1 | user1 | 4 | 12
story 2 | user2 | 6 | 14
story 4 | user1 | 4 | 15
This is what I have so far but I don't think it's correct:
Select *
From mytable
where (story_name,reading_counts)
IN (Select id, Max(reading_counts)
FROM mytable
Group BY user
)

In a Derived Table, you can first determine the maximum reading_counts for every user (Group By with Max())
Now, simply join this result-set to the main table on user and reading_counts, to get the row corresponding to maximum reading_counts for a user.
Try the following query:
SELECT
t1.*
FROM mytable AS t1
JOIN
(
SELECT t2.user,
MAX(t2.reading_counts) AS max_count
FROM mytable AS t2
GROUP BY t2.user
) AS dt
ON dt.user = t1.user AND
dt.max_count = t1.reading_counts

SELECT *
FROM mytable
WHERE user IN
(SELECT user, max(reading_counts)
FROM mytable
GROUP BY user)

Mysql: 3 types of users, row with multiple entries, cannot separate

I am working on an old mysql data base (not created by me). It has 300k users, users are either flagged with flag=0, flag=1 or flag=1 and flag=0. The latter means that user was flagged in the past and currently is no longer flagged. The table looks like:
user_id | log_data | action | data |
001 | 1-1-2002 | flip-flag | flag=0 |
002 | 2-2-2003 | flip-flag | flag=1 |
002 | 2-3-2003 | flip-flag | flag=0 |
003 | 3-3-2003 | flip-flag | flag=1 |
I am trying to create a list containing only the users that were flagged in the past and are no longer flagged (flag=1 and flag=0, user_id=002 in the table above) I tried:
select user_id, data
from table_name
where (data = 'flag=1' and data = 'flag=0')
limit 50;
but it does not return any result. Doing:
select user_id, data
from table_name
where data = 'flag=1' limit 50;
gives the list of all users flagged with flag=1 (current flagged and past flagged). Does anybody know what to do in this case?

Hmmm . . . I'm thinking aggregation and having:
select user_id
from table_name
group by user_id
having max(case when data = 'flag=1' then log_date end) <
max(case when data = 'flag=0' then log_date end);

Just to be clear, the following query that you wrote will never return any data - the same row cannot have data = 'flag=1' and data='flag=0' which is why you'll have to use aggregation or a self join to get this right.
select user_id, data
from table_name
where (data = 'flag=1' and data = 'flag=0')
limit 50;
Another solution (probably less effective than Gordon's version)
select t0.user_id
from table_name t0
where (select t1.logdate from table_name t1 where t1.user_id = t0.user_id
and data = 'flag=1' order by 1 desc limit 1) >
(select t2.logdate from table_name t2 where t2.user_id = t0.user_id
and data = 'flag=0' order by 1 desc limit 1)
group by t0.table_name

MySQL getting the lowest ID for a certain user -or- the ID of the entry with the highest urgency for each row

I have the following database
id | user | urgency | problem | solved
The information in there has different users, but these users all have multiple entries
1 | marco | 0 | MySQL problem | n
2 | marco | 0 | Email problem | n
3 | eddy | 0 | Email problem | n
4 | eddy | 1 | MTV doesn't work | n
5 | frank | 0 | out of coffee | y
What I want to do is this: Normally I would check everybody's oldest problem first. I use this query to get the ID's of the oldest problem.
select min(id) from db group by user
this gives me a list of the oldest problem ID's. But I want people to be able to make a certain problem more urgent. I want the ID with the highest urgency for each user, or ID of the problem with the highest urgency
Getting the max(urgency) won't give the ID of the problem, it will give me the max urgency.
To be clear: I want to get this as a result
row | id
0 | 1
1 | 4
The last entry should be in the results since it's solved

Select ...
From SomeTable As T
Join (
Select T1.User, Min( T1.Id ) As Id
From SomeTable As T1
Join (
Select T2.User, Max( T2.Urgency ) As Urgency
From SomeTable As T2
Where T2.Solved = 'n'
Group By T2.User
) As MaxUrgency
On MaxUrgency.User = T1.User
And MaxUrgency.Urgency = T1.Urgency
Where T1.Solved = 'n'
Group By T1.User
) As Z
On Z.User = T.User
And Z.Id = T.Id

There are lots of esoteric ways to do this, but here's one of the clearer ones.
First build a query go get your min id and max urgency:
SELECT
user,
MIN(id) AS min_id,
MAX(urgency) AS max_urgency
FROM
db
GROUP BY
user
Then incorporate that as a logical table into
a larger query for your answers:
SELECT
user,
min_id,
max_urgency,
( SELECT MIN(id) FROM db
WHERE user = a.user
AND urgency = a.max_urgency
) AS max_urgency_min_id
FROM
(
SELECT
user,
MIN(id) AS min_id,
MAX(urgency) AS max_urgency
FROM
db
GROUP BY
user
) AS a
Given the obvious indexes, this should be pretty efficient.

The following will get you exactly one row back -- the most urgent, probably oldest problem in your table.
select id from my_table where id = (
select min(id) from my_table where urgency = (
select max(urgency) from my_table
)
)
I was about to suggest adding a create_date column to your table so that you could get the oldest problem first for those problems of the same urgency level. But I'm now assuming you're using the lowest ID for that purpose.
But now I see you wanted a list of them. For that, you'd sort the results by ID:
select id from my_table where urgency = (
select max(urgency) from my_table
) order by id;
[Edit: Left out the order by!]
I forget, honestly, how to get the row number. Someone on the interwebs suggests something like this, but no idea if it works:
select #rownum:=#rownum+1 ‘row', id from my_table where ...

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

MySQL Query with the count, group by - mysql

select user, count(*) as 'total' , group_concat(message) from statistics group by user;

You could join the result of your group by with the full table (or vice versa)? Or, depending on what you want, you could use group_concat() using \n as separator.

Try grouping with self-join: select s1.user, s2.cnt, s1.message from statistics s1 join ( select user, count(*) cnt from statistics group by user ) s2 on s1.user = s2.user

Related

Update duplicate email addresses on mysql database table

sum of count(*) for all rows in MySQL

SQL writing custom query

Mysql: 3 types of users, row with multiple entries, cannot separate

MySQL getting the lowest ID for a certain user -or- the ID of the entry with the highest urgency for each row

Categories

Resources