Problem with this SQL that finds duplicate records

Problem with this SQL that finds duplicate records - mysql

Please help me with a problem with this SQL in phpmyadmin that finds duplicate records for cleanup:
SELECT userid, MIN(logged_time),`logged_time`, count(email) as duplicated_count
FROM tbl_users
GROUP BY email HAVING count(email) > 1
ORDER BY `duplicated_count` DESC
Here is the GOAL:
Trying to list the duplicate email records in a user database that also tracks when they last used the system under the account.
Trying to identify the records with the oldest logged_time so those duplicates ONLY can be deleted.
PROBLEM:
The SQL above WORKS to identify the duplicates. AND, it WORKS to display the oldest (min) login. BUT>>> The one userID it returns is NOT the user id with the oldest login.
In other words, the data returned is from different records in the duplicate search for the same user email.
The results look like:
userid, MIN(logged_time), logged time,
"111" "2013-04-10 22:35:21", "2017-10-01 04:17:49"
SO...... User "111" is not the one I want to delete! That userid matches the most RECENT logged time. I want the userid for the record that matched MIN(logged_time).
Thanks for the help! I know this may be confusing.
MySQL version is 5.5, so the code posted doesn't seem to work with this version. Any other suggestions?

The one userID it returns is NOT the user id with the oldest login.
Selecting min(logged_time) does not say to only return the oldest login. It just gives you the oldest logged_time for each email.
What you asked for is invalid SQL, but MySQL allows it. It's invalid because it leads to exactly the trouble you're having.
Say we have this.
userid email logged_time
-----------------------------
1 email1 2021-01-01
2 email2 2021-01-01
3 email1 2021-02-01
4 email3 2021-01-01
If we select userid, min(logged_time), count(email) from tbl_users group by email, MySQL can only show you one row per email. For email2 and email3 that's fine, there's only one option. But for email1 it has to choose either 1 or 3.
Normally this would make the query invalid and you would get an error. You can't select a non-aggregated column (userid) which is not in the group by.
But MySQL allows this. It picks a userid at random. And we get the problem you're having.
userid min(logged_time) count(email)
--------------------------------------
2 2021-01-01 1
3 2021-01-01 2
4 2021-01-01 1
See MySQL Handling of GROUP BY for more, and I'd suggest you turn it off with ONLY_FULL_GROUP_BY.
What you're asking for is a bit tricky.
You need to both find emails with duplicates AND pick the oldest row for each email. You can't do all that in a single query, you need two queries joined together.
You know how to find duplicate emails.
-- Note that if you have two entries with the same userid and email
-- it will count that as a duplicate.
select email
from tbl_users
group by email
having count(email) > 1
To order the emails by their logged_time use the row_number window function.
select
userid,
email,
logged_time,
row_number() over(partition by email order by logged_time asc) as login_order
from tbl_users
That will return all the rows with each ranked.
userid email logged_time login_order
--------------------------------------------
1 email1 2021-01-01 1
2 email2 2021-01-01 1
3 email1 2021-02-01 2
4 email3 2021-01-01 1
Join them together and select only where login_order = 1.
-- Get the emails with dups.
with users_with_duplicates as (
select email
from tbl_users
group by email
having count(email) > 1
),
-- Get the logins by email ranked by ascending logged_time
users_in_login_order as (
select
userid,
email,
logged_time,
row_number() over(partition by email order by logged_time asc) as login_order
from tbl_users
)
select
users_in_login_order.*
from users_in_login_order
-- The join will constrain users_in_login_order to only emails with dups
join users_with_duplicates
on users_in_login_order.email = users_with_duplicates.email
where login_order = 1
Try it.

Related

How to count number of user chat with another user

I try few mysql statement but didn't come to my expectations.
How to get the total of to_user chat and order by the lowest total?
Let say in this case,
id 7 chat with 2 user
id 6 chat with only 1 user.
so the minimum will be id 6.
Can someone help me with sql statement?
This is what my expected result
count
to_user
1
7
2
6

I think your problem will be solved with the following code:
SELECT COUNT(DISTINCT(from_user)) AS total,to_user
FROM chats
GROUP BY to_user
ORDER BY total ASC

Make sure you index the junction in both directions, otherwise it will BYITA later!
ALTER TABLE messages
ADD INDEX idx_from_to (from_user, to_user),
ADD INDEX idx_to_from (to_user, from_user);
You may want to take into account the fact that the full list of chats for a given user is (from any user, to given user) UNION (to any user, from given user).
Consider user 1 in your sample data, who has sent messages to 5, 6, & 7 but not received any. And user 5 has sent to user 7 and received from user 1.
SELECT COUNT(DISTINCT(from_user)) AS count, to_user
FROM messages
GROUP BY to_user
ORDER BY count ASC
returns the following (which matches the expected result detailed in your question, errors aside)
count
to_user
1
5
1
6
2
7
whereas
SELECT COUNT(*) AS count, from_user AS user
FROM (
SELECT from_user, to_user FROM messages
UNION
SELECT to_user, from_user FROM messages
) t
GROUP BY from_user
ORDER BY count ASC;
returns
count
user
1
6
2
5
2
7
3
1

MySQL - Get the latest record for a given list of column values

I have the following table structure:
Request table:
id insret_time account id
------------------------------------
1 2018-04-05 08:06:23 abc
2 2018-09-03 08:14:45 abc
3 2018-08-13 09:23:34 xyz
4 2018-08-04 09:25:37 def
5 2018-08-24 11:45:37 def
I need to find the latest records for account IDs abc and def. I don't care about xyz.
I tried to get the results using group by and inner join methods but was not successful in limiting the results to just the user list I care about.
Please advice
Update:
Thanks everyone for your feedback. Appreciate it! I needed the entire row as the output. I have used id column instead of timestamp to get the latest record since its auto-incremented This is what I finally came up with that gave me the output I need:
select t.* FROM table t
join (select max(table.id) as maxNum from table
where account_id in ('abc','def') group by account_id) tm on t.id = tm.maxNum;

I think this is what you were looking for
select account_id,max(insret_time)
from table where account_id in ('abc', 'def')
group by account_id

You can use not in and ignore xyz records and placing order by desc:
select account_id,max(insert_time) from table where account_id not in ('xyz') group by account_id order by id desc
Also you can use != operators just for one expression:
select account_id,max(insert_time) from table where account_id!='xyz' group by account_id order by id desc
I hope it helps :)

MySQL group multiple rows based on DISTINCT value

I need to display the last 2 results from a table (results), the results are comprised of several rows with matching submissionId, The number of rows per submission is unknown, and of course I prefer a single query.
Here is the DB table structure
submissionId input value
1 name jay
1 phone 123-4567
1 email test#gmail.com
2 name mo
2 age 32
3 name abe
3 email abe#gmail.com
4 name jack
4 phone 123-4567
4 email jack#gmail.com
Desierd results:
submissionId input value
3 name abe
3 email abe#gmail.com
4 name jack
4 phone 123-4567
4 email jack#gmail.com
Or even better, if I can combine the rows like this:
3 name abe 3 email abe#gmail.com
4 name jack 4 phone 123-4567 4 email jack#gmail.com

One option here is to use a subquery to identify the most recent and next to most recent submissionId:
SELECT submissionId, input, value
FROM yourTable
WHERE submissionId >= (SELECT MAX(submissionId) FROM yourTable) - 1
ORDER BY submissionId
Demo here:
SQLFiddle
Update:
If your submissionId column were really a date type, and you wanted the most recent two dates in your result set, then the following query will achieve that. Note that the subquery in the WHERE clause, while ugly, is not correlated to the outer query. This means that the MySQL optimizer should be able to figure out that it only needs to run it once.
SELECT submissionDate, input, value
FROM yourTable
WHERE submissionDate >=
(SELECT MAX(CASE WHEN submissionDate = (SELECT MAX(submissionDate) FROM yourTable)
THEN '1000-01-01'
ELSE submissionDate
END) FROM yourTable)
ORDER BY submissionDate
SQLFiddle

You can use limit in subqueries in the from clause, so a typical way to write this is:
SELECT submissionDate, input, value
FROM t join
(select distinct submissionDate
from t
order by submissionDate desc
limit 2
) sd
on t.submissionDate = sd.submissionDate;

This is how the query looks like now, so i can get the results with a LIMIT, RANGE, and id/timestamp (with help of Tim and Gordon):
SELECT *
FROM rmyTable t
JOIN
(SELECT DISTINCT sd.submissionId
FROM myTable sd
WHERE sd.questionId = yourId
ORDER BY sd.submissionId
LIMIT 2
) t2
ON t.submissionId = t2.submissionId
WHERE t.formId = yourId
AND dateTime BETWEEN 0000 AND 1111

update query problem

i have three tables on mysql database which are:
RECHARGE with these columns: rid, uid,res_id, agent_id, batch_id, rcard_order, serialno, email,
units, bankid, paydate, slipno, rpin, amtpd, bonus, description,
crSender, crSenderId,
transaction_ref,rechargeDate,
processed
SENT with these columns: sendid, uid, res_id, recipients, volume, ffdaily, message, sender, msgtype, flash, mob_field, wapurl,
date
BILL with these columns: bid, uid, email, unitBals, lastusedate
The question is these:i want a query that will subtract the sum of volume in SENT table from units in RECHARGE table and use the result to update the unitBals column on BILL table where the primary key joining the three tables is their uid.
i used this query but it is not giving me the same answer as when i sum(volume) and subtract it from sum(units) separately doing the calculation on my own
update bill set unitbals = (SELECT sum( recharge.units ) - sum( sent.volume )
FROM sent, recharge
WHERE sent.uid = recharge.uid)
where email = 'info#dunmininu.com'

There are two problems here. First, from the fact that you are using sum, I take it that there can be more than one Recharge record for a given Uid and more than one Sent record for a given Uid. If this is true, then when you do the join, you are not getting all the Recharges plus all the Sents, you are getting every combination of a Recharge and a Sent.
For example, suppose for a given Uid you have the following records:
Recharge:
Uid Units
42 2
42 3
42 4
Sent
Uid Volume
42 1
42 6
Then a query
select recharge.units, sent.volume
from recharge, sent
where recharge.uid=sent.uid
will give
Units Volume
2 1
2 6
3 1
3 6
4 1
4 6
So doing sum(units)-sum(volume) will give 18-21 = -3.
Also, you're doing nothing to connect the Uid of the Sent and Recharge to the Uid of the Bill. Thus, for any given Bill, you're processing records for ALL uids. The Uid of the Bill is never considered.
I think what you want is something more like:
update bill
set unitbals = (SELECT sum( recharge.units ) from recharge where recharge.uid=bill.uid)
- (select sum(sent.volume) from sent where sent.uid=bill.uid)
where email='info#dunmininu.com';
That is, take the sum of all the recharges for this uid, minus the sum of all the sents.
Note that this replaces the old value of Unitbals. It's also possible that you meant to say "unitbals=unitbals +" etc.

I think you need separate sum in the two tables:
update bill
set unitbals =
( SELECT sum( recharge.units )
FROM recharge
WHERE bill.id = recharge.uid
) -
( SELECT sum( sent.volume )
FROM sent
WHERE bill.id = sent.id
)
where email = 'info#dunmininu.com'

Counting the particular field from table

Can I get the count results for particular field from table. for example im using this query,
select id,retailer,email from tab
i got the result set,
1 ret1 test1#test.com
2 ret2 test1#test.com
3 ret3 test1#test.com
4 ret1 test2#test.com
5 ret2 test2#test.com
6 ret6 test2#test.com
What I need is count of (test1#test.com) as 3 times like wise. thanks.

This will give you the count of all email addresses in that table:
SELECT email, COUNT(*) FROM tab GROUP BY email;
If you want to get only one particular one count use this:
SELECT COUNT(*) FROM tab WHERE email = 'test#example.com';

To count a single email:
select count(id)
from tab
where email = 'test1#test.com'
or to count all email values:
select email, count(email)
from tab
group by email

To group all of your emails together to count them:
SELECT email
, COUNT(*) AS 'count'
FROM `tab`
GROUP BY email
If you're looking for just a single email address:
SELECT email
, COUNT(*) AS 'count'
FROM `tab`
WHERE email = 'test#example.com'

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008