Query to find the duplicates between the name and number in table

Query to find the duplicates between the name and number in table - mysql

SELECT count(*), lower(name), number
FROM tbl
GROUP BY lower(name), number
HAVING count(*) > 1;
input tb1
slno name number
1 aaa 111
2 Aaa 111
3 abb 221
4 Abb 121
5 cca 131
6 cca 141
7 abc 222
8 cse 222
This query can just find the duplicates in the number and names which are same but it wont be able find the duplicates in the 3 and 4th row!!!
SELECT count(*), lower(name)
FROM tbl
GROUP BY lower(name)
HAVING count(lower(name)) > 1
this query can find all the duplicates in name!!! it works perfectly
SELECT count(*), number
FROM tbl
GROUP BY number
HAVING count(number) > 1
this query can find all the duplicates in number!!! it works perfectly
I want a query which can find all the duplicates in both name and number whether the name consists of lower case and upper case
output
name number count
2 111 aaa
2 --- abb
2 --- cca
2 222 ---

Updated question
"Get duplicate on both number and name" ... "name and number as different column"
Rows can be counted twice here!
SELECT lower(name), NULL AS number, count(*) AS ct
FROM tbl
GROUP BY lower(name)
HAVING count(*) > 1
UNION ALL
SELECT NULL, number, count(*) AS ct
FROM tbl
GROUP BY number
HAVING count(*) > 1;
-> sqlfiddle
Original question
The problem is that the query groups by
GROUP BY lower(name), number
As row 3 and 4 have a different number, they are not the same for this query.
If you want to ignore different numbers for this query, try something like:
SELECT lower(name)
, count(*) AS ct
FROM tbl
GROUP BY lower(name)
HAVING count(*) > 1;

With a little work we can show counts for both name and number in one column:
select NameOrNumber, count(*) as Count
from (
select name as NameOrNumber from tb1
union all
select number from tb1
) a
group by NameOrNumber
having count(NameOrNumber) > 1
SQL Fiddle Example #1
Output #1:
| NAMEORNUMBER | COUNT |
------------------------
| 111 | 2 |
| aaa | 2 |
| abb | 2 |
| cca | 2 |
If you want the output in separate columns, you can do something like this:
select distinct if(t1.name = t2.name, t1.name, null) as DUPLICATE_Name,
if(t1.number = t2.number, t1.number, null) as DUPLICATE_Number
from tb1 t1
inner join tb1 t2 on (t1.name = t2.name or t1.number = t2.number)
and t1.slno <> t2.slno
SQL Fiddle Example #2
Output #2:
| DUPLICATE_NAME | DUPLICATE_NUMBER |
-------------------------------------
| Aaa | 111 |
| Abb | (null) |
| cca | (null) |

Related

Get latest record from each group and number of group rows in MySQL

I got a table like this:
id | column_a | column_value
1 | x | 5
2 | y | 7
3 | z | 4,7
4 | x | 3,6
5 | y | 2
6 | w | 5,8,9,11
I would like to get back column_value from latest record in each groups AND a count number of rows in the groups.
So the result should be this:
count(id) | column_value
2 | 3,6
2 | 2
1 | 4,7
1 | 5,8,9,11
I tried to reach this on the following two path:
select count(id), column_value
from table
group by column_a
This version get back the first records from the groups so its not ok for me.
select count(id), column_value
from table
where id in (select max(id)
from table
group by column_a)
This version also wrong because count cannot works well without group by.
I cannot figure it out how can I combine two versions advantages.
Any help is appreciated.

Try this
Select cnt, column_value
from tst t inner join (
Select column_a, count(id) cnt, max(id) as max_id
from tst
group by column_a ) x on (t.column_a= x.column_a and t.id = x.max_id)
order by cnt desc

Select all rows with multiple occurrences - on same day

I have a single MySQL table with the name 'checkins' and 4 columns.
id | userIDFK | checkin_datetime | shopId
------------------------------------------------
1 | 1 | 2018-01-18 09:44:00 | 3
2 | 2 | 2018-01-18 10:32:00 | 3
3 | 3 | 2018-01-18 11:19:00 | 3
4 | 1 | 2018-01-18 17:57:00 | 3
5 | 1 | 2018-01-18 16:31:00 | 1
6 | 1 | 2018-01-19 08:31:00 | 3
Basically I want to find rows where users have checked-in more than once (>=2) on the same day and the same shop. So for instance if a user checks-in as in rows with ids 1 and 4 (same user, same day, same shop), the query should return a hit with the the entire rows (id, userIDFK, checkin_datetime, shopId). Hope this makes sense.
I already tried using
SELECT id, userIDFK, checkin_datetime, shopId
FROM (
SELECT * FROM 'checkins' WHERE COUNT(userIDFK)>=2 AND COUNT(shopId)>=2
)
The same day part I have no clew how to do it, and I know this query is way off, but this is the best I could.

You can try grouping by userId checkin_date and shopID
SELECT userIDFK, checkin_datetime, shopId,COUNT(SHOPiD)
FROM checkins
GROUP BY userIDFK, DATE(checkin_datetime), shopId
HAVING COUNT(SHOPID)>1
EDIT
You can include a subquery to get all lines:
select b.id,b.userIDFK, b.checkin_datetime, b.shopId
from checkins b
where (SELECT COUNT(SHOPiD)
FROM checkins a
where a.userIDFK=b.userIDFK and date(a.checkin_datetime)=date(b.checkin_datetime) and a.shopId=b.a.shopId
GROUP BY userIDFK, DATE(checkin_datetime), shopId)>1

GROUPBY can be used to get the multiple occurrences.
SELECT id, userIDFK, checkin_datetime, shopId
FROM checkins
GROUP BY userIDFK, DATE(checkin_datetime), shopId
HAVING count(id) > 1;
Hope it helps!
EDIT:
Using inner join you can achieve it. Here is the query:
SELECT c1.* FROM checkins c1 INNER JOIN checkins c2
ON c1.userIDFK = c2.userIDFK
AND date(c1.checkin_datetime) = date(c2.checkin_datetime)
AND c1.shopId = c2.shopId
AND c1.id != c2.id
Cheers!!

Can you include totals and detail rows in a single mySQL query?

None of the questions on file seems to be about my precise problem.
Is it possible to sequence detail lines based on the number of detail lines for that order you have in a single query?
Consider the following simplified table:
Order number Article number
------------ --------------
123 1
123 2
123 3
234 1
234 2
345 1
456 1
456 2
456 3
456 4
456 5
The number of detail lines for each order would be
Order number Number of lines
------------ ---------------
123 3
234 2
345 1
456 5
Is it possible to select the order number, article number in descending order by total number of detail lines for each detail line? In other words the desired results are
Order number Article number
------------ --------------
456 1
456 2
456 3
456 4
456 5
123 1
123 2
123 3
234 1
234 2
345 1
I can do it with multiple queries and temporary tables or extra columns. Neither simple SELECTS, SELF JOINs nor UNIONs appear to give me the results I want. Is it possible to do with a single query?

This query should work as it's getting the count from the join and then ordering it by the count.
select t1.orderNumber, t1.articleNumber from myTable t1
inner join
(
select orderNumber, count(articleNumber) as count from myTable
group by orderNumber
) t2
on t1.orderNumber = t2.orderNumber
order by t2.count desc, t1.orderNumber, t1.articleNumber
To better expain it:
We are first selecting all the data, then we are inner joining a table that has the count for each order number, once we have this we can then order it by count DESC so we get the order number with the highest count on top and we can then add additional sorting in the Order By

In the interests of future readers (when MySQL 8.x is production ready and it supports window functions) you can avoid the intermediate subquery and join by using COUNT() OVER(partition by ...) like this:
select t1.orderNumber, t1.articleNumber f
from myTable t1
order by
count(*) over(partition by t1.orderNumber) DESC
, t1.orderNumber
, t1.articleNumber
;
orderNumber | f
----------: | -:
456 | 1
456 | 2
456 | 3
456 | 4
456 | 5
123 | 1
123 | 2
123 | 3
234 | 1
234 | 2
345 | 1
ps: The example above done in MariaDB 10.2 to show MySQL(ish) syntax at work (& it's standards complaint SQL anyway).
dbfiddle here

If I understand the problem correctly, following query should work:
select t1.order_number, t1.Article_number
from t1
inner join (select order_number,count(*) as cnt from t1 group by order_number) as t2
on t1.order_number = t2.order_number
order by t2.cnt desc,t1.order_number,t1.Article_number;
Hope it helps!

retrieve value of maximum occurrence in a table

I am in a very complicated problem. Let me explain you first what I am doing right now:
I have a table name feedback in which I am storing grades against course id. The table looks like this:
+-------+-------+-------+-------+-----------+--------------
| id | cid | grade |g_point| workload | easiness
+-------+-------+-------+-------+-----------+--------------
| 1 | 10 | A+ | 1 | 5 | 4
| 2 | 10 | A+ | 1 | 2 | 4
| 3 | 10 | B | 3 | 3 | 3
| 4 | 11 | B+ | 2 | 2 | 3
| 5 | 11 | A+ | 1 | 5 | 4
| 6 | 12 | B | 3 | 3 | 3
| 7 | 11 | B+ | 2 | 7 | 8
| 8 | 11 | A+ | 1 | 1 | 2
g_point has just specific values for the grades, thus I can use these values to show the user courses sorted by grades.
Okay, now first my task is to print out the grade of each course. The grade can be calculated by the maximum occurrence against each course. For example from this table we can see the result of cid = 10 will be A+, because it is present two times there. This is simple. I have already implemented this query which I will write here in the end.
The main problem is when we talk about the course cid = 11 which has two different grades. Now in that situation client asks me to take the average of workload and easiness of both these courses and whichever course has the greater average should be shown. The average would be computed like this:
all workload values of the grade against course
+ all easiness values of the grade against course
/ 2
From this example cid = 11 has four entries,have equal number of grades against a course
B+ grade average
avgworkload(2 + 7)/2=x
avgeasiness(3 + 8)/2 = y
answer x+y/2 = 10
A+ grade average
avgworkload(5 + 1)/2=x
avgeasiness(4 + 2)/2 = y
answer x+y/2 = 3
so the grade should be B+.
This is the query which I am running to get the max occurrence grade
SELECT
f3.coursecodeID cid,
f3.grade_point p,
f3.grade g
FROM (
SELECT
coursecodeID,
MAX(mode_qty) mode_qty
FROM (
SELECT
coursecodeID,
COUNT(grade_point) mode_qty
FROM feedback
GROUP BY
coursecodeID, grade_point
) f1
GROUP BY coursecodeID
) f2
INNER JOIN (
SELECT
coursecodeID,
grade_point,
grade,
COUNT(grade_point) mode_qty
FROM feedback
GROUP BY
coursecodeID, grade_point
) f3
ON
f2.coursecodeID = f3.coursecodeID AND
f2.mode_qty = f3.mode_qty
GROUP BY f3.coursecodeID
ORDER BY f3.grade_point

Here is SQL Fiddle.
I added a table Courses with the list of all course IDs, to make the main idea of the query easier to see. Most likely you have it in the real database. If not, you can generate it on the fly from feedback by grouping by cid.
For each cid we need to find the grade. Group feedback by cid, grade to get a list of all grades for the cid. We need to pick only one grade for a cid, so we use LIMIT 1. To determine which grade to pick we order them. First, by occurrence - simple COUNT. Second, by the average score. Finally, if there are several grades than have same occurrence and same average score, then pick the grade with the smallest g_point. You can adjust the rules by tweaking the ORDER BY clause.
SELECT
courses.cid
,(
SELECT feedback.grade
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGrade
FROM courses
ORDER BY courses.cid
result set
cid CourseGrade
10 A+
11 B+
12 B
UPDATE
MySQL doesn't have lateral joins, so one possible way to get the second column g_point is to repeat the correlated sub-query. SQL Fiddle
SELECT
courses.cid
,(
SELECT feedback.grade
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGrade
,(
SELECT feedback.g_point
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGPoint
FROM courses
ORDER BY CourseGPoint
result set
cid CourseGrade CourseGPoint
10 A+ 1
11 B+ 2
12 B 3
Update 2 Added average score into ORDER BY SQL Fiddle
SELECT
courses.cid
,(
SELECT feedback.grade
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGrade
,(
SELECT feedback.g_point
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGPoint
,(
SELECT (AVG(workload) + AVG(easiness))/2
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS AvgScore
FROM courses
ORDER BY CourseGPoint, AvgScore DESC
result
cid CourseGrade CourseGPoint AvgScore
10 A+ 1 3.75
11 B+ 2 5
12 B 3 3

If I understood well you need an inner select to find the average, and a second outer select to find the maximum values of the average
select cid, grade, max(average)/2 from (
select cid, grade, avg(workload + easiness) as average
from feedback
group by cid, grade
) x group by cid, grade
This solution has been tested on your data usign sql fiddle at this link
If you change the previous query to
select cid, max(average)/2 from (
select cid, grade, avg(workload + easiness) as average
from feedback
group by cid, grade
) x group by cid
You will find the max average for each cid.
As mentioned in the comments you have to choose wich strategy use if you have more grades that meets the max average. For example if you have
+-------+-------+-------+-------+-----------+--------------
| id | cid | grade |g_point| workload | easiness
+-------+-------+-------+-------+-----------+--------------
| 1 | 10 | A+ | 1 | 5 | 4
| 2 | 10 | A+ | 1 | 2 | 4
| 3 | 10 | B | 3 | 3 | 3
| 4 | 11 | B+ | 2 | 2 | 3
| 5 | 11 | A+ | 1 | 5 | 4
| 9 | 11 | C | 1 | 3 | 6
You will have grades A+ and C soddisfing the maximum average 4.5

Mysql - Select at least one or select none

I have a table as so...
----------------------------------------
| id | name | group | number |
----------------------------------------
| 1 | joey | 1 | 2 |
| 2 | keidy | 1 | 3 |
| 3 | james | 2 | 2 |
| 4 | steven | 2 | 5 |
| 5 | jason | 3 | 2 |
| 6 | shane | 3 | 3 |
----------------------------------------
I'm running a select like so:
SELECT * FROM table WHERE number IN (2,3);
The problem im trying to solve is that I want to only grab get results from groups that have 1 or more rows of each number. For instance the above query is returning id's 1-2-3-5-6, when I'd like the results to exclude id 3 since the group of '2' can only return 1 result for the number of '2' and not for BOTH 2 and 3, since there's no row with the number 3 for the group 2 i'd like it to not even select id 3 at all.
Any help would be great.

Try it this way
SELECT *
FROM table1 t
WHERE number IN(2, 3)
AND EXISTS
(
SELECT *
FROM table1
WHERE number IN(2, 3)
AND `group` = t.`group`
GROUP BY `group`
HAVING MAX(number = 2) > 0
AND MAX(number = 3) > 0
)
or
SELECT *
FROM table1 t JOIN
(
SELECT `group`
FROM table1
WHERE number IN(2, 3)
GROUP BY `group`
HAVING MAX(number = 2) > 0
AND MAX(number = 3) > 0
) q
ON t.`group` = q.`group`;
or
SELECT *
FROM table1
WHERE `group` IN
(
SELECT `group`
FROM table1
WHERE number IN(2, 3)
GROUP BY `group`
HAVING MAX(number = 2) > 0
AND MAX(number = 3) > 0
);
Sample output (for both queries):
| ID | NAME | GROUP | NUMBER |
|----|-------|-------|--------|
| 1 | joey | 1 | 2 |
| 2 | keidy | 1 | 3 |
| 5 | jason | 3 | 2 |
| 6 | shane | 3 | 3 |
Here is SQLFiddle demo

On this, you can approach from a fun way with multiple joins for what you WANT qualified, OR, apply a prequery to get all qualified groups as others have suggested, but readability is a bit off for me..
Anyhow, here's an approach going through the table once, but with joins
select DISTINCT
T.id,
T.Name,
T.Group,
T.Number
from
YourTable T
Join YourTable T2
on T.Group = T2.Group AND T2.Group = 2
Join YourTable T3
on T.Group = T3.Group AND T3.Group = 3
where
T.Number IN ( 2, 3 )
So on the first record, it is pointing to by it's own group to the T2 group AND the T2 group is specifically a 2... Then again, but testing the group for the T3 instance and T3's group is a 3.
If it cant complete the join to either of the T2 or T3 instances, the record is done for consideration, and since indexes work great for joins like this, make sure you have one index for your NUMBER criteria, and another index on the (GROUP, NUMBER) for those comparisons and the next query sample...
If doing by more than this simple 2, but larger group, prequery qualified groups, then join to that
select
YT2.*
from
( select YT1.group
from YourTable YT1
where YT1.Number in (2, 3)
group by YT1.group
having count( DISTINCT YT1.group ) = 2 ) PreQualified
JOIN YourTable YT2
on PreQualified.group = YT2.group
AND YT2.Number in (2,3)

Maybe this,if I understand you
SELECT id FROM table WHERE `group` IN
(SELECT `group` FROM table WHERE number IN (2,3)
GROUP BY `group`
HAVING COUNT(DISTINCT number)=2)
SQL Fiddle
This will return all ids where BOTH numbers exist in a group.Remove DISTINCT if you want ids for groups where just one numbers is in.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Query to find the duplicates between the name and number in table - mysql

Related

Get latest record from each group and number of group rows in MySQL

Select all rows with multiple occurrences - on same day

Can you include totals and detail rows in a single mySQL query?

retrieve value of maximum occurrence in a table

Mysql - Select at least one or select none

Categories

Resources