MySQL Group where any 3 of 5 columns match - mysql

I am searching an addresses table for duplicates, using SOUNDEX to find the duplicates. This works fine, and it requires all 5 soundex columns to match in order to group
However, I want to GROUP where ANY 3 of my 5 SOUNDEX columns match.
Here is my current query:
SELECT `Address`.`id`,
SOUNDEX(`Address`.`address_company_name`) as soundex_address_company_name,
SOUNDEX(`Address`.`contact_name`) as soundex_contact_name,
SOUNDEX(`Address`.`street_address`) as soundex_street_address,
SOUNDEX(`Address`.`suburb`) as soundex_suburb,
SOUNDEX(`Address`.`city`) as soundex_city,
`Address`.`address_country_id`,
`Address`.`address_zone_id`,
`Address`.`postcode`,
COUNT(*)
FROM
`addresses` AS `Address`
WHERE
((`Address`.`address_company_name` IS NOT NULL)
OR (`Address`.`contact_name` IS NOT NULL))
GROUP BY
SOUNDEX(address_company_name),
SOUNDEX(contact_name),
SOUNDEX(street_address),
SOUNDEX(suburb),
SOUNDEX(city),
address_country_id,
address_zone_id,
postcode
HAVING
COUNT(*) > 1
I understand how to do this with multiple queries, ie: loop through each address in our database and then re-query the database for addresses which match any 3 of the 5 columns, however I am hoping to do this in fewer queries as the above query executes very quickly.
I also understand that were this possible, some records may be grouped multiple times, I don't mind if this is the case but I am unsure whether this flies in the face of MySQL logic?

You can try something like this
SELECT a.id, b.id id2, COUNT(*) no_matches
FROM
(
SELECT id,
column_id,
CASE column_id
WHEN 1 THEN SOUNDEX(address_company_name)
WHEN 2 THEN SOUNDEX(contact_name)
WHEN 3 THEN SOUNDEX(street_address)
WHEN 4 THEN SOUNDEX(suburb)
WHEN 5 THEN SOUNDEX(city)
END column_value
FROM addresses a CROSS JOIN
(
SELECT 1 column_id UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5
) i
WHERE address_company_name IS NOT NULL
OR contact_name IS NOT NULL
) a CROSS JOIN
(
SELECT id,
column_id,
CASE column_id
WHEN 1 THEN SOUNDEX(address_company_name)
WHEN 2 THEN SOUNDEX(contact_name)
WHEN 3 THEN SOUNDEX(street_address)
WHEN 4 THEN SOUNDEX(suburb)
WHEN 5 THEN SOUNDEX(city)
END column_value
FROM addresses a CROSS JOIN
(
SELECT 1 column_id UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5
) i
WHERE address_company_name IS NOT NULL
OR contact_name IS NOT NULL
) b
WHERE a.column_value = b.column_value
AND a.id < b.id
GROUP BY a.id, b.id
HAVING COUNT(*) > 2
Sample output:
| ID | ID2 | NO_MATCHES |
|----|-----|------------|
| 1 | 2 | 4 |
| 4 | 5 | 3 |
Here is SQLFiddle demo

Related

How to delete repeated value from all the row present in a column of mysql [duplicate]

This question already has answers here:
Is storing a delimited list in a database column really that bad?
(10 answers)
Closed 3 years ago.
I am stuck in a situation where in my table I have multiple duplicate value for each row in a column, and table look like
User_Id | Color
-----------+-------------------------------
1 | Red, Blue, Red,Green
2 | Green,Green,Blue,Blue, Red
3 | Black, White
4 | Red,Red,Red
I want to remove or delete each duplicate value from column color, so that each user_id hold only unique value like.
User_Id | Color
-----------+--------------------
1 | Red, Blue,Green
2 | Green, Blue, Red
3 | Black,White
4 | Red
Is there any way to achieve the desired output? I searched a lot but got nothing.
Your valuable comment will be highly appreciated.
is there any way to achieve the desired output, I searched a lot but got nothing. your valuable comment will be highly appreciated.
Like I already said you should normalize, a delimited list can't enforce uniqueness: no way to prevent Red, Blue, Red, Green, Bluewhen inserting and updating without writing application code or a trigger. Which also means fetching the complete data.
If you don't normalize this you are going to need to be creative, with a SQL number generator and nested SUBSTRING_INDEX() functions and a CROSS JOIN to split the string. And use GROUP BY and GROUP_CONCAT(DISTINCT ..) to make the unique values
You don't want to do this, this query shows how hard the query is on a delimited list
Query
SELECT
DISTINCT
t.User_Id
, GROUP_CONCAT(DISTINCT TRIM(SUBSTRING_INDEX(
SUBSTRING_INDEX(
t.Color
, ','
, sql_number_generator.number
)
, ','
, -1
)
)) AS color
FROM (
SELECT
#row := #row + 1 AS number
FROM (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row1
CROSS JOIN (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row2
CROSS JOIN (
SELECT #row := 0
) init_user_params
) AS sql_number_generator
CROSS JOIN
t
GROUP BY
t.User_Id
Result
| User_Id | color |
| ------- | -------------- |
| 1 | Blue,Green,Red |
| 2 | Blue,Green,Red |
| 3 | Black,White |
| 4 | Red |
see demo
but the problem is how to update the color column with this result, i
tried but it throw an error [ Operand should contain 1 column(s) ], i
can't understand how it is to be done
Still can't believe you are willing to continue this approach. after the warnings
UPDATE
t
INNER JOIN (
SELECT
DISTINCT
t.User_Id
, GROUP_CONCAT(DISTINCT TRIM(SUBSTRING_INDEX(
SUBSTRING_INDEX(
t.Color
, ','
, sql_number_generator.number
)
, ','
, -1
)
)) AS color
FROM (
SELECT
#row := #row + 1 AS number
FROM (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row1
CROSS JOIN (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
) row2
CROSS JOIN (
SELECT #row := 0
) init_user_params
) AS sql_number_generator
CROSS JOIN
t
GROUP BY
t.User_Id
) AS records_to_updated
SET t.Color = records_to_updated.color
WHERE
t.User_Id = records_to_updated.User_Id
see demo

MySQL: Get count of data in comma separated column [duplicate]

This question already has answers here:
Search with comma-separated value mysql
(1 answer)
MySql PHP select count of distinct values from comma separated data (tags)
(5 answers)
Closed 4 years ago.
I have a table that stores data like:
userid books
ym0001 dictionary,textbooks,notebooks
ym0002 textbooks,dictionary
I want to count number of times each book occurs. I want my result to be in this format.
books Counts
dictionary 2
notebooks 1
textbooks 2
This is mysql. Please help
The following approach builds a result of 1000 integers, then uses those integers (n) to locate segments within the comma seperated string, and for each segment it creates a new row so that the derived table looks like this:
userid | book
:----- | :---------
ym0001 | dictionary
ym0002 | textbooks
ym0001 | textbooks
ym0002 | dictionary
ym0001 | notebooks
Once that exists it is a simple matter of grouping by book to arrive at the counts.
select
book, count(*) Counts
from (
select
t.userid
, SUBSTRING_INDEX(SUBSTRING_INDEX(t.books, ',', numbers.n), ',', -1) book
from (
select #rownum:=#rownum+1 AS n
from
(
select 0 union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6
union all select 7 union all select 8 union all select 9
) a
cross join (
select 0 union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6
union all select 7 union all select 8 union all select 9
) b
cross join (
select 0 union all select 1 union all select 2 union all select 3
union all select 4 union all select 5 union all select 6
union all select 7 union all select 8 union all select 9
) c
cross join (select #rownum:=0) r
) numbers
inner join mytable t
on CHAR_LENGTH(t.books)
-CHAR_LENGTH(REPLACE(t.books, ',', '')) >= numbers.n-1
) d
group by
book
order by
book
book | Counts
:--------- | -----:
dictionary | 2
notebooks | 1
textbooks | 2
If you already have a table of numbers, use that instead.
the cross joins of a b and c dynamically produce 1000 rows, if you need more add further cross joins similar to c. i.e. the number of numbers should exceed the maximum length of your comma seperated data
db<>fiddle here

Select duplicates while concatenating every one except the first

I am trying to write a query that will select all of the numbers in my table, but those numbers with duplicates i want to append something on the end that shows it as a duplicate. However I am not sure how to do this.
Here is an example of the table
TableA
ID Number
1 1
2 2
3 2
4 3
5 4
SELECT statement output would be like this.
Number
1
2
2-dup
3
4
Any insight on this would be appreciated.
if you mysql version didn't support window function. you can try to write a subquery to make row_number then use CASE WHEN to judgement rn > 1 then mark dup.
create table T (ID int, Number int);
INSERT INTO T VALUES (1,1);
INSERT INTO T VALUES (2,2);
INSERT INTO T VALUES (3,2);
INSERT INTO T VALUES (4,3);
INSERT INTO T VALUES (5,4);
Query 1:
select t1.id,
(CASE WHEN rn > 1 then CONCAT(Number,'-dup') ELSE Number END) Number
from (
SELECT *,(SELECT COUNT(*)
FROM T tt
where tt.Number = t1.Number and tt.id <= t1.id
) rn
FROM T t1
)t1
Results:
| id | Number |
|----|--------|
| 1 | 1 |
| 2 | 2 |
| 3 | 2-dup |
| 4 | 3 |
| 5 | 4 |
If you can use window function you can use row_number with window function to make rownumber by Number.
select t1.id,
(CASE WHEN rn > 1 then CONCAT(Number,'-dup') ELSE Number END) Number
from (
SELECT *,row_number() over(partition by Number order by id) rn
FROM T t1
)t1
sqlfiddle
I made a list of all the IDs that weren't dups (left join select) and then compared them to the entire list(case when):
select
case when a.id <> b.min_id then cast(a.Number as varchar(6)) + '-dup' else cast(a.Number as varchar(6)) end as Number
from table_a
left join (select MIN(b.id) min_id, Number from table_a b group by b.number)b on b.number = a.number
I did this in MS SQL 2016, hope it works for you.
This creates the table used:
insert into table_a (ID, Number)
select 1,1
union all
select 2,2
union all
select 3,2
union all
select 4,3
union all
select 5,4

How to make query return 0 instead of empty set if there is no result

How can i make this query to return a row with 0 value if there is no value for each date
SELECT COUNT(id) FROM `panel_messages` WHERE `sent_by` = 'root'
AND `send_date` IN ("1395-4-25","1395-4-24","1395-4-23","1395-4-22","1395-4-21","1395-4-20","1395-4-19")
GROUP BY `send_date`
ORDER BY `send_date` DESC
My expected result is 7 rows like this :
| row1 |
| row2 |
| row3 |
| row4 |
| row5 |
| row6 |
| row7 |
and if there is no result for one of the rows i want it to be 0 which is default value :
| 2 |
| 0 |
| 0 |
| 2 |
| 0 |
| 3 |
| 1 |
But right now i just get 4 rows because if there is no result my query doesn't return anything :
| 2 |
| 2 |
| 3 |
| 1 |
SQL fiddle : http://sqlfiddle.com/#!9/a07486/3
Please give it a try:
SELECT
COALESCE(YT.total,t.total) AS cnt
FROM
(SELECT 0 AS total) t
LEFT JOIN
(
SELECT
COUNT(id) AS total
FROM `panel_messages`
WHERE `sent_by` = 'root'
AND `send_date` IN ("1395-4-25","1395-4-24","1395-4-23","1395-4-22","1395-4-21","1395-4-20","1395-4-19")
GROUP BY `send_date`
ORDER BY `send_date` DESC
) YT
ON 1=1;
Note:
A dummy row has been created with value 0.
Later doing a LEFT JOIN between this dummy table and your query
And finally using COALESCE you can achieve the default count 0 if your main query doesn't return anything.
EDIT:
Query:
SELECT
COALESCE(YT.count,0) AS count
FROM
(
SELECT ADDDATE('1395-01-01', INTERVAL #i:=#i+1 DAY) AS DAY
FROM (
SELECT a.a
FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
) a
JOIN (SELECT #i := -1) r1
) dateTable
LEFT JOIN
(
SELECT
send_date,
COUNT(id) AS count
FROM
`panel_messages`
WHERE
`sent_by` = 'root'
AND `send_date` IN (
"1395-4-25",
"1395-4-24",
"1395-4-23",
"1395-4-20"
)
GROUP BY
`send_date`
ORDER BY
`send_date` DESC
) AS YT
ON dateTable.DAY = YT.send_date
WHERE dateTable.DAY IN ('1395-04-25','1395-04-24','1395-04-23','1395-04-20');
In order to get zero count for the dates which don't exist you need to create a temporary table where all the dates (under a certain range) reside.
Then making a left join between the date field of this temporary table and send_date field of your table would do the job done almost.
Finally you need to use COALESCE to get 0 if the count is NULL.
WORKING DEMO
try this :
SELECT sent_by ,"1395-4-25" as `SEND DATE`,COUNT(*) FROM `panel_messages` WHERE `sent_by` = 'root' AND `send_date` = "1395-4-25"
union
SELECT sent_by ,"1395-4-24" as `SEND DATE`,COUNT(*) FROM `panel_messages` WHERE `sent_by` = 'root' AND `send_date` = "1395-4-24"
union
SELECT sent_by ,"1395-4-23" as `SEND DATE`,COUNT(*) FROM `panel_messages` WHERE `sent_by` = 'root' AND `send_date` = "1395-4-23"
ORDER BY `SEND DATE` DESC
in this case when date is not found the count(*) return 0; but in the first return null add the 4 select statement and it will return 7 rows now it work but it can be better if i found onother solution i'm going back here
onother answer what you are trying to do is impossible without the union :
but you can try some think else
create a temporary table that contain your date
create Table temporary (
send_date date
);
insert INTO temporay("1395-4-25"),("1395-4-24"),("1395-4-23"),("1395-4-22"),("1395-4-21"),("1395-4-20"),("1395-4-19")
than do select with rigth join between your table and this one now you will have record for the date that don't have send_by
panel_messages.sent_by | panel_messages.send_date | temporary.send_date
root "1395-4-25" "1395-4-25"
root "1395-4-25" "1395-4-25"
null null "1395-4-24"
null null "1395-4-23"
root "1395-4-19" "1395-4-19"
.
.
.
now you count how much message in every day all i did is create a result that can return what you need :
Try this select after you create the temporary table
SELECT temporary.send_date, count(sender_by)
from panel_messages RIGTH JOIN temporary ON (temporary.send_date = panel_messages.send_date)
where
panel_messages.sent_by like 'root'
group by temporary.send_date
ORDER BY send_date DESC;

Count Duplicates with same id passing in one coulmn

Hi there m trying to calculate the row count for same value,
id,value
1 | a
2 | b
3 | c
4 | d
5 | e
and my query is
select value, count(*) as Count from mytable where id in('4','2','4','1','4') group by value having count(*) > 1
for which my expected output will be,
value,Count
d | 3
b | 1
a | 1
Thanks, any help will be appreciated
Try that:
SELECT value, count(value) AS Count
FROM mytable m
WHERE value = m.value
GROUP BY value
SELECT t.id, t.value, COUNT(t.id)
FROM
test t
JOIN
( SELECT 1 AS id
UNION ALL SELECT 3
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 1
UNION ALL SELECT 1 ) AS tmp
ON t.id = tmp.id
GROUP BY t.id
Sample on sqlfiddle.com
See also: Force MySQL to return duplicates from WHERE IN clause without using JOIN/UNION?
Of course, your IN parameter will be dynamic, and thus you will have to generate the corresponding SQL statement for the tmp table.
That's the SQL-only way to do it. Another possibility is to have the query like you have it in your question and afterwards programmatically associate the rows to the count passed to the IN parameter.