Grouping fields that partially match in MySQL - mysql

I'm trying to return duplicate records in a user table where the fields only partially match, and the matching field contents are arbitrary. I'm not sure if I'm explaining it well, so here is the query I might run to get the duplicate members by some unique field:
SELECT MAX(id)
FROM members
WHERE 1
GROUP BY some_unique_field
HAVING COUNT(some_unique_field) > 1
I want to apply this same idea to an email field, but unfortunately our email field can contain multiple e-mails seperated by a comma. For example, I want a member with his email set to "user#someaddress.com" to be returned as a duplicate of another member that has "user#someaddress.com","someotheruser#someaddress.com" in their field. GROUP BY obviously will not accomplish this as-is.

Something like this might work for you:
SELECT *
FROM members m1
inner join members m2 on m1.id <> m2.id
and (
m1.email = m2.email
or m1.email like '%,' + m2.email
or m1.email like m2.email + ',%'
or m1.email like '%,' + m2.email + ',%'
)
It depends on how consistently your email addresses are formatted when there are more than one. You might need to modify the query slightly if there is always a space after the comma, e.g., or if the quotes are actually part of your data.

This works for me; may not do what you want:
SELECT MAX(ID)
FROM members
WHERE Email like "%someuser%"
GROUP BY Email
HAVING COUNT(Email) > 1

Related

Replace text in SQL

I have a field called EMAIL_ADDRESS. One of the records would be:
john#gmail.com, mike#gmail.com, joe#yahoo.com, george#yahoo.com, fred#gmail.com
I wan to remove all yahoo addresses in my SELECT query to get:
john#gmail.com, mike#gmail.com, fred#gmail.com
If I use
REPLACE(SM.SCORECARD_EMAIL_ADDRESS, 'joe#yahoo.com,', '')
this works.
If I want to remove ALL yahoo email addresses this doesn't work:
REPLACE(SM.SCORECARD_EMAIL_ADDRESS, '%#yahoo.com,', '')
because wildcards don't seem to work as it's looking for % in the string.
You should probably fix your table design and stop storing CSV lists of email addresses. Instead, get each email onto a separate record. As a short term fix, if you're running MySQL 8+, you may use REGEXP_REPLACE():
UPDATE yourTable
SET EMAIL_ADDRESS = REGEXP_REPLACE(
REGEXP_REPLACE(EMAIL_ADDRESS, '(, )?\\S+#yahoo\\.com,?', ','), '^,+|,+$', '')
WHERE EMAIL_ADDRESS LIKE '%#yahoo.com%';
If you don't need to udpate records but you want them only in the SELECT query you can use NOT LIKE operator
SELECT * FROM your_table WHERE email NOT LIKE '%yahoo.com'
So you get records that doesn’t match the like pattern

search in mysql fields

I need to have a select statement that matches records partially from a field containing full names.
If a record contains john, it will display records with john, johnson, johnsmith etc. john could be anywhere within name
select
name_field
from
my_table
where
name_field like '%john%'
Update:
For the question
Do you mean to ask that "How to find, if name_field value of row1 matches full or partial with the same field value from other row2 to rowN?"
You replied "this is exactly what i need".
The following solution may be helpful to you:
select
t.name_field_id, t.name_field as 'name_value',
d.name_field_id as 'id_of_dup', d.name_field as 'dup_in'
from
my_table t,
my_table d
where
d.name_field != t.name_field
and d.name_field like concat( '%', t.name_field, '%' )
order by name_value, dup_in;

Ordering a Union Query in MS Access SQL

OK I have a particularly nasty union ordering problem so any help would be appreciated.
The scenario is this:
Member Table with the following records (actual data):
REI882
YUI987
POBO37
NUBS26
BTBU12
MZBY10
TYBW54
(These are listed in the order I want them back from my query.)
There are a number of business rules about the construction of these MemberIDs which I believe are unrelated to the sort. They're historic and set in stone. I'm stuck with them. They indicate seniority of the member.
The ordering is done from the last 4 characters in the ID, ascending. The first two characters of the ID are completely meaningless as far as the sort is concerned.
So the topmost possible record is ??A001 (most senior) and the lowest possible record is ??ZZ99 (least senior).
When I query my member table the list I get back must display most senior at top... Obviously a standard sort does not work. This is what I have to date:
The first of these queries deals with sorting members whose ID only has 1 leading letter. The second deals with those with 2 leading letters.
SELECT * FROM (
SELECT Member.ID
FROM Member
WHERE (((IsNumeric(Mid([Member.ID],4,1)))=-1)) **check the 4th character is a digit
ORDER BY (Mid([Member.ID],3,1)), (Mid([Member.ID],4,1)), (Mid([Member.ID],5,1)), (Mid([Member.ID],6,1))
) t1
UNION
SELECT * FROM (
SELECT Member.ID
FROM Member
WHERE (((IsNumeric(Mid([Member.ID],4,1)))=0)) **check the 4th character is a letter
ORDER BY (Mid([Member.ID],3,1)), (Mid([Member.ID],4,1)), (Mid([Member.ID],5,1)), (Mid([Member.ID],6,1))
) t2
But I get CRAZY results with the union! If I run each of the selects individually - no problem my funky (heavily reliant on some nasty string manipulation in access!) sort works exactly as I want it.
I understand this is pretty complicated but I hope I've explained it clearly and that someone is up for some kudos for figuring it out!!!
edit: The result from my query is seemingly random:
YUI987
MZBY10
NUBS26
BTBU12
REI882
POBO37
TYBW54
ORDER BY in a SELECT statement that UNION with another SELECT is not correct.
See Specifying a conditional order here
You can use this:
SELECT ID FROM(
(SELECT Member.ID,1 AS T,Left([Member.ID],2) AS Part1, Right([Member.ID],4) AS Part2
FROM Member
WHERE (((IsNumeric(Mid([Member.ID],3,1)))=-1)))
UNION
(SELECT Member.ID,2 AS T,Left([Member.ID],3) AS Part1, Right([Member.ID],3) AS Part2
FROM Member
WHERE (((IsNumeric(Mid([Member.ID],4,1)))=-1) and ((IsNumeric(Mid([Member.ID],3,1)))=0)))
UNION
(SELECT Member.ID,3 AS T,Left([Member.ID],4) AS Part1, Right([Member.ID],2) AS Part2
FROM Member
WHERE (((IsNumeric(Mid([Member.ID],5,1)))=-1) and ((IsNumeric(Mid([Member.ID],4,1)))=0)))
ORDER BY T,Part1,Part2)
#Justin Kirk: I don't know what is your problem exactly. But I hope it can help you
Why are you not using the RIGHT function.
Something like
SELECT ID
FROM (
SELECT ID
FROM (
SELECT Member.ID
FROM Member
WHERE (((IsNumeric(Mid([Member.ID],4,1)))=-1)) **check the 4th character is a digit
) t1
UNION
SELECT ID
FROM (
SELECT Member.ID
FROM Member
WHERE (((IsNumeric(Mid([Member.ID],4,1)))=0)) **check the 4th character is a letter
) t2
) t3
ORDER BY RIGHT(ID,4)
How about skipping the UNION?
SELECT members.ID
FROM members
ORDER BY Right([ID],3), Right(id,4)
Based on the new rules, this mess may work.
SELECT
Len(IIf([textId] Like "[a-z][a-z][0-9][0-9][0-9][0-9]",Left([textid],2),
IIf([textId] Like "[a-z][a-z][a-z][0-9][0-9][0-9]",Left([textid],3),
IIf([textId] Like "[a-z][a-z][a-z][a-z][0-9][0-9]",Left([textid],4),"_")))) AS Ln,
IIf(textId Like "[a-z][a-z][0-9][0-9][0-9][0-9]",Left(textid,2),
IIf(textId Like "[a-z][a-z][a-z][0-9][0-9][0-9]",Left(textid,3),
IIf(textId Like "[a-z][a-z][a-z][a-z][0-9][0-9]",Left(textid,4),"_"))) AS Alpha,
IIf(textId Like "[a-z][a-z][0-9][0-9][0-9][0-9]",Val(Right(textid,4)),
IIf(textId Like "[a-z][a-z][a-z][0-9][0-9][0-9]",Val(Right(textid,3)),
IIf(textId Like "[a-z][a-z][a-z][a-z][0-9][0-9]",Val(Right(textid,2)),0))) AS Numbr,
table.textid
FROM table
ORDER BY
Len(IIf([textId] Like "[a-z][a-z][0-9][0-9][0-9][0-9]",Left([textid],2),
IIf([textId] Like "[a-z][a-z][a-z][0-9][0-9][0-9]",Left([textid],3),
IIf([textId] Like "[a-z][a-z][a-z][a-z][0-9][0-9]",Left([textid],4),"_")))),
IIf(textId Like "[a-z][a-z][0-9][0-9][0-9][0-9]",Left(textid,2),
IIf(textId Like "[a-z][a-z][a-z][0-9][0-9][0-9]",Left(textid,3),
IIf(textId Like "[a-z][a-z][a-z][a-z][0-9][0-9]",Left(textid,4),"_"))),
IIf(textId Like "[a-z][a-z][0-9][0-9][0-9][0-9]",Val(Right(textid,4)),
IIf(textId Like "[a-z][a-z][a-z][0-9][0-9][0-9]",Val(Right(textid,3)),
IIf(textId Like "[a-z][a-z][a-z][a-z][0-9][0-9]",Val(Right(textid,2)),0)))

MYSQL special Char

I have in my table this value ART(\'O\') in the field Subject.
How do I check if this subject exist?
I tried:
select * from table1 where Subject = 'ART(\'O\')';
select * from table1 where Subject = "ART(\'O\')";
Both failed at picking up the record.
How sholud I prhase the query so that the record containing ART(\'O\') will be picked?
Note: Please do not refer the query: select * from table1 where Subject like '%ART(%';
bec they may be other records such as ART(EX), ART(NA),etc... existing
Need to know how to use the Subject = '' method.
Thanks.
If the value contains the backslashes, you probably need to escape them. Otherwise you're looking for value ART('O').
SELECT * FROM table1 WHERE Subject = "ART(\\'O\\')";
Also make sure you don't have trailing whitespace.

Grouping multiple MySql queries

I have a simple query as listed below
SELECT id, name, email FROM users WHERE group_id = 1
This works great until, I then start adding LIKE queries, chained with OR statements to the end.
SELECT id, name, email FROM users
WHERE group_id = 1
AND id LIKE $searchterm
OR name LIKE $searchterm
OR email LIKE $searchterm
Suddenly my WHERE clause is no longer upheld and results with a 'group_id' of 2 or 3 are retrieved.
Is there a way I can group WHERE clauses so that they are always upheld or am I missing something obvious?
Dealing with the query first - you need to use brackets for the WHERE clause to be interpreted correctly:
SELECT id, name, email
FROM users
WHERE group_id = 1
AND ( id LIKE $searchterm
OR name LIKE $searchterm
OR email LIKE $searchterm)
I'd be looking at using Full Text Search (FTS) instead, so you could use:
SELECT id, name, email
FROM users
WHERE group_id = 1
AND MATCH(id, name, email) AGAINST ($searchterm)
Mind that the USERS table needs to be MyISAM...
I assume you want
email FROM users WHERE group_id = 1 AND (id LIKE $searchterm OR name LIKE $searchterm OR email LIKE $searchterm)
Here is the mysql operator precedence table