Im trying to write a query that will match partial matches to stored name values.
My database looks as follows
Blockquote
FirstName | Middle Name | Surname
----------------------------------
Joe | James | Bloggs
J | J | Bloggs
Joe | | Bloggs
Jane | | Bloggs
Now if a user enters their name as
J Bloggs
my query should return all 4 rows, as they are all potential matches.
Similarly if a user enters the name
J J Bloggs
all rows should be returned.
If a user enters their name as
Joe Bloggs
only the first three should be returned.
I have tried the following
SELECT *
FROM PERSON
WHERE CONCAT(' ',FirstName,' ',MiddleName,' ', Surname) LIKE '% Joe%'
AND CONCAT(' ',FirstName,' ',MiddleName,' ', Surname, ' ') LIKE '% Bloggs%';
But this doesn't return 'J J Bloggs'.
Any ideas?
If I understand your logic correctly, any of the three input name components is considered to be a match if it either is a substring of a value in the table, or vice-versa. That is, J matches Joe, but also Joe matches to J. Using this logic, we can write the following query:
SELECT *
FROM yourTable
WHERE
(INSTR(FirstName, 'J') > 0 OR INSTR('J', FirstName) > 0) AND
(INSTR(MiddleName, 'J') > 0 OR INSTR('J', MiddleName) > 0 OR MiddleName IS NULL) AND
(INSTR(Surname, 'Bloggs') > 0 OR INSTR('Bloggs', Surname) > 0);
Demo
Note that the middle name has some additional logic. If the middle name be missing in a record (i.e. it is NULL), then we wave the requirement for the middle names to match.
I think you might need OR instead of AND...
SELECT *
FROM PERSON
WHERE CONCAT(' ',FirstName,' ',MiddleName,' ', Surname) LIKE '% Joe%'
OR CONCAT(' ',FirstName,' ',MiddleName,' ', Surname, ' ') LIKE '% Bloggs%';
Within a table like this:
ID| ph_number
-----------
1 | 51231234
2 | 5123 1234
3 | 51231234; 61231234
4 | 5123 1234; 61231234
5 | 5123 1934; 6123 1234
6 | 5123 1234; 6123 1234
7 | aargh; 5123 1234; 6123 1234
, user needs to find a phone number (ex 51231234) not knowing where the spaces are, or if there are many numbers per field. I can find the numbers without spaces with query like this:
SELECT ID, ph_number FROM test WHERE REPLACE(ph_number, ' ', '') LIKE REPLACE('51231234', ' ', '')
that returns IDs 1 and 2, or
SELECT ID, ph_number FROM test WHERE ph_number LIKE '%51231234%'
that returns IDs 1 and 3. But Needed are IDs 1,2,3,4, 6 and 7. I'm not able to combine the two queries. Have tried:
SELECT ID, ph_number FROM test WHERE REPLACE(ph_number, ' ', '') LIKE ('%' + REPLACE('51231234', ' ', '') + '%') // returns 1 & 2
SELECT ID, ph_number FROM test WHERE REPLACE(ph_number, ' ', '') LIKE '%' + REPLACE('51231234', ' ', '') + '%' // returns ERROR
How could I achieve this? I wouldn't want to tell users that they can't have multiple numbers on the field.
In MySQL "+" is exclusively an arithmetic operator. Use the CONCAT() function to concatenate strings:
....WHERE REPLACE(ph_number, ' ', '') LIKE CONCAT('%', REPLACE('51231234', ' ', ''), '%')
I create this table:
create table if not exists `example`(
`firstNames` varchar(45) not null,
`secondNames` varchar(45) not null)
ENGINE = InnoDB;
Now I insert one row:
insert into example values('Jose Alonzo', 'Pena Palma');
And a check if is correct
select * from example;
| firstNames | secondNames |
----------------------------
| Jose Alonzo| Pena Palma |
Its ok!
Easy
Now I create a statment to search this row
set #search = 'jose alonzo pena';
select * from example
where concat(firstNames, ' ', secondNames) like concat('%',#search,'%');
This return
| firstNames | secondNames |
----------------------------
| Jose Alonzo| Pena Palma |
Now I change the value #search for 'jose pena'
set #search = 'jose pena';
select * from example
where concat(firstNames, ' ', secondNames) like concat('%',#search,'%');
And do not return nothing!
| firstNames | secondNames |
What is happening?
I can't use like for characters that are in the middle of the varchar?
No, you cannot use like for characters that are in the middle of the string. Or, in other words, a space character matches a space character, not an arbitrary string of characters. The following would match:
where concat(firstNames, ' ', secondNames) like concat('%', replace(#search, ' ', '%'), '%')
The order would be important, so this would match concat(firstNames, ' ', secondNames) but not concat(secondNames, ' ', firstNames).
If you are interested in these types of searches, you should investigate full text indexes. In addition to being more powerful, they are also faster.
I'm trying to replace a bunch of characters in a MySQL field. I know the REPLACE function but that only replaces one string at a time. I can't see any appropriate functions in the manual.
Can I replace or delete multiple strings at once? For example I need to replace spaces with dashes and remove other punctuation.
You can chain REPLACE functions:
select replace(replace('hello world','world','earth'),'hello','hi')
This will print hi earth.
You can even use subqueries to replace multiple strings!
select replace(london_english,'hello','hi') as warwickshire_english
from (
select replace('hello world','world','earth') as london_english
) sub
Or use a JOIN to replace them:
select group_concat(newword separator ' ')
from (
select 'hello' as oldword
union all
select 'world'
) orig
inner join (
select 'hello' as oldword, 'hi' as newword
union all
select 'world', 'earth'
) trans on orig.oldword = trans.oldword
I'll leave translation using common table expressions as an exercise for the reader ;)
Cascading is the only simple and straight-forward solution to mysql for multiple character replacement.
UPDATE table1
SET column1 = replace(replace(REPLACE(column1, '\r\n', ''), '<br />',''), '<\r>','')
REPLACE does a good simple job of replacing characters or phrases everywhere they appear in a string. But when cleansing punctuation you may need to look for patterns - e.g. a sequence of whitespace or characters in the middle of a word or after a full stop. If that's the case, a regular expression replace function would be much more powerful.
UPDATE: If using MySQL version 8+, a REGEXP_REPLACE function is provided and can be invoked as follows:
SELECT txt,
REGEXP_REPLACE(REPLACE(txt, ' ', '-'),
'[^a-zA-Z0-9-]+',
'') AS `reg_replaced`
FROM test;
See this DB Fiddle online demo.
PREVIOUS ANSWER - only read on if using a version of MySQL before version 8: .
The bad news is MySQL doesn't provide such a thing but the good news is it's possible to provide a workaround - see this blog post.
Can I replace or delete multiple strings at once? For example I need
to replace spaces with dashes and remove other punctuation.
The above can be achieved with a combination of the regular expression replacer and the standard REPLACE function. It can be seen in action in this online Rextester demo.
SQL (excluding the function code for brevity):
SELECT txt,
reg_replace(REPLACE(txt, ' ', '-'),
'[^a-zA-Z0-9-]+',
'',
TRUE,
0,
0
) AS `reg_replaced`
FROM test;
CREATE FUNCTION IF NOT EXISTS num_as_word (name TEXT) RETURNS TEXT RETURN
(
SELECT
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(IFNULL(name, ''),
'1', 'one'),
'2', 'two'),
'3', 'three'),
'4', 'four'),
'5', 'five'),
'6', 'six'),
'7', 'seven'),
'8', 'eight'),
'9', 'nine')
);
I've been using lib_mysqludf_preg for this which allows you to:
Use PCRE regular expressions directly in MySQL
With this library installed you could do something like this:
SELECT preg_replace('/(\\.|com|www)/','','www.example.com');
Which would give you:
example
on php
$dataToReplace = [1 => 'one', 2 => 'two', 3 => 'three'];
$sqlReplace = '';
foreach ($dataToReplace as $key => $val) {
$sqlReplace = 'REPLACE(' . ($sqlReplace ? $sqlReplace : 'replace_field') . ', "' . $key . '", "' . $val . '")';
}
echo $sqlReplace;
result
REPLACE(
REPLACE(
REPLACE(replace_field, "1", "one"),
"2", "two"),
"3", "three");
UPDATE schools SET
slug = lower(name),
slug = REPLACE(slug, '|', ' '),
slug = replace(slug, '.', ' '),
slug = replace(slug, '"', ' '),
slug = replace(slug, '#', ' '),
slug = replace(slug, ',', ' '),
slug = replace(slug, '\'', ''),
slug = trim(slug),
slug = replace(slug, ' ', '-'),
slug = replace(slug, '--', '-');
UPDATE schools SET
slug = replace(slug, '--', '-');
If you are using MySQL Version 8+ then below is the built-in function that might help you better.
String
Replace
Output
w"w\'w. ex%a&m:p l–e.c)o(m
"'%&:)(–
www.example.com
MySQL Query:
SELECT REGEXP_REPLACE('`w"w\'w. ex%a&m:p l–e.c)o(m`', '[("\'%[:blank:]&:–)]', '');
Almost for all bugging characters-
SELECT REGEXP_REPLACE(column, '[\("\'%[[:blank:]]&:–,#$#!;\\[\\]\)<>\?\*\^]+','')
Real-life scenario.
I had to update all the files name which has been saved in 'demo' with special characters.
SELECT * FROM demo;
| uri |
|------------------------------------------------------------------------------|
| private://webform/applicant_details/129/offers upload winners .png |
| private://webform/applicant_details/129/student : class & teacher data.pdf |
| private://webform/applicant_details/130/tax---user's---data__upload.pdf |
| private://webform/applicant_details/130/Applicant Details _ report_0_2.pdf |
| private://webform/applicant_details/131/india&asia%population huge.pdf |
Test Case -
The table has multiple rows with special characters in the file name.
Advice:
To remove all the special characters from the file name and use a-z, A-Z, 0-9, dot and underscore with a lower file name.
Expected result is:
| uri |
|------------------------------------------------------------------------------|
| private://webform/applicant_details/129/offers_upload_winners_.png |
| private://webform/applicant_details/129/student_class_teacher_data.pdf |
| private://webform/applicant_details/130/tax_user_s_data_upload.pdf |
| private://webform/applicant_details/130/applicant_details_report_0_2.pdf |
| private://webform/applicant_details/131/india_asia_population_huge.pdf |
Okay, let's plan step by step
1st - let's find the file name
2nd - run all the find replace on that file name part only
3rd - replace the new file name with an old one
How can we do this?
Let's break down the whole action in chunks for a better understanding.
Below function will extract the file name only from the full path e.g. "Applicant Details _ report_0_2.pdf"
SELECT -- MySQL SELECT statement
SUBSTRING_INDEX -- MySQL built-in function
( -- Function start Parentheses
uri, -- my table column
'/', -- delimiter (the last / in full path; left to right ->)
-1 -- start from the last and find the 1st one (from right to left <-)
) -- Function end Parentheses
from -- MySQL FROM statement
demo; -- My table name
#1 Query result
| uri |
|------------------------------------|
| offers upload winners .png |
| student : class & teacher data.pdf |
| tax---user's---data__upload.pdf |
| Applicant Details _ report_0_2.pdf |
| india&asia%population huge.pdf |
Now we have to find and replace within the generated file name result.
SELECT
REGEXP_REPLACE( -- MySQL REGEXP_REPLACE built-in function (string, pattern, replace)
SUBSTRING_INDEX(uri, '/', -1), -- File name only
'[^a-zA-Z0-9_.]+', -- Find everything which is not a-z, A-Z, 0-9, . or _.
'_' -- Replace with _
) AS uri -- Give a alias column name for whole result
from
demo;
#2 Query result
| uri |
|------------------------------------|
| offers_upload_winners_.png |
| student_class_teacher_data.pdf |
| tax_user_s_data__upload.pdf |
| Applicant_Details___report_0_2.pdf |
| india_asia_population_huge.pdf |
FYI - Last '+' in the pattern is for repetitive words like ---- or multiple spaces ' ', Notice the result without '+' in the below regex pattern.
SELECT
REGEXP_REPLACE( -- MySQL REGEXP_REPLACE built-in function (string, pattern, replace)
SUBSTRING_INDEX(uri, '/', -1), -- File name only
'[^a-zA-Z0-9_.]', -- Find everything which is not a-z, A-Z, 0-9, . or _.
'_' -- Replace with _
) AS uri -- Give a alias column name for whole result
from
demo;
#3 Query result
| uri |
|------------------------------------|
| offers___upload__winners_.png |
| student___class___teacher_data.pdf |
| tax___user_s___data__upload.pdf |
| Applicant_Details___report_0_2.pdf |
| india_asia_population__huge.pdf |
Now, we have a file name without special characters (. and _ allowed). But the problem is file name still has Capital letters and also has multiple underscores.
Let's lower the file name first.
SELECT
LOWER(
REGEXP_REPLACE(
SUBSTRING_INDEX(uri, '/', -1),
'[^a-zA-Z0-9_.]',
'_'
)
) AS uri
from
demo;
#4 Query result
| uri |
|------------------------------------|
| offers_upload_winners_.png |
| student_class_teacher_data.pdf |
| tax_user_s_data__upload.pdf |
| applicant_details___report_0_2.pdf |
| india_asia_population_huge.pdf |
Now everything is in lower case but underscores are still there. So we will wrap the whole REGEX.. with one more REGEX..
SELECT
LOWER(
REGEXP_REPLACE( -- this wrapper will solve the multiple underscores issue
REGEXP_REPLACE(
SUBSTRING_INDEX(uri, '/', -1),
'[^a-zA-Z0-9_.]+',
'_'
),
'[_]+', -- if 1st regex action has multiple __ then find it
'_' -- and replace them with single _
)
) AS uri
from
demo;
#5 Query result
| uri |
|----------------------------------|
| offers_upload_winners_.png |
| student_class_teacher_data.pdf |
| tax_user_s_data_upload.pdf |
| applicant_details_report_0_2.pdf |
| india_asia_population_huge.pdf |
Congratulations! we have found what we were looking for. Now UPDATE TIME! Yeah!!
UPDATE -- run a MySQL UPDATE statement
demo -- tell MySQL to which table you want to update
SET -- put SET statement to set the updated values in desire column
uri = REPLACE( -- tell MySQL to which column you want to update,
-- I am also putting REPLACE function to replace existing values with new one
-- REPLACE (string, replace, with-this)
uri, -- my column to replace
SUBSTRING_INDEX(uri, '/', -1), -- my file name part "Applicant Details _ report_0_2.pdf"
-- without doing any action
LOWER( -- "applicant_details_report_0_2.pdf"
REGEXP_REPLACE( -- "Applicant_Details_report_0_2.pdf"
REGEXP_REPLACE( -- "Applicant_Details___report_0_2.pdf"
SUBSTRING_INDEX(uri, '/', -1), -- "Applicant Details _ report_0_2.pdf"
'[^a-zA-Z0-9_.]+',
'_'
),
'[_]+',
'_'
)
)
);
And after and UPDATE Query, result would be like this.
| uri |
|--------------------------------------------------------------------------|
| private://webform/applicant_details/152/offers_upload_winners_.png |
| private://webform/applicant_details/153/student_class_teacher_data.pdf |
| private://webform/applicant_details/153/tax_user_s_data_upload.pdf |
| private://webform/applicant_details/154/applicant_details_report_0_2.pdf |
| private://webform/applicant_details/154/india_asia_population_huge.pdf |
Sample data script
DROP TABLE IF EXISTS `demo`;
CREATE TABLE `demo` (
`uri` varchar(255) CHARACTER SET utf8mb3 COLLATE utf8_bin NOT NULL DEFAULT '' COMMENT 'The S3 URI of the file.',
`filesize` bigint unsigned NOT NULL DEFAULT '0' COMMENT 'The size of the file in bytes.',
`timestamp` int unsigned NOT NULL DEFAULT '0' COMMENT 'UNIX timestamp for when the file was added.',
`dir` int NOT NULL DEFAULT '0' COMMENT 'Boolean indicating whether or not this object is a directory.',
`version` varchar(255) CHARACTER SET utf8mb3 COLLATE utf8_bin DEFAULT '' COMMENT 'The S3 VersionId of the object.'
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
INSERT INTO `demo` (`uri`, `filesize`, `timestamp`, `dir`, `version`) VALUES
('private://webform/applicant_details/152/offers upload winners .png', 14976905, 1658397516, 0, ''),
('private://webform/applicant_details/153/student : class & teacher data.pdf', 0, 1659525447, 1, ''),
('private://webform/applicant_details/153/tax---user\'s---data__upload.pdf', 98449, 1658397516, 0, ''),
('private://webform/applicant_details/154/Applicant Details _ report_0_2.pdf', 0, 1659525447, 1, ''),
('private://webform/applicant_details/154/india&asia%population huge.pdf', 13301, 1658397517, 0, '');
Big Thanks:
MySQL: SELECT, UPDATE, REPLACE, SUBSTRING_INDEX, LOWER, REGEXP_REPLACE
MySQL Query Formatter: Thanks to CodeBeautify for such an awesome tool.
Will the following query evaluate to true (1), false (0), or NULL?
SELECT '%' LIKE ' % ';
the answer provided is
The '%' character is matched by '%', but not by the space characters surrounding it, so the expression evaluates to false.
+----------------+
| '%' LIKE ' % ' |
+----------------+
| 0 |
+----------------+
but i thought % can match zero or more characters? so % can match % + Spaces? or does characters nt include wildcards?
UPDATE:
oh but if the comparison happens the other way arnd it is true ... hmm ...
SELECT ' % ' LIKE '%';
Any non-NULL string is matched by the '%' metacharacter, so the expression evaluates to true.
+----------------+
| ' % ' LIKE '%' |
+----------------+
| 1 |
+----------------+
Logic is wrong. You had to write
select ' % ' like '%'
If you are writing like ' % ', it means that in first string must be space, then any symbols and one more space in the end. Wildcards is for like statement, in first string it's not wildcard but symbol.
Not entirely sure what your question is, but example time:
Sample table, mytbl:
col1
----
abc
def
feh
zba
a b
Query1
------
select * from mytbl where col1 like '%b%'
Result1
-------
abc
zba
a b
Query2
------
select * from mytbl where col1 like '%b'
Result2
------
a b
Query3
------
select * from mytbl where col1 like 'a%'
Result3
-------
abc
a b
Query4
------
select * from mytbl where col1 like '% b%'
Result4
-------
a b
Query5
------
select * from mytbl where col1 like '% b %'
Result5
-------
null
As you can see, the % matches zero or more characters. Non-% characters are treated as literals. So that means % b % is looking for anything + space + b + space + anything.
Hopefully, this helps.