How to replace a regex pattern in MySQL - mysql

I have a table called myTable which has a column called col1. This column contains data in this format: (1 or 2 digits)(hyphen)(8 digits).
I want to replace all the data in this column and replace everything before hyphen with 4, so this is an example:
--------------------------------
| old values | New Values |
--------------------------------
| 1-654283568 => 4-654283568 |
| 2-467862833 => 4-467862833 |
| 8-478934293 => 4-478934293 |
| 12-573789475 => 4-573789475 |
| 16-574738575 => 4-574738575 |
--------------------------------
I am using MySQL 5.7.19, I believe REGEXP_REPLACE is available in MySQL Version 8+... not sure how this can be achieved?

You don't need regex; you can use SUBSTRING_INDEX to extract everything after the hyphen and concatenate 4- to that:
UPDATE myTable
SET col1 = CONCAT('4-', SUBSTRING_INDEX(col1, '-', -1))
Demo on dbfiddle
This will work regardless of the number of characters after the hyphen.

Looking to your pattern seem you could avoid regexp
update myTable
set col1 = concat('4-', right(col1,8))
or
update myTable
set col1 = concat('4', right(col1,9))

Try this:
UPDATE testing SET val=REPLACE(val,SUBSTRING(val,1,LOCATE('-',val)),'4-');
Fiddle here :https://www.db-fiddle.com/f/4mU5ctLh8NB9iKSKZF9Ue2/2
Using LOCATE to find '-' position then use SUBSTRING to get only the front part of the '-'.

SELECT CONCAT( #new_prefix, SUBSTRING(old_value FROM LOCATE('-', old_value)) ) AS new_value
UPDATE sourcetable
SET fieldname = CONCAT( '4', SUBSTRING(fieldname FROM LOCATE('-', fieldname)) )
WHERE LOCATE('-', fieldname)
/* AND another conditions */

Related

Insert a "-" after third character for whole column in SQL

I have a column (Name is the header of the column) with 8 character numbers. I am looking for a query to insert a '-' after the third character of every row of data.
For example if I have:
| Name |
|----------|
| 99912345 |
I want to get:
| Name |
|-----------|
| 999-12345 |
I have tried the following:
SELECT INSERT(name, 3, 0, "-");
The database I am using is called temp.Test1 on mySQL
You were close:
SELECT INSERT(name, 4, 0, '-') from mytable
Here is the demo:
DEMO
In MySQL, use substring to divide your value and concat to put it back together.
set #test = 99912345;
select concat(
substring(#test, 1, 3),
'-',
substring(#test, 4)
);
gives 999-12345
Edit: You can also make a virtual column which does this for you, and just retrieve the column in your application.
alter table `test1`
add `formattedName` varchar(9) as (
concat(
substring(`name`, 1, 3),
'-',
substring(`name`, 4)
)
);
select `formattedName` from `test`
See demo

How to replace all the digits before hyphen with a new digit using MySQL? [duplicate]

I have a table called myTable which has a column called col1. This column contains data in this format: (1 or 2 digits)(hyphen)(8 digits).
I want to replace all the data in this column and replace everything before hyphen with 4, so this is an example:
--------------------------------
| old values | New Values |
--------------------------------
| 1-654283568 => 4-654283568 |
| 2-467862833 => 4-467862833 |
| 8-478934293 => 4-478934293 |
| 12-573789475 => 4-573789475 |
| 16-574738575 => 4-574738575 |
--------------------------------
I am using MySQL 5.7.19, I believe REGEXP_REPLACE is available in MySQL Version 8+... not sure how this can be achieved?
You don't need regex; you can use SUBSTRING_INDEX to extract everything after the hyphen and concatenate 4- to that:
UPDATE myTable
SET col1 = CONCAT('4-', SUBSTRING_INDEX(col1, '-', -1))
Demo on dbfiddle
This will work regardless of the number of characters after the hyphen.
Looking to your pattern seem you could avoid regexp
update myTable
set col1 = concat('4-', right(col1,8))
or
update myTable
set col1 = concat('4', right(col1,9))
Try this:
UPDATE testing SET val=REPLACE(val,SUBSTRING(val,1,LOCATE('-',val)),'4-');
Fiddle here :https://www.db-fiddle.com/f/4mU5ctLh8NB9iKSKZF9Ue2/2
Using LOCATE to find '-' position then use SUBSTRING to get only the front part of the '-'.
SELECT CONCAT( #new_prefix, SUBSTRING(old_value FROM LOCATE('-', old_value)) ) AS new_value
UPDATE sourcetable
SET fieldname = CONCAT( '4', SUBSTRING(fieldname FROM LOCATE('-', fieldname)) )
WHERE LOCATE('-', fieldname)
/* AND another conditions */

MySQL Select and Remove JSON Characters from a Column [duplicate]

I'm trying to replace a bunch of characters in a MySQL field. I know the REPLACE function but that only replaces one string at a time. I can't see any appropriate functions in the manual.
Can I replace or delete multiple strings at once? For example I need to replace spaces with dashes and remove other punctuation.
You can chain REPLACE functions:
select replace(replace('hello world','world','earth'),'hello','hi')
This will print hi earth.
You can even use subqueries to replace multiple strings!
select replace(london_english,'hello','hi') as warwickshire_english
from (
select replace('hello world','world','earth') as london_english
) sub
Or use a JOIN to replace them:
select group_concat(newword separator ' ')
from (
select 'hello' as oldword
union all
select 'world'
) orig
inner join (
select 'hello' as oldword, 'hi' as newword
union all
select 'world', 'earth'
) trans on orig.oldword = trans.oldword
I'll leave translation using common table expressions as an exercise for the reader ;)
Cascading is the only simple and straight-forward solution to mysql for multiple character replacement.
UPDATE table1
SET column1 = replace(replace(REPLACE(column1, '\r\n', ''), '<br />',''), '<\r>','')
REPLACE does a good simple job of replacing characters or phrases everywhere they appear in a string. But when cleansing punctuation you may need to look for patterns - e.g. a sequence of whitespace or characters in the middle of a word or after a full stop. If that's the case, a regular expression replace function would be much more powerful.
UPDATE: If using MySQL version 8+, a REGEXP_REPLACE function is provided and can be invoked as follows:
SELECT txt,
REGEXP_REPLACE(REPLACE(txt, ' ', '-'),
'[^a-zA-Z0-9-]+',
'') AS `reg_replaced`
FROM test;
See this DB Fiddle online demo.
PREVIOUS ANSWER - only read on if using a version of MySQL before version 8: .
The bad news is MySQL doesn't provide such a thing but the good news is it's possible to provide a workaround - see this blog post.
Can I replace or delete multiple strings at once? For example I need
to replace spaces with dashes and remove other punctuation.
The above can be achieved with a combination of the regular expression replacer and the standard REPLACE function. It can be seen in action in this online Rextester demo.
SQL (excluding the function code for brevity):
SELECT txt,
reg_replace(REPLACE(txt, ' ', '-'),
'[^a-zA-Z0-9-]+',
'',
TRUE,
0,
0
) AS `reg_replaced`
FROM test;
CREATE FUNCTION IF NOT EXISTS num_as_word (name TEXT) RETURNS TEXT RETURN
(
SELECT
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(IFNULL(name, ''),
'1', 'one'),
'2', 'two'),
'3', 'three'),
'4', 'four'),
'5', 'five'),
'6', 'six'),
'7', 'seven'),
'8', 'eight'),
'9', 'nine')
);
I've been using lib_mysqludf_preg for this which allows you to:
Use PCRE regular expressions directly in MySQL
With this library installed you could do something like this:
SELECT preg_replace('/(\\.|com|www)/','','www.example.com');
Which would give you:
example
on php
$dataToReplace = [1 => 'one', 2 => 'two', 3 => 'three'];
$sqlReplace = '';
foreach ($dataToReplace as $key => $val) {
$sqlReplace = 'REPLACE(' . ($sqlReplace ? $sqlReplace : 'replace_field') . ', "' . $key . '", "' . $val . '")';
}
echo $sqlReplace;
result
REPLACE(
REPLACE(
REPLACE(replace_field, "1", "one"),
"2", "two"),
"3", "three");
UPDATE schools SET
slug = lower(name),
slug = REPLACE(slug, '|', ' '),
slug = replace(slug, '.', ' '),
slug = replace(slug, '"', ' '),
slug = replace(slug, '#', ' '),
slug = replace(slug, ',', ' '),
slug = replace(slug, '\'', ''),
slug = trim(slug),
slug = replace(slug, ' ', '-'),
slug = replace(slug, '--', '-');
UPDATE schools SET
slug = replace(slug, '--', '-');
If you are using MySQL Version 8+ then below is the built-in function that might help you better.
String
Replace
Output
w"w\'w. ex%a&m:p l–e.c)o(m
"'%&:)(–
www.example.com
MySQL Query:
SELECT REGEXP_REPLACE('`w"w\'w. ex%a&m:p l–e.c)o(m`', '[("\'%[:blank:]&:–)]', '');
Almost for all bugging characters-
SELECT REGEXP_REPLACE(column, '[\("\'%[[:blank:]]&:–,#$#!;\\[\\]\)<>\?\*\^]+','')
Real-life scenario.
I had to update all the files name which has been saved in 'demo' with special characters.
SELECT * FROM demo;
| uri |
|------------------------------------------------------------------------------|
| private://webform/applicant_details/129/offers upload winners .png |
| private://webform/applicant_details/129/student : class & teacher data.pdf |
| private://webform/applicant_details/130/tax---user's---data__upload.pdf |
| private://webform/applicant_details/130/Applicant Details _ report_0_2.pdf |
| private://webform/applicant_details/131/india&asia%population huge.pdf |
Test Case -
The table has multiple rows with special characters in the file name.
Advice:
To remove all the special characters from the file name and use a-z, A-Z, 0-9, dot and underscore with a lower file name.
Expected result is:
| uri |
|------------------------------------------------------------------------------|
| private://webform/applicant_details/129/offers_upload_winners_.png |
| private://webform/applicant_details/129/student_class_teacher_data.pdf |
| private://webform/applicant_details/130/tax_user_s_data_upload.pdf |
| private://webform/applicant_details/130/applicant_details_report_0_2.pdf |
| private://webform/applicant_details/131/india_asia_population_huge.pdf |
Okay, let's plan step by step
1st - let's find the file name
2nd - run all the find replace on that file name part only
3rd - replace the new file name with an old one
How can we do this?
Let's break down the whole action in chunks for a better understanding.
Below function will extract the file name only from the full path e.g. "Applicant Details _ report_0_2.pdf"
SELECT -- MySQL SELECT statement
SUBSTRING_INDEX -- MySQL built-in function
( -- Function start Parentheses
uri, -- my table column
'/', -- delimiter (the last / in full path; left to right ->)
-1 -- start from the last and find the 1st one (from right to left <-)
) -- Function end Parentheses
from -- MySQL FROM statement
demo; -- My table name
#1 Query result
| uri |
|------------------------------------|
| offers upload winners .png |
| student : class & teacher data.pdf |
| tax---user's---data__upload.pdf |
| Applicant Details _ report_0_2.pdf |
| india&asia%population huge.pdf |
Now we have to find and replace within the generated file name result.
SELECT
REGEXP_REPLACE( -- MySQL REGEXP_REPLACE built-in function (string, pattern, replace)
SUBSTRING_INDEX(uri, '/', -1), -- File name only
'[^a-zA-Z0-9_.]+', -- Find everything which is not a-z, A-Z, 0-9, . or _.
'_' -- Replace with _
) AS uri -- Give a alias column name for whole result
from
demo;
#2 Query result
| uri |
|------------------------------------|
| offers_upload_winners_.png |
| student_class_teacher_data.pdf |
| tax_user_s_data__upload.pdf |
| Applicant_Details___report_0_2.pdf |
| india_asia_population_huge.pdf |
FYI - Last '+' in the pattern is for repetitive words like ---- or multiple spaces ' ', Notice the result without '+' in the below regex pattern.
SELECT
REGEXP_REPLACE( -- MySQL REGEXP_REPLACE built-in function (string, pattern, replace)
SUBSTRING_INDEX(uri, '/', -1), -- File name only
'[^a-zA-Z0-9_.]', -- Find everything which is not a-z, A-Z, 0-9, . or _.
'_' -- Replace with _
) AS uri -- Give a alias column name for whole result
from
demo;
#3 Query result
| uri |
|------------------------------------|
| offers___upload__winners_.png |
| student___class___teacher_data.pdf |
| tax___user_s___data__upload.pdf |
| Applicant_Details___report_0_2.pdf |
| india_asia_population__huge.pdf |
Now, we have a file name without special characters (. and _ allowed). But the problem is file name still has Capital letters and also has multiple underscores.
Let's lower the file name first.
SELECT
LOWER(
REGEXP_REPLACE(
SUBSTRING_INDEX(uri, '/', -1),
'[^a-zA-Z0-9_.]',
'_'
)
) AS uri
from
demo;
#4 Query result
| uri |
|------------------------------------|
| offers_upload_winners_.png |
| student_class_teacher_data.pdf |
| tax_user_s_data__upload.pdf |
| applicant_details___report_0_2.pdf |
| india_asia_population_huge.pdf |
Now everything is in lower case but underscores are still there. So we will wrap the whole REGEX.. with one more REGEX..
SELECT
LOWER(
REGEXP_REPLACE( -- this wrapper will solve the multiple underscores issue
REGEXP_REPLACE(
SUBSTRING_INDEX(uri, '/', -1),
'[^a-zA-Z0-9_.]+',
'_'
),
'[_]+', -- if 1st regex action has multiple __ then find it
'_' -- and replace them with single _
)
) AS uri
from
demo;
#5 Query result
| uri |
|----------------------------------|
| offers_upload_winners_.png |
| student_class_teacher_data.pdf |
| tax_user_s_data_upload.pdf |
| applicant_details_report_0_2.pdf |
| india_asia_population_huge.pdf |
Congratulations! we have found what we were looking for. Now UPDATE TIME! Yeah!!
UPDATE -- run a MySQL UPDATE statement
demo -- tell MySQL to which table you want to update
SET -- put SET statement to set the updated values in desire column
uri = REPLACE( -- tell MySQL to which column you want to update,
-- I am also putting REPLACE function to replace existing values with new one
-- REPLACE (string, replace, with-this)
uri, -- my column to replace
SUBSTRING_INDEX(uri, '/', -1), -- my file name part "Applicant Details _ report_0_2.pdf"
-- without doing any action
LOWER( -- "applicant_details_report_0_2.pdf"
REGEXP_REPLACE( -- "Applicant_Details_report_0_2.pdf"
REGEXP_REPLACE( -- "Applicant_Details___report_0_2.pdf"
SUBSTRING_INDEX(uri, '/', -1), -- "Applicant Details _ report_0_2.pdf"
'[^a-zA-Z0-9_.]+',
'_'
),
'[_]+',
'_'
)
)
);
And after and UPDATE Query, result would be like this.
| uri |
|--------------------------------------------------------------------------|
| private://webform/applicant_details/152/offers_upload_winners_.png |
| private://webform/applicant_details/153/student_class_teacher_data.pdf |
| private://webform/applicant_details/153/tax_user_s_data_upload.pdf |
| private://webform/applicant_details/154/applicant_details_report_0_2.pdf |
| private://webform/applicant_details/154/india_asia_population_huge.pdf |
Sample data script
DROP TABLE IF EXISTS `demo`;
CREATE TABLE `demo` (
`uri` varchar(255) CHARACTER SET utf8mb3 COLLATE utf8_bin NOT NULL DEFAULT '' COMMENT 'The S3 URI of the file.',
`filesize` bigint unsigned NOT NULL DEFAULT '0' COMMENT 'The size of the file in bytes.',
`timestamp` int unsigned NOT NULL DEFAULT '0' COMMENT 'UNIX timestamp for when the file was added.',
`dir` int NOT NULL DEFAULT '0' COMMENT 'Boolean indicating whether or not this object is a directory.',
`version` varchar(255) CHARACTER SET utf8mb3 COLLATE utf8_bin DEFAULT '' COMMENT 'The S3 VersionId of the object.'
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
INSERT INTO `demo` (`uri`, `filesize`, `timestamp`, `dir`, `version`) VALUES
('private://webform/applicant_details/152/offers upload winners .png', 14976905, 1658397516, 0, ''),
('private://webform/applicant_details/153/student : class & teacher data.pdf', 0, 1659525447, 1, ''),
('private://webform/applicant_details/153/tax---user\'s---data__upload.pdf', 98449, 1658397516, 0, ''),
('private://webform/applicant_details/154/Applicant Details _ report_0_2.pdf', 0, 1659525447, 1, ''),
('private://webform/applicant_details/154/india&asia%population huge.pdf', 13301, 1658397517, 0, '');
Big Thanks:
MySQL: SELECT, UPDATE, REPLACE, SUBSTRING_INDEX, LOWER, REGEXP_REPLACE
MySQL Query Formatter: Thanks to CodeBeautify for such an awesome tool.

SQL, replace part of a string with email

In my DB I have a column user_email with values:
aaa#test.com
bbb#test.com
ccc#test.com
I would only like to change part of email address that comes after #, so that the resulting column would have values:
aaa#other.net
bbb#other.net
ccc#other.net
How could I achieve that?
I've found following solution that seems to do the trick:
UPDATE table_name SET user_email = REPLACE(user_email, '#test.com', '#other.net');
use replace function
demo
select replace(name,substring(name,position('#' in name),length(name)-position('#' in name)+1),'#other.net')
select replace('aaa#test.com',substring('aaa#test.com',position('#' in 'aaa#test.com'),
length('aaa#test.com')-position('#' in 'aaa#test.com')+1),'#other.net')
output:
val n
aaa#test.com aaa#other.net
use SUBSTRING_INDEX and concat fnction
select concat(SUBSTRING_INDEX("aaa#test.com", "#", 1),'#other.net')
output aaa#other.net
so for your column user_email it would be
select concat(SUBSTRING_INDEX(user_email, "#", 1),'#other.net')
select concat
(substring
('bbb#test.com',1,char_length
('bbb#test.com')-
char_length
(substring_index
('bbb#test.com','.',-1))),'net') x;
| x |
| ------------ |
| bbb#test.net |
View on DB Fiddle
You might use replace, substr and instr together as :
SELECT replace( 'aaa#test.com',
substr('aaa#test.com',instr('aaa#test.com','#'),length('aaa#test.com'))
,'#other.net') as result_str;
result_str
-------------
aaa#other.net
or from your table(tab) with a column called as email :
select replace(email,substr(email,instr(email,'#'),length(email)),'#other.net') result_str
from tab;
result_str
-------------
aaa#other.net
bbb#other.net
ccc#other.net
Rextester Demo

Can MySQL replace multiple characters?

I'm trying to replace a bunch of characters in a MySQL field. I know the REPLACE function but that only replaces one string at a time. I can't see any appropriate functions in the manual.
Can I replace or delete multiple strings at once? For example I need to replace spaces with dashes and remove other punctuation.
You can chain REPLACE functions:
select replace(replace('hello world','world','earth'),'hello','hi')
This will print hi earth.
You can even use subqueries to replace multiple strings!
select replace(london_english,'hello','hi') as warwickshire_english
from (
select replace('hello world','world','earth') as london_english
) sub
Or use a JOIN to replace them:
select group_concat(newword separator ' ')
from (
select 'hello' as oldword
union all
select 'world'
) orig
inner join (
select 'hello' as oldword, 'hi' as newword
union all
select 'world', 'earth'
) trans on orig.oldword = trans.oldword
I'll leave translation using common table expressions as an exercise for the reader ;)
Cascading is the only simple and straight-forward solution to mysql for multiple character replacement.
UPDATE table1
SET column1 = replace(replace(REPLACE(column1, '\r\n', ''), '<br />',''), '<\r>','')
REPLACE does a good simple job of replacing characters or phrases everywhere they appear in a string. But when cleansing punctuation you may need to look for patterns - e.g. a sequence of whitespace or characters in the middle of a word or after a full stop. If that's the case, a regular expression replace function would be much more powerful.
UPDATE: If using MySQL version 8+, a REGEXP_REPLACE function is provided and can be invoked as follows:
SELECT txt,
REGEXP_REPLACE(REPLACE(txt, ' ', '-'),
'[^a-zA-Z0-9-]+',
'') AS `reg_replaced`
FROM test;
See this DB Fiddle online demo.
PREVIOUS ANSWER - only read on if using a version of MySQL before version 8: .
The bad news is MySQL doesn't provide such a thing but the good news is it's possible to provide a workaround - see this blog post.
Can I replace or delete multiple strings at once? For example I need
to replace spaces with dashes and remove other punctuation.
The above can be achieved with a combination of the regular expression replacer and the standard REPLACE function. It can be seen in action in this online Rextester demo.
SQL (excluding the function code for brevity):
SELECT txt,
reg_replace(REPLACE(txt, ' ', '-'),
'[^a-zA-Z0-9-]+',
'',
TRUE,
0,
0
) AS `reg_replaced`
FROM test;
CREATE FUNCTION IF NOT EXISTS num_as_word (name TEXT) RETURNS TEXT RETURN
(
SELECT
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(IFNULL(name, ''),
'1', 'one'),
'2', 'two'),
'3', 'three'),
'4', 'four'),
'5', 'five'),
'6', 'six'),
'7', 'seven'),
'8', 'eight'),
'9', 'nine')
);
I've been using lib_mysqludf_preg for this which allows you to:
Use PCRE regular expressions directly in MySQL
With this library installed you could do something like this:
SELECT preg_replace('/(\\.|com|www)/','','www.example.com');
Which would give you:
example
on php
$dataToReplace = [1 => 'one', 2 => 'two', 3 => 'three'];
$sqlReplace = '';
foreach ($dataToReplace as $key => $val) {
$sqlReplace = 'REPLACE(' . ($sqlReplace ? $sqlReplace : 'replace_field') . ', "' . $key . '", "' . $val . '")';
}
echo $sqlReplace;
result
REPLACE(
REPLACE(
REPLACE(replace_field, "1", "one"),
"2", "two"),
"3", "three");
UPDATE schools SET
slug = lower(name),
slug = REPLACE(slug, '|', ' '),
slug = replace(slug, '.', ' '),
slug = replace(slug, '"', ' '),
slug = replace(slug, '#', ' '),
slug = replace(slug, ',', ' '),
slug = replace(slug, '\'', ''),
slug = trim(slug),
slug = replace(slug, ' ', '-'),
slug = replace(slug, '--', '-');
UPDATE schools SET
slug = replace(slug, '--', '-');
If you are using MySQL Version 8+ then below is the built-in function that might help you better.
String
Replace
Output
w"w\'w. ex%a&m:p l–e.c)o(m
"'%&:)(–
www.example.com
MySQL Query:
SELECT REGEXP_REPLACE('`w"w\'w. ex%a&m:p l–e.c)o(m`', '[("\'%[:blank:]&:–)]', '');
Almost for all bugging characters-
SELECT REGEXP_REPLACE(column, '[\("\'%[[:blank:]]&:–,#$#!;\\[\\]\)<>\?\*\^]+','')
Real-life scenario.
I had to update all the files name which has been saved in 'demo' with special characters.
SELECT * FROM demo;
| uri |
|------------------------------------------------------------------------------|
| private://webform/applicant_details/129/offers upload winners .png |
| private://webform/applicant_details/129/student : class & teacher data.pdf |
| private://webform/applicant_details/130/tax---user's---data__upload.pdf |
| private://webform/applicant_details/130/Applicant Details _ report_0_2.pdf |
| private://webform/applicant_details/131/india&asia%population huge.pdf |
Test Case -
The table has multiple rows with special characters in the file name.
Advice:
To remove all the special characters from the file name and use a-z, A-Z, 0-9, dot and underscore with a lower file name.
Expected result is:
| uri |
|------------------------------------------------------------------------------|
| private://webform/applicant_details/129/offers_upload_winners_.png |
| private://webform/applicant_details/129/student_class_teacher_data.pdf |
| private://webform/applicant_details/130/tax_user_s_data_upload.pdf |
| private://webform/applicant_details/130/applicant_details_report_0_2.pdf |
| private://webform/applicant_details/131/india_asia_population_huge.pdf |
Okay, let's plan step by step
1st - let's find the file name
2nd - run all the find replace on that file name part only
3rd - replace the new file name with an old one
How can we do this?
Let's break down the whole action in chunks for a better understanding.
Below function will extract the file name only from the full path e.g. "Applicant Details _ report_0_2.pdf"
SELECT -- MySQL SELECT statement
SUBSTRING_INDEX -- MySQL built-in function
( -- Function start Parentheses
uri, -- my table column
'/', -- delimiter (the last / in full path; left to right ->)
-1 -- start from the last and find the 1st one (from right to left <-)
) -- Function end Parentheses
from -- MySQL FROM statement
demo; -- My table name
#1 Query result
| uri |
|------------------------------------|
| offers upload winners .png |
| student : class & teacher data.pdf |
| tax---user's---data__upload.pdf |
| Applicant Details _ report_0_2.pdf |
| india&asia%population huge.pdf |
Now we have to find and replace within the generated file name result.
SELECT
REGEXP_REPLACE( -- MySQL REGEXP_REPLACE built-in function (string, pattern, replace)
SUBSTRING_INDEX(uri, '/', -1), -- File name only
'[^a-zA-Z0-9_.]+', -- Find everything which is not a-z, A-Z, 0-9, . or _.
'_' -- Replace with _
) AS uri -- Give a alias column name for whole result
from
demo;
#2 Query result
| uri |
|------------------------------------|
| offers_upload_winners_.png |
| student_class_teacher_data.pdf |
| tax_user_s_data__upload.pdf |
| Applicant_Details___report_0_2.pdf |
| india_asia_population_huge.pdf |
FYI - Last '+' in the pattern is for repetitive words like ---- or multiple spaces ' ', Notice the result without '+' in the below regex pattern.
SELECT
REGEXP_REPLACE( -- MySQL REGEXP_REPLACE built-in function (string, pattern, replace)
SUBSTRING_INDEX(uri, '/', -1), -- File name only
'[^a-zA-Z0-9_.]', -- Find everything which is not a-z, A-Z, 0-9, . or _.
'_' -- Replace with _
) AS uri -- Give a alias column name for whole result
from
demo;
#3 Query result
| uri |
|------------------------------------|
| offers___upload__winners_.png |
| student___class___teacher_data.pdf |
| tax___user_s___data__upload.pdf |
| Applicant_Details___report_0_2.pdf |
| india_asia_population__huge.pdf |
Now, we have a file name without special characters (. and _ allowed). But the problem is file name still has Capital letters and also has multiple underscores.
Let's lower the file name first.
SELECT
LOWER(
REGEXP_REPLACE(
SUBSTRING_INDEX(uri, '/', -1),
'[^a-zA-Z0-9_.]',
'_'
)
) AS uri
from
demo;
#4 Query result
| uri |
|------------------------------------|
| offers_upload_winners_.png |
| student_class_teacher_data.pdf |
| tax_user_s_data__upload.pdf |
| applicant_details___report_0_2.pdf |
| india_asia_population_huge.pdf |
Now everything is in lower case but underscores are still there. So we will wrap the whole REGEX.. with one more REGEX..
SELECT
LOWER(
REGEXP_REPLACE( -- this wrapper will solve the multiple underscores issue
REGEXP_REPLACE(
SUBSTRING_INDEX(uri, '/', -1),
'[^a-zA-Z0-9_.]+',
'_'
),
'[_]+', -- if 1st regex action has multiple __ then find it
'_' -- and replace them with single _
)
) AS uri
from
demo;
#5 Query result
| uri |
|----------------------------------|
| offers_upload_winners_.png |
| student_class_teacher_data.pdf |
| tax_user_s_data_upload.pdf |
| applicant_details_report_0_2.pdf |
| india_asia_population_huge.pdf |
Congratulations! we have found what we were looking for. Now UPDATE TIME! Yeah!!
UPDATE -- run a MySQL UPDATE statement
demo -- tell MySQL to which table you want to update
SET -- put SET statement to set the updated values in desire column
uri = REPLACE( -- tell MySQL to which column you want to update,
-- I am also putting REPLACE function to replace existing values with new one
-- REPLACE (string, replace, with-this)
uri, -- my column to replace
SUBSTRING_INDEX(uri, '/', -1), -- my file name part "Applicant Details _ report_0_2.pdf"
-- without doing any action
LOWER( -- "applicant_details_report_0_2.pdf"
REGEXP_REPLACE( -- "Applicant_Details_report_0_2.pdf"
REGEXP_REPLACE( -- "Applicant_Details___report_0_2.pdf"
SUBSTRING_INDEX(uri, '/', -1), -- "Applicant Details _ report_0_2.pdf"
'[^a-zA-Z0-9_.]+',
'_'
),
'[_]+',
'_'
)
)
);
And after and UPDATE Query, result would be like this.
| uri |
|--------------------------------------------------------------------------|
| private://webform/applicant_details/152/offers_upload_winners_.png |
| private://webform/applicant_details/153/student_class_teacher_data.pdf |
| private://webform/applicant_details/153/tax_user_s_data_upload.pdf |
| private://webform/applicant_details/154/applicant_details_report_0_2.pdf |
| private://webform/applicant_details/154/india_asia_population_huge.pdf |
Sample data script
DROP TABLE IF EXISTS `demo`;
CREATE TABLE `demo` (
`uri` varchar(255) CHARACTER SET utf8mb3 COLLATE utf8_bin NOT NULL DEFAULT '' COMMENT 'The S3 URI of the file.',
`filesize` bigint unsigned NOT NULL DEFAULT '0' COMMENT 'The size of the file in bytes.',
`timestamp` int unsigned NOT NULL DEFAULT '0' COMMENT 'UNIX timestamp for when the file was added.',
`dir` int NOT NULL DEFAULT '0' COMMENT 'Boolean indicating whether or not this object is a directory.',
`version` varchar(255) CHARACTER SET utf8mb3 COLLATE utf8_bin DEFAULT '' COMMENT 'The S3 VersionId of the object.'
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
INSERT INTO `demo` (`uri`, `filesize`, `timestamp`, `dir`, `version`) VALUES
('private://webform/applicant_details/152/offers upload winners .png', 14976905, 1658397516, 0, ''),
('private://webform/applicant_details/153/student : class & teacher data.pdf', 0, 1659525447, 1, ''),
('private://webform/applicant_details/153/tax---user\'s---data__upload.pdf', 98449, 1658397516, 0, ''),
('private://webform/applicant_details/154/Applicant Details _ report_0_2.pdf', 0, 1659525447, 1, ''),
('private://webform/applicant_details/154/india&asia%population huge.pdf', 13301, 1658397517, 0, '');
Big Thanks:
MySQL: SELECT, UPDATE, REPLACE, SUBSTRING_INDEX, LOWER, REGEXP_REPLACE
MySQL Query Formatter: Thanks to CodeBeautify for such an awesome tool.