How create json format with group-concat mysql? - mysql

How create json format with group-concat mysql?
(I use MySQL)
Example1:
table1:
email | name | phone
-------------------------------------
my1#gmail.com | Ben | 6555333
my2#gmail.com | Tom | 2322452
my2#gmail.com | Dan | 8768768
my1#gmail.com | Joi | 3434356
like syntax code that not give me the format:
select email, group-concat(name,phone) as list from table1
group by email
output that I need:
email | list
------------------------------------------------
my1#gmail.com | {name:"Ben",phone:"6555333"},{name:"Joi",phone:"3434356"}
my2#gmail.com | {name:"Tom",phone:"2322452"},{name:"Dan",phone:"8768768"}
Thanks

With the newer versions of MySQL, you can use JSON_OBJECT function to achieve the desired result, like so:
GROUP_CONCAT(
JSON_OBJECT(
'name', name,
'phone', phone
)
) AS list
To get the SQL response ready to be parsed as an array:
CONCAT(
'[',
GROUP_CONCAT(
JSON_OBJECT(
'name', name,
'phone', phone
)
),
']'
) AS list
This will give you a string like: [{name: 'ABC', phone: '111'}, {name: 'DEF', phone: '222'}] which can be JSON parsed.

Try this query -
SELECT
email,
GROUP_CONCAT(CONCAT('{name:"', name, '", phone:"',phone,'"}')) list
FROM
table1
GROUP BY
email;
JSON format result -
+---------------+-------------------------------------------------------------+
| email | list |
+---------------+-------------------------------------------------------------+
| my1#gmail.com | {name:"Ben", phone:"6555333"},{name:"Joi", phone:"3434356"} |
| my2#gmail.com | {name:"Tom", phone:"2322452"},{name:"Dan", phone:"8768768"} |
+---------------+-------------------------------------------------------------+

For Mysql 5.7.22+
SELECT
email,
JSON_ARRAYAGG(
JSON_OBJECT(
'name', name,
'phone', phone
)
) AS list
FROM table1
GROUP BY email;
Result:
+---------------+-------------------------------------------------------------------+
| email | list |
+---------------+-------------------------------------------------------------------+
| my1#gmail.com | [{"name":"Ben", "phone":6555333},{"name":"Joi", "phone":3434356}] |
| my2#gmail.com | [{"name":"Tom", "phone":2322452},{"name":"Dan", "phone":8768768}] |
+---------------+-------------------------------------------------------------------+
The only difference is that column list is now Json-valid, so you can parse directly as Json

I hope this finds the right eyes.
You can use:
For arrays (documentation):
JSON_ARRAYAGG(col_or_expr) as ...
For objects (documentation):
JSON_OBJECTAGG(key, value) as ...

Devart's answer above is great, but K2xL's question is valid. The answer I found was to hexadecimal-encode the name column using HEX(), which ensures that it will create valid JSON. Then in the application, convert the hexadecimal back into the string.
(Sorry for the self-promotion, but) I wrote a little blog post about this with a little more detail:
http://www.alexkorn.com/blog/2015/05/hand-rolling-valid-json-in-mysql-using-group_concat/
[Edit for Oriol] Here's an example:
SELECT email,
CONCAT(
'[',
COALESCE(
GROUP_CONCAT(
CONCAT(
'{',
'\"name\": \"', HEX(name), '\", ',
'\"phone\": \"', HEX(phone), '\"',
'}')
ORDER BY name ASC
SEPARATOR ','),
''),
']') AS bData
FROM table
GROUP BY email
Also note I've added a COALESCE in case there are no items for that email.

Similar to Madacol's answer above, but slightly different. Instead of JSONARRAYAGG, you could also CAST AS JSON:
SELECT
email,
CAST( CONCAT(
'[',
GROUP_CONCAT(
JSON_OBJECT(
'name', name,
'phone', phone
)
),']') AS JSON )
FROM table1
GROUP BY email;
Result:
+---------------+-------------------------------------------------------------------+
| email | list |
+---------------+-------------------------------------------------------------------+
| my1#gmail.com | [{"name":"Ben", "phone":6555333},{"name":"Joi", "phone":3434356}] |
| my2#gmail.com | [{"name":"Tom", "phone":2322452},{"name":"Dan", "phone":8768768}] |
+---------------+-------------------------------------------------------------------+

Going off of #Devart's answer... if the field contains linebreaks or double quotation marks, the result will not be valid JSON.
So, if we know the "phone" field occasionally contains double-quotes and linebreaks, our SQL would look like:
SELECT
email,
CONCAT(
'[',
GROUP_CONCAT(CONCAT(
'{name:"',
name,
'", phone:"',
REPLACE(REPLACE(phone, '"', '\\\\"'),'\n','\\\\n'),
'"}'
)),
']'
) AS list
FROM table1 GROUP BY email;
If Ben phone has a quote in the middle of it, and Joi's has a newline, the SQL would give (valid JSON) results like:
[{name:"Ben", phone:"655\"5333"},{name:"Joi", phone:"343\n4356"}]

Use like this
SELECT email,concat('{name:"',ur_name_column,'",phone:"',ur_phone_column,'"}') as list FROM table1 GROUP BY email;
Cheers

Related

How to replace a regex pattern in MySQL

I have a table called myTable which has a column called col1. This column contains data in this format: (1 or 2 digits)(hyphen)(8 digits).
I want to replace all the data in this column and replace everything before hyphen with 4, so this is an example:
--------------------------------
| old values | New Values |
--------------------------------
| 1-654283568 => 4-654283568 |
| 2-467862833 => 4-467862833 |
| 8-478934293 => 4-478934293 |
| 12-573789475 => 4-573789475 |
| 16-574738575 => 4-574738575 |
--------------------------------
I am using MySQL 5.7.19, I believe REGEXP_REPLACE is available in MySQL Version 8+... not sure how this can be achieved?
You don't need regex; you can use SUBSTRING_INDEX to extract everything after the hyphen and concatenate 4- to that:
UPDATE myTable
SET col1 = CONCAT('4-', SUBSTRING_INDEX(col1, '-', -1))
Demo on dbfiddle
This will work regardless of the number of characters after the hyphen.
Looking to your pattern seem you could avoid regexp
update myTable
set col1 = concat('4-', right(col1,8))
or
update myTable
set col1 = concat('4', right(col1,9))
Try this:
UPDATE testing SET val=REPLACE(val,SUBSTRING(val,1,LOCATE('-',val)),'4-');
Fiddle here :https://www.db-fiddle.com/f/4mU5ctLh8NB9iKSKZF9Ue2/2
Using LOCATE to find '-' position then use SUBSTRING to get only the front part of the '-'.
SELECT CONCAT( #new_prefix, SUBSTRING(old_value FROM LOCATE('-', old_value)) ) AS new_value
UPDATE sourcetable
SET fieldname = CONCAT( '4', SUBSTRING(fieldname FROM LOCATE('-', fieldname)) )
WHERE LOCATE('-', fieldname)
/* AND another conditions */

MySQL Select and Remove JSON Characters from a Column [duplicate]

I'm trying to replace a bunch of characters in a MySQL field. I know the REPLACE function but that only replaces one string at a time. I can't see any appropriate functions in the manual.
Can I replace or delete multiple strings at once? For example I need to replace spaces with dashes and remove other punctuation.
You can chain REPLACE functions:
select replace(replace('hello world','world','earth'),'hello','hi')
This will print hi earth.
You can even use subqueries to replace multiple strings!
select replace(london_english,'hello','hi') as warwickshire_english
from (
select replace('hello world','world','earth') as london_english
) sub
Or use a JOIN to replace them:
select group_concat(newword separator ' ')
from (
select 'hello' as oldword
union all
select 'world'
) orig
inner join (
select 'hello' as oldword, 'hi' as newword
union all
select 'world', 'earth'
) trans on orig.oldword = trans.oldword
I'll leave translation using common table expressions as an exercise for the reader ;)
Cascading is the only simple and straight-forward solution to mysql for multiple character replacement.
UPDATE table1
SET column1 = replace(replace(REPLACE(column1, '\r\n', ''), '<br />',''), '<\r>','')
REPLACE does a good simple job of replacing characters or phrases everywhere they appear in a string. But when cleansing punctuation you may need to look for patterns - e.g. a sequence of whitespace or characters in the middle of a word or after a full stop. If that's the case, a regular expression replace function would be much more powerful.
UPDATE: If using MySQL version 8+, a REGEXP_REPLACE function is provided and can be invoked as follows:
SELECT txt,
REGEXP_REPLACE(REPLACE(txt, ' ', '-'),
'[^a-zA-Z0-9-]+',
'') AS `reg_replaced`
FROM test;
See this DB Fiddle online demo.
PREVIOUS ANSWER - only read on if using a version of MySQL before version 8: .
The bad news is MySQL doesn't provide such a thing but the good news is it's possible to provide a workaround - see this blog post.
Can I replace or delete multiple strings at once? For example I need
to replace spaces with dashes and remove other punctuation.
The above can be achieved with a combination of the regular expression replacer and the standard REPLACE function. It can be seen in action in this online Rextester demo.
SQL (excluding the function code for brevity):
SELECT txt,
reg_replace(REPLACE(txt, ' ', '-'),
'[^a-zA-Z0-9-]+',
'',
TRUE,
0,
0
) AS `reg_replaced`
FROM test;
CREATE FUNCTION IF NOT EXISTS num_as_word (name TEXT) RETURNS TEXT RETURN
(
SELECT
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(IFNULL(name, ''),
'1', 'one'),
'2', 'two'),
'3', 'three'),
'4', 'four'),
'5', 'five'),
'6', 'six'),
'7', 'seven'),
'8', 'eight'),
'9', 'nine')
);
I've been using lib_mysqludf_preg for this which allows you to:
Use PCRE regular expressions directly in MySQL
With this library installed you could do something like this:
SELECT preg_replace('/(\\.|com|www)/','','www.example.com');
Which would give you:
example
on php
$dataToReplace = [1 => 'one', 2 => 'two', 3 => 'three'];
$sqlReplace = '';
foreach ($dataToReplace as $key => $val) {
$sqlReplace = 'REPLACE(' . ($sqlReplace ? $sqlReplace : 'replace_field') . ', "' . $key . '", "' . $val . '")';
}
echo $sqlReplace;
result
REPLACE(
REPLACE(
REPLACE(replace_field, "1", "one"),
"2", "two"),
"3", "three");
UPDATE schools SET
slug = lower(name),
slug = REPLACE(slug, '|', ' '),
slug = replace(slug, '.', ' '),
slug = replace(slug, '"', ' '),
slug = replace(slug, '#', ' '),
slug = replace(slug, ',', ' '),
slug = replace(slug, '\'', ''),
slug = trim(slug),
slug = replace(slug, ' ', '-'),
slug = replace(slug, '--', '-');
UPDATE schools SET
slug = replace(slug, '--', '-');
If you are using MySQL Version 8+ then below is the built-in function that might help you better.
String
Replace
Output
w"w\'w. ex%a&m:p l–e.c)o(m
"'%&:)(–
www.example.com
MySQL Query:
SELECT REGEXP_REPLACE('`w"w\'w. ex%a&m:p l–e.c)o(m`', '[("\'%[:blank:]&:–)]', '');
Almost for all bugging characters-
SELECT REGEXP_REPLACE(column, '[\("\'%[[:blank:]]&:–,#$#!;\\[\\]\)<>\?\*\^]+','')
Real-life scenario.
I had to update all the files name which has been saved in 'demo' with special characters.
SELECT * FROM demo;
| uri |
|------------------------------------------------------------------------------|
| private://webform/applicant_details/129/offers upload winners .png |
| private://webform/applicant_details/129/student : class & teacher data.pdf |
| private://webform/applicant_details/130/tax---user's---data__upload.pdf |
| private://webform/applicant_details/130/Applicant Details _ report_0_2.pdf |
| private://webform/applicant_details/131/india&asia%population huge.pdf |
Test Case -
The table has multiple rows with special characters in the file name.
Advice:
To remove all the special characters from the file name and use a-z, A-Z, 0-9, dot and underscore with a lower file name.
Expected result is:
| uri |
|------------------------------------------------------------------------------|
| private://webform/applicant_details/129/offers_upload_winners_.png |
| private://webform/applicant_details/129/student_class_teacher_data.pdf |
| private://webform/applicant_details/130/tax_user_s_data_upload.pdf |
| private://webform/applicant_details/130/applicant_details_report_0_2.pdf |
| private://webform/applicant_details/131/india_asia_population_huge.pdf |
Okay, let's plan step by step
1st - let's find the file name
2nd - run all the find replace on that file name part only
3rd - replace the new file name with an old one
How can we do this?
Let's break down the whole action in chunks for a better understanding.
Below function will extract the file name only from the full path e.g. "Applicant Details _ report_0_2.pdf"
SELECT -- MySQL SELECT statement
SUBSTRING_INDEX -- MySQL built-in function
( -- Function start Parentheses
uri, -- my table column
'/', -- delimiter (the last / in full path; left to right ->)
-1 -- start from the last and find the 1st one (from right to left <-)
) -- Function end Parentheses
from -- MySQL FROM statement
demo; -- My table name
#1 Query result
| uri |
|------------------------------------|
| offers upload winners .png |
| student : class & teacher data.pdf |
| tax---user's---data__upload.pdf |
| Applicant Details _ report_0_2.pdf |
| india&asia%population huge.pdf |
Now we have to find and replace within the generated file name result.
SELECT
REGEXP_REPLACE( -- MySQL REGEXP_REPLACE built-in function (string, pattern, replace)
SUBSTRING_INDEX(uri, '/', -1), -- File name only
'[^a-zA-Z0-9_.]+', -- Find everything which is not a-z, A-Z, 0-9, . or _.
'_' -- Replace with _
) AS uri -- Give a alias column name for whole result
from
demo;
#2 Query result
| uri |
|------------------------------------|
| offers_upload_winners_.png |
| student_class_teacher_data.pdf |
| tax_user_s_data__upload.pdf |
| Applicant_Details___report_0_2.pdf |
| india_asia_population_huge.pdf |
FYI - Last '+' in the pattern is for repetitive words like ---- or multiple spaces ' ', Notice the result without '+' in the below regex pattern.
SELECT
REGEXP_REPLACE( -- MySQL REGEXP_REPLACE built-in function (string, pattern, replace)
SUBSTRING_INDEX(uri, '/', -1), -- File name only
'[^a-zA-Z0-9_.]', -- Find everything which is not a-z, A-Z, 0-9, . or _.
'_' -- Replace with _
) AS uri -- Give a alias column name for whole result
from
demo;
#3 Query result
| uri |
|------------------------------------|
| offers___upload__winners_.png |
| student___class___teacher_data.pdf |
| tax___user_s___data__upload.pdf |
| Applicant_Details___report_0_2.pdf |
| india_asia_population__huge.pdf |
Now, we have a file name without special characters (. and _ allowed). But the problem is file name still has Capital letters and also has multiple underscores.
Let's lower the file name first.
SELECT
LOWER(
REGEXP_REPLACE(
SUBSTRING_INDEX(uri, '/', -1),
'[^a-zA-Z0-9_.]',
'_'
)
) AS uri
from
demo;
#4 Query result
| uri |
|------------------------------------|
| offers_upload_winners_.png |
| student_class_teacher_data.pdf |
| tax_user_s_data__upload.pdf |
| applicant_details___report_0_2.pdf |
| india_asia_population_huge.pdf |
Now everything is in lower case but underscores are still there. So we will wrap the whole REGEX.. with one more REGEX..
SELECT
LOWER(
REGEXP_REPLACE( -- this wrapper will solve the multiple underscores issue
REGEXP_REPLACE(
SUBSTRING_INDEX(uri, '/', -1),
'[^a-zA-Z0-9_.]+',
'_'
),
'[_]+', -- if 1st regex action has multiple __ then find it
'_' -- and replace them with single _
)
) AS uri
from
demo;
#5 Query result
| uri |
|----------------------------------|
| offers_upload_winners_.png |
| student_class_teacher_data.pdf |
| tax_user_s_data_upload.pdf |
| applicant_details_report_0_2.pdf |
| india_asia_population_huge.pdf |
Congratulations! we have found what we were looking for. Now UPDATE TIME! Yeah!!
UPDATE -- run a MySQL UPDATE statement
demo -- tell MySQL to which table you want to update
SET -- put SET statement to set the updated values in desire column
uri = REPLACE( -- tell MySQL to which column you want to update,
-- I am also putting REPLACE function to replace existing values with new one
-- REPLACE (string, replace, with-this)
uri, -- my column to replace
SUBSTRING_INDEX(uri, '/', -1), -- my file name part "Applicant Details _ report_0_2.pdf"
-- without doing any action
LOWER( -- "applicant_details_report_0_2.pdf"
REGEXP_REPLACE( -- "Applicant_Details_report_0_2.pdf"
REGEXP_REPLACE( -- "Applicant_Details___report_0_2.pdf"
SUBSTRING_INDEX(uri, '/', -1), -- "Applicant Details _ report_0_2.pdf"
'[^a-zA-Z0-9_.]+',
'_'
),
'[_]+',
'_'
)
)
);
And after and UPDATE Query, result would be like this.
| uri |
|--------------------------------------------------------------------------|
| private://webform/applicant_details/152/offers_upload_winners_.png |
| private://webform/applicant_details/153/student_class_teacher_data.pdf |
| private://webform/applicant_details/153/tax_user_s_data_upload.pdf |
| private://webform/applicant_details/154/applicant_details_report_0_2.pdf |
| private://webform/applicant_details/154/india_asia_population_huge.pdf |
Sample data script
DROP TABLE IF EXISTS `demo`;
CREATE TABLE `demo` (
`uri` varchar(255) CHARACTER SET utf8mb3 COLLATE utf8_bin NOT NULL DEFAULT '' COMMENT 'The S3 URI of the file.',
`filesize` bigint unsigned NOT NULL DEFAULT '0' COMMENT 'The size of the file in bytes.',
`timestamp` int unsigned NOT NULL DEFAULT '0' COMMENT 'UNIX timestamp for when the file was added.',
`dir` int NOT NULL DEFAULT '0' COMMENT 'Boolean indicating whether or not this object is a directory.',
`version` varchar(255) CHARACTER SET utf8mb3 COLLATE utf8_bin DEFAULT '' COMMENT 'The S3 VersionId of the object.'
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
INSERT INTO `demo` (`uri`, `filesize`, `timestamp`, `dir`, `version`) VALUES
('private://webform/applicant_details/152/offers upload winners .png', 14976905, 1658397516, 0, ''),
('private://webform/applicant_details/153/student : class & teacher data.pdf', 0, 1659525447, 1, ''),
('private://webform/applicant_details/153/tax---user\'s---data__upload.pdf', 98449, 1658397516, 0, ''),
('private://webform/applicant_details/154/Applicant Details _ report_0_2.pdf', 0, 1659525447, 1, ''),
('private://webform/applicant_details/154/india&asia%population huge.pdf', 13301, 1658397517, 0, '');
Big Thanks:
MySQL: SELECT, UPDATE, REPLACE, SUBSTRING_INDEX, LOWER, REGEXP_REPLACE
MySQL Query Formatter: Thanks to CodeBeautify for such an awesome tool.

How to create SQL Query that concatenates and parses value in result set

How to make an SQL query for a table based on the following conditions:
Result is a single column that concatenates all fields delimited by a dash into a single string (ex: FieldA-FieldB-FieldC-FieldD-FieldE)
If a given field is NULL or if the field's value is a string such as "EMPTY" or "NA", do not concatenate that field's value into the result string
Example Table Person (FirstName, LastName, Street, City, State):
Bob | Dylan | 555 Street | Mountain View | California
Ally | M | NULL | Seattle | Washington
Jan | Van | EMPTY | EMPTY | Oregon
Nancy | Finn | EMPTY | EMPTY | NA
Don | William | NULL | EMPTY | Illinois
Result:
Bob-Dylan-555 Street-Mountain View-California
Ally-M-Seattle-Washington
Jan-Van-Oregon
Nancy-Finn
Don-William-Illinois
I know this can be done programatically, but wanted to know if this can be done in SQL and if it would be more efficient to do so in the query itself.
Fully-baked solution for SQL Server 2017 and above:
SELECT *
FROM Person p
OUTER APPLY (
SELECT STRING_AGG(NULLIF(NULLIF(val, 'EMPTY'), 'NA'), '-')
WITHIN GROUP (ORDER BY n) AS val
FROM (VALUES (1, p.FirstName), (2, p.LastName),(3, p.Street),
(4,p.City), (5, p.State)) z(n, val)
)sub;
DBFiddle Demo
MySQL version using CONCAT_WS:
CONCAT_WS() stands for Concatenate With Separator and is a special form of CONCAT(). The first argument is the separator for the rest of the arguments. The separator is added between the strings to be concatenated. The separator can be a string, as can the rest of the arguments. If the separator is NULL, the result is NULL.
CONCAT_WS() does not skip empty strings. However, it does skip any NULL values after the separator argument.
SELECT CONCAT_WS('-',
NULLIF(NULLIF(FirstName, 'EMPTY'), 'NA'),
NULLIF(NULLIF(LastName, 'EMPTY'), 'NA'),
NULLIF(NULLIF(Street, 'EMPTY'), 'NA'),
NULLIF(NULLIF(City, 'EMPTY'), 'NA'),
NULLIF(NULLIF(State, 'EMPTY'), 'NA')) AS r
FROM Person p;
DBFiddle Demo2
first, use CONCAT to concatenate the fields.
then use REPLACE to replace NULL values
SELECT REPLACE( CONCAT( field1, "-", field2 , "-", field3) , "NULL", "EMPTY" )
FROM `table`
Try This
SELECT ISNULL(FirstName,'') + '-' +
ISNULL (LastName,'') + '-' +
ISNULL (City,'') + '-' +
ISNULL (State,'')
FROM Person
OR LIKE THIS
SELECT CASE WHEN ISNULL(FirstName,'') = '' THEN '' ELSE FirstName + '-' +
CASE WHEN ISNULL(LastName,'') = '' THEN '' ELSE LastName + '-' +
CASE WHEN ISNULL(City,'') = '' THEN '' ELSE City + '-' +
CASE WHEN ISNULL(State,'') = '' THEN '' ELSE State + '-' END AS
ColumnName
FROM Person
Your select should be something like this:
select isnull(FieldA,'')+ '-' + isnull (FieldB,'') + '-' + isnull (FieldC,'') ....
and so on ..
This will work on MS SQL server if you don't want '-' if previous field is null than you should use case statement.
If you want to replace also 'Empty' or 'NULL' strings than you should use:
select replace(replace( isnull(FieldA+'-','') , 'Empty' , ''),'Null', '')
I have modified isnull() by Nitin_g3 observation.

Selecting rows with GROUP_CONCAT in MySQL in a specific order

I have a table called "tblVersion" that looks something like...
| key | value |
|-------------------------|
| buildVersion | 5 |
| minorVersion | 4 |
| majorVersion | 2 |
I want to build a query that will return the string "2.4.5", i.e. majorVersion.minorVersion.buildVersion.
So far I have
SELECT GROUP_CONCAT(tblVersion.value SEPARATOR '.' ) AS softwareVersion
FROM tblVersion
WHERE tblVersion.key = 'majorVersion'
OR tblVersion.key = 'minorVersion' OR tblVersion.key = 'buildVersion'
This returns "5.2.4" and I can't seem to get the string in the correct order.
Is it possible to be specific about the order the values are displayed?
Use order by FIELD
SELECT GROUP_CONCAT(value order by FIELD(tblVersion.key , 'majorVersion', 'minorVersion' , 'buildVersion') SEPARATOR '.' ) AS softwareVersion
FROM tblVersion
WHERE tblVersion.key = 'majorVersion'
OR tblVersion.key = 'minorVersion' OR tblVersion.key = 'buildVersion';
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_field
demo: http://sqlfiddle.com/#!2/8ef367/4
You should be able to use a CASE expression in the GROUP_CONCAT function:
select
group_concat(`value`
order by
case `key`
when 'majorVersion' then 0
when 'minorVersion' then 1
else 'buildVersion' end SEPARATOR '.') SoftwareVersion
from tblVersion
See SQL Fiddle with Demo
If DhruvPathak is on the right lines in regards to what you're actually after, then that can be achieved this way...
SELECT GROUP_CONCAT(x.value ORDER BY FIELD(x.key,'minorversion','majorversion','buildversion') DESC SEPARATOR '.') softwareVersion
FROM tblversion x
WHERE x.key IN('minorVersion','majorVersion','buildVersion');

Can MySQL replace multiple characters?

I'm trying to replace a bunch of characters in a MySQL field. I know the REPLACE function but that only replaces one string at a time. I can't see any appropriate functions in the manual.
Can I replace or delete multiple strings at once? For example I need to replace spaces with dashes and remove other punctuation.
You can chain REPLACE functions:
select replace(replace('hello world','world','earth'),'hello','hi')
This will print hi earth.
You can even use subqueries to replace multiple strings!
select replace(london_english,'hello','hi') as warwickshire_english
from (
select replace('hello world','world','earth') as london_english
) sub
Or use a JOIN to replace them:
select group_concat(newword separator ' ')
from (
select 'hello' as oldword
union all
select 'world'
) orig
inner join (
select 'hello' as oldword, 'hi' as newword
union all
select 'world', 'earth'
) trans on orig.oldword = trans.oldword
I'll leave translation using common table expressions as an exercise for the reader ;)
Cascading is the only simple and straight-forward solution to mysql for multiple character replacement.
UPDATE table1
SET column1 = replace(replace(REPLACE(column1, '\r\n', ''), '<br />',''), '<\r>','')
REPLACE does a good simple job of replacing characters or phrases everywhere they appear in a string. But when cleansing punctuation you may need to look for patterns - e.g. a sequence of whitespace or characters in the middle of a word or after a full stop. If that's the case, a regular expression replace function would be much more powerful.
UPDATE: If using MySQL version 8+, a REGEXP_REPLACE function is provided and can be invoked as follows:
SELECT txt,
REGEXP_REPLACE(REPLACE(txt, ' ', '-'),
'[^a-zA-Z0-9-]+',
'') AS `reg_replaced`
FROM test;
See this DB Fiddle online demo.
PREVIOUS ANSWER - only read on if using a version of MySQL before version 8: .
The bad news is MySQL doesn't provide such a thing but the good news is it's possible to provide a workaround - see this blog post.
Can I replace or delete multiple strings at once? For example I need
to replace spaces with dashes and remove other punctuation.
The above can be achieved with a combination of the regular expression replacer and the standard REPLACE function. It can be seen in action in this online Rextester demo.
SQL (excluding the function code for brevity):
SELECT txt,
reg_replace(REPLACE(txt, ' ', '-'),
'[^a-zA-Z0-9-]+',
'',
TRUE,
0,
0
) AS `reg_replaced`
FROM test;
CREATE FUNCTION IF NOT EXISTS num_as_word (name TEXT) RETURNS TEXT RETURN
(
SELECT
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(
REPLACE(IFNULL(name, ''),
'1', 'one'),
'2', 'two'),
'3', 'three'),
'4', 'four'),
'5', 'five'),
'6', 'six'),
'7', 'seven'),
'8', 'eight'),
'9', 'nine')
);
I've been using lib_mysqludf_preg for this which allows you to:
Use PCRE regular expressions directly in MySQL
With this library installed you could do something like this:
SELECT preg_replace('/(\\.|com|www)/','','www.example.com');
Which would give you:
example
on php
$dataToReplace = [1 => 'one', 2 => 'two', 3 => 'three'];
$sqlReplace = '';
foreach ($dataToReplace as $key => $val) {
$sqlReplace = 'REPLACE(' . ($sqlReplace ? $sqlReplace : 'replace_field') . ', "' . $key . '", "' . $val . '")';
}
echo $sqlReplace;
result
REPLACE(
REPLACE(
REPLACE(replace_field, "1", "one"),
"2", "two"),
"3", "three");
UPDATE schools SET
slug = lower(name),
slug = REPLACE(slug, '|', ' '),
slug = replace(slug, '.', ' '),
slug = replace(slug, '"', ' '),
slug = replace(slug, '#', ' '),
slug = replace(slug, ',', ' '),
slug = replace(slug, '\'', ''),
slug = trim(slug),
slug = replace(slug, ' ', '-'),
slug = replace(slug, '--', '-');
UPDATE schools SET
slug = replace(slug, '--', '-');
If you are using MySQL Version 8+ then below is the built-in function that might help you better.
String
Replace
Output
w"w\'w. ex%a&m:p l–e.c)o(m
"'%&:)(–
www.example.com
MySQL Query:
SELECT REGEXP_REPLACE('`w"w\'w. ex%a&m:p l–e.c)o(m`', '[("\'%[:blank:]&:–)]', '');
Almost for all bugging characters-
SELECT REGEXP_REPLACE(column, '[\("\'%[[:blank:]]&:–,#$#!;\\[\\]\)<>\?\*\^]+','')
Real-life scenario.
I had to update all the files name which has been saved in 'demo' with special characters.
SELECT * FROM demo;
| uri |
|------------------------------------------------------------------------------|
| private://webform/applicant_details/129/offers upload winners .png |
| private://webform/applicant_details/129/student : class & teacher data.pdf |
| private://webform/applicant_details/130/tax---user's---data__upload.pdf |
| private://webform/applicant_details/130/Applicant Details _ report_0_2.pdf |
| private://webform/applicant_details/131/india&asia%population huge.pdf |
Test Case -
The table has multiple rows with special characters in the file name.
Advice:
To remove all the special characters from the file name and use a-z, A-Z, 0-9, dot and underscore with a lower file name.
Expected result is:
| uri |
|------------------------------------------------------------------------------|
| private://webform/applicant_details/129/offers_upload_winners_.png |
| private://webform/applicant_details/129/student_class_teacher_data.pdf |
| private://webform/applicant_details/130/tax_user_s_data_upload.pdf |
| private://webform/applicant_details/130/applicant_details_report_0_2.pdf |
| private://webform/applicant_details/131/india_asia_population_huge.pdf |
Okay, let's plan step by step
1st - let's find the file name
2nd - run all the find replace on that file name part only
3rd - replace the new file name with an old one
How can we do this?
Let's break down the whole action in chunks for a better understanding.
Below function will extract the file name only from the full path e.g. "Applicant Details _ report_0_2.pdf"
SELECT -- MySQL SELECT statement
SUBSTRING_INDEX -- MySQL built-in function
( -- Function start Parentheses
uri, -- my table column
'/', -- delimiter (the last / in full path; left to right ->)
-1 -- start from the last and find the 1st one (from right to left <-)
) -- Function end Parentheses
from -- MySQL FROM statement
demo; -- My table name
#1 Query result
| uri |
|------------------------------------|
| offers upload winners .png |
| student : class & teacher data.pdf |
| tax---user's---data__upload.pdf |
| Applicant Details _ report_0_2.pdf |
| india&asia%population huge.pdf |
Now we have to find and replace within the generated file name result.
SELECT
REGEXP_REPLACE( -- MySQL REGEXP_REPLACE built-in function (string, pattern, replace)
SUBSTRING_INDEX(uri, '/', -1), -- File name only
'[^a-zA-Z0-9_.]+', -- Find everything which is not a-z, A-Z, 0-9, . or _.
'_' -- Replace with _
) AS uri -- Give a alias column name for whole result
from
demo;
#2 Query result
| uri |
|------------------------------------|
| offers_upload_winners_.png |
| student_class_teacher_data.pdf |
| tax_user_s_data__upload.pdf |
| Applicant_Details___report_0_2.pdf |
| india_asia_population_huge.pdf |
FYI - Last '+' in the pattern is for repetitive words like ---- or multiple spaces ' ', Notice the result without '+' in the below regex pattern.
SELECT
REGEXP_REPLACE( -- MySQL REGEXP_REPLACE built-in function (string, pattern, replace)
SUBSTRING_INDEX(uri, '/', -1), -- File name only
'[^a-zA-Z0-9_.]', -- Find everything which is not a-z, A-Z, 0-9, . or _.
'_' -- Replace with _
) AS uri -- Give a alias column name for whole result
from
demo;
#3 Query result
| uri |
|------------------------------------|
| offers___upload__winners_.png |
| student___class___teacher_data.pdf |
| tax___user_s___data__upload.pdf |
| Applicant_Details___report_0_2.pdf |
| india_asia_population__huge.pdf |
Now, we have a file name without special characters (. and _ allowed). But the problem is file name still has Capital letters and also has multiple underscores.
Let's lower the file name first.
SELECT
LOWER(
REGEXP_REPLACE(
SUBSTRING_INDEX(uri, '/', -1),
'[^a-zA-Z0-9_.]',
'_'
)
) AS uri
from
demo;
#4 Query result
| uri |
|------------------------------------|
| offers_upload_winners_.png |
| student_class_teacher_data.pdf |
| tax_user_s_data__upload.pdf |
| applicant_details___report_0_2.pdf |
| india_asia_population_huge.pdf |
Now everything is in lower case but underscores are still there. So we will wrap the whole REGEX.. with one more REGEX..
SELECT
LOWER(
REGEXP_REPLACE( -- this wrapper will solve the multiple underscores issue
REGEXP_REPLACE(
SUBSTRING_INDEX(uri, '/', -1),
'[^a-zA-Z0-9_.]+',
'_'
),
'[_]+', -- if 1st regex action has multiple __ then find it
'_' -- and replace them with single _
)
) AS uri
from
demo;
#5 Query result
| uri |
|----------------------------------|
| offers_upload_winners_.png |
| student_class_teacher_data.pdf |
| tax_user_s_data_upload.pdf |
| applicant_details_report_0_2.pdf |
| india_asia_population_huge.pdf |
Congratulations! we have found what we were looking for. Now UPDATE TIME! Yeah!!
UPDATE -- run a MySQL UPDATE statement
demo -- tell MySQL to which table you want to update
SET -- put SET statement to set the updated values in desire column
uri = REPLACE( -- tell MySQL to which column you want to update,
-- I am also putting REPLACE function to replace existing values with new one
-- REPLACE (string, replace, with-this)
uri, -- my column to replace
SUBSTRING_INDEX(uri, '/', -1), -- my file name part "Applicant Details _ report_0_2.pdf"
-- without doing any action
LOWER( -- "applicant_details_report_0_2.pdf"
REGEXP_REPLACE( -- "Applicant_Details_report_0_2.pdf"
REGEXP_REPLACE( -- "Applicant_Details___report_0_2.pdf"
SUBSTRING_INDEX(uri, '/', -1), -- "Applicant Details _ report_0_2.pdf"
'[^a-zA-Z0-9_.]+',
'_'
),
'[_]+',
'_'
)
)
);
And after and UPDATE Query, result would be like this.
| uri |
|--------------------------------------------------------------------------|
| private://webform/applicant_details/152/offers_upload_winners_.png |
| private://webform/applicant_details/153/student_class_teacher_data.pdf |
| private://webform/applicant_details/153/tax_user_s_data_upload.pdf |
| private://webform/applicant_details/154/applicant_details_report_0_2.pdf |
| private://webform/applicant_details/154/india_asia_population_huge.pdf |
Sample data script
DROP TABLE IF EXISTS `demo`;
CREATE TABLE `demo` (
`uri` varchar(255) CHARACTER SET utf8mb3 COLLATE utf8_bin NOT NULL DEFAULT '' COMMENT 'The S3 URI of the file.',
`filesize` bigint unsigned NOT NULL DEFAULT '0' COMMENT 'The size of the file in bytes.',
`timestamp` int unsigned NOT NULL DEFAULT '0' COMMENT 'UNIX timestamp for when the file was added.',
`dir` int NOT NULL DEFAULT '0' COMMENT 'Boolean indicating whether or not this object is a directory.',
`version` varchar(255) CHARACTER SET utf8mb3 COLLATE utf8_bin DEFAULT '' COMMENT 'The S3 VersionId of the object.'
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
INSERT INTO `demo` (`uri`, `filesize`, `timestamp`, `dir`, `version`) VALUES
('private://webform/applicant_details/152/offers upload winners .png', 14976905, 1658397516, 0, ''),
('private://webform/applicant_details/153/student : class & teacher data.pdf', 0, 1659525447, 1, ''),
('private://webform/applicant_details/153/tax---user\'s---data__upload.pdf', 98449, 1658397516, 0, ''),
('private://webform/applicant_details/154/Applicant Details _ report_0_2.pdf', 0, 1659525447, 1, ''),
('private://webform/applicant_details/154/india&asia%population huge.pdf', 13301, 1658397517, 0, '');
Big Thanks:
MySQL: SELECT, UPDATE, REPLACE, SUBSTRING_INDEX, LOWER, REGEXP_REPLACE
MySQL Query Formatter: Thanks to CodeBeautify for such an awesome tool.