How to split everything after - (dash) using MySQL - mysql

I need to split data within a cell separated by - (dash) and put into separate columns. The problem I am having is there may be more than one -.
So using the table below with the original data coming from sic_orig, I need to put everything before the first - in sic_num and everything after the first - in sic_desc. I'm sure this is really easy, but I can't seem to find anything clear on this.
This is what my table should look like with sic_orig being the source and sic_num and sic_desc being data pulled from sic_orig:
sic_orig | sic_num | sic_desc
---------------------------------------------------------------------------
509406 - Jewelers-Wholesale | 509406 | Jewelers-Wholesale
--------------------------------------|-----------|------------------------
506324 - Burglar Alarm Systems | 506324 | Burglar Alarm Systems
--------------------------------------|-----------|------------------------
502317 - Picture Frames-Wholesale | 502317 | Picture Frames-Wholesale
This code works, but only works right if there are two -'s and some cells may have 1, 2 or 3 -'s
UPDATE test_tbl_1
SET sic_num = SUBSTRING_INDEX(`sic_orig`, '-', 1),
sic_desc = SUBSTRING_INDEX(`sic_orig`, '-', -2);
How do I split everything before first - and everything after first -?

One method is to use the length of the first part and use that for substr():
UPDATE test_tbl_1
SET sic_num = SUBSTRING_INDEX(`sic_og`, '-', 1),
sic_desc = SUBSTR(sig_og, CHAR_LENGTH(SUBSTRING_INDEX(`sic_og`, '-', 1)) + 1) ;

You can use a combination of SUBSTR() and LOCATE() function to help you slice the string:
UPDATE test_tbl_1
SET sic_num = SUBSTR(sig_orig, 1, LOCATE('-', sig_orig) - 1),
sic_desc = SUBSTR(sig_orig, LOCATE('-', sig_orig) + 1) ;
Click here for MySQL string functions.

Another alternative is to get a count of the dashes in the string. We can get a count of the number of dash characters by doing a replacement of all dash characters with an empty string, and then subtracting the length from the length of the original string.
As a demonstration:
SELECT `sic_orig`
, CHAR_LENGTH(`sic_orig`)-CHAR_LENGTH(REPLACE(`sic_orig`,'-','')) AS cnt_dashes
FROM ( SELECT '509406 - Jewelers-Wholesale ' AS sic_orig
UNION ALL SELECT '506324 - Burglar Alarm Systems'
UNION ALL SELECT '502317 - Picture Frames-Wholesale'
UNION ALL SELECT ' la di dah no dashes '
) t
returns:
sic_orig cnt_dashes
------------------------------------- ----------
509406 - Jewelers-Wholesale 2
506324 - Burglar Alarm Systems 1
502317 - Picture Frames-Wholesale 2
lots-of - -dashes- --everywhere-- -- 10
zero dashes 0
We can use the expression that returns the count of dashes as the third argument of SUBSTRING_INDEX, multiplying by negative 1 to get a a negative value...
SELECT `sic_orig`
, TRIM(
SUBSTRING_INDEX(`sic_orig`,'-'
, 1
)
) AS before_first_dash
, TRIM(
SUBSTRING_INDEX(`sic_orig`,'-'
, -1*(CHAR_LENGTH(`sic_orig`)-CHAR_LENGTH(REPLACE(`sic_orig`,'-','')))
)
) AS after_first_dash
FROM ( SELECT '509406 - Jewelers-Wholesale ' AS sic_orig
UNION ALL SELECT '506324 - Burglar Alarm Systems'
UNION ALL SELECT '502317 - Picture Frames-Wholesale'
UNION ALL SELECT 'lots-of - -dashes- - -every-where-'
UNION ALL SELECT ' zero dashes '
) t
returns:
sic_orig before_first_dash after_first_dash
--------------------------------- ----------------- ----------------------
509406 - Jewelers-Wholesale 509406 Jewelers-Wholesale
506324 - Burglar Alarm Systems 506324 Burglar Alarm Systems
502317 - Picture Frames-Wholesale 502317 Picture Frames-Wholesale
lots-of - -dashes- - -every-where- lots of - -dashes- - -every-where-
zero dashes zero dashes
The extra line breaks and formatting is intended to make deciphering the expressions easier, making sure parens balance, etc.
I always test my expressions with a SELECT statement first, before I put those expressions into an UPDATE statement.

Related

How to truncate double precision value in PostgreSQL by keeping exactly first two decimals?

I'm trying to truncate double precision value when I'm build json using json_build_object() function in PostgreSQL 11.8 but with no luck. To be more precise I'm trying to truncate 19.9899999999999984 number to ONLY two decimals but making sure it DOES NOT round it to 20.00 (which is what it does), but to keep it at 19.98.
BTW, what I've tried so far was to use:
1) TRUNC(found_book.price::numeric, 2) and I get value 20.00
2) ROUND(found_book.price::numeric, 2) and I get value 19.99 -> so far this is closesest value but not what I need
3) ROUND(found_book.price::double precision, 2) and I get
[42883] ERROR: function round(double precision, integer) does not exist
Also here is whole code I'm using:
create or replace function public.get_book_by_book_id8(b_id bigint) returns json as
$BODY$
declare
found_book book;
book_authors json;
book_categories json;
book_price double precision;
begin
-- Load book data:
select * into found_book
from book b2
where b2.book_id = b_id;
-- Get assigned authors
select case when count(x) = 0 then '[]' else json_agg(x) end into book_authors
from (select aut.*
from book b
inner join author_book as ab on b.book_id = ab.book_id
inner join author as aut on ab.author_id = aut.author_id
where b.book_id = b_id) x;
-- Get assigned categories
select case when count(y) = 0 then '[]' else json_agg(y) end into book_categories
from (select cat.*
from book b
inner join category_book as cb on b.book_id = cb.book_id
inner join category as cat on cb.category_id = cat.category_id
where b.book_id = b_id) y;
book_price = trunc(found_book.price, 2);
-- Build the JSON response:
return (select json_build_object(
'book_id', found_book.book_id,
'title', found_book.title,
'price', book_price,
'amount', found_book.amount,
'is_deleted', found_book.is_deleted,
'authors', book_authors,
'categories', book_categories
));
end
$BODY$
language 'plpgsql';
select get_book_by_book_id8(186);
How do I achieve to keep EXACTLY ONLY two FIRST decimal digits 19.98 (any suggestion/help is greatly appreciated)?
P.S. PostgreSQL version is 11.8
In PostgreSQL 11.8 or 12.3 I cannot reproduce:
# select trunc('19.9899999999999984'::numeric, 2);
trunc
-------
19.98
(1 row)
# select trunc(19.9899999999999984::numeric, 2);
trunc
-------
19.98
(1 row)
# select trunc(19.9899999999999984, 2);
trunc
-------
19.98
(1 row)
Actually I can reproduce with the right type and a special setting:
# set extra_float_digits=0;
SET
# select trunc(19.9899999999999984::double precision::text::numeric, 2);
trunc
-------
19.99
(1 row)
And a possible solution:
# show extra_float_digits;
extra_float_digits
--------------------
3
(1 row)
select trunc(19.9899999999999984::double precision::text::numeric, 2);
trunc
-------
19.98
(1 row)
But note that:
Note: The extra_float_digits setting controls the number of extra
significant digits included when a floating point value is converted
to text for output. With the default value of 0, the output is the
same on every platform supported by PostgreSQL. Increasing it will
produce output that more accurately represents the stored value, but
may be unportable.
As #pifor suggested I've managed to get it done by directly passing trunc(found_book.price::double precision::text::numeric, 2) as value in json_build_object like this:
json_build_object(
'book_id', found_book.book_id,
'title', found_book.title,
'price', trunc(found_book.price::double precision::text::numeric, 2),
'amount', found_book.amount,
'is_deleted', found_book.is_deleted,
'authors', book_authors,
'categories', book_categories
)
Using book_price = trunc(found_book.price::double precision::text::numeric, 2); and passing it as value for 'price' key didn't work.
Thank you for your help. :)

MySQL Question - I want to eliminate all text within any parenthesis

Just checking to see if any of you would have a solution for this – from the below text like this I want to eliminate all text within any parenthesis.
Input –
PAY - addition,FILES (aaaaaaaaaaaaaa/bbbbbbbbbbbs i.e. ssss,ffff – i.e. cccccc),DED (ppppppp, llllll, fffff gggg),LOSS (ddddd, hhhhhh – i.e.),F TO G ( “F” is switching to “G”)
Output –
PAY - addition,FILES,DED,LOSS,F TO G
If you are running MySQL 8.0, you can do this with regexp_replace():
regexp_replace(mytext, '\\([^)]*\\)', '')
This works as long as there are no nested parentheses in the expression (which is consistent with your sample data).
Demo on DB Fiddle:
select regexp_replace(
'PAY - addition,FILES (aaaaaaaaaaaaaa/bbbbbbbbbbbs i.e. ssss,ffff – i.e. cccccc),DED (ppppppp, llllll, fffff gggg),LOSS (ddddd, hhhhhh – i.e.),F TO G ( “F” is switching to “G”)',
'\\([^)]*\\)',
''
) val
| val |
| :--------------------------------------- |
| PAY - addition,FILES ,DED ,LOSS ,F TO G |
Another on for MYSQL8.0:
SET #input:="PAY - addition,FILES (aaaaaaaaaaaaaa/bbbbbbbbbbbs i.e. ssss,ffff – i.e. cccccc),DED (ppppppp, llllll, fffff gggg),LOSS (ddddd, hhhhhh – i.e.),F TO G ( “F” is switching to “G”)";
with recursive cte as (
select
0 i,
#input as text
union all
select
i+1,
CASE WHEN instr(text,'(') >0 AND instr(text,')')>instr(text,'(') THEN REPLACE(text, substring(text,instr(text,'('),instr(text,')')-instr(text,'(')+1), '') ELSE '' END
from cte
where i<10
) select text from cte where text<>'' order by i desc limit 1;
output:
+------------------------------------------+
| text |
+------------------------------------------+
| PAY - addition,FILES ,DED ,LOSS ,F TO G |
+------------------------------------------+
1 row in set (0.00 sec)

Order by for column in varchar type

I have the following column strand which is ordered in ascending order but its taking 3.10 as next after 3.1 instead of 3.2..
the column is varchar type..
Strand
3.1
3.1.1
3.1.1.1
3.1.1.2
3.1.2
3.1.2.1
3.10 # wrong
3.10.1 # wrong
3.10.1.1 # wrong
3.2 <- this should have been after 3.1.2.1
3.2.1
3.2.1.1
..
3.9
3.9.1.1
<- here is where 3.10 , 3.10.1 and 3.10.1.1 should reside
I used the following query to order it;
SELECT * FROM [table1]
ORDER BY RPAD(Strand,4,'.0') ;
how to make sure its ordered in the right way such that 3.10,3.10.1 and 3.10.1.1 is at last
Try this:
DROP TABLE T1;
CREATE TABLE T1 (Strand VARCHAR(20));
INSERT INTO T1 VALUES ('3.1');
INSERT INTO T1 VALUES('3.1.1');
INSERT INTO T1 VALUES('3.1.1.1');
INSERT INTO T1 VALUES('3.1.1.2');
INSERT INTO T1 VALUES('3.2');
INSERT INTO T1 VALUES('3.2.1');
INSERT INTO T1 VALUES('3.10');
INSERT INTO T1 VALUES('3.10.1');
SELECT * FROM T1
ORDER BY STRAND;
SELECT *
FROM T1
ORDER BY
CAST(SUBSTRING_INDEX(CONCAT(Strand+'.0.0.0.0','.',1) AS UNSIGNED INTEGER) *1000 +
CAST(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(Strand,'.0.0.0.0'),'.',2),'.',-1) AS UNSIGNED INTEGER) *100 +
CAST(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(Strand,'.0.0.0.0'),'.',3),'.',-1) AS UNSIGNED INTEGER) *10 +
CAST(SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(Strand,'.0.0.0.0'),'.',4),'.',-1) AS UNSIGNED INTEGER)
Output not ordeded:
Strand
1 3.1
2 3.1.1
3 3.1.1.1
4 3.1.1.2
5 3.10
6 3.10.1
7 3.2
8 3.2.1
Output Ordered:
Strand
1 3.1
2 3.1.1
3 3.1.1.1
4 3.1.1.2
5 3.2
6 3.2.1
7 3.10
8 3.10.1
you can order the result baset on the integer value of your field. your code will looks like
select [myfield]from [mytable] order by
convert(RPAD(replace([myfield],'.',''),4,0),UNSIGNED INTEGER);
in this code replace function will cleand the dots (.)
hope thin help
You must normalize each group of digits
SELECT * FROM [table1]
ORDER BY CONCAT(
LPAD(SUBSTRING_INDEX(Strand,'.',1),3,'0'), '-',
LPAD(SUBSTRING_INDEX(SUBSTRING_INDEX(Strand,'.',2),'.',-1),3,'0'), '-',
LPAD(SUBSTRING_INDEX(SUBSTRING_INDEX(Strand,'.',3),'.',-1),3,'0'), '-',
LPAD(SUBSTRING_INDEX(SUBSTRING_INDEX(Strand,'.',3),'.',-1),3,'0'));
sample
mysql> SELECT CONCAT(
-> LPAD(SUBSTRING_INDEX('3.10.1.1','.',1),3,'0'), '-',
-> LPAD(SUBSTRING_INDEX(SUBSTRING_INDEX('3.10.1.1','.',2),'.',-1),3,'0'), '-',
-> LPAD(SUBSTRING_INDEX(SUBSTRING_INDEX('3.10.1.1','.',3),'.',-1),3,'0'), '-',
-> LPAD(SUBSTRING_INDEX(SUBSTRING_INDEX('3.10.1.1','.',3),'.',-1),3,'0'));
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| CONCAT(
LPAD(SUBSTRING_INDEX('3.10.1.1','.',1),3,'0'), '-',
LPAD(SUBSTRING_INDEX(SUBSTRING_INDEX('3.10.1.1','.',2),'.',-1),3,'0'), '-',
LPAD(SUBSTRING_INDEX(SUBSTRING_INDEX('3.10.1.1','.',3),'.',-1),3,'0'), '-',
LPAD(SUBSTRING_INDEX(SUBSTRI |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 003-010-001-001 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0,00 sec)
Cause "strand" column is text data, so it will be ordered in alphabetical. To make it be ordered as your desire, you should format your data before insert or update it. Suppose maximum digit for each level is 3, your data should be formated like this
003.001
003.001.001
003.001.001.001
003.002
003.002.001
003.002.001.001
003.010
010.001
The altenative way is splitting "strand" column into mutiple columns. Each column will store data for each level, such as
Level1 | Level2 | Level3 ...
3 | 1 | 0
3 | 1 | 1
3 | 2 | 0
...
3 | 10 | 0
Datatype of these columns should be number and then you should be able to order by these columns.
If the point(.) in your data is no more than 3, you can try this:
select *
from demo
order by replace(Strand, '.', '') * pow(10, (3 + length(replace(Strand, '.', '')) - length(Strand)))
If the point is uncertain, here you can should use subquery to get max num of point:
select demo.Strand
from demo
cross join (
select max(length(Strand) - length(replace(Strand, '.', ''))) as num from demo
) t
order by replace(Strand, '.', '') * pow(10, (num + length(replace(Strand, '.', '')) - length(Strand)))
See demo in Rextester.
As you see, I've used function replace, length, pow in order by clause.
1) replace(Strand, '.', '') will give us int number, like:
replace('3.10.1.1', '.', '') => 31011;
2) (3 + length(replace(Strand, '.', '')) - length(Strand)) will give us the count of point which the max num of point minus point's count in Strand, like:
3.1 => 2;
3)pow returns the value of X raised to the power of Y;
so the sample data will be calculated like:
3100
3110
3111
3112
3120
3121
31000
31010
31011
3200
3210
3211
3900
3911
by these nums, you will get the right sort.

MySQL Multiple IF CASE Statements

I have a temporary table "productsTmp", in which I have a column "footprintSize".
The column contains several strings with the next format:
a) 12 x 34
b) v 11
c) 12 x 34 (v 12)
I want to extract the numbers only, in order to obtain something like:
a) v1 = 12 ; v2 = 34
b) v3 = 11
c) v1 = 12 ; v2 = 34 ; v3 = 12
Note: The values are from a Rectangular Prism, v1 = width, v2 = length, v3 = height. The height always comes after the character "v" (which in my case is ø).
To extract this values, I thought of using a subquery loop, but I've only come with the next idea:
IF footprintSize LIKE '%x%'
-- Example: 24 x 24
SELECT SUBSTRING_INDEX(footprintSize, 'x', 1) AS lval;
SELECT SUBSTRING_INDEX(footprintSize, 'x', -1) AS rval;
-- Example: 8 x 8 (Ø 10)
IF rval LIKE '%)%'
SELECT SUBSTRING_INDEX(SELECT SUBSTRING_INDEX(SELECT SUBSTRING_INDEX(footprintSize, '(', -1), ' ', -1), ')', 1) AS dval;
ELSE
-- Example: ø 11
SELECT SUBSTRING_INDEX(footprintSize, ' ', -1) AS dval;
END IF;
However I've been told that "IF" only works in stored procedures, which is not something I'm looking for. So I tried the next:
SELECT
CASE TRUE
WHEN footprintSize LIKE "%x%" THEN (SUBSTRING_INDEX(footprintSize,'x', 1))
WHEN footprintSize LIKE "%)%" THEN (SUBSTRING_INDEX(footprintSize,')', -1))
END as "footprintSize"
FROM productsTmp
But I'm not close to achieve what I want.
At the end, I want to have something like:
footprintSize lval rval dval
24 x 24 24 24
8 x 8 (Ø 10) 8 8 10
ø 11 11
For the empty spaces, I can have null or even add a zero, but I'm more concerned of how can I split this data into three columns.
Thank you.

Parse text using substring in mysql

I want to parse a text using substring. The format we have for the text is like this:
N, Adele, A, 18
And the substring we do is like this:
SUBSTRING_INDEX(SUBSTRING_INDEX(text, ',', 2), ', ', -1) as 'Name',
SUBSTRING_INDEX(SUBSTRING_INDEX(text, ',', 4), ', ', -1) as 'Age',
The output we get is:
| Name | Age |
| Adele | 18 |
But we want to change the text format to:
N Adele, A 18
What would be the correct syntax so can I parse the text in the position 1 (N Adele) and use the delimiter space and just get Adele? And then same for the next text (A 18)?
I tried doing
SUBSTRING_INDEX(SUBSTRING_INDEX(text, ' ', 1), ', ', -1) as 'Name',
But the output I got is just
| Name |
| N |
The output I was hoping for is like this:
| Name |
| Adele |
Presuming here that you want to change your original data structure and still be able to get the results out. You change your data structure to:
N Adele, A 18 -- etc
With the potential to have multiple names as the name (space separated), my previous example is not correct.
You could trim off the N and A directly with their space, knowing that they will only ever be two characters long and that they will always be there, like this:
SUBSTRING(TRIM(SUBSTRING_INDEX(`text`, ',', 1)), 3) AS 'Name',
SUBSTRING(TRIM(SUBSTRING_INDEX(`text`, ',', -1)), 3) AS 'Age'
To get:
Name | Age
--------------------
Adele | 18
You can use
SELECT
SUBSTRING(text, 2, INSTR(text, ',') - INSTR(text, ' ')) AS Name,
SUBSTRING(text, INSTR(text, ',') + 3, LENGTH(text) - INSTR(text, ',') + 3) AS Age
FROM your_table;
as the position of the field descriptors (N and A) are fixed (relative to the start of the string and to the comma). You can check the working query in this fiddle.