remove last character if it is not number - mysql

I would like to remove last character of string if is not a number.
Input column
atg167d
atg645
msc587f
atg6692
Output column
atg167
atg645
msc587
atg6692

You can use SUBSTR(inputColumn, -1) to get the last character, and then check to see if it's between '0' and '9'. If it is, the column ends with a number. If not, use SUBSTR to remove the last character:
SELECT
CASE
WHEN SUBSTR(inCol, -1) BETWEEN '0' AND '9' THEN inCol
ELSE SUBSTR(inCol, 1, LENGTH(inCol) - 1)
END AS outCol
FROM myTable
There's a working SQL Fiddle here

Related

Mysql query replace first empty string

I have many fields that contains extra space in start, like:
" RTX 3060"
-^
I want to remove first extra space from all fields of this table.
What I tried:
UPDATE table set field = concat( '', substring(field , 1)) where left(field ,1)=' ';
Return 0 result!
Try with this statement, it will remove only the first space char of field is it's a space:
UPDATE `table`
SET column = SUBSTRING(column,2)
WHERE SUBSTRING(column, 1, 1) = ' ';
Try using the LTRIM function. The syntax looks something like this:
UPDATE table SET column = LTRIM(column);

SQL: Substring until second character starting from right keeping left values

I have this filename AAAA_BBBBB_CC_HDDD_HGGG.csv and I'm trying to keep the values after the second underscore starting from the right.
So i want to keep any values just before _HDDD_HGGG.csv
This is my code:
SET #NFileN = REVERSE(SUBSTRING(REVERSE(#source_filename),1,CHARINDEX('_',REVERSE(#source_filename), CHARINDEX('_', REVERSE (#source_filename), 0) + 1)))
And this is the returned value:
(6 rows affected)
_HDDD_HGGG.csv
Instead of being AAAA_BBBBB_CC.
Does anyone has a clue for this?
You are taking a SUBSTRING from 1 till your CHARINDEX while your string is reversed. Either reverse your string again or use LEN to find the length of your string like so:
REVERSE(
SUBSTRING(
REVERSE(#source_filename),
CHARINDEX('_',
REVERSE(#source_filename),
CHARINDEX('_',
REVERSE (#source_filename),
0)+1)+1,
LEN(#source_filename)
)
)
p.s.: Added a second +1 to remove the "_" between CC and HDDD
p.p.s: CHARINDEX is a SQL Server function which I assume is what you are actually using. The MySQL equivalent would be POSITION, the equivalent for LEN would be LENGTH

Adding a single Digit to a 4-digits long number in mysql

I'm trying to get a mysql script, that changes every 4-digit long number "into" a 5-digit long, by adding a "0" at the start of each number. This is, what I tried:
SELECT * FROM `customer_address_entity_text` WHERE CHAR_LENGTH(value) < 5;
SELECT CONCAT("0", CAST(value as CHAR(50)) AS value;
but it shows an error, that there is no field "value" found:
#1054 - Unknown field 'value' in field list (translated)
would be nice, if someone could help me with this.
(it also gives out this error, when I'm not trying to Cast 'value' to a CHAR)
tl;dr: I want 'value = "0" + value' in mysql
example:
'value = 1234; value = "0" + value; value = 01234' and that in mysql
Two issues:
There is a missing closing parenthesis for CONCAT(
Your second SELECT has no FROM clause, so indeed there is no value field there.
So move that CONCAT expression inside the first SELECT clause and balance the parentheses:
SELECT c.*, CONCAT("0", CAST(value as CHAR(50))) AS value
FROM `customer_address_entity_text` c
WHERE CHAR_LENGTH(value) < 5;
If your purpose is to pad all values with zeroes so they get 5 digits, so that it also transforms 1 to "00001" and 12 to "00012", then use LPAD:
SELECT c.*, LPAD(value, 5, "0") AS value
FROM `customer_address_entity_text` c;
To update the value field:
UPDATE `customer_address_entity_text`
SET value = LPAD(value, 5, "0");
Or, with your original concat version:
UPDATE `customer_address_entity_text`
SET value = CONCAT("0", CAST(value as CHAR(50)))
WHERE CHAR_LENGTH(value) < 5;

substring_index skips delimiter from right

I have a table 'car_purchases' with a 'description' column. The column is a string that includes first name initial followed by full stop, space and last name.
An example of the Description column is
'Car purchased by J. Blow'
I am using 'substring_index' function to extract the letter preceding the '.' in the column string. Like so:
SELECT
Description,
SUBSTRING_INDEX(Description, '.', 1) as TrimInitial,
SUBSTRING_INDEX(
SUBSTRING_INDEX(Description, '.', 1),' ', -1) as trimmed,
length(SUBSTRING_INDEX(
SUBSTRING_INDEX(Description, '.', 1),' ', -1)) as length
from car_purchases;
I will call this query 1.
picture of the result set (Result 1) is as follows
As you can see the problem is that the 'trimmed' column in the select statement starts counting the 2nd delimiter ' ' instead of the first from the right and produces the result 'by J' instead of just 'J'. Further the length column indicates that the string length is 5 instead of 4 so WTF?
However when I perform the following select statement;
select SUBSTRING_INDEX(
SUBSTRING_INDEX('Car purchased by J. Blow', '.', 1),' ', -1); -- query 2
Result = 'J' as 'Result 2'.
As you can see from result 1 the string in column 'Description' is exactly (as far as I can tell) the same as the string from 'Result 2'. But when the substring_index is performed on the column (instead of just the string itself) the result ignores the first delimiter and selects a string from the 2nd delimiter from the right of the string.
I've racked my brains over this and have tried 'by ' and ' by' as delimiters but both options do not produce the desired result of a single character. I do not want to add further complexity to query 1 by using a trim function. I've also tried the cast function on result column 'trimmed' but still no success. I do not want to concat it either.
There is an anomaly in the 'length' column of query 1 where if I change the length function to char_length function like so:
select length(SUBSTRING_INDEX(
SUBSTRING_INDEX(Description, '.', 1),' ', -1)) as length -- result = 5
select char_length(SUBSTRING_INDEX(
SUBSTRING_INDEX(Description, '.', 1),' ', -1)) as length -- result = 4
Can anyone please explain to me why the above select statement would produce 2 different results? I think this is the reason why I am not getting my desired result.
But just to be clear my desired outcome is to get 'J' not 'by J'.
I guess I could try reverse but I dont think this is an acceptable compromise. Also I am not familiar with collation and charset principles except that I just use the defaults.
Cheers Players!!!!
CHAR_LENGTH returns length in characters, so a string with 4 2-byte characters would return 4. LENGTH however returns length in bytes, so a string with 4 2-byte characters would return 8. The discrepancy in your results (including SUBSTRING_INDEX) says that the "space" between by and J is not actually a single-byte space (ASCII 0x20) but a 2-byte character that looks like a space. To workaround this, you could try replacing all unicode characters with spaces using CONVERT and REPLACE. In this example, I have an en-space unicode character in the string between by and J. The CONVERT changes that to a ?, and the REPLACE then converts that to a space:
SELECT SUBSTRING_INDEX( SUBSTRING_INDEX("Car purchased by J. Blow", '.', 1),' ', -1)
Output:
by J
With CONVERT and REPLACE:
SELECT SUBSTRING_INDEX( SUBSTRING_INDEX(REPLACE(CONVERT("Car purchased by J. Blow" USING ASCII), '?', ' '), '.', 1),' ', -1)
Output
J
For your query, you would replace the string with your column name i.e.
SELECT SUBSTRING_INDEX( SUBSTRING_INDEX(REPLACE(CONVERT(description USING ASCII), '?', ' '), '.', 1),' ', -1)
Demo on DBFiddle

Teradata Masking - Retain all chararcters at position 1,4,8,12,16 .... in a string and mask remaining characters with 'X'

I have a requirement where I need to mask all but characters in position 1,4,8,12,16.. for a variable length string with 'X'
For example:
Input string - 'John Doe'
Output String - 'JXXn xxE'
SPACE between the two strings must be retained.
Kindly help or reach out for more details if required.
I think maybe an external function would be best here, but if that's too much to bite off, you can get crafty with strtok_split_to_table, xml_agg and regexp_replace to rip the string apart, replace out characters using your criteria, and stitch it back together:
WITH cte AS (SELECT REGEXP_REPLACE('this is a test of this functionality', '(.)', '\1,') AS fullname FROM Sys_Calendar.calendar WHERE calendar_date = CURRENT_DATE)
SELECT
REGEXP_REPLACE(REGEXP_REPLACE((XMLAGG(tokenout ORDER BY tokennum) (VARCHAR(200))), '(.) (.)', '\1\2') , '(.) (.)', '\1\2')
FROM
(
SELECT
tokennum,
outkey,
CASE WHEN tokennum = 1 OR tokennum mod 4 = 0 OR token = ' ' THEN token ELSE 'X' END AS tokenout
FROM TABLE (strtok_split_to_table(cte.fullname, cte.fullname, ',')
RETURNS (outkey VARCHAR(200), tokennum integer, token VARCHAR(200) CHARACTER SET UNICODE)) AS d
) stringshred
GROUP BY outkey
This won't be fast on a large data set, but it might suffice depending on how much data you have to process.
Breaking this down:
WITH cte AS (SELECT REGEXP_REPLACE('this is a test of this functionality', '(.)', '\1,') AS fullname FROM Sys_Calendar.calendar WHERE calendar_date = CURRENT_DATE)
This CTE is just adding a comma between every character of our incoming string using that regexp_replace function. Your name will come out like J,o,h,n, ,D,o,e. You can ignore the sys_calendar part, I just put that in so it would spit out exactly 1 record for testing.
SELECT
tokennum,
outkey,
CASE WHEN tokennum = 1 OR tokennum mod 4 = 0 OR token = ' ' THEN token ELSE 'X' END AS tokenout
FROM TABLE (strtok_split_to_table(cte.fullname, cte.fullname, ',')
RETURNS (outkey VARCHAR(200), tokennum integer, token VARCHAR(200) CHARACTER SET UNICODE)) AS d
This subquery is the important bit. Here we create a record for every character in your incoming name. strtok_split_to_table is doing the work here splitting that incoming name by comma (which we added in the CTE)
The Case statement just runs your criteria swapping out 'X' in the correct positions (record 1, or a multiple of 4, and not a space).
SELECT
REGEXP_REPLACE(REGEXP_REPLACE((XMLAGG(tokenout ORDER BY tokennum) (VARCHAR(200))), '(.) (.)', '\1\2') , '(.) (.)', '\1\2')
Finally we use XMLAGG to combine the many records back into one string in a single record. Because XMLAGG adds a space in between each character we have to hit it a couple of times with regexp_replace to flip those spaces back to nothing.
So... it's ugly, but it does the job.
The code above spits out:
tXXs XX X XeXX oX XhXX fXXXtXXXaXXXy
I couldn't think of a solution, but then #JNevill inspired me with his idea to add a comma to each character :-)
SELECT
RegExp_Replace(
RegExp_Replace(
RegExp_Replace(inputString, '(.)(.)?(.)?(.)?', '(\1(\2[\3(\4', 2)
,'(\([^ ])', 'X')
,'(\(|\[)')
,'this is a test of this functionality' AS inputString
tXXs XX X XeXX oX XhXX fXXXtXXXaXXXy
The 1st RegExp_Replace starts at the 2nd character (keep the 1st character as-is) and processes groups of (up to) 4 characters adding either a ( (characters #1,#2,#4, to be replaced by X unless it's a space) or [ (character #3, no replacement), which results in :
t(h(i[s( (i(s[ (a( (t[e(s(t( [o(f( (t[h(i(s( [f(u(n(c[t(i(o(n[a(l(i(t[y(
Of course this assumes that both characters don't exists in your input data, otherwise you have to choose different ones.
The 2nd RegExp_Replace replaces the ( and the following character with X unless it's a space, which results in:
tXX[s( XX[ X( X[eXX( [oX( X[hXX( [fXXX[tXXX[aXXX[y(
Now there are some (& [ left which are removed by the 3rd RegExp_Replace.
As I still consider me as a beginner in Regular Expressions, there will be better solutions :-)
Edit:
In older Teradata versions not all parameters were optional, then you might have to add values for those:
RegExp_Replace(
RegExp_Replace(
RegExp_Replace(inputString, '(.)(.)?(.)?(.)?', '(\1(\2[\3(\4', 2, 0 'c')
,'(\([^ ])', 'X', 1, 0 'c')
,'(\(|\[)', '', 1, 0 'c')