Extract Only charcters from a String - mysql

I have a column value like
lut00006300.txt
sand2a0000300.raw
I need to extract only character data from above given column values. I tried the below query and was able to get the first three characters.
select filesize,
substring(Filename FROM 1 FOR 3) AS Instrument from Collection;
Is there any approach to extract only the characters from the column value leaving the extensions
The results should be :
LUT
SAND2A

I think below query will helps you.
select filesize,Filename from Collection where Filename REGEXP '[:alpha]';
Refer:- http://dev.mysql.com/doc/refman/5.1/en/regexp.html

SELECT
filesize,
UPPER(SUBSTRING_INDEX(SUBSTRING_INDEX(Filename, '.', 1), '0', 1)) AS Instrument
FROM Collection;
This is a dirty solution, since you want to have the 2 in SAND2A.
Read more about the functions here.

Related

Mysql: extract a string from field between delimiters (backwards)

I have a Column 'ACCOUNT_NUMBER' from a table 'BankingActivity' which contains data as follow :
example:
ManualBanking-BankDeposit-350-1006590343--INTERNAL_A
or
MyPayCard-MyPayDeposit-620-989228234--TL
I need to extract the number '1006590343' or '989228234'
Initially i execute the following query:
select substr( `BankingActivity`.`ACCOUNT_NUMBER`,(
locate( '--', `BankingActivity`.`ACCOUNT_NUMBER` ) - 9 ),9 ) * 1
from BankingActivity
Which works fine if the length of the string does not exceed 9 digits. Over 9 digits, I obviously have issues and can not get the full string.
How can i look backwards for the delimiter '--' and then extract the value between the '--' delimiter and the previous '-' delimiter?
I tried with some Regex but I am not familiar enough with it to get a correct result.
Try
SELECT regexp_substr(
regexp_substr(acct, '-\\d+--'), '\\d+')
FROM (
SELECT 'ManualBanking-BankDeposit-350-1006590343--INTERNAL_A' as acct
UNION
SELECT 'MyPayCard-MyPayDeposit-620-989228234--TL'
) accounts;
The inner regexp_substr extracts a substring that begins with a dash followed by 1 or more digits and ends with two dashes. That would be e. g. '-1006590343--'. From this, the outer regexp_substr extracts all consecutive digits, that is '1006590343'.
More detailed information about regular expressions in MySQL can be found in the documentation.
If I have understood your question correctly then you can try something like this -
select SUBSTRING_INDEX(SUBSTRING_INDEX('ManualBanking-BankDeposit-350-1006590343--INTERNAL_A', '-' ,-3), '--', 1);
select SUBSTRING_INDEX(SUBSTRING_INDEX('MyPayCard-MyPayDeposit-620-989228234--TL', '-' ,-3), '--', 1);
This is probably a job for SUBSTRING_INDEX().
Check it out. Fiddle here.
SET #s = 'ManualBanking-BankDeposit-350-1006590343--INTERNAL_A';
SELECT SUBSTRING_INDEX(#s, '-', -3);
This splits your string on '-'. It takes everything after the third '-' delimiter from the end, and gives you back 1006590343--INTERNAL_A.
Then we use SUBSTRING_INDEX() again on that.
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(#s, '-', -3), '-', 1);
Lo and behold, this gets us 1006590343.
But. This is a brittle way to do it. MySQL's string processing isn't easy to program in detailed ways. This solution doesn't take into account things like missing dashes at the end of the string. Garbage in, garbage out. Use a host language like C# / php / nodejs / Java etc to do this kind of string analysis if you want it to be super-robust for real world data.

Finding exact value in mysql

I'm trying to solve problem how to find exact value from string.
The problem is then searching in Column StringB for the value 1, it finds all rows containing 1. The idea is that if I look for value 1 in StringB it should only find where value is exact.
Using LIKE is not a perfect option since it will take all rows which contains 1, using = also is not a option since it searches for equal value.
Also tried to use INSTR, but it works almost same as LIKE.
Same with Locate.
There is currently stored formats:
number (example: "2" without "")
number. (example: "2." without "")
number.number (example: "2.23.52.12.35" without "")
And they don't change.
This column only stores numbers, no letter or other type of string ONLY numbers (integer type)
Is there any way to strictly search for value?
My database is InnoDB. Thank you for your time.
Try using REGEXP:
SELECT *
FROM yourTable
WHERE CONCAT('.', StringB, '.') REGEXP CONCAT('[.]', '2', '[.]');
Demo
We could also use LIKE instead of REGEXP:
SELECT *
FROM yourTable
WHERE CONCAT('.', StringB, '.') LIKE CONCAT('%.', '2', '.%');
If you do:
where stringB = 1
Then MySQL has to figure out what types to use. By the rules of SQL, it will convert '1.00' to a number -- and they match.
If you do
where stringB = '1'
Then the types do what you intend. And the values are compared as strings.
More: Keep the types consistent. Don't ever depend on implicit conversion.

How to adjust phone number to same format?

I want to extract data from my database, and I wish the when I extract the data all the h/p number all is same format.
sample:
+60161234567
016-1234567
0161234567
To:
+6016-1234567
The following simple code will give you the expected Output.
SELECT
(CASE
WHEN contact_number RLIKE "^[+]([0-9]){10}" THEN CONCAT(SUBSTRING(contact_number,1,5), "-", SUBSTRING(contact_number,6,7))
WHEN contact_number RLIKE "^([0-9]){3}[-]([0-9]){7}" THEN CONCAT("+6",contact_number)
WHEN contact_number RLIKE "^([0-9]){10}" THEN CONCAT("+6",SUBSTRING(contact_number,1,3), "-", SUBSTRING(contact_number,4,7))
ELSE 'Number is not in correct format'
END) AS 'Phone Number'
FROM table_name;
We have to use the regular expression for checking number format. I have done the query based on the number format that you have given. If a different format is there then try corresponding regular expression.
In the query, the 1st case for the number like "+60161234567", the 2nd case for the number like "016-1234567" and Last case for the number like "0161234567".
I have attached my SQLFiddle to this solution. You can check it. Thank you!
You can use concat and substr function to change format of phone number column during extracting data as below:
SELECT CONCAT(SUBSTR( contact_number, 1, 5 ) , "-", SUBSTR( contact_number, 5 ) ) AS phone_no FROM test_table;
Try this:
SELECT CONCAT("+6",SUBSTRING(REPLACE(REPLACE(phone, '+6', ''), '-', ''),1,3), "-", SUBSTRING(REPLACE(REPLACE(phone, '+6', ''), '-', ''),4,7)) FROM `phone`
It seems that an h/p number is a phone number?
This isn't a number, in the data type sense, so it shouldn't be stored as one. Numbers can have math applied to them and represent some finite value. Varchar or char is probably the appropriate data type here. When inserting the data, standardize the way you record the data and fix it in your application before inserting. In the US, phone numbers can be shown as (555) 123-4567 or 555-123-4567 so you could just pick the format — or what I would do is store it as 5551234567 and then format it in the application when displaying it to the user (that way, the user can prefer one format or the other). At the minimum, though, you should standardize the format when inserting the string so that you don't have to deal later with a variety of formats.
For data that already exists, I think your best bet is to fix the whole database. I would use some programming or scripting language that's good at manipulating data to fix it in the database (personally I think Python is a good choice).

MySQL substring between two strings

I need a hand to solve a problem with my column field.
I need to extract the string in between these two different "patterns" of strings for example:
[...string] contract= 1234567890123350566076070666 issued= [string
...]
I want to extract the string in between 'contract=' and 'issued='
At the present moment I'm using
SELECT substring(substring_index(licence_key,'contract=',-1),1,40) FROM table
The problem is that this string in between doesn't have always 40 characters so it's not fixed length and so the data that comes before and after that. It's a volatile data.
Do you known how I can handle that?
Just use substring_index() twice:
SELECT substring_index(substring_index(licence_key, 'contract=', -1),
'issued=', 1)
FROM table;
If this string does not match then give the total result.
If you want to replace then you can use like this.
UPDATE questions set question= REPLACE(question, '<xml></xml>', '') WHERE question like '%<xml>%';
UPDATE questions set question= REPLACE(question, substring_index(substring_index(question, '<xml>', -1), '</xml>', 1), '') WHERE question like '%<xml>%';

MySQL sort by name

Is ir possible to sort a column alphabetically but ignoring certain words like e.g 'The'
e.g.
A normal query would return
string 1
string 3
string 4
the string 2
I would like to return
string 1
the string 2
string 3
string 4
Is this possible?
EDIT
Please note I am looking to replace multiple words like The, A, etc... Can this be done?
You can try
SELECT id, text FROM table ORDER BY TRIM(REPLACE(LOWER(text), 'the ', ''))
but note that it will be very slow for large datasets as it has to recompute the new string for every row.
IMO you're better off with a separate column with an index on it.
For multiple stopwords just keep nesting REPLACE calls. :)
This will replace all leading "The " as an example
SELECT *
FROM YourTable
ORDER BY REPLACE(Val,'The ', '')
Yes, it should be possible to use expressions with the ORDER-part:
SELECT * FROM yourTable ORDER BY REPLACE(yourField, "the ", "")
I have a music listing that is well over 75,000 records and I had encountered a similar situation. I wrote a PHP script that checked for all string that began with 'A ', 'An ' or 'The ' and truncated that part off the string. I also converted all uppercase letters to lowercase and stored that string in a new column. After setting an index on that column, I was done.
Obviously you display the initial column but sort by the newly-created indexed column. I get results in a second or so now.