So I have a column with strings that are multiples of 5 characters, such as "12345" or "abcde12345" or "asdfghjkli12345". What I'm trying to do is write a query to split each of these strings into 5 character chunks and return the distinct ones.
So with "12345" , "abcde12345" , "asdfghjkli12345"
I would get back "12345" "abcde" "asdfg" and "hjkli"
Is this possible?
MySQL does not include a function to split a delimited string. Although separated data would normally be split into separate fields within a relation data, spliting such can be useful either during initial data load/validation or where such data is held in a text field.
The following formula can be used to extract the Nth item in a delimited list, in this case the 3rd item "ccccc" in the example comma separated list.
select replace(substring(substring_index('aaa,bbbb,ccccc', ',', 3), length(substring_index('aaa,bbbb,ccccc', ',', 3 - 1)) + 1), ',', '') ITEM3
The above formula does not need the first item to be handled as a special case and returns empty strings correctly when the item count is less than the position requested.
You can also create your own split function and use it. Split value from one field to two
Source: http://dev.mysql.com/doc/refman/5.0/en/string-functions.html
This sounds like it would be outside the limitations of SQL and would certainly require a programming language. It would be VERY easy in any 3GL.
Related
I want to split the string in MySQL query using a delimiter. However, I want to split on the last occurrence of the string and get the first part of the string.
For example:-
'apple - banana - grape'
The result after splitting should be 'apple - banana'. The important thing is that we do not know how many occurrences of '-' will be there in the string.
On MySQL 8+, the REGEXP_REPLACE function works well for your requirement:
SELECT fruits, REGEXP_REPLACE(fruits, '\\s*-\\s*[^-]+$', '') AS fruits_out
FROM yourTable;
Demo
By the way, a much better table design would be to store each CSV fruit value in a separate row. This would alleviate the need to use regex to manipulate the list of fruits.
I have a dataset where the values are different, and I want to bring them into a single format.The values are stored as varchar
For ex.
1st Case: 1.23.45 should be 123.45
2nd Case: 125.45 should be 125.45
The first one, has two decimals. I want to remove the first decimal only(if there are 2) else let the value be as it is.
How do I do this?
I tried using replace(Qty,'.',''). But this is removing of them.
I think this can do (although I am not 100% sure about corner cases)
SET Qty = SUBSTRING(Qty, 1, LOCATE(Qty, '.') - 1) + SUBSTRING(Qty, LOCATE(Qty, '.') + 1, LENGTH(Qty) - LOCATE(Qty, '.') - 1)
WHERE LENGTH(Qty) - LENGTH(REPLACE(Qty, '.', '')
You can use a regular expression to handle this case.
Assuming there are only two decimals in your string the below query should be able to handle the case.
select (value,'^(\d+)(\.)?(\d+\.\d+)$',concat('$1','$2')) as a
Here we are matching a regular expression pattern and capturing the following
digits before first decimal occurrence in group one
digits before and after last decimal occurrence including the last decimal in group two.
Following that we are concatenating the two captured groups.
Note that the first decimal has been made optional using ? character and hence we are able to handle both type of cases.
Even if there are more than two decimal cases, I believe a properly constructed regular expression should be able to handle it.
I have a MySQL table setup where one column's values are a string of comma-separated True/False values (1s or 0s). For example, in the column, one field's value may be "0,1,0,0,0,0,1,1,0" and another may be "1,0,0,1,1,1,0,0,0" (note: these are NOT 9 separate columns, but a string in one column). I need to QUERY the MySQL table for elements that are "true"(1) for the "nth element" of that column's value/string.
So, if I was looking for rows, with a specific column, where the 3rd element of the column's value was 1, it would produce a list of results. So, in this case, I would only be searching for "1" in the fth place (12345 = X,X,X...) of the string (X,X,1,X,X,X,X,X,X,X). How can I query this?
This is a crude example of what I am trying to do ...
"SELECT tfcolumn FROM mytable WHERE substr({tfcolumn}, 0, 5)=1"
{tfcolumn} represents the column value
5 represents the 5th position of the string
=1 represents what I need that position to equal to.
Please help. Thanks
You can't. Once you put a serialized data type into a column in SQL (like comma separated lists, or JSON objects) you are preventing yourself from performing any query on the data in those columns. You have to pull the data in a different way and then use a program like python, VB, etc to get the comma separated values you are looking for.
Unless you want to deal with trying to make this mess of a query work...
I would recommend changing your table structure before it's too late. Although it is possible, it is not optimized in a format that a DBMS recognizes. Because of that the DBMS will spend a significant amount of time going through every record to parse the csv values which is something that it was not meant to be doing. Doing the query in SQL will take as much time (if not more time) than just pulling all the records and searching with a tool that can do it properly.
If the column contains values exactly like the ones you posted, then the Nth element is at the 2 * N - 1 position in the comma separated list.
So do this:
SELECT tfcolumn
FROM tablename
WHERE substr(tfcolumn, 2 * 5 - 1, 1) = '1'
Replace 5 with the index that you search for.
See the demo.
Or remove all commas and get the Nth char:
SELECT tfcolumn
FROM tablename
WHERE substr(replace(tfcolumn, ',', ''), 5, 1) = '1'
See the demo.
Try this
if substring_index(substring_index('0,1,0,0,0,0,1,1,0',',',3),',',-1)='1'
The first argument can be your column name. The second argument (',') tells the function that the string is comma-separated. The third argument takes the first 3 elements of the string. So, the output of inner substring_index is '0,1,0'.
The outer substring_index has -1 as the last argment. So, it starts counting in reverse direction & takes only 1 element starting from right.
For example, if the value in a particular row is '2,682,7003,14,185', then the value of substring_index(substring_index('2,682,7003,14,185',',',3),',',-1) is '7003'.
I have got field containing comma separated values. I need to extract the last element in the list.
I have tried with this:
select list_field, LTRIM(RTRIM(right(list_field, len(list_field) - CHARINDEX(',',list_field))))
But it returns the last part of the list just starting after the first comma occurrence.
For example,
a,b returns b
a,b,c returns b,c
I would like to use a regex like pattern. Is it possible in TSQL (sql server 2008)?
Any other clues?
Find the last , by reversing the string and looking for the first occurrence, then read that many characters from the right of the string;
rtrim(right(list_field, charindex(',', reverse(list_field)) - 1))
(Use reverse(list_field) + ',' if there is the possibility of no delimiters in the field & you want the single value)
Consider the string "55,33,255,66,55"
I am finding ways to count number of occurence of a specific characters ("55" in this case) in this string using mysql select query.
Currently i am using the below logic to count
select CAST((LENGTH("55,33,255,66,55") - LENGTH(REPLACE("55,33,255,66,55", "55", ""))) / LENGTH("55") AS UNSIGNED)
But the issue with this one is, it counts all occurence of 55 and the result is = 3,
but the desired output is = 2.
Is there any way i can make this work correct? please suggest.
NOTE : "55" is the input we are giving and consider the value "55,33,255,66,55" is from a database field.
Regards,
Balan
You want to match on ',55,', but there's the first and last position to worry about. You can use the trick of adding commas to the frot and back of the input to get around that:
select LENGTH('55,33,255,66,55') + 2 -
LENGTH(REPLACE(CONCAT(',', '55,33,255,66,55', ','), ',55,', 'xxx'))
Returns 2
I've used CONCAT to pre- and post-pend the commas (rather than adding a literal into the text) because I assume you'll be using this on a column not a literal.
Note also these improvements:
Removal of the cast - it is already numeric
By replacing with a string one less in length (ie ',55,' length 4 to 'xxx' length 3), the result doesn't need to be divided - it's already the correct result
2 is added to the length because of the two commas added front and back (no need to use CONCAT to calculate the pre-replace length)
Try this:
select CAST((LENGTH("55,33,255,66,55") + 2 - LENGTH(REPLACE(concat(",","55,33,255,66,55",","), ",55,", ",,"))) / LENGTH("55") AS UNSIGNED)
I would do an sub select in this sub select I would replace every 255 with some other unique signs and them count the new signs and the standing 55's.
If(row = '255') then '1337'
for example.