I need to find the first and second "_" and extract whatever is between.
example data
doc_856_abc_123
doc_876_xyz_999
So far I have the following substring query. But I need help
select SUBSTRING_INDEX( column, '_', 2 )
It is outputting
doc_856
doc_867
How do I combine the above query to maybe another substring go get the desired results. Which would be.
856
867
Just apply SUBSTRING_INDEX again on the resulted string
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(column, '_', 2 ), '_', -1)
Related
I have JSON stored in a MySQL database (version 5.6.17) that I'm trying to regex into a column to retrieve a list of campaign IDs. My query is as follows:
SELECT JSON REGEXP '"id":([0-9]*)' AS id
FROM PROD_APPNEXUS.dimension_json_creatives;
where JSON is a column containing the data I need to parse as ID. I know REGEXP can be used for strings in SELECT queries (i.e. SELECT 'foobar' REGEXP '([a-z]+)' AS foobar) but can columns be pattern matched in the same way?
Would there be a way to cast the JSON column as string and then regex?
Any help would be appreciated!
Thanks,
Sam
You can use replace and substring_index to split your column, like this;)
SELECT replace(substring_index(JSON, ':', -1), '"', '') AS id
FROM PROD_APPNEXUS.dimension_json_creatives;
when I run sql below return aaaa,
select replace(substring_index('"id":"aaaa"', ':', -1), '"', '');
I assumed your JSON's value does not exist :.
I want to find common parent folder in my table images with field path.
For example, in my table i have for this table 6 rows.
1 c:\images\site1\root\img\logo.png
2 c:\images\site1\root\resource\test1.png
3 c:\images\site1\root\resource\test2.png
4 c:\images\site1\root\resource\test3.png
5 c:\images\site1\root\images\background.png
And I want to have "root" folders:
c:\images\site1\root\img\
c:\images\site1\root\resource\
c:\images\site1\root\images\
What is the SQL request to get this, please?
Many thanks.
This can be a bit tricky. If you know that the path does not share the name of the file, you could replace the last part with an empty string:
select distinct replace(path, substring_index(path, '\\', -1), '')
However, this may not always be true.
What you want is everything before the last occurrence of \\. Here is a method using substring_index():
select distinct substring_index(path, '\\',
length(path) - length(replace(path, '\\', '')
)
The difference in lengths is simply counting the number of separator characters in the string. If the path has one separator (i.e. 'a\b'), then the argument to substring_index() is 1, which is what you want.
I have a table which i am using to query and getting its one column which matches regular expression which is (\/.+\/\?).
Content of the resulted column is like:
/Anything here/?
Example output:
\abc\cdf\?....
\ab\?....
\abc\cdf\?....
\sb\?....
where '....' can be anything
Desired result i want is unique values before \? such that rows with duplicate regexp matched content are shown once only like here (\abc\cdf\?.... showing twice instead of onece)
\abc\cdf\?....
\ab\?....
\sb\?....
OR
\abc\cdf\?
\ab\?
\sb\?
I have looked very much but couldn't find anything there is regexp_substr in oracle but that is not working in SQL.
Please if someone could help me with the sql query that would be awesome.
If you want everything before the last \, then you can use substring_index() and some string manipulation:
select substring_index(col, '\\',
length(col) - length(replace(col, '\\', ''))
) as firstpart,
count(*)
from table t
group by substring_index(col, '\\',
length(col) - length(replace(col, '\\', ''))
);
I'd like to extract the number between NUMBER and ;. So far I can extract the data up to the number, but I don't want anything after the number. e.g.,
SELECT
SUBSTRING(field, LOCATE('NUMBER=', rrule) + 7)
FROM table
Data field:
DATA:PASS=X12;NUMBER=331;FIELD=1
DATA:PASS=X12;NUMBER=2;FOO=BAR;FIELD=1
Desired Output:
331
2
You can use a combination of SUBSTRING_INDEX functions:
SELECT
SUBSTRING_INDEX(
SUBSTRING_INDEX(field, 'NUMBER=', -1),
';',
1)
FROM
tablename
Please see an example fiddle here.
The inner SUBSTRING_INDEX will return everything after the NUMBER= string, while the second will return everything before the ; returned by the inner function.
I'm building a basic search functionality, using LIKE (I'd be using fulltext but can't at the moment) and I'm wondering if MySQL can, on searching for a keyword (e.g. WHERE field LIKE '%word%') return 20 words either side of the keyword, as well?
You can do it all in the query using SUBSTRING_INDEX
CONCAT_WS(
' ',
-- 20 words before
TRIM(
SUBSTRING_INDEX(
SUBSTRING(field, 1, INSTR(field, 'word') - 1 ),
' ',
-20
)
),
-- your word
'word',
-- 20 words after
TRIM(
SUBSTRING_INDEX(
SUBSTRING(field, INSTR(field, 'word') + LENGTH('word') ),
' ',
20
)
)
)
Use the INSTR() function to find the position of the word in the string, and then use SUBSTRING() function to select a portion of characters before and after the position.
You'd have to look out that your SUBSTRING instruction don't use negative values or you'll get weird results.
Try that, and report back.
I don't think its possible to limit the number of words returned, however to limit the number of chars returned you could do something like
SELECT SUBSTRING(field_name, LOCATE('keyword', field_name) - chars_before, total_chars) FROM table_name WHERE field_name LIKE "%keyword%"
chars_before - is the number of
chars you wish to select before the
keyword(s)
total_chars - is the
total number of chars you wish to
select
i.e. the following example would return 30 chars of data staring from 15 chars before the keyword
SUBSTRING(field_name, LOCATE('keyword', field_name) - 15, 30)
Note: as aryeh pointed out, any negative values in SUBSTRING() buggers things up considerably - for example if the keyword is found within the first [chars_before] chars of the field, then the last [chars_before] chars of data in the field are returned.
I think your best bet is to get the result via SQL query and apply a regular expression programatically that will allow you to retrieve a group of words before and after the searched word.
I can't test it now, but the regular expression should be something like:
.*(\w+)\s*WORD\s*(\w+).*
where you replace WORD for the searched word and use regex group 1 as before-words, and 2 as after-words
I will test it later when I can ask my RegexBuddy if it will work :) and I will post it here