Find values that are start of SearchString - mysql

For a given search string s, I want to find values from an indexed varchar(255) field (~1m rows), so that s.startsWith(value) == true.
Example:
s = "hello world"
matches: "h", "hello", "hello world"
Is that possible?

You can use INSTR with opposite arguments than one would do in most cases:
SELECT *
FROM mytable
WHERE INSTR('hello world', mycol) = 1
This will return records where mycol has a substring of "hello world" starting at position 1. So any of the following will match:
h
he
hel
hell
hello
hello (with trailing space)
hello w
hello wo
hello wor
hello worl
hello world
You could maybe get better performance with the addition of the following redundant condition, which could hint the SQL engine to choose an index on mycol:
SELECT *
FROM mytable
WHERE INSTR('hello world', mycol) = 1
AND mycol like 'h%'
Just be aware that even with use of an index this does not guarantee a faster output. Imagine a table with values:
hel
helaaaaaaa
helaaaaaab
helaaaaaac
...(1000 more records like that, and finally:)
hello world
... then the engine would still scan all records, and only get the first and the last.
If you have an application executing this query, you could let it build the SQL dynamically, so that it looks like this:
SELECT *
FROM mytable
WHERE mycol IN ('h', 'he', 'hel', 'hell', 'hello', 'hello ', 'hello w',
'hello wo', 'hello wor', 'hello worl', 'hello world')
This would have potentially the best performance. If that doesn't do it, then certainly this elaborate SQL will do it:
SELECT * FROM mytable WHERE mycol = 'h'
UNION ALL
SELECT * FROM mytable WHERE mycol = 'he'
UNION ALL
SELECT * FROM mytable WHERE mycol = 'hel'
UNION ALL
SELECT * FROM mytable WHERE mycol = 'hell'
UNION ALL
SELECT * FROM mytable WHERE mycol = 'hello'
UNION ALL
SELECT * FROM mytable WHERE mycol = 'hello '
UNION ALL
SELECT * FROM mytable WHERE mycol = 'hello w'
UNION ALL
SELECT * FROM mytable WHERE mycol = 'hello wo'
UNION ALL
SELECT * FROM mytable WHERE mycol = 'hello wor'
UNION ALL
SELECT * FROM mytable WHERE mycol = 'hello worl'
UNION ALL
SELECT * FROM mytable WHERE mycol = 'hello world'

You want to use like for this:
where col like 'hello%' -- or whatever
or
where col like concat(#s, '%')
The reason for using like is that it can make use of an index, because the pattern does not start with wildcard characters.
EDIT:
I might have the logic backwards. If so, you can still use like:
where #s like concat(col, '%')
In this case, though, an index cannot be used, because the column is an argument to a function.

You can use "like" in sql.
select * from tableName
where columnName like 'h%'

Related

Return list of words that appear directly before "word of interest"

Essentially I have a column of long text values, ex:
ID | text
0 | Hi my name is Brian.
1 | I think Brian sucks.
And I want to write a function that returns a list of words that come right before "word of interest". So if I searched "Brian", the function would return "is" and "think " because both words appear right before "Brian".
I have this code so far but it is not working:
select case when (select w.t regexp concat('[[:<:]]', w.v)) = 1
then substr(w.t, 1, locate(w.v, w.t)-1) else null end as 'left_word',
w.v as word
from (
select text from table as t, "Brian" as v
) as w;
Any ideas?
You have a syntax error in your query:
select text from table as t, "Brian" as v
should be:
select text as t, "Brian" as v from table
Once you fix that, your output is:
Hi my name is
I think
You can then use SUBSTRING_INDEX to extract the last word out of those strings:
select case when w.t regexp concat('[[:<:]]', w.v, '[[:>:]]')
then substring_index(substr(w.t, 1, locate(w.v, w.t)-2), ' ', -1)
else null
end as 'left_word',
w.v as word
from (
select text as t, "Brian" as v
from `table`
) as w;
Output:
left_word word
is Brian
think Brian
Demo on SQLFiddle
Note for MySQL > 8.0.4, you need to use \\b as the word boundary instead of [[:<:]] and [[:>:]], so your query becomes
select case when w.t regexp concat('\\b', w.v, '\\b')
then substring_index(substr(w.t, 1, locate(w.v, w.t)-2), ' ', -1)
else null
end as 'left_word',
w.v as word
from (
select text as t, "Brian" as v
from `table`
) as w;
Demo on dbfiddle
You can try this query:
select
SUBSTRING_INDEX(SUBSTRING_INDEX(text, 'Brian', 1),' ',-2)
from `table`
where text LIKE '%Brian%'

Including local variable in concat string in MySQL

I'm looking to take a specified string and query a table where a concat of 2 fields is equal to the string.
set #fab = "36013-601301-11";
set #job = substring_index(#fab, '-', 1);
set #fabnumba = trim(leading LEFT(#fab,char_length(#job)+1) from #fab);
select * from (select JobNumber, concat(JobNumber, '-', LotNumber) as bomfab from qiw_powerbi) base
where bomfab LIKE concat(#job,"-", #fabnumba)
If I try the following it fails:
WHERE bombfab LIKE "36013-601301-11"
However, this attempt works:
WHERE bombfab LIKE "36013-%601301-11"
How can I concat() with the variables #job and #fabnumba to do this?
Are you sure that the LotNumber values from qiw_powerbi are what you are expecting? They don't have any leading spaces?
What happens if you try adding a TRIM function to LotNumber:
select * from (select JobNumber, concat(JobNumber, '-', TRIM(LotNumber)) as bomfab from qiw_powerbi) base
where bomfab LIKE concat(#job,"-", #fabnumba)

MySQL Regex Replace Query

I have a field with this value:
TEST:ATEST:TESTA
And I want to replace "TEST" with "NEW", I have tried this query:
UPDATE `table` SET `field` = REPLACE(`field`, 'TEST', 'NEW') WHERE `field` REGEXP 'TEST';
The result was:
NEW:ANEW:NEWA
Q: How could I do the replacement query so the result would be like this:
NEW:ATEST:TESTA
It is a bit of a pain, but you can do it this way:
UPDATE `table`
SET field = substr(REPLACE(concat(':', field, ':'), ':TEST:', ':NEW:'),
2, length(REPLACE(concat(':', field, ':'), ':TEST:', ':NEW:')) - 2)
WHERE concat(':', field, ':') LIKE '%:TEST:%';
I prefer LIKE to REGEXP because there is the hope of being able to use an index. That is not a possibility in this case, but there is the hope.
This is delimiting the values with colons at the beginning and the end, and only replacing fully delimited values. The trick is to then remove the additional colons.
You can try http://sqlfiddle.com/#!9/4e66b/3
so the update query is (if table name = table1, field name = field1, and there is unique column id):
UPDATE `table1`
INNER JOIN
(SELECT id,
#set_idx:=FIND_IN_SET('TEST',REPLACE(field1,':',',')),
#set_size:=LENGTH(field1)-LENGTH(REPLACE(field1,':',''))+1,
CASE
WHEN #set_idx=1 THEN CONCAT('NEW',SUBSTRING(field1, 4))
WHEN #set_idx>1 THEN CONCAT(SUBSTRING_INDEX(field1, ':',#set_idx-1),':NEW', IF(#set_size>#set_idx,CONCAT(':',SUBSTRING_INDEX(field1, ':',-(#set_size-#set_idx))),''))
END as new
FROM table1
WHERE `field1` REGEXP '(^TEST$)|(^TEST:)|(:TEST$)|(:TEST:)'
) t
ON t.id = table1.id
SET table1.field1 = t.new;

SQL: select unique substrings from the table by mask

There is a SQL table mytable that has a column mycolumn.
That column has text inside each cell. Each cell may contain "this.text/31/" or "this.text/72/" substrings (numbers in that substrings can be any) as a part of string.
What SQL query should be executed to display a list of unique such substrings?
P.S. Of course, some cells may contain several such substrings.
And here are the answers for questions from the comments:
The query supposed to work on SQL Server.
The prefered output should contain the whole substring, not the numeric part only. It actually could be not just the number between first "/" and the second "/".
And it is varchar type (probably)
Example:
mycolumn contains such values:
abcd/eftthis.text/31/sadflh adslkjh
abcd/eftthis.text/44/khjgb ljgnkhj this.text/447/lhkjgnkjh
ljgkhjgadsvlkgnl
uygouyg/this.text/31/luinluinlugnthis.text/31/ouygnouyg
khjgbkjyghbk
The query should display:
this.text/31/
this.text/44/
this.text/447/
How about using a recursive CTE:
CREATE TABLE #myTable
(
myColumn VARCHAR(100)
)
INSERT INTO #myTable
VALUES
('abcd/eftthis.text/31/sadflh adslkjh'),
('abcd/eftthis.text/44/khjgb ljgnkhj this.text/447/lhkjgnkjh'),
('ljgkhjgadsvlkgnl'),
('uygouyg/this.text/31/luinluinlugnthis.text/31/ouygnouyg'),
('khjgbkjyghbk')
;WITH CTE
AS
(
SELECT MyColumn,
CHARINDEX('this.text/', myColumn, 0) AS startPos,
CHARINDEX('/', myColumn, CHARINDEX('this.text/', myColumn, 1) + 10) AS endPos
FROM #myTable
WHERE myColumn LIKE '%this.text/%'
UNION ALL
SELECT T1.MyColumn,
CHARINDEX('this.text/', T1.myColumn, C.endPos) AS startPos,
CHARINDEX('/', T1.myColumn, CHARINDEX('this.text/', T1.myColumn, c.endPos) + 10) AS endPos
FROM #myTable T1
INNER JOIN CTE C
ON C.myColumn = T1.myColumn
WHERE SUBSTRING(T1.MyColumn, C.EndPos, 100) LIKE '%this.text/%'
)
SELECT DISTINCT SUBSTRING(myColumn, startPos, EndPos - startPos)
FROM CTE
Having a table named test with the following data:
COLUMN1
aathis.text/31/
this.text/1/
bbbthis.text/72/sksk
could this be what you are looking for?
select SUBSTR(COLUMN1,INSTR(COLUMN1,'this.text', 1 ),INSTR(COLUMN1,'/',INSTR(COLUMN1,'this.text', 1 )+10) - INSTR(COLUMN1,'this.text', 1 )+1) from test;
result:
this.text/31/
this.text/1/
this.text/72/
i see your problem:
Assume the same table as above but now with the following data:
this.text/77/
xxthis.text/33/xx
xthis.text/11/xxthis.text/22/x
xthis.text/1/x
The following might help you:
SELECT SUBSTR(COLUMN1, INSTR(COLUMN1,'this.text', 1 ,1), INSTR(COLUMN1,'/',INSTR(COLUMN1,'this.text', 1 ,1)+10) - INSTR(COLUMN1,'this.text', 1 ,1)+1) FROM TEST
UNION
SELECT CASE WHEN (INSTR(COLUMN1,'this.text', 1,2 ) >0) THEN
SUBSTR(COLUMN1, INSTR(COLUMN1,'this.text', 1,2 ), INSTR(COLUMN1,'/',INSTR(COLUMN1,'this.text', 1 ,2),2) - INSTR(COLUMN1,'this.text', 1,2 )+1) end FROM TEST;
it will generate the following result:
this.text/1/
this.text/11/
this.text/22/
this.text/33/
this.text/77/
The downside is that you need to add a select statement for every occurance you might have of "this.text". If you might have 100 "this.text" in the same cell it might be a problem.
SQL> select SUBSTR(column_name,1,9) from tablename;
column_name
this.text
SELECT REGEXP_SUBSTR(column_name,'this.text/[[:digit:]]+/')
FROM table_name

Detect if value is number in MySQL

Is there a way to detect if a value is a number in a MySQL query? Such as
SELECT *
FROM myTable
WHERE isANumber(col1) = true
You can use Regular Expression too... it would be like:
SELECT * FROM myTable WHERE col1 REGEXP '^[0-9]+$';
Reference:
http://dev.mysql.com/doc/refman/5.1/en/regexp.html
This should work in most cases.
SELECT * FROM myTable WHERE concat('',col1 * 1) = col1
It doesn't work for non-standard numbers like
1e4
1.2e5
123. (trailing decimal)
If your data is 'test', 'test0', 'test1111', '111test', '111'
To select all records where the data is a simple int:
SELECT *
FROM myTable
WHERE col1 REGEXP '^[0-9]+$';
Result: '111'
(In regex, ^ means begin, and $ means end)
To select all records where an integer or decimal number exists:
SELECT *
FROM myTable
WHERE col1 REGEXP '^[0-9]+\\.?[0-9]*$'; - for 123.12
Result: '111' (same as last example)
Finally, to select all records where number exists, use this:
SELECT *
FROM myTable
WHERE col1 REGEXP '[0-9]+';
Result: 'test0' and 'test1111' and '111test' and '111'
SELECT * FROM myTable
WHERE col1 REGEXP '^[+-]?[0-9]*([0-9]\\.|[0-9]|\\.[0-9])[0-9]*(e[+-]?[0-9]+)?$'
Will also match signed decimals (like -1.2, +0.2, 6., 2e9, 1.2e-10).
Test:
drop table if exists myTable;
create table myTable (col1 varchar(50));
insert into myTable (col1)
values ('00.00'),('+1'),('.123'),('-.23e4'),('12.e-5'),('3.5e+6'),('a'),('e6'),('+e0');
select
col1,
col1 + 0 as casted,
col1 REGEXP '^[+-]?[0-9]*([0-9]\\.|[0-9]|\\.[0-9])[0-9]*(e[+-]?[0-9]+)?$' as isNumeric
from myTable;
Result:
col1 | casted | isNumeric
-------|---------|----------
00.00 | 0 | 1
+1 | 1 | 1
.123 | 0.123 | 1
-.23e4 | -2300 | 1
12.e-5 | 0.00012 | 1
3.5e+6 | 3500000 | 1
a | 0 | 0
e6 | 0 | 0
+e0 | 0 | 0
Demo
Returns numeric rows
I found the solution with following query and works for me:
SELECT * FROM myTable WHERE col1 > 0;
This query return rows having only greater than zero number column that col1
Returns non numeric rows
if you want to check column not numeric try this one with the trick (!col1 > 0):
SELECT * FROM myTable WHERE !col1 > 0;
This answer is similar to Dmitry, but it will allow for decimals as well as positive and negative numbers.
select * from table where col1 REGEXP '^[[:digit:]]+$'
use a UDF (user defined function).
CREATE FUNCTION isnumber(inputValue VARCHAR(50))
RETURNS INT
BEGIN
IF (inputValue REGEXP ('^[0-9]+$'))
THEN
RETURN 1;
ELSE
RETURN 0;
END IF;
END;
Then when you query
select isnumber('383XXXX')
--returns 0
select isnumber('38333434')
--returns 1
select isnumber(mycol) mycol1, col2, colx from tablex;
-- will return 1s and 0s for column mycol1
--you can enhance the function to take decimals, scientific notation , etc...
The advantage of using a UDF is that you can use it on the left or right side of your "where clause" comparison. this greatly simplifies your SQL before being sent to the database:
SELECT * from tablex where isnumber(columnX) = isnumber('UnkownUserInput');
hope this helps.
Another alternative that seems faster than REGEXP on my computer is
SELECT * FROM myTable WHERE col1*0 != col1;
This will select all rows where col1 starts with a numeric value.
Still missing this simple version:
SELECT * FROM myTable WHERE `col1` + 0 = `col1`
(addition should be faster as multiplication)
Or slowest version for further playing:
SELECT *,
CASE WHEN `col1` + 0 = `col1` THEN 1 ELSE 0 END AS `IS_NUMERIC`
FROM `myTable`
HAVING `IS_NUMERIC` = 1
You can use regular expression for the mor detail https://dev.mysql.com/doc/refman/8.0/en/regexp.html
I used this ^([,|.]?[0-9])+$. This is allows handle to the decimal and float number
SELECT
*
FROM
mytable
WHERE
myTextField REGEXP "^([,|.]?[0-9])+$"
I recommend: if your search is simple , you can use `
column*1 = column
` operator interesting :) is work and faster than on fields varchar/char
SELECT * FROM myTable WHERE column*1 = column;
ABC*1 => 0 (NOT EQU **ABC**)
AB15*A => 15 (NOT EQU **AB15**)
15AB => 15 (NOT EQU **15AB**)
15 => 15 (EQUALS TRUE **15**)
SELECT * FROM myTable WHERE sign (col1)!=0
ofcourse sign(0) is zero, but then you could restrict you query to...
SELECT * FROM myTable WHERE sign (col1)!=0 or col1=0
UPDATE: This is not 100% reliable, because "1abc" would return sign of
1, but "ab1c" would return zero... so this could only work for text that does not begins with numbers.
you can do using CAST
SELECT * from tbl where col1 = concat(cast(col1 as decimal), "")
I have found that this works quite well
if(col1/col1= 1,'number',col1) AS myInfo
Try Dividing /1
select if(value/1>0 or value=0,'its a number', 'its not a number') from table