I was looking for a way to exclude values with a '_' in the results set from a mysql database.
Why would the following sql statement return no results?
select questionKey
from labels
where set_id = 674
and questionKey like 'Class%'
and questionKey not like '%_%' ;
which was the first sql I tried where as
select questionKey
from labels
where set_id = 674
and questionKey like 'Class%'
and locate('_',questionKey) = 0 ;
returns
questionKey
ClassA
ClassB
ClassC
ClassD
ClassE
ClassF
ClassG
ClassNPS
ClassDis
which is the result I wanted. Both SQL statements appear to me to be logically equivalent though they are not.
As tadman and PM77 already pointed out, it's a special character. If you want to use the first query, try to escape it like this (note the backslash):
select questionKey
from labels
where set_id = 674
and questionKey like 'Class%'
and questionKey not like '%\_%' ;
In the LIKE context _ takes on special meaning and represents any single character. It's the only one other than % that means something here.
Your LOCATE() version is probably the best here, though it's worth noting that doing table scans like this can get cripplingly slow on large amounts of data. If underscore represents something important you might want to have a flag field you can set and index.
You could also use a regular expression to try and match records with a single condition:
REGEXP '^Class[^_]+'
Related
I am trying to pull a product code from a long set of string formatted like a URL address. The pattern is always 3 letters followed by 3 or 4 numbers (ex. ???### or ???####). I have tried using REGEXP and LIKE syntax, but my results are off for both/I am not sure which operators to use.
The first select statement is close to trimming the URL to show just the code, but oftentimes will show a random string of numbers it may find in the URL string.
The second select statement is more rudimentary, but I am unsure which operators to use.
Which would be the quickest solution?
SELECT columnName, SUBSTR(columnName, LOCATE(columnName REGEXP "[^=\-][a-zA-Z]{3}[\d]{3,4}", columnName), LENGTH(columnName) - LOCATE(columnName REGEXP "[^=\-][a-zA-Z]{3}[\d]{3,4}", REVERSE(columnName))) AS extractedData FROM tableName
SELECT columnName FROM tableName WHERE columnName LIKE '%___###%' OR columnName LIKE '%___####%'
-- Will take a substring of this result as well
Example Data:
randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-xyz123&hello_world=us&etc_etc
In this case, the desired string is "xyz123" and the location of said pattern is variable based on each entry.
EDIT
SELECT column, LOCATE(column REGEXP "([a-zA-Z]{3}[0-9]{3,4}$)", column), SUBSTR(column, LOCATE(column REGEXP "([a-zA-Z]{3}[0-9]{3,4}$)", column), LENGTH(column) - LOCATE(column REGEXP "^.*[a-zA-Z]{3}[0-9]{3,4}", REVERSE(column))) AS extractData From mainTable
This expression is still not grabbing the right data, but I feel like it may get me closer.
I suggest using
REGEXP_SUBSTR(column, '(?<=[&?]random_code=[^&#]{0,256}-)[a-zA-Z]{3}[0-9]{3,4}(?![^&#])')
Details:
(?<=[&?]random_code=[^&#]{0,256}-) - immediately on the left, there must be & or &, random_code=, and then zero to 256 chars other than & and # followed with a - char
[a-zA-Z]{3} - three ASCII letters
[0-9]{3,4} - three to four ASCII digits
(?![^&#]) - that are followed either with &, # or end of string.
See the online demo:
WITH cte AS ( SELECT 'randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-xyz123&hello_world=us&etc_etc' val
UNION ALL
SELECT 'randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-xyz4567&hello_world=us&etc_etc'
UNION ALL
SELECT 'randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-xyz89&hello_world=us&etc_etc'
UNION ALL
SELECT 'randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-xyz00000&hello_world=us&etc_etc'
UNION ALL
SELECT 'randomwebsite.com/3982356923abcd1ab?random_code=12480712_ABC_DEF_ANOTHER_CODE-aaaaa11111&hello_world=us&etc_etc')
SELECT REGEXP_SUBSTR(val,'(?<=[&?]random_code=[^&#]{0,256}-)[a-zA-Z]{3}[0-9]{3,4}(?![^&#])') output
FROM cte
Output:
I'd make use of capture groups:
(?<=[=\-\\])([a-zA-Z]{3}[\d]{3,4})(?=[&])
I assume with [^=\-] you wanted to capture string with "-","\" or "=" in front but not include those chars in the result. To do that use "positive lookbehind" (?<=.
I also added a lookahead (?= for "&".
If you'd like to fidget more with regex I recommend RegExr
I'd like to use an SQL query to find and replace multiple values. I've had a look at this question that shows the following answer:
UPDATE
YourTable
SET
Column1 = REPLACE(Column1,'a','b')
WHERE
Column1 LIKE '%a%'
How can I find and replace multiple values instead of just the one?
My data is like the following, there's hundreds of rows, I'm specifically wanting to target each product_id:123:
subscription_id,products
"128","product_id:268|quantity:1|total:3.15|meta:|tax:0;product_id:267|quantity:1|total:2.97|meta:|tax:0"
I need to replace the product id's with new products id's. So it'll be "everything matching 268 will become 195" and "everything matching 267 will become 194".
Is there an efficient way to do it other than taking the code block above and using that for each product. Can I be done with one sweep through?
Simplest possible way would be to chain REPLACEs together, but considering the concatenated nature of the field you need to be sure you don't inadvertently target something that's not actually a product_id value. You can mitigate this by including some contextual content from the string value itself:
UPDATE YourTable
SET products = REPLACE(REPLACE(products, "product_id:267|", "product_id:194|"), "product_id:268|", "product_id:195|");
DBFiddle | MySQL 5.6 Reference Manual :: 13.2.8 REPLACE Statement
If there's some variability in how these strings might appear in a given field and you're running MySQL >=8.0, you can leverage something like REGEXP_REPLACE() to perform this same replacement using a defined RegExp pattern.
Yes, there are ways. For example, you can create a table like
replacements(id, oldval, newval)
and do the following:
UPDATE
Yourtable
JOIN
replacements
ON
Yourtable.Column1 LIKE CONCAT('%', replacements.oldval, '%')
SET
Yourtable.Column1 = REPLACE(Yourtable.Column1, replacements.oldval, replacements.newval);
The problem is that you would need to fill replacements with the pairs of oldval-newval, but MySQL cannot guess that. Insertion is as simple (assuming that id is auto_increment) as
INSERT INTO replacements(oldval, newval) VALUES
('a', 'b'),
('c', 'd'),
...
;
I am a beginner so please help me.
There are 2 things you need to combine in this case.
Because you didn't provide enough information in your question we have to guess what you mean by name. I'm going to assume that you have a single name column, but that would be unusual.
With strings, to match a character column that is not an exact match, you need to use LIKE which allows for wildcards.
You also need to negate the match, or in other words show things that are NOT (something).
First to match names that START with 'A'.
SELECT * FROM table_name WHERE name LIKE 'A%';
This should get you all the PEOPLE who have names that "Start with A".
Some databases are case sensitive. I'm not going to deal with that issue. If you were using MySQL that is not an issue. Case sensitivity is not universal. In some RDBMS like Oracle you have to take some steps to deal with mixed case in a column.
Now to deal with what you actually want, which is NOT (starting with A).
SELECT * FROM table_name WHERE name NOT LIKE 'A%';
your question should have more detail however you can use the substr function
SELECT name FROM yourtable
WHERE SUBSTR(name,1,1) <> 'A'
complete list of mysql string functions here
mysql docs
NOT REGXP operator
MySQL NOT REGXP is used to perform a pattern match of a string expression expr against a pattern pat. The pattern can be an extended regular expression.
Syntax:
expr NOT REGEXP pat
Query:
SELECT * FROM emp_table WHERE emp_name NOT REGEXP '^[a]';
or
SELECT * FROM emp_table WHERE emp_name NOT REGEXP '^a';
I am trying to do a simple query that has a where clause stating there is no match for 2 items:
where l.country not like \"%USA%\" or \"%CA%\" ORDER BY l.state
I also tried:
where l.country not like \"%USA%\" or l.country not like \"%CA%\" ORDER BY l.state
also tried:
where l.country not like (\"%USA%\", \"%CA%\") ORDER BY l.state
is there a way to use "not like" with more than one match?
This is your original condition:
where l.country not like "%USA%" or "%CA%" ORDER BY l.state
I assume you intend this to mean "the country is neither the USA nor CA."
If so, you would write it this way:
where l.country not like '%USA%' and l.country not like '%CA%' ORDER BY l.state
But there's no syntax in SQL for NOT LIKE 'X' OR 'Y'. The LIKE predicate has a left operand and a single right operand, no more.
The expression you wrote is a valid expression, but doesn't do what you think it does. It's like as if you had written this:
where (l.country not like "%USA%") or ("%CA%") ORDER BY l.state
That is, two terms, separated by OR, the first is a LIKE comparison, and the second is just a single string literal on its own. That's a valid term in an expression, but it doesn't do anything useful. It's like writing:
x = 6 * 8 + 0
What effect does the zero have in that expression? None.
Update: I was mistaken, I overlooked one effect of the query as you wrote it. You should know that in a boolean expression if you OR two terms, it doesn't matter what the first term is if the second term is always TRUE.
WHERE (some expression) OR (TRUE)
This is always true.
The literal string '%CA%' counts as true, because it's not an empty string or a NULL. So in your original query, the WHERE clause is always true no matter what the country is.
You could use REGEXP with an alternation:
SELECT *
FROM yourTable
WHERE country NOT REGEXP '"%USA%"|"%CA%"'
Notes:
You don't need to escape double quotes which appear inside of string literals in single quotes. Your original query would not run, I think, because you need to compare a column using LIKE against either another column or a string literal, normally in single quotes.
REGEXP is not case sensitive, so we could have used usa and ca for the same result, though this does not appear to matter in your case.
What's the difference between
SELECT foo FROM bar WHERE foobar='$foo'
AND
SELECT foo FROM bar WHERE foobar LIKE'$foo'
= in SQL does exact matching.
LIKE does wildcard matching, using '%' as the multi-character match symbol and '_' as the single-character match symbol. '\' is the default escape character.
foobar = '$foo' and foobar LIKE '$foo' will behave the same, because neither string contains a wildcard.
foobar LIKE '%foo' will match anything ending in 'foo'.
LIKE also has an ESCAPE clause so you can set an escape character. This will let you match literal '%' or '_' within the string. You can also do NOT LIKE.
The MySQL site has documentation on the LIKE operator. The syntax is
expression [NOT] LIKE pattern [ESCAPE 'escape']
LIKE can do wildcard matching:
SELECT foo FROM bar WHERE foobar LIKE "Foo%"
If you don't need pattern matching, then use = instead of LIKE. It's faster and more secure. (You are using parameterized queries, right?)
Please bear in mind as well that MySQL will do castings dependent upon the situation: LIKE will perform string cast, whereas = will perform int cast. Considering the situation of:
(int) (vchar2)
id field1 field2
1 1 1
2 1 1,2
SELECT *
FROM test AS a
LEFT JOIN test AS b ON a.field1 LIKE b.field2
will produce
id field1 field2 id field1 field2
1 1 1 1 1 1
2 1 1,2 1 1 1
whereas
SELECT *
FROM test AS a
LEFT JOIN test AS b ON a.field1 = b.field2
will produce
id field1 field2 id field1 field2
1 1 1 1 1 1
1 1 1 2 1 1,2
2 1 1,2 1 1 1
2 1 1,2 2 1 1,2
According to the MYSQL Reference page, trailing spaces are significant in LIKE but not =, and you can use wildcards, % for any characters, and _ for exactly one character.
I think in term of speed = is faster than LIKE. As stated, = does an exact match and LIKE can use a wildcard if needed.
I always use = sign whenever I know the values of something. For example
select * from state where state='PA'
Then for likes I use things like:
select * from person where first_name like 'blah%' and last_name like 'blah%'
If you use Oracle Developers Tool, you can test it with Explain to determine the impact on the database.
The end result will be the same, but the query engine uses different logic to get to the answer. Generally, LIKE queries burn more cycles than "=" queries. But when no wildcard character is supplied, I'm not certain how the optimizer may treat that.
With the example in your question there is no difference.
But, like Jesse said you can do wildcard matching
SELECT foo FROM bar WHERE foobar LIKE "Foo%"
SELECT foo FROM bar WHERE foobar NOT LIKE "%Foo%"
More info:
http://dev.mysql.com/doc/refman/5.0/en/string-comparison-functions.html
A little bit og google doesn't hurt...
A WHERE clause with equal sign (=) works fine if we want to do an exact match. But there may be a requirement where we want to filter out all the results where 'foobar' should contain "foo". This can be handled using SQL LIKE clause alongwith WHERE clause.
If SQL LIKE clause is used along with % characters then it will work like a wildcard.
SELECT foo FROM bar WHERE foobar LIKE'$foo%'
Without a % character LIKE clause is very similar to equal sign alongwith WHERE clause.
In your example, they are semantically equal and should return the same output.
However, LIKE will give you the ability of pattern matching with wildcards.
You should also note that = might give you a performance boost on some systems, so if you are for instance, searching for an exakt number, = would be the prefered method.
Looks very much like taken out from a PHP script. The intention was to pattern-match the contents of variable $foo against the foo database field, but I bet it was supposed to be written in double quotes, so the contents of $foo would be fed into the query.
As you put it, there is NO difference.
It could potentially be slower but I bet MySQL realises there are no wildcard characters in the search string, so it will not do LIKE patter-matching after all, so really, no difference.
In my case I find Like being faster than =
Like fetched a number of rows in 0.203 secs the first time then 0.140 secs
= returns fetched the same rows in 0.156 secs constantly
Take your choice
I found an important difference between LIKE and equal sign = !
Example: I have a table with a field "ID" (type: int(20) ) and a record that contains the value "123456789"
If I do:
SELECT ID FROM example WHERE ID = '123456789-100'
Record with ID = '123456789' is found (is an incorrect result)
If I do:
SELECT ID FROM example WHERE ID LIKE '123456789-100'
No record is found (this is correct)
So, at least for INTEGER-fields it seems an important difference...