I have a field where I save strings, like:
"one two-tree"
"one-two-tree"
"one-two tree"
"one two tree"
When I do a SELECT, I want to retrieve strings that have either "-" or " " (space). Example:
When I do:
Select name from table where name="one two tree"
I want it to bring also results where there is either space or -, in this case returning all string exemplified above.
Is there a wildcard for this?
As far as standard SQL, you must use "or", or "like". depending on what exactly you want. EXAMPLE: Select name from table where name like "one?two?tree".
However, mySQL supports a REGEX extension that will give you what you want:
http://dev.mysql.com/doc/refman/5.7/en/pattern-matching.html
One option is to use replace:
select name
from yourtable
where replace(name, '-', ' ') = 'one two tree'
There is, but it is slow: you can use REGEXP:
SELECT name
FROM table
WHERE name REGEXP 'one[- ]two[- ]tree'
or you can use replacements:
SELECT name
FROM table
WHERE REPLACE(name, '-', ' ') = 'one two three'
but your best bet is to make an additional column where you will have a normalised name (with dashes always replaced with spaces, for example) so you can take advantage of indices.
You can use the LIKE condition with '%' (wildcard operator) for this.
e.g.
SELECT name from table_name WHERE name LIKE '%-%' OR name LIKE '% %'
-- will return all names that have `-` or ` `.
Related
I have a table with a varchar column that represents a path. I want to search for rows that have a path that follow a pattern like name.name[*] where name can be anything. I am looking for repeated strings contained anywhere in the path column that are separated by a period and have a square bracket after them.
This seems to call for Regexp, so through python I have something like https://regex101.com/r/apS20a/4
However, trying to implement this with MySQL Regexp is not working. I have been able to translate the shorthand into REGEXP '([A-Za-z_]+).(\1[[0-9]+])', but it seems that MySql Regex does not support capture groups. Is there a way to accomplish what I am trying to do with mysql regexp? Thank you
I don't think that MySQL supports capture groups. But if you only have one example of .name[ in the string between the first . and the first [, you can hack your way around it. This is not a general solution, just a specific approach in this case.
You can get the name with:
select substring_index(substring_index(url, '[', 1), '.', -1) as name
And then incorporate this into a regular expression:
select t.*
from (select t.*,
substring_index(substring_index(url, '[', 1), '.', -1) as name
from t
) t
where url like concat('%', name, '.', name, '[%');
This just uses like instead of regexp, because [ and . are regular expression wildcards. Of course, this assumes that name does not have _ or %.
EDIT:
Here is a method that actually identifies when this occurs -- and works even if there are multiple patterns.
The idea is to construct the regular expression based on what happens between the . and [ -- and then to apply it. Delightfully self-referential:
select t.*,
(url regexp regex)
from (select t.*,
substr(regexp_replace(url, '[^.]*[.]([^\\[]*)\\[[^.]*', '|$1[.]$1\\\\['), 2) as regex
from (select 'abcde.de[12345.345[ABC' as url union all
select 'abcdefdef[[[[..123.124['
) t
) t;
Here is the above in a db<>fiddle.
I have a search module with SQL query like this:
SELECT FROM trilers WHERE title '%something%'
And when I search for keyword for example like "spiderman" it returns not found, but when I search for "spider-man" it returns my content (original row in MySQL is "spider-man").
How can I ignore all symbols like -, #, !, : and return content with "spiderman" and "spider-man" keywords at the same time?
What you can do is replace the characters you don't care about before the search takes place.
First iteration would look like this:
SELECT * FROM trilers WHERE REPLACE(title, '-', '') LIKE '%spiderman%'
This would ignore any '-'.
Next you would rap that with another REPLACE to include '#' like this:
SELECT * FROM trilers WHERE REPLACE(REPLACE(title, '-', ''), '#', '') LIKE '%spiderman%'
For all 3 ('!','-','#') you would just increase the Replace with another Replace like this:
SELECT * FROM trilers WHERE REPLACE(REPLACE(REPLACE(title, '-', ''), '#', ''),'!','') LIKE '%spiderman%'
You could try something like
SELECT * FROM trilers WHERE replace(title, '-', '') LIKE '%spiderman%'
The other answers involving using REPLACE are great, but if you don't care what characters appear between "spider" and "man" or how many characters there are between the two strings, you can use an additional wildcard in your expression:
SELECT * FROM Superheroes WHERE HeroName LIKE '%spider%man%';
If you want to match only one character, but allow any character, you can use the _ wildcard, which matches only one character:
SELECT * FROM Superheroes WHERE HeroName LIKE '%spider_man%';
This will match "spideryman" and "spideryman in la la land" but not "spiderysupereliteuberheroman".
If you have a limited number of possible symbols, a way to do it without REPLACE is to use a disjunctive expression:
SELECT * FROM Superheroes WHERE
HeroName LIKE '%spiderman%'
OR
HeroName LIKE '%spider-man%'
OR
HeroName LIKE '%spider#man%'
OR
HeroName LIKE '%spider!man%';
WHERE trilers REGEXP '[[:<:]]spider[-#!:]?man[[:>:]]'
Some discussion:
[[:<:]] -- word boundary
[-#!:] -- character set, matches any of them. ('-' must be first)
[-#!:]? -- optional -- so that 'spiderman' will still match
This, unlike the rest of the answers, will avoid matching
spidermaniac
Also, consider using FULLTEXT.
You should be able to use your search with a small update. You should be able to do something like: SELECT FROM trilers WHERE title LIKE '%spider%'. This should search for anything where spider is before or after something else like the hyphen (-)
I'd like to make an SQL query where the condition is that column1 contains three or more words. Is there something to do that?
maybe try counting spaces ?
SELECT *
FROM table
WHERE (LENGTH(column1) - LENGTH(replace(column1, ' ', ''))) > 1
and assume words is number of spaces + 1
If you want a condition that a column contains three or more words and you want it to work in a bunch of databases and we assume that words are separated by single spaces, then you can use like:
where column1 like '% % %'
I think David nailed it above. However, as a more complete answer:
LENGTH(RTRIM(LTRIM(REPLACE(column1,' ', ' ')))) - LENGTH(REPLACE(RTRIM(LTRIM(REPLACE(column1, ' ', ' '))), ' ', '')) + 1 AS number_of_words
This will remove double spaces, as well as leading and trailing spaces in your string.
Of course, you may go further by adding replacements for more than 2 spaces in a row...
In Postgres you can use regexp_split_to_array() for this:
select *
from the_table
where array_length(regexp_split_to_array(the_column, '\s+'), 1) >= 3;
This will split the contents of the column the_column into array elements. One ore more whitespace are used as the delimiter. It won't respect "quoted" spaces though. The value 'one "two three" four' will be counted as four words.
The best way to do this, is to NOT do this.
Instead, you should use the application layer to count the words during INSERT and save the word count into its own column.
While I like, and upvoted, some of the answers here, all of them will be very slow and not 100% accurate.
I know people want a simple answer to SELECT the word count, but it just is NOT POSSIBLE with accuracy and speed.
If you want it to be 100% accurate, and very fast, then use this solution.
Steps to solve:
Add a column to your table and index it: ALTER TABLE tablename ADD COLUMN wordcount INT UNSIGNED NULL, ADD INDEX idxtablename_count (wordcount ASC);.
Before doing your INSERT, count the number of words using your application. For example in PHP: $count = str_word_count($somevalue);
During the INSERT, include the value of $count for the column wordcount like insert into tablename (col1, col2, col3, wordcount) values (val1, val2, val3, $count);
Then your select statement becomes super easy, clean, uber-fast, and 100% accurate.
select * from tablename where wordcount >= 3;
Also remember when you are updating any rows that you will need to recount the words for that column.
For "n" or more words
select *
from table
where (length(column)- length(replace(column, " ", "")) + 1) >= n
PS: This would not work if words have multiple spaces between them.
With ClickHouse DB You can use splitByWhitespace() function.
Refer : https://clickhouse.com/docs/en/sql-reference/functions/splitting-merging-functions#splitbywhitespaces
None of the other answers seem to take multiple spaces into account. For example, a lot of people use two spaces between sentences; these space-counters would count an extra word per sentence. "Also, scenarios such as spaces around a hyphen - like that. "
For my purposes, this was far more accurate:
SELECT
LENGTH(REGEXP_REPLACE(myText, '[ \n\t\|\-]{1,}',' ')) -
LENGTH(REGEXP_REPLACE(myText, '[ \n\t\|\-]{1,}', '')) wordCount FROM myTable;
It counts any sets of 1 or more consecutive characters from any of: [space, linefeed, tab, pipe, or hyphen] and counts it as one word.
This can work:
SUM(LENGTH(a) - LENGTH(REPLACE(a, ' ', '')) + 1)
Where a is the string column. It will count the number of spaces, which is 1 less than the number of words.
To handle multiple spaces too, use the method shown here
Declare #s varchar(100)
set #s=' See how many words this has '
set #s=ltrim(rtrim(#s))
while charindex(' ',#s)>0
Begin
set #s=replace(#s,' ',' ')
end
select len(#s)-len(replace(#s,' ',''))+1 as word_count
https://exploresql.com/2018/07/31/how-to-count-number-of-words-in-a-sentence/
I want to use the LIKE operator to match possible values in a column.
If the value begins with "CU" followed by a digit (e.g. "3") followed by anything else, I would like to return it. There only seems to be a wildcard for any single character using underscore, however I need to make sure it is a digit and not a-z.
I have tried these to no avail:
select name from table1 where name like 'CU[0-9]%'
select name from table1 where name like 'CU#%'
Preferably this could be case sensitive i.e. if cu or Cu or cU then this would not be a match.
You need to use regexp:
select name
from table1
where name regexp binary '^CU[0-9]'
The documentation for regexp is here.
EDIT: binary is required to ensure case-sensitive matching
The like operator only have the % and _ wildcards in MySQL, but you can use a regular expression with the rlike operator:
select name from table1 where name rlike '^CU[0-9]'
You can use REGEXP operator, see http://dev.mysql.com/doc/refman/5.1/en/regexp.html#operator_regexp
so your query would be:
select name from table where name regexp 'CU[0-9].*';
Have you tried with:
select name from table where name between 'CU0' and 'CU9'
In my table I have firstname and last name. Few names are upper case ( ABRAHAM ), few names are lower case (abraham), few names are character starting with ucword (Abraham).
So when i am doing the where condition using REGEXP '^[abc]', I am not getting proper records. How to change the names to lower case and use SELECT QUERY.
SELECT * FROM `test_tbl` WHERE cus_name REGEXP '^[abc]';
This is my query, works fine if the records are lower case, but my records are intermediate ,my all cus name are not lower case , all the names are like ucword.
So for this above query am not getting proper records display.
I think you should query your database making sure that the names are lowered, suppose that name is the name you whish to find out, and in your application you've lowered it like 'abraham', now your query should be like this:
SELECT * FROM `test_tbl` WHERE LOWER(cus_name) = name
Since i dont know what language you use, I've just placed name, but make sure that this is lowered and you should retrieve Abraham, ABRAHAM or any variation of the name!
Hepe it helps!
Have you tried:
SELECT * FROM `test_tbl` WHERE LOWER(cus_name) REGEXP '^[abc]';
I don't know since when, but nowadays MySql REGEXP is case insensitive.
https://dev.mysql.com/doc/refman/5.7/en/pattern-matching.html
You don't need regexp to search for names starting with a specific string or character.
SELECT * FROM `test_tbl` WHERE cus_name LIKE 'abc%' ;
% is wildcard char. The search is case insensitive unless you set the binary attribute for column cus_name or you use the binary operator
SELECT * FROM `test_tbl` WHERE BINARY cus_name LIKE 'abc%' ;
A few valid options already presented, but here's one more with just regex:
SELECT * FROM `test_tbl` WHERE cus_name REGEXP '^[abcABC]';