MySQL substring search in TEXT column - mysql

I have a column in an SQL table populated with the contents of an entire text file. Is there a way to get all substring matches of a particular term? For example, I want to get all occurrences of the words "bright star" starting 10 characters before the first occurrence and ending 10 characters after.
I was trying things like this with no success:
SELECT SUBSTRING (PARAGRAPH, -10, 10)
WHERE MATCH (paragraph) AGAINST ("+bright +star" IN BOOLEAN MODE);
I know that MySQL queries can be deeply nested, but I don't know if it's even possible to perform such a search.
Many thanks,
A.

You want to use Regular Expressions to filter out subtring matches by your pattern.
SELECT REGEXP_SUBSTR(paragraph, '.{10}[bright|star].{10}')

Related

How do I create a SELECT conditional in MySQL where the conditional is the character length of the LIKE match?

I am working on a search function, where the matches are weighted based on certain conditions. One of the conditions I want to add weight to is matches where the character length of the query string in a LIKE match is longer than 4.
This is what I want to the query to look like, roughly. %s is meant to represent the actual match found by LIKE, but I don't think it does. I'm wondering if there is a special variable in MySQL that does represent the precise character match found by LIKE.
SELECT help.*,
IF(CHAR_LENGTH(%s) > 4, 2, 0) w
FROM help
WHERE (
(title LIKE '%this%' OR title LIKE '%testy%' OR title LIKE '%test%') OR
(content LIKE '%this%' OR content LIKE '%testy%' OR content LIKE '%test%')
) LIMIT 1000
edit: I could in the PHP split the search string array into two arrays based on the character length of the elements, with two separate queries that return different values for 'w', then combine the results, but I'd rather not do that, as it seems to me that would be awkward, messy, and slow.
Check out FULLTEXT as another way to discover rows. It will be faster, but won't address your question.
This probably has the effect you want.
SELECT ....
IF ( (title LIKE '%testy%' OR
content LIKE '%testy%'), 2, 0)
....
Note that the "match" in your LIKEs includes the %, so it is the entire length of the string. I don't think that is what you wanted.
REGEXP "(this|testy|that)" will match either 4 or 5 characters (in this example). It may be possible to do something with REGEXP_REPLACE to replace that with the empty string, then see how much it shrank.
I think the answer to my question is that what I wanted to do isn't possible. There is no special variable in MySQL representing the core character match in a WHERE condtional where LIKE is the operator. The match is the contents of the returned data row.
What I did to reach my objective was took the original dynamic list of search tokens, iterated through that list, and performed a search on each token, with the SQL tailored to the conditions that matched each token.
As I did this I built an array of the search results, using the id for the database row as the index for the array. This allowed me to perform calculations with the array elements, while avoiding duplicates.
I'm not posting the PHP code because the original question was about the SQL.

MySql Specific Search - Replace String

I need to search words that contain multiple number prefixes.
Example:
0119
0129
0139
0149
But there is other prefixes, 0155859, 0128889
Etc.
If i search 0%9 it'll come up with all the results i don't want, it'll include the 0155859, 0128889 ones
I need to search and list ONLY the ones that have 0119, etc
How do i do it ?
0XX9 ( Where XX is any strings that matches, so 0119, 0129, etc. % Lists all other characters till a 9 appears, i don't want that. )
I'm trying on my english, correct me if i did'nt expressed myself right !
In a LIKE pattern, the _ character matches any single character. So you can do:
WHERE word LIKE '0__9%'
This matches a word that begins with 0, then any two characters, then 9, then anything after that.
My gut feeling at seeing your question was to consider using REGEXP, which is MySQL's regex matching operator. Try the following query:
SELECT *
FROM yourTable
WHERE word REGEXP '0[0-9][0-9]9'
The pattern used would match any word containing a zero, followed by any two numbers, followed by a 9.

mysql how to detect if column has no integer

Hi I have been trying to select the records whose column has no integer I have this piece of code and tried it different ways but still get back rows with P992142
P992142
P301716
P301716
P307162
P306522
which I don't want
select practitioner_id
from claimsprofinload
WHERE practitioner_id not like '%[0-9]%';
You're using a regular expression in conjunction with LIKE, which is not valid. What you want is the REGEXP or RLIKE comparison.
Since that expression is evaluated as a more literal string, and since none of your rows have [0-9] literally in them, it matches all rows.

Improving Mysql Match Against search

I've been loking into Mysql's Match Against search. The results are strange. For example, if I have a table attribute with an entry "education" and do a search (using match against) for "edu" then it finds it. But if i search for "educ" no results are returned. All the way up to "educatio" does not return results. So it only matches whole words, or if 3 letters or less match in a word.
Is there a way to improve it so that results are returned when a search term is a subset of a word in the attribute? E.g. using the example above, searching "educat" would return rows containing "Education"
You can do exactly what you want by matching IN BOOLEAN MODE and using the * operator.
For example:
... MATCH(thing) AGAINST ('+educat*' IN BOOLEAN MODE)...
The + tells the match to include only the values of thing that contain the match term, which in this case is all indexed values beginning with "educat" (see here for how Boolean mode works in detail).
As an aside, Fulltext search in MySQL does not index words of 3 or fewer characters by default, so I suspect your match with "edu" is not working the way you think. Look at the value of your ft_min_word_len variable to see if that's the case.
you can use the mark %a (a=your word or letter)that search any word that start with the same word or letter
you can use %a% that search part of the word that the start and/or in the middle of the word
and the last one you can use a% that ends with the word or letter

MySQL query - select postcode matches

I need to make a selection based on the first 2 characters of a field, so for example
SELECT * from table WHERE postcode LIKE 'rh%'
But this would select any record that contains those 2 characters at any point in the "postcode" field right? I am in need of a query that just selects the first 2 characters. Any pointerS?
Thanks
Your query is correct. It searches for postcodes starting with "rh".
In contrast, if you wanted to search for postcodes containing the string "rh" anywhere in the field, you would write:
SELECT * from table WHERE postcode LIKE '%rh%'
Edit:
To answer your comment, you can use either or both % and _ for relatively simple searches. As you have noticed already, % matches any number of characters whereas _ matches a single character.
So, in order to match postcodes starting with "RHx " (where x is any character) your query would be:
SELECT * from table WHERE postcode LIKE 'RH_ %'
(mind the space after _). For more complex search patterns, you need to read about regular expressions.
Further reading:
http://dev.mysql.com/doc/refman/5.1/en/pattern-matching.html
http://dev.mysql.com/doc/refman/5.1/en/regexp.html
LIKE '%rh%' will return all rows with 'rh' anywhere
LIKE 'rh%' will return all rows with 'rh' at the beginning
LIKE '%rh' will return all rows with 'rh' at the end.
If you want to get only first two characters 'rh', use MySQL SUBSTR() function
http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_substr
Dave, your way seems correct to me (and works on my test data). Using a leading % as well will match anywhere in the string which obviously isn't desirable when dealing with postcodes.