Wildcard searches w/MySQL - mysql

I've a situation where my wildcard search is not really working how I thought it should...
This is the MySQL query I am running and it will return only whole words...
eg. If I enter "phone" as a search word it will return all rows with "phone" in it but not rows with "phones" (note the added 's')
mysql_query(" SELECT * FROM posts WHERE permalink = '$permalink' AND LOWER(raw_text) LIKE '%$str%' " );
How can I get it to return all variations of the search word? Obviously I know this could run into problems as it could return all sorts of matches if the user enters a common run of letters, I thought I could make this part of the advanced search option.
ADENDUM
I have narrowed it down to not a problem with the wildcard but with what I'm doing with the returned data... It is in my Regex that I am throwing at the data.
$pattern= "/(?:[\w\"',.-]+\s*){0,5}[\"',.-]?\S*".$str."(\s|[,.!?])(\s*[A-Za-z0-9.,-]+){0,5}/";
preg_match_all($pattern, $row["raw_text"], $matches);
The regex is not finding the string in the raw data that I am returning so it is throwing me a null. A new problem, I'm not that familiar with regex so I will havbe to fugure this one out as well... Maybe I'll be throwing up a new question soon!
Thanks,
M

I think something else is going on. The % part of the query seems correct. And this seems to be confirmation:
select 'phones' LIKE '%phone%';
+-------------------------+
| 'phones' LIKE '%phone%' |
+-------------------------+
| 1 |
+-------------------------+
1 row in set (0.00 sec)
By the way, I really hope you are doing a rigorous sanitation on $str (and $permalink too if it is from user input). You should only allow alphanumerics and a small number of other safe characters (spaces, probably) and you should be running it through mysql_real_escape_string() before using it in mysql_query(). Better yet, have a look at PDO.
Anyway, back to troubleshooting this: One thing to try might be to have the program log the string your sending to mysql_query(). Basically, change the call to mysql_query() to a call to error_log() or echo() or something like that. Then copy and paste the resulting query into MySQL command-line tool or PHPmyAdmin or whatever. When that query doesn't work the way you expect, at least you can look at it and tweak it to figure out what's up. And who knows, maybe it will be super obvious once you see the query spelled out.

I suspect you have a trailing space in your $str variable, which would explain what you're seeing: Your LIKE criterion would be '%phone %', which matches "... phone ...", but not "... phones ...".
Try trimming your value first.

Related

MySQL MATCH AGAINST not find results if an underscore is attached to the search term

I use full text indexing to find results faster and it works well except when the term i search for is attached to an underscore inside the database record.
My database records:
article.title
++++++++++++++++++++++++++++++
My article 123456 created
------------------------------
My article new_123456 created
------------------------------
My article 123456_new created
My match against query:
MATCH(article.title) AGAINST ( "123456*" IN BOOLEAN MODE )
This query return only the first record and ignore the others since the term "123456" is attached to an underscore ( _ ), either before or after the term, the query will ignore the records.
What is the thing I did wrong and how to fix this problem?
There are many things that can mess up FULLTEXT:
Punctuation
"stop words"
min word "length"
Language
It is sometimes best to edit the data before storing it. In your case, replacing "_" with " " might be the 'right' solution. That could be done either in your application code as you insert strings, or by using MySQL's REPLACE() as the string is INSERTed.

Equal (=) and like in mysql giving different result for same string with no wildcards

Maybe the question was not clear: What makes the difference, based on a character-by-character matching, between the two mail strings provided, as long as both were only [a-z] characters and in the same table (meaning same collation) to explain that some strings fail and some not? Anyone has a clue?
I've found several debates explaining the LIKE and = use in mySql, but not find a fulfilling answer for this issue:
I've found out that searching by whole mailboxes (not truncated, but only complete mailboxes so using no wildcards) using LIKE will not return few of them in my script, but they will match if using = sign instead.
(Of course currently function is updated and working properly with equal sign, but I would appreciate if someone could help me bring some light into this).
I can't reproduce the mailboxes per obvious security reasons but, can reproduce the structure of the last one I noticed that fails (each "x" represents a smallcap latin non-special character [a-z]):
select id from table_name where email = "xxx.xxxxxxxxx.xxxxxx#gmail.com"
Returns the id
select id from table_name where email like "xxx.xxxxxxxxx.xxxxxx#gmail.com"
Returns NULL
And this is giving me nuts, because normally it is working fine and, for example, with a structure like this:
select id from table_name where email = "xxxxx.xxxxxxxxx#gmail.com"
Returns the id
select id from table_name where email like "xxxxx.xxxxxxxxx#gmail.com"
Returns the id too
#_#
Maybe there's something wrong with the LIKE matching when there is more than one dot in the mailbox structure? Any other idea?
Thanks in advance for your time fellows.

Mysql, dealing with String Regex

I'm developing a Java desktop application that connects with a database, and I would like to know the next. It results that as far as I know, Prepared Statements avoid SQL injections while you don't make a direct concatenation with user data, but today I figured out that it doesn't escape String regex (like '%' from the LIKE operator,) due that it just escapes characters that could break up the String itself and alter the query. So, if user does:
Search = "%Dogs"; // User input
Query = "SELECT * FROM Table WHERE Field LIKE ?";
blah.setString(1, Search);
It will return all the rows that contains 'Dogs' at the beginning by injection.
Now I ask:
1-) Is this something bad / dangerous viewing from a global point?
2-) Is there a full list of Regex that Mysql could use from inside a String? if so, can you please share it with me?
Thank you.
If the user uses such meta characters in their search, the results may or may not be catastrophic, but a search for %% could be bad. A valid search for %Dogs may also not return the results the user was expecting which affects their experience.
LIKE only offers two meta characters, so you can escape them both on your own when acquired from users (simply using something akin to Search = Search.replaceAll("%", "\\\\%")).

MySQL - Confusing RegEx Variable Issue

I need some help with a RegEx. The concept is simple, but the actual solution is well beyond anything I know how to figure out. If anyone could explain how I could achieve my desired effect (and provide an explanation with any example code) it'd be much appreciated!
Basically, imagine a database table that stores the following string:
'My name is $1. I wonder who $2 is.'
First, bear in mind that the dollar sign-number format IS set in stone. That's not just for this example--that's how these wildcards will actually be stored. I would like an input like the following to be able to return the above string.
'My name is John. I wonder who Sarah is.'
How would I create a query that searches with wildcards in this format, and then returns the applicable rows? I imagine a regular expression would be the best way. Bear in mind that, theoretically, any number of wildcards should be acceptable.
Right now, this is the part of my existing query that drags the content out of the database. The concatenation, et cetera, is there because in a single database cell, there are multiple strings concatenated by a vertical bar.
AND CONCAT('|', content, '|')
LIKE CONCAT('%|', '" . mysql_real_escape_string($in) . "', '|%')
I need to modify ^this line to work with the variables that are a part of the query, while keeping the current effect (vertical bars, etc) in place. If the RegEx also takes into account the bars, then the CONCAT() functions can be removed.
Here is an example string with concatenation as it might appear in the database:
Hello, my name is $1.|Hello, I'm $1.|$1 is my name!
The query should be able to match with any of those chunks in the string, and then return that row if there is a match. The variables $1 should be treated as wildcards. Vertical bars will always delimit chunks.
For MySQL, this article is a nice guide which should help you. The Regexp would be "(\$)(\d+)". Here's a query I ripped off the article:
SELECT * FROM posts WHERE content REGEXP '(\\$)(\\d+)';
After retrieving data, use this handy function:
function ParseData($query,$data) {
$matches=array();
while(preg_match("/(\\$)(\\d+)/",$query,$matches)) {
if (array_key_exists(substr($matches[0],1),$data))
$query=str_replace($matches[0],"'".mysql_real_escape_string($data[substr($matches[0],1)])."'",$query);
else
$query=str_replace($matches[0],"''",$query);
}
return $query;
}
Usage:
$query="$1 went to $2's house";
$data=array(
'1' => 'Bob',
'2' => 'Joe'
);
echo ParseData($query,$data); //Returns "Bob went to Joe's house
If you aren't sticky about using the $1 and $2 and could change them around a bit, you could take a look at this:
http://php.net/manual/en/function.sprintf.php
E.G.
<?php
$num = 5;
$location = 'tree';
$format = 'There are %d monkeys in the %s';
printf($format, $num, $location);
?>
If you want to find entries in the database, then you can use a LIKE statement:
SELECT statement FROM myTable WHERE statement LIKE '%$1%'
Which will find all statements that include $1. I'm assuming that the first number to replace will always be $1 - it doesn't matter, in that case, that the total number of wildcards is arbitrary, as we're just looking for the first one.
The PHP replacement is a little trickier. You could probably do something like:
$count = 1;
while (strpos($statement, "$" . $count)) {
$statement = str_replace("$" . $count, $array[$count], $statement);
}
(I've not tested that, so there might be typos in there, but it should be enough to give the general idea.)
The one downside is that it will fail if you have more than ten parameters in your string to replace - the first runthrough will replace the first two characters of $10, as it's looking for $1.
I asked a different, but similar, question, and I think the solution applies to this question just as well.
https://stackoverflow.com/a/10763476/1382779

MySQL Find similar strings

I have an InnoDB database table of 12 character codes which often need to be entered by a user.
Occasionally, a user will enter the code incorrectly (for example typing a lower case L instead of a 1 etc).
I'm trying to write a query that will find similar codes to the one they have entered but using LIKE '%code%' gives me way too many results, many of which contain only one matching character.
Is there a way to perform a more detailed check?
Edit - Case sensitive not required.
Any advice appreciated.
Thanks.
Have a look at soundex. Commonly misspelled strings have the same soundex code, so you can query for:
where soundex(Code) like soundex(UserInput)
use without wildcard % for that
SELECT `code` FROM table where code LIKE 'user_input'
thi wil also check the space
SELECT 'a' = 'a ', return 1 whereas SELCET 'a' LIKE 'a ' return 0
reference