Escape string for use in MySQL fulltext search - mysql

I am using Laravel 4 and have set up the following query:
if(Input::get('keyword')) {
$keyword = Input::get('keyword');
$search = DB::connection()->getPdo()->quote($keyword);
$query->whereRaw("MATCH(resources.name, resources.description, resources.website, resources.additional_info) AGAINST(? IN BOOLEAN MODE)",
array($search)
);
}
This query runs fine under normal use, however, if the user enters a string such as ++, an error is thrown. Looking at the MySQl docs, there are some keywords, such as + and - which have specific purposes. Is there a function which will escape these types of special characters from a string so it can be used in a fulltext search like above without throwing any errors?
Here is an example of an error which is thrown:
{"error":{"type":"Illuminate\\Database\\QueryException","message":"SQLSTATE[42000]: Syntax error or access violation: 1064 syntax error, unexpected '+' (SQL: select * from `resources` where `duplicate` = 0 and MATCH(resources.name, resources.description, resources.website, resources.additional_info) AGAINST('c++' IN BOOLEAN MODE))","file":"\/var\/www\/html\/[...]\/vendor\/laravel\/framework\/src\/Illuminate\/Database\/Connection.php","line":555}}
Solutions I've tried:
$search = str_ireplace(['+', '-'], ' ', $keyword);
$search = filter_var($keyword, FILTER_SANITIZE_STRING);
$search = DB::connection()->getPdo()->quote($keyword);
I'm assuming I will need to use regex. What's the best approach here?

Only the words and operators have meaning in Boolean search mode. Operators are: +, -, > <, ( ), ~, *, ", #distance. After some research I found what word characters are: Upper case, Lower case letters, Numeral (digit) and _. I think you can use one of two approaches:
Replace all non word characters with spaces (I prefer this approach). This can be accomplished with regex:
$search = preg_replace('/[^\p{L}\p{N}_]+/u', ' ', $keyword);
Replace characters-operators with spaces:
$search = preg_replace('/[+\-><\(\)~*\"#]+/', ' ', $keyword);
Only words are indexed by full text search engine and can be searched. Non word characters isn't indexed, so it does not make sense to leave them in the search string.
References:
Boolean Full-Text Searches
Fine-Tuning MySQL Full-Text Search (see: "Character Set Modifications")
PHP: preg_replace
PHP: Unicode character properties
PHP: Possible modifiers in regex patterns

While the answer from Rimas is technically correct, it will suit you only if you do not want users to use the MATCH operators, because it will strip them all completely. For example, I do want to allow use of all of them except #distance in search forms on my site, thus I've come up with this:
#Trim first
$newValue = preg_replace('/^\p{Z}+|\p{Z}+$/u', '', string);
#Remove all symbols except allowed operators and space. #distance is not included, since it's unlikely a human will be using it through UI form
$newValue = preg_replace('/[^\p{L}\p{N}_+\-<>~()"* ]/u', '', $newValue);
#Remove all operators, that can only precede a text and that are not preceded by either beginning of string or space
$newValue = preg_replace('/(?<!^| )[+\-<>~]/u', '', $newValue);
#Remove all double quotes and asterisks, that are not preceded by either beginning of string, letter, number or space
$newValue = preg_replace('/(?<![\p{L}\p{N}_ ]|^)[*"]/u', '', $newValue);
#Remove all double quotes and asterisks, that are inside text
$newValue = preg_replace('/([\p{L}\p{N}_])([*"])([\p{L}\p{N}_])/u', '', $newValue);
#Remove all opening parenthesis which are not preceded by beginning of string or space
$newValue = preg_replace('/(?<!^| )\(/u', '', $newValue);
#Remove all closing parenthesis which are not preceded by beginning of string or space or are not followed by end of string or space
$newValue = preg_replace('/(?<![\p{L}\p{N}_])\)|\)(?! |$)/u', '', $newValue);
#Remove all double quotes if the count is not even
if (substr_count($newValue, '"') % 2 !== 0) {
$newValue = preg_replace('/"/u', '', $newValue);
}
#Remove all parenthesis if count of closing does not match count of opening ones
if (substr_count($newValue, '(') !== substr_count($newValue, ')')) {
$newValue = preg_replace('/[()]/u', '', $newValue);
}
Unfortunately I was not able to figure out a way to do this in 1 regex, thus doing multiple runs. It's also possible, that I am missing some edge cases, as well. Any additions or corrections are appreciated: either here or create an issue for https://github.com/Simbiat/database where I implement this.

Related

JSON data with double quotes

I am getting data from api call which contains double quotes. for example data = '{"firstName":""John""}'
how to parse this data into json.
expected-output:result = JSON.parse(data) and result.firstname should give output as "John" not John
As #Cid points out, that is invalid JSON.
You will need to sanitize it first:-
var json = data.replace(/""/g, '"');
var x = JSON.parse(json);
If you want to keep the inner quotes, you'll need to use something like this:-
var json = data.replace(/(\".*\":)\"\"(.*)\"\"/g, '$1 "\\"$2\\""');
var x = JSON.parse(json);
However, you may need to fiddle with the regex if it conflicts with other parameters.
You can review the regex above at https://regex101.com/ to get an explanation of how the regex matches:-
/(\".*\":)\"\"(.*)\"\"/g
1st Capturing Group (\".*\":)
\" matches the character " literally (case sensitive)
.* matches any character (except for line terminators)
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\" matches the character " literally (case sensitive)
: matches the character : literally (case sensitive)
\" matches the character " literally (case sensitive)
\" matches the character " literally (case sensitive)
2nd Capturing Group (.*)
.* matches any character (except for line terminators)
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\" matches the character " literally (case sensitive)
\" matches the character " literally (case sensitive)
Global pattern flags
g modifier: global. All matches (don't return after first match)
The $1 and $2 in the replacement text correspond to the capture groups in the regex. See String.prototype.replace() for details.
Try With this
var json = '{"firstName":""John""}'; //Let's say you got this
json = json.replace(/\"([^(\")"]+)\":/g,"$1:"); //This will remove all the quotes
json;

Replace & with &

When running the following code, I'm encountering an error message saying that the semicolon that I used on this line:
$select_stock->addExpression("REPLACE(b.corporateName, '&', '&')");
for the ampersand is incorrectly placed
InvalidArgumentException: ; is not supported in SQL strings. Use only one statement at a time.
Is there another way to solve this?
public function c_form_db_2($cName) {
$select_stock = $this->connection->select('stock', 'a');
$select_stock->fields('a', ['high', 'low', 'stockname']);
$select_stock->innerJoin('stockdetails', 'b', 'b.high = a.high');
$select_stock->condition('a.isCurrentPrice', 'Yes');
$select_stock->condition('a.isActive', 'Yes');
$select_stock->condition('b.status', 'Closing');
$select_stock->addExpression("REPLACE(b.corporateName, '&', '&')");
$select_stock->escapeLike($cName);
$select_stock->orderBy('a.tickerId', 'DESC');
$select_stock->orderBy('a.volId', 'DESC');
$select_stock_rows = $select_stock->execute()
->fetchAll(\PDO::FETCH_ASSOC);
return $select_stock_rows;
}
I do not know Drupal but I assume REPLACE is the standard MySql function and that Drupal supports all of them. In that case, if by chance you are running MySql 8, then instead of using REPLACE, use REGEXP_REPLACE and match against the regular expression '&amp.' using the wildcard '.' for the ';' character on the assumption that ';' is the only character that will ever be matched by the wildcard.

preg_replace : capturing single quote inside single quote escaped expression

In a wordpress theme, I am using the "posts_where" filter to add search to the "excerpt" field. It is working excepted when there is a single quote in the search string, leading to a SQL synthax error.
It seems to be a bug in the preg_replace function off the posts_where filter.
For example, for the string "o'kine" , the $where string received in the posts_where filter is :
"AND (((cn_posts.post_title LIKE '%o\'kine%') OR (cn_posts.post_content LIKE '%o\'kine%')))"
Then this is my preg_replace to add the post_excerpt field :
$where = preg_replace(
"/post_title\s+LIKE\s*(\'[^\']+\')/",
"post_title LIKE $1) OR (post_excerpt LIKE $1", $where );
And the value of the $where after :
"AND (((cn_posts.post_title LIKE '%o\') OR (post_excerpt LIKE '%o\'kine%') OR (cn_posts.post_content LIKE '%o\'kine%')))"
See the '%o\' part that is causing the SQL synthax error.
The expected result would be :
"AND (((cn_posts.post_title LIKE '%o\'kine%') OR (post_excerpt LIKE '%o\'kine%') OR (cn_posts.post_content LIKE '%o\'kine%')))"
The bug is clearly in my regular expression, more precisely in my capturing parentheses. I do not know how to deal with the possibility of zero or more single quote in my search string?
EDIT : With Casimir et Hippolyte answer, this is the working filter with single quote in the search string :
function cn_search_where( $where ) {
$where = preg_replace(
"/post_title\s+LIKE\s*('[^'\\\\]*+(?s:\\\\.[^'\\\\]*)*+')/",
"post_title LIKE $1) OR (post_excerpt LIKE $1", $where );
return $where;
}
The subpattern to match a quoted string with eventual escaped quotes (or other characters) is:
'[^'\\]*+(?s:\\.[^'\\]*)*+'
(note that to figure a literal backslash in a regex pattern, it must be escaped since the backslash is a special character)
So in a php string (backslashes need to be escaped one more time):
$pattern = "~'[^'\\\\]*+(?s:\\\\.[^'\\\\]*)*+'~";
With this information, I think you can build the pattern yourself.
details:
' # a literal single quote
[^'\\]*+ # zero or more characters that are not a single quote or a backslash
(?s: # open a non-capture group with the s modifier (the dot can match newlines)
\\. # an escaped character
[^'\\]*
)*+ # repeat the group zero or more times
'

Mysql - how to strip certain characters from the end of the string

I have strings like this :
column:
----------
word[1]
word[2]
word
word[2]
word
word[3]
Where word is a variable length random characters string.
How would I remove square brackets with numbers in them from the end of these strings in mysql table?
Does mysql allow regexes?
update test
set name = SUBSTRING_INDEX(name,'[',1)
where name=name
DEMO
You could use the following select:
IF(RIGHT[(myColumn, 1) = "]", SUBSTRING(myColumn, -3), myColumn)
RIGHT(mycolumn, 1) == ] will check if your entry lasts with a closing bracket.
SUBSTRING(myColumn, -3) will return the string without the closing bracket, if there is one.
myColumns will return the full string, if there is no bracket.

How do I escape special characters in MySQL?

For example:
select * from tablename where fields like "%string "hi" %";
Error:
You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'hi" "' at line 1
How do I build this query?
The information provided in this answer can lead to insecure programming practices.
The information provided here depends highly on MySQL configuration, including (but not limited to) the program version, the database client and character-encoding used.
See http://dev.mysql.com/doc/refman/5.0/en/string-literals.html
MySQL recognizes the following escape sequences.
\0 An ASCII NUL (0x00) character.
\' A single quote (“'”) character.
\" A double quote (“"”) character.
\b A backspace character.
\n A newline (linefeed) character.
\r A carriage return character.
\t A tab character.
\Z ASCII 26 (Control-Z). See note following the table.
\\ A backslash (“\”) character.
\% A “%” character. See note following the table.
\_ A “_” character. See note following the table.
So you need
select * from tablename where fields like "%string \"hi\" %";
Although as Bill Karwin notes below, using double quotes for string delimiters isn't standard SQL, so it's good practice to use single quotes. This simplifies things:
select * from tablename where fields like '%string "hi" %';
I've developed my own MySQL escape method in Java (if useful for anyone).
See class code below.
Warning: wrong if NO_BACKSLASH_ESCAPES SQL mode is enabled.
private static final HashMap<String,String> sqlTokens;
private static Pattern sqlTokenPattern;
static
{
//MySQL escape sequences: http://dev.mysql.com/doc/refman/5.1/en/string-syntax.html
String[][] search_regex_replacement = new String[][]
{
//search string search regex sql replacement regex
{ "\u0000" , "\\x00" , "\\\\0" },
{ "'" , "'" , "\\\\'" },
{ "\"" , "\"" , "\\\\\"" },
{ "\b" , "\\x08" , "\\\\b" },
{ "\n" , "\\n" , "\\\\n" },
{ "\r" , "\\r" , "\\\\r" },
{ "\t" , "\\t" , "\\\\t" },
{ "\u001A" , "\\x1A" , "\\\\Z" },
{ "\\" , "\\\\" , "\\\\\\\\" }
};
sqlTokens = new HashMap<String,String>();
String patternStr = "";
for (String[] srr : search_regex_replacement)
{
sqlTokens.put(srr[0], srr[2]);
patternStr += (patternStr.isEmpty() ? "" : "|") + srr[1];
}
sqlTokenPattern = Pattern.compile('(' + patternStr + ')');
}
public static String escape(String s)
{
Matcher matcher = sqlTokenPattern.matcher(s);
StringBuffer sb = new StringBuffer();
while(matcher.find())
{
matcher.appendReplacement(sb, sqlTokens.get(matcher.group(1)));
}
matcher.appendTail(sb);
return sb.toString();
}
You should use single-quotes for string delimiters. The single-quote is the standard SQL string delimiter, and double-quotes are identifier delimiters (so you can use special words or characters in the names of tables or columns).
In MySQL, double-quotes work (nonstandardly) as a string delimiter by default (unless you set ANSI SQL mode). If you ever use another brand of SQL database, you'll benefit from getting into the habit of using quotes standardly.
Another handy benefit of using single-quotes is that the literal double-quote characters within your string don't need to be escaped:
select * from tablename where fields like '%string "hi" %';
MySQL has the string function QUOTE, and it should solve the problem
For strings like that, for me the most comfortable way to do it is doubling the ' or ", as explained in the MySQL manual:
There are several ways to include quote characters within a string:
A “'” inside a string quoted with “'” may be written as “''”.
A “"” inside a string quoted with “"” may be written as “""”.
Precede the quote character by an escape character (“\”).
A “'” inside a string quoted with “"” needs no special treatment and need not be doubled or escaped. In the same way, “"” inside a
Strings quoted with “'” need no special treatment.
It is from http://dev.mysql.com/doc/refman/5.0/en/string-literals.html.
You can use mysql_real_escape_string. mysql_real_escape_string() does not escape % and _, so you should escape MySQL wildcards (% and _) separately.
For testing how to insert the double quotes in MySQL using the terminal, you can use the following way:
TableName(Name,DString) - > Schema
insert into TableName values("Name","My QQDoubleQuotedStringQQ")
After inserting the value you can update the value in the database with double quotes or single quotes:
update table TableName replace(Dstring, "QQ", "\"")
If you're using a variable when searching in a string, mysql_real_escape_string() is good for you. Just my suggestion:
$char = "and way's 'hihi'";
$myvar = mysql_real_escape_string($char);
select * from tablename where fields like "%string $myvar %";