I'm trying to save a char which contains, among others %, but the problem is that while other characters such as ' / or " seem to work just fine, I just can't figue out how to escape %. I tried many things, but using mysql_real_escape_string or other stuff just don't work.
At the moment, I'm replacing % with 'percent' (it didn't work with backslash + %), saving that to the database, and then replacing it back before echoing, but as you may notice, it's not really optimal. Please let me know if there's a better way to do it.
Don't try to put variable values in-line with the query. Use placeholders and value binding.
If in php you need to use addslashes to percent character before inserting.
for mysql its:
'10\% off' //you need to escape % character
MySQL will not change % character while doing an insert/update. If you are having problems with it, it must be some other layer in your setup which is doing the conversion.
create table test ( a varchar(10));
insert into test values ('abc%def');
select * from test;
+---------+
| a |
+---------+
| abc%def |
+---------+
1 row in set (0.00 sec)
Apparently, the % can be escaped by making it double, so a quite simple str_replace did it for me.
Related
I inherited a MySQL table (MyISAM utf8_general_ci encoding) that has a strange character looks like this in myPHPAdmin: •
I assume this a bullet point of some type?
When rendered on a HTML page it looks like this: �
How do I replace this value with a <BR><LI> so I can turn it into a line break with a properly formatted list item?
I've tried a standard UPDATE query but it does not replace these values? I assume I need to escape them somehow?
Query attempted:
UPDATE `FL_Regs` SET `Remarks` = "<BR><LI>" WHERE `Remarks` = "•"
You did not showed your query, so I'm only guessing.
If you're having hard times with your client encoding characters for you (I imagine you may use phpmyadmin, which involve a lot of steps between your browser and the actual server), you may try by giving the string to search as sequence of bytes.
It happen that • is U+2022, a character named "BULLET" in Unicode, which is encoded as e2 80 a2 in UTF8. So you can use X'E280A2' instead of '•' in your query.
Typically:
> select X'E280A2';
+-----------+
| X'E280A2' |
+-----------+
| • |
+-----------+
You can, if you want to better understand what's happening, try to use the HEX() function, first maybe to check what's MySQL is receiving when your're sending a bullet:
SELECT HEX('•');
Typically I'm getting E280A2 which is as previously seen the UTF8 encoding of the BULLET character.
And so see what's actually stored in your table:
SELECT HEX(your_column) FROM your_table;
Try to limit the search to a single raw to make it almost readable.
I need to make selects from fields that can contain special characteres for example
+--------------+
| code |
+--------------+
| **4058947"_\ |
| **4123/"_\ |
| sew'-8947"_\ |
+--------------+
i try this
select code from table where code REGEXP '[(|**4058947"_\|)]';
select code from table where code REGEXP '[(**4058947"_\)]';
select code from table where code REGEXP '^[(**4058947"_\)]';
but the querys return all rows and this query return empty
select code from table where code REGEXP '^[(**4058947"_\)]$';
and i need that only return the first one or the specified
To select only one row, you could just do this if it doesn't matter which one.
SELECT code FROM table LIMIT 1
If it does matter, drop the regex.
SELECT code FROM table WHERE code = "**4058947\"_\\"
To match those special characters (in this case, " and \), you need to "escape" them. (That's how it's called. I didn't make that up.) In most mainstream languages this is done by putting a backslash in front of it (MySQL does it this way too). The backslash is the escape character, a backslash with another character behind it is called an escape sequence. As you see, I escaped the quote and the backslash in the code value I want to match, so it should work now.
If you need to keep the regexes (which I hope is not the case, since you have the literal string you want to match against) same thing applies. Escape quotes and backslashes and you'll be fine, if you drop the parentheses and brackets. Note that in a regex, you need to escape far more characters. This is because some characters (for example: | [] () * + have a special function in a regex. This is very handy, but becomes a bit of a problem when you need to match a string with that character in it. In that case, you need to escape it, but with a double backslash! This is because MySQL first parses the query and will throw an error if it encounters an invalid escape sequence (that is, if you escape a character you needn't escape according to MySQL). Only then is the result parsed as a regex, with the double backslashes replaced by single backslashes. This gets ugly very quickly, since this means matching a backslash with a MySQL regex requires 4 backslashes! Two in the regex, but this needs to be doubled, since MySQL parses it as a string first!
I am struggling with this query and want to know if I am wasting my time and need to write a php script or is something like the following actually possible?
UPDATE my_table
SET #userid = user_id
AND SET filename('http://pathto/newfilename_'#userid'.jpg')
FROM my_table
WHERE filename
LIKE '%_%' AND filename
LIKE '%jpg'AND filename
NOT LIKE 'http%';
Basically I have 700 odd files that need renaming in the database as they do not match the filenames as I am changing system, they are called in the database.
The format is 2_gfhgfhf.jpg which translates to userid_randomjumble.jpg
But not all files in the database are in this format only about 700 out of thousands. So I want to identify names that contain _ but don't contain http (thats the correct format that I don't want to touch).
I can do that fine but now comes the tricky bit!!
I want to replace that file name userid_randomjumble.jpg with http://pathto/filename_userid.jpg So I want to set the column user_id in that row to a variable and insert it into my new filename.
The above doesn't work for obvious reasons but I am not sure if there is a way round what I'm trying to do. I have no idea if it's possible? Am I wasting my time with this and should I turn to PHP with mysql and stop being lazy? Or is there a way to get this to work?
Yes it is possible without the php. Here is a simple example
SET #a:=0;
SELECT * FROM table WHERE field_name = #a;
Yes you can do it using straightforward SQL:
UPDATE my_table
SET filename = CONCAT('http://pathto/newfilename_', userid, '.jpg')
WHERE filename LIKE '%\_%jpg'
AND filename NOT LIKE 'http%';
Notes:
No need for variables. Any columns of rows being updated may be referenced
In mysql, use CONCAT() to add text values together
With LIKE, an underscore (_) has a special meaning - it means "any single character". If you want to match a literal underscore, you must escape it with a backslash (\)
Your two LIKE predicates may be safely merged into one for a simpler query
I'm importing some data from a CSV file, and numbers that are larger than 1000 get turned into 1,100 etc.
What's a good way to remove both the quotes and the comma from this so I can put it into an int field?
Edit:
The data is actually already in a MySQL table, so I need to be able to this using SQL. Sorry for the mixup.
My guess here is that because the data was able to import that the field is actually a varchar or some character field, because importing to a numeric field might have failed. Here was a test case I ran purely a MySQL, SQL solution.
The table is just a single column (alpha) that is a varchar.
mysql> desc t;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| alpha | varchar(15) | YES | | NULL | |
+-------+-------------+------+-----+---------+-------+
Add a record
mysql> insert into t values('"1,000,000"');
Query OK, 1 row affected (0.00 sec)
mysql> select * from t;
+-------------+
| alpha |
+-------------+
| "1,000,000" |
+-------------+
Update statement.
mysql> update t set alpha = replace( replace(alpha, ',', ''), '"', '' );
Query OK, 1 row affected (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> select * from t;
+---------+
| alpha |
+---------+
| 1000000 |
+---------+
So in the end the statement I used was:
UPDATE table
SET field_name = replace( replace(field_name, ',', ''), '"', '' );
I looked at the MySQL Documentation and it didn't look like I could do the regular expressions find and replace. Although you could, like Eldila, use a regular expression for a find and then an alternative solution for replace.
Also be careful with s/"(\d+),(\d+)"/$1$2/ because what if the number has more then just a single comma, for instance "1,000,000" you're going to want to do a global replace (in perl that is s///g). But even with a global replace the replacement starts where you last left off (unless perl is different), and would miss the every other comma separated group. A possible solution would be to make the first (\d+) optional like so s/(\d+)?,(\d+)/$1$2/g and in this case I would need a second find and replace to strip the quotes.
Here are some ruby examples of the regular expressions acting on just the string "1,000,000", notice there are NOT double quote inside the string, this is just a string of the number itself.
>> "1,000,000".sub( /(\d+),(\d+)/, '\1\2' )
# => "1000,000"
>> "1,000,000".gsub( /(\d+),(\d+)/, '\1\2' )
# => "1000,000"
>> "1,000,000".gsub( /(\d+)?,(\d+)/, '\1\2' )
# => "1000000"
>> "1,000,000".gsub( /[,"]/, '' )
# => "1000000"
>> "1,000,000".gsub( /[^0-9]/, '' )
# => "1000000"
Here is a good case for regular expressions. You can run a find and replace on the data either before you import (easier) or later on if the SQL import accepted those characters (not nearly as easy). But in either case, you have any number of methods to do a find and replace, be it editors, scripting languages, GUI programs, etc. Remember that you're going to want to find and replace all of the bad characters.
A typical regular expression to find the comma and quotes (assuming just double quotes) is: (Blacklist)
/[,"]/
Or, if you find something might change in the future, this regular expression, matches anything except a number or decimal point. (Whitelist)
/[^0-9\.]/
What has been discussed by the people above is that we don't know all of the data in your CSV file. It sounds like you want to remove the commas and quotes from all of the numbers in the CSV file. But because we don't know what else is in the CSV file we want to make sure that we don't corrupt other data. Just blindly doing a find/replace could affect other portions of the file.
You could use this perl command.
Perl -lne 's/[,|"]//; print' file.txt > newfile.txt
You may need to play around with it a bit, but it should do the trick.
Here's the PHP way:
$stripped = str_replace(array(',', '"'), '', $value);
Link to W3Schools page
Actually nlucaroni, your case isn't quite right. Your example doesn't include double-quotes, so
id,age,name,...
1,23,phil,
won't match my regex. It requires the format "XXX,XXX". I can't think of an example of when it will match incorrectly.
All the following example won't include the deliminator in the regex:
"111,111",234
234,"111,111"
"111,111","111,111"
Please let me know if you can think of a counter-example.
Cheers!
The solution to the changed question is basically the same.
You will have to run select query with the regex where clause.
Somthing like
Select *
FROM SOMETABLE
WHERE SOMEFIELD REGEXP '"(\d+),(\d+)"'
Foreach of these rows, you want to do the following regex substitution s/"(\d+),(\d+)"/$1$2/ and then update the field with the new value.
Please Joseph Pecoraro seriously and have a backup before doing mass changes to any files or databases. Because whenever you do regex, you can seriously mess up data if there are cases that you have missed.
My command does remove all ',' and '"'.
In order to convert the sting "1,000" more strictly, you will need the following command.
Perl -lne 's/"(\d+),(\d+)"/$1$2/; print' file.txt > newfile.txt
Daniel's and Eldila's answer have one problem: They remove all quotes and commas in the whole file.
What I usually do when I have to do something like this is to first replace all separating quotes and (usually) semicolons by tabs.
Search: ";"
Replace: \t
Since I know in which column my affected values will be I then do another search and replace:
Search: ^([\t]+)\t([\t]+)\t([0-9]+),([0-9]+)\t
Replace: \1\t\2\t\3\4\t
... given the value with the comma is in the third column.
You need to start with an "^" to make sure that it starts at the beginning of a line. Then you repeat ([0-9]+)\t as often as there are columns that you just want to leave as they are.
([0-9]+),([0-9]+) searches for values where there is a number, then a comma and then another number.
In the replace string we use \1 and \2 to just keep the values from the edited line, separating them with \t (tab). Then we put \3\4 (no tab between) to put the two components of the number without the comma right after each other. All values after that will be left alone.
If you need your file to have semicolon to separate the elements, you then can go on and replace the tabs with semicolons. However then - if you leave out the quotes - you'll have to make sure that the text values do not contain any semicolons themselves. That's why I prefer to use TAB as column separator.
I usually do that in an ordinary text editor (EditPlus) that supports RegExp, but the same regexps can be used in any programming language.
I have a table which is full of arbitrarily formatted phone numbers, like this
027 123 5644
021 393-5593
(07) 123 456
042123456
I need to search for a phone number in a similarly arbitrary format ( e.g. 07123456 should find the entry (07) 123 456
The way I'd do this in a normal programming language is to strip all the non-digit characters out of the 'needle', then go through each number in the haystack, strip all non-digit characters out of it, then compare against the needle, eg (in ruby)
digits_only = lambda{ |n| n.gsub /[^\d]/, '' }
needle = digits_only[input_phone_number]
haystack.map(&digits_only).include?(needle)
The catch is, I need to do this in MySQL. It has a host of string functions, none of which really seem to do what I want.
Currently I can think of 2 'solutions'
Hack together a franken-query of CONCAT and SUBSTR
Insert a % between every character of the needle ( so it's like this: %0%7%1%2%3%4%5%6% )
However, neither of these seem like particularly elegant solutions.
Hopefully someone can help or I might be forced to use the %%%%%% solution
Update: This is operating over a relatively fixed set of data, with maybe a few hundred rows. I just didn't want to do something ridiculously bad that future programmers would cry over.
If the dataset grows I'll take the 'phoneStripped' approach. Thanks for all the feedback!
could you use a "replace" function to strip out any instances of "(", "-" and " ",
I'm not concerned about the result being numeric.
The main characters I need to consider are +, -, (, ) and space
So would that solution look like this?
SELECT * FROM people
WHERE
REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(phonenumber, '('),')'),'-'),' '),'+')
LIKE '123456'
Wouldn't that be terribly slow?
This looks like a problem from the start. Any kind of searching you do will require a table scan and we all know that's bad.
How about adding a column with a hash of the current phone numbers after stripping out all formatting characters. Then you can at least index the hash values and avoid a full blown table scan.
Or is the amount of data small and not expected to grow much?
Then maybe just sucking all the numbers into the client and running a search there.
I know this is ancient history, but I found it while looking for a similar solution.
A simple REGEXP may work:
select * from phone_table where phone1 REGEXP "07[^0-9]*123[^0-9]*456"
This would match the phonenumber column with or without any separating characters.
As John Dyer said, you should consider fixing the data in the DB and store only numbers. However, if you are facing the same situation as mine (I cannot run a update query) the workaround I found was combining 2 queries.
The "inside" query will retrieve all the phone numbers and format them removing the non-numeric characters.
SELECT REGEXP_REPLACE(column_name, '[^0-9]', '') phone_formatted FROM table_name
The result of it will be all phone numbers without any special character. After that the "outside" query just need to get the entry you are looking for.
The 2 queries will be:
SELECT phone_formatted FROM (
SELECT REGEXP_REPLACE(column_name, '[^0-9]', '') phone_formatted FROM table_name
) AS result WHERE phone_formatted = 9999999999
Important: the AS result is not used but it should be there to avoid erros.
An out-of-the-box idea, but could you use a "replace" function to strip out any instances of "(", "-" and " ", and then use an "isnumeric" function to test whether the resulting string is a number?
Then you could do the same to the phone number string you're searching for and compare them as integers.
Of course, this won't work for numbers like 1800-MATT-ROCKS. :)
Is it possible to run a query to reformat the data to match a desired format and then just run a simple query? That way even if the initial reformatting is slow you it doesn't really matter.
My solution would be something along the lines of what John Dyer said. I'd add a second column (e.g. phoneStripped) that gets stripped on insert and update. Index this column and search on it (after stripping your search term, of course).
You could also add a trigger to automatically update the column, although I've not worked with triggers. But like you said, it's really difficult to write the MySQL code to strip the strings, so it's probably easier to just do it in your client code.
(I know this is late, but I just started looking around here :)
i suggest to use php functions, and not mysql patterns, so you will have some code like this:
$tmp_phone = '';
for ($i=0; $i < strlen($phone); $i++)
if (is_numeric($phone[$i]))
$tmp_phone .= '%'.$phone[$i];
$tmp_phone .= '%';
$search_condition .= " and phone LIKE '" . $tmp_phone . "' ";
This is a problem with MySQL - the regex function can match, but it can't replace. See this post for a possible solution.
See
http://www.mfs-erp.org/community/blog/find-phone-number-in-database-format-independent
It is not really an issue that the regular expression would become visually appalling, since only mysql "sees" it. Note that instead of '+' (cfr. post with [\D] from the OP) you should use '*' in the regular expression.
Some users are concerned about performance (non-indexed search), but in a table with 100000 customers, this query, when issued from a user interface returns immediately, without noticeable delay.
Here is a working Solution for PHP users.
This uses a loop in PHP to build the Regular Expression. Then searches the database in MySQL with the RLIKE operator.
$phone = '(456) 584-5874' // can be any format
$phone = preg_replace('/[^0-9]/', '', $phone); // strip non-numeric characters
$len = strlen($phone); // get length of phone number
for ($i = 0; $i < $len - 1; $i++) {
$regex .= $phone[$i] . "[^[:digit:]]*";
}
$regex .= $phone[$len - 1];
This creates a Regular Expression that looks like this: 4[^[:digit:]]*5[^[:digit:]]*6[^[:digit:]]*5[^[:digit:]]*8[^[:digit:]]*4[^[:digit:]]*5[^[:digit:]]*8[^[:digit:]]*7[^[:digit:]]*4
Now formulate your MySQL something like this:
$sql = "SELECT Client FROM tb_clients WHERE Phone RLIKE '$regex'"
NOTE: I tried several of the other posted answers but found performance issues. For example, on our large database, it took 16 seconds to run the IsNumeric example. But this solution ran instantly. And this solution is compatible with older MySQL versions.
MySQL can search based on regular expressions.
Sure, but given the arbitrary formatting, if my haystack contained "(027) 123 456" (bear in mind position of spaces can change, it could just as easily be 027 12 3456 and I wanted to match it with 027123456, would my regex therefore need to be this?
"^[\D]+0[\D]+2[\D]+7[\D]+1[\D]+2[\D]+3[\D]+4[\D]+5[\D]+6$"
(actually it'd be worse as the mysql manual doesn't seem to indicate it supports \D)
If that is the case, isn't it more or less the same as my %%%%% idea?
Just an idea, but couldn't you use Regex to quickly strip out the characters and then compare against that like #Matt Hamilton suggested?
Maybe even set up a view (not sure of mysql on views) that would hold all phone numbers stripped by regex to a plain phone number?
Woe is me. I ended up doing this:
mre = mobile_number && ('%' + mobile_number.gsub(/\D/, '').scan(/./m).join('%'))
find(:first, :conditions => ['trim(mobile_phone) like ?', mre])
if this is something that is going to happen on a regular basis perhaps modifying the data to be all one format and then setup the search form to strip out any non-alphanumeric (if you allow numbers like 310-BELL) would be a good idea. Having data in an easily searched format is half the battle.
a possible solution can be found at http: //udf-regexp.php-baustelle.de/trac/
additional package need to be installed, then you can play with REGEXP_REPLACE
Create a user defined function to dynamically creates Regex.
DELIMITER //
CREATE FUNCTION udfn_GetPhoneRegex
(
var_Input VARCHAR(25)
)
RETURNS VARCHAR(200)
BEGIN
DECLARE iterator INT DEFAULT 1;
DECLARE phoneregex VARCHAR(200) DEFAULT '';
DECLARE output VARCHAR(25) DEFAULT '';
WHILE iterator < (LENGTH(var_Input) + 1) DO
IF SUBSTRING(var_Input, iterator, 1) IN ( '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' ) THEN
SET output = CONCAT(output, SUBSTRING(var_Input, iterator, 1));
END IF;
SET iterator = iterator + 1;
END WHILE;
SET output = RIGHT(output,10);
SET iterator = 1;
WHILE iterator < (LENGTH(output) + 1) DO
SET phoneregex = CONCAT(phoneregex,'[^0-9]*',SUBSTRING(output, iterator, 1));
SET iterator = iterator + 1;
END WHILE;
SET phoneregex = CONCAT(phoneregex,'$');
RETURN phoneregex;
END//
DELIMITER ;
Call that User Defined Function in your stored procedure.
DECLARE var_PhoneNumberRegex VARCHAR(200);
SET var_PhoneNumberRegex = udfn_GetPhoneRegex('+ 123 555 7890');
SELECT * FROM Customer WHERE phonenumber REGEXP var_PhoneNumberRegex;
I would use Google's libPhoneNumber to format a number to E164 format. I would add a second column called "e164_number" to store the e164 formatted number and add an index on it.
In my case, I needed to identify Swiss (CH) mobile phone numbers in the phone column and move them in mobile column.
As all mobile phone numbers starts with 07x or +417x here is the regex to use :
/^(\+[0-9][0-9]\s*|0|)7.*/mgix
It find all numbers like the following :
+41 79 123 456 78
+417612345678
076 123 456 78
07812345678
7712345678
and ignore all others like theese :
+41 47 123 456 78
+413212345678
021 123 456 78
02212345678
3412345678
In MySQL it gives the following code :
UPDATE `contact`
SET `mobile` = `phone`,
`phone` = ''
WHERE `phone` REGEXP '^(\\+[\D+][0-9]\\s*|0|)(7.*)$'
You'll need to clean your number from special chars like -/.() before.
https://regex101.com/r/AiWFX8/1