I am a novice programmer and I'm currently working with functions and stored procedures in MySQL using Workbench 5.6 . I've been searching for some time now here on SO and on the Web for a formal definition of the "#" operator in MySQL and it's proper use, but I wasn't able to find some concrete explanation.
Let's say that I have this :
/*..... Stored Procedure... */
declare i int ;
set #i = 1 ;
select #i ;
/* do some other stuff */
End;
The result of select will be 1 ,instead, if I do:
select i ;
I will get a Null result.
From my intuition so far, I think that is accessing the direction in the memory of a stored variable and prints/modifies its content,still I'm not quite sure.Could you shed some more light?
Are there any other uses of it?
Thanks a priori.
It isn't an operator (I suspect you come from PHP, where it is an operator). It's the syntax for user-defined variables:
User variables are written as #var_name, where the variable name
var_name consists of alphanumeric characters, “.”, “_”, and “$”. A
user variable name can contain other characters if you quote it as a
string or identifier (for example, #'my-var', #"my-var", or
#my-var).
The # denotes a variable, you prefix your variables with the # to prevent confusing them with column names and other schema, it also makes life a lot easier when looking at code. When you enter select I from x;, your looking for column I, which doesn't exist in the table, hence the null.
Related
I have large tables of freely formatted text strings stored in MySQL database. Within each of those strings I have to find three substrings which are specifically formatted. This problem looks like an ideal fit for MySQL REGEXP pattern matching.
I know that MySQl REGEXP operator returns only True or False. Moreover, because I need to process large tables, I would need to achieve the goal within MySQL and not to involve PHP or any other server side language.
Example of source data:
FirstEntry_somestring_202320047A_210991957_700443250_Lieferadresse:_modified string c/o Logistics, some address and another text
SecondEntry_hereisanothertext_210991957_text_202320047A_and_700443250_another text which does not have any predefined structure
ThirdEntry_700443250_210991957_202320047A_Lieferadresse:_here some address, Logistics, and some another text with address.
FourthEntry some very long text before numbers__202320047A-700443250-210991957-Lieferadresse:, another text with address and company name. None of this text has predefined structure
The examples above have are four strings stored as TEXT datatypes within MySQL table. They do not have any specific structure. I know however, that somewhere in each records must be three numbers freely delimited and but they have specific format:
Regex Format: '\d{3}(30|31|32)\d{4}[A-Z])'
Regex Format:'(\d{3}(99)\d{4})')
Regex Format: '((700)\d{6})'
Could you please help me how can I get the substrings matching the Regex patterns in the text above?
The Server runs on:
Windows OS
IIS 7
MySQL for Windows
PHP
...
Thank you!
MariaDB 10.0.5 (from 2013) is virtually the same as MySQL, but it includes the full set of REGEXP. Also it has REGEXP_REPLACE().
See https://mariadb.com/kb/en/mariadb/pcre/
For those interested in this question, I have developed my own solution using MySQL Stored Procedures.
I think, this is the most valuable solution on this subject on StackOverflow, as it provides real solution. In contrast to others, there were only vague ideas offered:
-- Return REGEX Value
DELIMITER $$
DROP PROCEDURE IF EXISTS RETURNREGEX$$
CREATE PROCEDURE RETURNREGEX(IN strSentence VARCHAR(1024), IN regex_str VARCHAR(1024), IN length_str INT )
BEGIN
DECLARE index_str INT DEFAULT 0;
DECLARE match_str VARCHAR(1024) DEFAULT '';
DECLARE result BOOL DEFAULT FALSE;
REPEAT
-- Get substring with predefined length
SELECT SUBSTRING(strSentence, index_str, length_str) INTO match_str;
-- compare this substring agains REGEX to see if we have match
SELECT match_str REGEXP regex_str INTO result;
SET index_str = index_str + 1;
-- evaluate result (TRUE / FALSE)
UNTIL result OR index_str > length(strSentence)
END REPEAT;
IF result = TRUE THEN SELECT match_str;
ELSE SELECT NULL;
END IF;
END$$
DELIMITER ;
MySql has a function CONCAT_WS that I use to export multiple fields with a delimiter into a single field. Works great!
There are multiple fields being stored in a database I query off of that has data that I need to extract each field individually but within each field the data need to include a delimiter. I can most certainly do a concatenate but that does take awhile to set-up if my data requires up to 100 unique values. Below is an example of what I am talking about
Stored Data 01020304050607
End Result 01,02,03,04,05,06,07
Stored Data 01101213
End Result 01,10,12,13
Is there a function in MySQL that does the above?
I am not that familiar with mysql but I have seen questions like this come up before where a regular expression function would be useful. There are user-defined functions available that allow Oracle-like regular expression functions to be used as their support is weak in mysql. See here: https://github.com/hholzgra/mysql-udf-regexp
So you could do something like this:
select trim(TRAILING ',' FROM regexp_replace(your_column, '(.{2})', '\1,') )
from your_table;
This adds a comma every 2 character then chops off the last one. Maybe this will give you some ideas.
Originally I thought this problem was a general failure of my understanding or could be generecized to a case that would be useful to others. I still haven't solved the problem, but upon learning more about the boundaries of the problem I see it is probably something highly specific and not of use to the community. Thank you for any and all help and time taken to help.
Summary
When I pass the value (Unit-101|00-102|Unit_103|Unit 104) to my stored procedure (as a VARCHAR, to use as comparison in RLIKE in a WHERE clause) - it will generate the error 'parentheses not balanced'. However, the stored procedure works perfectly when other values are passed in - three or less in the capture group (e.g. (Unit-101|00-102|Unit_103)), wildcard (.*) or wildcard-component (Unit-1.*) values work perfectly. Also, a very similar stored procedure works perfectly when more than three values are in the capture group... (Please see below for more details. Thank you.)
Edit: At least that's what I thought. I tried (a|b|c|d) (I've tried other values before which did not work) - and it worked. So I'm once again at a complete loss.
Problem Context
I have a stored procedure which includes some regex in a WHERE clause - it aggregates some totals for entries only where a Unit name matches the regex passed in to the procedure (_r_unit, an IN VARCHAR(100)).
This has worked as expected for many cases of _r_unit - .*, Unit-B.*, Unit-101, etc. - however in one case, when I used parentheses to capture Unit-101, or 00-102, or Unit_103, or Unit 104 - (Unit-101|00-102|Unit_103|Unit 104) - then the query fails with the error #1139 - Got error 'parentheses not balanced' from regexp.
Steps attempted to find solution so far
I first discovered this problem while passing in the regexp value from php, using preg_quote to escape the - characters in the unit names. However, a commenter helpfully pointed out that the - character should not need to be escaped here, and php will mean it needs to be 'double-escaped' anyway. So, I have tried some php-related things, but this does not appear to be the issue - now that I've eliminated that as the cause I'm just passing in values by hand using phpmyadmin to examine the conditions which cause an error. For reference, the variable which was passed in as a value to the stored procedure from php was set to -
"(" . preg_quote("Unit-101") . "|" . preg_quote("00-102") . "|"
. preg_quote("Unit_103") . "|" . preg_quote("Unit 104") . ")";
The literal contents of the variable (examined by echoing it out) was (Unit\-101|00\-102|Unit_103|Unit 104), as expected.
I have also examined the regex in regex101 to check it 'does what it says on the tin' - it's as expected, looking for Unit-101, 00-102, Unit_103, or Unit 104.
Variations on regex input I have tried.
Regex which does not generate the error
CALL StoredProcedure('(Unit-101)'); // Or any other of the four units, non-escaped dash
CALL StoredProcedure('(Unit\-101)'); // Or any other of the four units, escaped dash
CALL StoredProcedure('(Unit-101|00-102)')
CALL StoredProcedure('(Unit_103|Unit 104)')
CALL StoredProcedure('(Unit-101|Unit_103|Unit 104)')
CALL StoredProcedure('(00-102|Unit_103|Unit 104)')
CALL StoredProcedure('(Unit-101|00-102|Unit 104)')
Regex which generates the error
CALL StoredProcedure('(Unit-101|00-102|Unit_103|Unit 104)')
CALL StoredProcedure('(Unit 104|00-102|Unit-101|Unit_103)')
CALL StoredProcedure('(Unit-101|00-102)|(Unit_103|Unit 104)')
I tried this with some random values too - and it seems to break with 4 values, regardless of what they are.
I just tried running another stored procedure which takes in the same value (i.e. _r_unit) and it works correctly with 4 values. They are very similar queries so I'm trying to find a difference in the WHERE clauses where the regex is used but I can't find any ...
Stored Procedure 1 WHERE clauses - Generates Error
/* first WHERE clause, to retrieve results FROM one database */
WHERE
`Date` BETWEEN _startDate AND _endDate
AND `Unit Type` = _unitType
AND `Unit Serial` RLIKE _r_unit
AND `Driver` RLIKE _r_driver
AND `Error` != ""
AND `Error` RLIKE _r_errorCode
/* second WHERE clause, to retrieve results FROM a second (relational) database */
/* (there are two WHERE clauses because the results are unioned together) */
WHERE
DB._Records.Date BETWEEN _startDate AND _endDate
AND DB.UnitTypes.Module = _unitType
AND DB.Units.Serial RLIKE _r_unit
AND DB.Drivers.Driver RLIKE _r_driver
AND DB.ErrorCodes.ErrorCode != ""
AND DB.ErrorCodes.ErrorCode RLIKE _r_maintCode
Stored Procedure 2 WHERE clauses - Does not generate error
/* first WHERE clause, to retrieve results FROM one database */
WHERE
`Date` BETWEEN _startDate AND _endDate
AND `Unit Type` = _unitType
AND `Unit Serial` RLIKE _r_unit
AND `Driver` RLIKE _r_driver
/* second WHERE clause, to retrieve results FROM a second (relational) database */
/* (there are two WHERE clauses because the results are unioned together) */
WHERE
DB._Records.Date BETWEEN _startDate AND _endDate
AND DB.UnitTypes.Module = _unitType
AND DB.Units.Serial RLIKE _r_unit
AND DB.Drivers.Driver RLIKE _r_driver
(I am aware some fields are badly named however this is an old system and a lot of things are dependent on it so they would be difficult to change)
I'm at my wit's end! Any advice much appreciated!
This answer does not address the issue exposed in the question but it is too long to fit in a comment.
A small quote from MySQL RLIKE documentation:
Because MySQL uses the C escape syntax in strings (for example, “\n” to represent the newline character), you must double any “\” that you use in your REGEXP strings.
...
To use a literal instance of a special character in a regular expression, precede it by two backslash (\) characters. The MySQL parser interprets one of the backslashes, and the regular expression library interprets the other.
This means the backslashes (\) are swallowed by MySQL and the regular expression engine does not see them. However, in the exposed regex, the dashes (-) have no special meaning and it's OK to leave them unquoted.
Unfortunately, this does not explain why your query fails with that strange error message.
Hope you already solved your problem! In case not, here is how I fixed the problem of a query working perfectly at the MySQLprompt but giving the 'regexp brackets not balanced' MySQL error when running it through PHP.
The simple solution is that one needs to add one extra backslash when going through PHP!
So, as an example, to match any of {[( MySQL will be happy with REGEXP [{\\[(] but the same query executed via php needs REGEXP [{\\\[(]
Hope this helps!
I am having the following problem:
I have a table T which has a column Name with names. The names have the following structure:
A\\B\C
You can create on yourself like this:
create table T ( Name varchar(10));
insert into T values ('A\\\\B\\C');
select * from T;
Now if I do this:
select Name from T where Name = 'A\\B\C';
That doesn't work, I need to escape the \ (backslash):
select Name from T where Name = 'A\\\\B\\C';
Fine.
But how do I do this automatically to a string Name?
Something like the following won't do it:
select replace('A\\B\C', '\\', '\\\\');
I get: A\\\BC
Any suggestions?
Many thanks in advance.
You have to use "verbatim string".After using that string your Replace function will
look like this
Replace(#"\", #"\\")
I hope it will help for you.
The literal A\\B\C must be coded as A\\\\A\\C, and the parameters of replace() need escaping too:
select 'A\\\\B\\C', replace('A\\\\B\\C', '\\', '\\\\');
output (see this running on SQLFiddle):
A\\B\C A\\\\B\\C
So there is little point in using replace. These two statements are equivalent:
select Name from T where Name = replace('A\\\\B\\C', '\\', '\\\\');
select Name from T where Name = 'A\\\\B\\C';
Usage of regular expression will solve your problem.
This below query will solve the given example.
1) S\\D\B
select * from T where Name REGEXP '[A-Z]\\\\\\\\[A-Z]\\\\[A-Z]$';
if incase the given example might have more then one char
2) D\\B\ACCC
select * from T where Name REGEXP '[A-Z]{1,5}\\\\\\\\[A-Z]{1,5}\\\\[A-Z]{1,5}$';
note: i have used 5 as the max occurrence of char considering the field size is 10 as its mentioned in the create table query.
We can still generalize it.If this still has not met your expectation feel free to ask for my help.
You're confusing what's IN the database with how you represent that data in SQL statements. When a string in the database contains a special character like \, you have to type \\ to represent that character, because \ is a special character in SQL syntax. You have to do this in INSERT statements, but you also have to do it in the parameters to the REPLACE function. There are never actually any double slashes in the data, they're just part of the UI.
Why do you think you need to double the slashes in the SQL expression? If you're typing queries, you should just double the slashes in your command line. If you're generating the query in a programming language, the best solution is to use prepared statements; the API will take care of proper encoding (prepared statements usually use a binary interface, which deals with the raw data). If, for some reason, you need to perform queries by constructing strings, the language should hopefully provide a function to escape the string. For instance, in PHP you would use mysqli_real_escape_string.
But you can't do it by SQL itself -- if you try to feed the non-escaped string to SQL, data is lost and it can't reconstruct it.
You could use LIKE:
SELECT NAME FROM T WHERE NAME LIKE '%\\\\%';
Not exactly sure by what you mean but, this should work.
select replace('A\\B\C', '\', '\\');
It's basically going to replace \ whereever encountered with \\ :)
Is this what you wanted?
I am new to SQL and I have several large database with upper case first and last names that I need to convert to proper case in SQL sever 2008.
I am using the following to do this:
update database
Set FirstNames = upper(substring(FirstNames, 1, 1))
+ lower(substring(FirstNames, 2, (len(FirstNames) - 1) ))
I was wondering if there was any way to adapt this so that a field with two first names is also updated (currently I make the change and then go through and manually change the second name).
I have looked over the other answers in this field and they all seem quit long, compared to the query above.
Also is there any way to assist with converting the Mc suranmes ( I will manually change the others)? MCDONALD to McDonald, again I am just using the about query but replacing the FirstNames with LastName.
This is probably best done outside of SQL. However, if there is a requirement to do it on the server or if speed isn't an issue (because it will be an issue so you need to figure out if you care), the way you are going about it is probably the best way of doing so. If you want, you could create a UDF that puts all of the logic in one area.
Here is some code I came across (with attribution and more information below it):
CREATE FUNCTION dbo.fCapFirst(#input NVARCHAR(4000)) RETURNS NVARCHAR(4000)
AS
BEGIN
DECLARE #position INT
WHILE IsNull(#position,Len(#input)) > 1
SELECT #input = Stuff(#input,IsNull(#position,1),1,upper(substring(#input,IsNull(#position,1),1))),
#position = charindex(' ',#input,IsNull(#position,1)) + 1
RETURN (#input)
END
--Call it like so
select dbo.fCapFirst(Lower(Column)) From MyTable
I got this code from http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=37760 There is more information and other suggestions in this forum as well.
As for dealing with cases like the McDonald, I would suggest one of two ways to handle this. One would be to put a search in the above UDF for key names ('McDonald', 'McGrew', etc.) or for patterns (the first two letters are Mc then make the next one capital, etc.) The second way would be to put these cases (the full names) in a table and have their replacement value in a second column. Then simply do a replace. Most likely, however, it will be easiest to identify rules like Mc then capitalize instead of trying to list every last-name possibility.
Don't forget you may want to modify the above UDF to include dashes, not just spaces.
Maybe this is too long but it is very easy and can be adapted for -, ', etc:
UPDATE tbl SET LastName = Case when (CharIndex(' ',lastname,1)<>0) then (Upper(Substring(lastname,1,1))+Lower(Substring(lastname,2,CharIndex(' ',lastname,1)-1)))+
(Upper(Substring(lastname,CharIndex(' ',lastname,1)+1,1))+
Lower(Substring(lastname,CharIndex(' ',lastname,1)+2,Len(lastname)-(CharIndex(' ',lastname,1)-1))))
else (Upper(Substring(lastname,1,1))+Lower(Substring(lastname,2,Len(lastname)-1))) end,
FirstName = Case when (CharIndex(' ',firstname,1)<>0) then (Upper(Substring(firstname,1,1))+Lower(Substring(firstname,2,CharIndex(' ',firstname,1)-1)))+
(Upper(Substring(firstname,CharIndex(' ',firstname,1)+1,1))+
Lower(Substring(firstname,CharIndex(' ',firstname,1)+2,Len(firstname)-(CharIndex(' ',firstname,1)-1))))
else (Upper(Substring(firstname,1,1))+Lower(Substring(firstname,2,Len(firstname)-1))) end;
Tony Rogerson has code that deals with:
double barrelled names eg Arthur Bentley-Smythe
Control characters
I haven't used it myself though...