ANTLR: problem differntiating unary and binary operators (e.g. minus sign) - binary

i'm using ANTLR (3.2) to parse some rather simple grammar. Unfortunately, I came across a little problem. Take the follwoing rule:
exp
: NUM
| '(' expression OPERATOR expression ')' -> expression+
| '(' (MINUS | '!') expression ')' -> expression
;
OPERATOR contains the same minus sign ('-') as is defined with MINUS. Now ANTLR seems to be unable to deal with these two rules. If I remove either one, everything works fine.
Anyone ideas?

Make the unary expression the one with the highest precedence. I'd also use a different token for the unary - to make the distinction between the minus better. A demo:
grammar Exp;
options {
output=AST;
}
tokens {
UNARY;
}
parse
: exp EOF
;
exp
: additionExp
;
additionExp
: multiplyExp ('+'^ multiplyExp | '-'^ multiplyExp)*
;
multiplyExp
: unaryExp ('*'^ unaryExp | '/'^ unaryExp)*
;
unaryExp
: '-' atom -> ^(UNARY atom)
| '!' atom -> ^('!' atom)
| atom
;
atom
: '(' exp ')' -> exp
| Number -> Number
;
Number : ('0'..'9')+ ('.' ('0'..'9')+)? ;
Spaces : (' ' | '\t' | '\r'| '\n') {$channel=HIDDEN;} ;
A quick test with the source:
3 * -4 + 7 / 6 * -(3 + -7 * (4 + !2))
produced the following AST:
image created using http://graph.gafol.net/

Related

Apostrophe at beginning/end of search string not treated as part of a word by RegEx

We run a dictionary and have run into a problem with searches that contain an apostrophe at the start of a search string. In English words like 'twas are quite rare but in the language we're dealing with, ' is considered a word character and extremely common at the start of a phrase (for instance 's) and also at the end of words (for instance a').
Oddly enough, RegEx searches don't seem to struggle with this if it's in the middle (for example air a' bhòrd gets all the desired results) but ' at beginning or end of a search string is not treated as part of a word by RegEx.
We've ascertained this is part of the RegEx specification (only alphanumeric characters and _ are treated as part of a word) but we're wondering if it is it possible to write a RegEx expression that also treats apostrophes as part of a word?
This is what we're currently getting:
-- Demonstration on MySQL 5.6.21 Community
Select ('cat''s' REGEXP CONCAT('[[:<:]]', 'cat''s', '[[:>:]]'));
-- returns 1
Select ('''cat''s' REGEXP CONCAT('[[:<:]]' ,'''cat''s' ,'[[:>:]]' ));
-- returns 0
Select ('_cat''s' REGEXP CONCAT('[[:<:]]' ,'_cat''s' ,'[[:>:]]' ));
-- returns 1
Select ('-cat''s' REGEXP CONCAT('[[:<:]]' ,'-cat''s' ,'[[:>:]]' ));
-- returns 0
Select (' cat''s' REGEXP CONCAT('[[:<:]]' ,' cat''s' ,'[[:>:]]' ));
-- returns 0
Select ('cat''' REGEXP CONCAT('[[:<:]]' ,'cat''' ,'[[:>:]]' ));
-- returns 0
Any suggestions greatly welcomed :)
I think that you should provide your own definition of what a word character is, instead of relying on default ICE word boundaries ([[:<:]], [[:>:]]). From the mysql 5.6 documentation :
A word is a sequence of word characters that is not preceded by or followed by word characters. A word character is an alphanumeric character in the alnum class or an underscore (_).
That would mean : '^|[^[:alnum:]_]'
^ -- the beginning of the string
| -- OR
[^ -- any character OTHER than
[:alnum:] -- an alphanumeric character
_ -- an underscore
]
And ICE end of string would be : '[^[:alnum:]_]|$', where $ represents the end of string.
You could just modify this to add the single quote in the character class, like :
beginning : '^|[^[:alnum:]_'']'
end : '[^[:alnum:]_'']|$'
Here is your regex :
SELECT (val REGEXP CONCAT('(^|[^[:alnum:]_''])', 'cat''s', '([^[:alnum:]_'']|$)'));
See the demo on dbfiddle
Schema (MySQL v5.6)
Query #1
Select ('cat''s'
REGEXP CONCAT('(^|[^[:alnum:]_''])', 'cat''s', '([^[:alnum:]_'']|$)')) res;
| res |
| --- |
| 1 |
Query #2
Select ('''cat''s'
REGEXP CONCAT('(^|[^[:alnum:]_''])', '''cat''s', '([^[:alnum:]_'']|$)' )) res;
| res |
| --- |
| 1 |
Query #3
Select ('_cat''s'
REGEXP CONCAT('(^|[^[:alnum:]_''])', '_cat''s' , '([^[:alnum:]_'']|$)' )) res;
| res |
| --- |
| 1 |
Query #4
Select ('-cat''s'
REGEXP CONCAT('(^|[^[:alnum:]_''])', '-cat''s' , '([^[:alnum:]_'']|$)' )) res;
| res |
| --- |
| 1 |
Query #5
Select (' cat''s'
REGEXP CONCAT('(^|[^[:alnum:]_''])', ' cat''s' , '([^[:alnum:]_'']|$)' )) res;
| res |
| --- |
| 1 |
Query #6
Select ('cat'''
REGEXP CONCAT('(^|[^[:alnum:]_''])', 'cat''' , '([^[:alnum:]_'']|$)' )) res;
| res |
| --- |
| 1 |

Using ANTLR4 to create functions with no argument

I am still new to ANTLR4 and I am trying to achieve the following
I have business rules which consist of logical operation
(A= 'text' or B < 1) and getDataDB
the function getDataDB does not take any argument. the function will retrieve some data to validate it and return either true or false.
my grammar is below
/*
* Test grammar
*/
grammar FunctionRule;
parse: expr EOF
;
expr
: expr binop expr #logicalExpression
| lhs=VARIABLE compop rhs=VARIABLE #variableExpression
| lhs=VARIABLE compop rhs=STRING #stringExpression
| lhs=VARIABLE compop rhs=NUMBER #numberExpression
| TRUE #booleanTrue
| FALSE #booleanFalse
| function #functionExpression
| VARIABLE #booleanVariable
| LEFTPAREN expr RIGHTPAREN #enclosedExpression
;
binop : AND | OR
;
compop: EQUAL | LT | GT | LTE | GTE | NE
;
function : ID {System.out.println("HELLLL");};
TRUE: 'true' | 'TRUE' ;
FALSE: 'false' | 'FALSE';
STRING: '"' ~([\t\n\r]| '"')* '"'
;
ID : [getDataDB];
LEFTPAREN: '(';
RIGHTPAREN: ')';
EQUAL : '=' | 'EQ';
LT : '<' | 'LT';
GT : '>' | 'GT';
LTE : '<=' | 'LE';
GTE : '>=' | 'GE';
NE : '!=' | 'NE';
AND : 'AND' | '&' | 'and';
OR : 'OR' | 'or' | '|';
VARIABLE : [a-zA-Z]+[a-zA-Z0-9_.-]*;
NUMBER : [0-9]+ ('.'[0-9]+)?;
SPACE : [ \t\r\n] -> skip;
When I generate classes from the grammar, i did not see anything related to the function.
1-how do I define a function correctly in the grammar file.
2- where i can put the code for this function after creating the classes, is it only in the action clause, is there is a way to put the class name in the grammar where i can put the implementation
Thanks for the help!
ID : [getDataDB];
This means that ID matches a single letter that could be either one of g, e, t, D, a or B. What you likely wanted is ID: 'getDataDB'; which matches the string getDataDB. Note that calling this ID is highly misleading.
where i can put the code for this function
Are you writing an interpreter using a visitor? Then you'd put the code into the visitFunction method or rather in a getDataDB method that you call from visitFunction if the function name was equal to getDataDB (right now that would always be the case, but I'm assuming you eventually want to introduce more than one function).
Alternatively you could also structure your grammar slightly differently like this (removing the ID rule):
function : 'getDataDB' # GetDataDB
| 'otherFunction' # OtherFunction
;
Then you could define the functions in visitGetDataDB and visitOtherFunction respectively.
All that's assuming that you want function names to be keywords (which implies that there can't be user-definable functions). If you don't, you should not have separate tokens for function names, so zero-argument functions and variables become indistinguishable syntactically (unless you add a requirement to add () for functions, but it doesn't look like that's what you want). So you should just have one rule that could be either a variable or a zero-argument function and then check whether the given identifier is the name of a function in visitVariableOrNullaryFunction (which maybe you'd just call visitVariable for brevity).

Got error 'repetition-operator operand invalid' from regexp (Error #1139)

I have a column phone_number on a database that an entry may contain more than one phone number. The plan is to identify entries which do not pass a regex expression validation.
This is the query I am using to accomplish my objective:
SELECT id, phone_number FROM store WHERE phone_number NOT REGEXP '^\s*\(?(020[78]?\)? ?[1-9][0-9]{2,3} ?[0-9]{4})|(0[1-8][0-9]{3}\)? ?[1-9][0-9]{2} ?[0-9]{3})\s*$';
Problem is, every time I run the code, I get an error:
Error Code: 1139. Got error 'repetition-operator operand invalid' from regexp
Thanks in advance.
The regex you are using has at least 2 issues: 1) the escapes should be doubled, and 2) there are 2 groups separated with | that makes the ^ and $ apply to the two branches separately.
'^\s*\(?(020[78]?\)? ?[1-9][0-9]{2,3} ?[0-9]{4})|(0[1-8][0-9]{3}\)? ?[1-9][0-9]{2} ?[0-9]{3})\s*$'
^--------------------------------------^ ^------------------------------------------^
You can use
'^[[:space:]]*\\(?(020[78]?\\)? ?[1-9][0-9]{2,3} ?[0-9]{4}|0[1-8][0-9]{3}\\)? ?[1-9][0-9]{2} ?[0-9]{3})[[:space:]]*$'
Breakdown:
^ - start of string
[[:space:]]* - 0+ whitespaces
\\(? - 1 or 0 ( chars
(020[78]?\\)? ?[1-9][0-9]{2,3} ?[0-9]{4}|0[1-8][0-9]{3}\\)? ?[1-9][0-9]{2} ?[0-9]{3}) - An alternation group matching 2 alternatives:
020[78]?\\)? ?[1-9][0-9]{2,3} ?[0-9]{4} - 020 + optional 7 or 8 + an optional ) + an optional space + a digit from 1 to 9 + 3 or 2 digits + an optional space + 4 digits
| - or
0[1-8][0-9]{3}\\)? ?[1-9][0-9]{2} ?[0-9]{3} - 0 + a digit from 1 to 8 + 3 digits + an optional ) + an optional space + a digit from 1 to 9 + 2 digits + an optional space + 3 digits
[[:space:]]* - 0+ whitespaces
$ - end of string

create ++ operator in VHDL

I would like to have a new C++ style operator for the STD_LOGIC_VECTOR type. So far I managed to create and use the following function:
FUNCTION PLUS_ONE ( a : STD_LOGIC_VECTOR) RETURN STD_LOGIC_VECTOR is
BEGIN
RETURN std_logic_vector( unsigned( a ) + 1);
END FUNCTION;
now if i create this:
FUNCTION "++" ( a : STD_LOGIC_VECTOR) RETURN STD_LOGIC_VECTOR is
BEGIN
RETURN std_logic_vector( unsigned( a ) + 1);
END FUNCTION;
ISE throws the following error:
"++" is not a predefined operator.
Now the question is, is it possible to create new operators in VHDL an I am missing something
You can only overload operators in VHDL, you cannot create new operator symbols. Quoting the LRM (section 4.5.2):
The declaration of a function whose designator is an operator symbol
is used to overload an operator. The sequence of characters of the
operator symbol shall be one of the operators in the operator classes
defined in 9.2.
And the corresponding section of the manual says:
condition_operator ::= ??
logical_operator ::= and | or | nand | nor | xor | xnor
relational_operator ::= = | /= | < | <= | > | >= | ?= | ?/= | ?< | ?<= | ?> | ?>=
shift_operator ::= sll | srl | sla | sra | rol | ror
adding_operator ::= + | – | &
sign ::= + | –
multiplying_operator ::= * | / | mod | rem
miscellaneous_operator ::= ** | abs | not
As much as I like brevity, I must admit that choosing shorthand operators over standard ways of writing expressions is "syntactic sugar", and has a potential to obfuscate the code. It is interesting to note that "trendier" languages like Python and Ruby don't have a ++ operator as well.
Could VHDL support the ++ operator? I'm currently working on a VHDL parser, and I risk saying that adding a postfix ++ operator would break quite a few rules of the language grammar, especially because unary operators expect to take an operand to the right of the symbol. Owing to this and to the fact that aren't many strong arguments in favor of such a change, I don't expect to see it anytime soon. All thigs considered, my personal choice has been to stick with value := value + 1 for standard data types.

MySQL Remove Trailing Zero

Is there a built-in function in MySQL the removes trailing zeros on the right?
I have samples and i want my output to be like this:
1.0 ==> 1
1.50 ==> 1.5
10.030 ==> 10.03
0.50 ==> 0.5
0.0 ==> 0
Easiest way by far, just add zero!
Examples:
SET
#yournumber1="1.0",
#yournumber2="1.50",
#yournumber3="10.030",
#yournumber4="0.50",
#yournumber5="0.0"
;
SELECT
(#yournumber1+0),
(#yournumber2+0),
(#yournumber3+0),
(#yournumber4+0),
(#yournumber5+0)
;
+------------------+------------------+------------------+------------------+------------------+
| (#yournumber1+0) | (#yournumber2+0) | (#yournumber3+0) | (#yournumber4+0) | (#yournumber5+0) |
+------------------+------------------+------------------+------------------+------------------+
| 1 | 1.5 | 10.03 | 0.5 | 0 |
+------------------+------------------+------------------+------------------+------------------+
1 row in set (0.00 sec)
If the column your value comes from is DECIMAL or NUMERIC type, then cast it to string first to make sure the conversion takes place...ex:
SELECT (CAST(`column_name` AS CHAR)+0) FROM `table_name`;
For a shorter way, just use any built-in string function to do the cast:
SELECT TRIM(`column_name`)+0 FROM `table_name`;
it solves my problem using this:
(TRIM(TRAILING '.' FROM(CAST(TRIM(TRAILING '0' FROM setpoint)AS char)))) AS setpoint
example:
mysql> SELECT testid, designationid, test, measure,
(TRIM(TRAILING '.' FROM(CAST(TRIM(TRAILING '0' FROM setpoint)AS char)))) AS setpoint,
(TRIM(TRAILING '.' FROM(CAST(TRIM(TRAILING '0' FROM tmin)AS char)))) AS tmin,
(TRIM(TRAILING '.' FROM(CAST(TRIM(TRAILING '0' FROM tmax)AS char)))) AS tmax,
FROM tests
This is my method:
SELECT TRIM(TRAILING '.' FROM TRIM(TRAILING '0' FROM `table`.`column`)) FROM table
I had a similar problem in a situation where I could not modify the code nor the SQL query, but I was allowed to modify the database structure. So I changed the column format from DECIMAL to FLOAT and it solved my problem.