I am trying to extract an email address from user input text in Watson Conversation. First thing first, I need to trigger a particular node using an if condition like this:
input.text.contains('\^(([^<>()[].,;:s#\"]+(.[^<>()[].,;:s#\"]+)*)|(\".+\"))#(([[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}])|(([a-zA-Z-0-9]+.)+[a-zA-Z]{2,}))$\')
But it doesn't work, I tried a lot of regexes that I found on the internet but none of them work. Does anyone know how to write a proper regex?
I suggest using a much simpler, approximate, regex to match emails that you need to use with String.matches(string regexp) method that accepts a regex:
input.text.matches('^\\S+#\\S+\\.\\S+$')
Do not forget to double escape backslashes so as to define literal backslashes in the pattern.
Pattern details:
^ - start of string
\\S+ - one or more non-whitespace chars
# - a # symbol
\\S+ - one or more non-whitespace chars
\\. - a literal dot
\\S+ - one or more non-whitespace chars
$ - end of string.
Related
$.validator.addMethod('AZ09_', function (value) {
return /^[a-zA-Z0-9.-_]+$/.test(value);
}, 'Only letters, numbers, and _-. are allowed');
When I use somehting like test-123 it still triggers as if the hyphen is invalid. I tried \- and --
Escaping using \- should be fine, but you can also try putting it at the beginning or the end of the character class. This should work for you:
/^[a-zA-Z0-9._-]+$/
Escaping the hyphen using \- is the correct way.
I have verified that the expression /^[a-zA-Z0-9.\-_]+$/ does allow hyphens. You can also use the \w class to shorten it to /^[\w.\-]+$/.
(Putting the hyphen last in the expression actually causes it to not require escaping, as it then can't be part of a range, however you might still want to get into the habit of always escaping it.)
The \- maybe wasn't working because you passed the whole stuff from the server with a string. If that's the case, you should at first escape the \ so the server side program can handle it too.
In a server side string: \\-
On the client side: \-
In regex (covers): -
Or you can simply put at the and of the [] brackets.
Generally with hyphen (-) character in regex, its important to note the difference between escaping (\-) and not escaping (-) the hyphen because hyphen apart from being a character themselves are parsed to specify range in regex.
In the first case, with escaped hyphen (\-), regex will only match the hyphen as in example /^[+\-.]+$/
In the second case, not escaping for example /^[+-.]+$/ here since the hyphen is between plus and dot so it will match all characters with ASCII values between 43 (for plus) and 46 (for dot), so will include comma (ASCII value of 44) as a side-effect.
\- should work to escape the - in the character range. Can you quote what you tested when it didn't seem to? Because it seems to work: http://jsbin.com/odita3
A more generic way of matching hyphens is by using the character class for hyphens and dashes ("\p{Pd}" without quotes). If you are dealing with text from various cultures and sources, you might find that there are more types of hyphens out there, not just one character. You can add that inside the [] expression
I have an HTML5 input element with a pattern attribute. I'm having some trouble with an optional group.
The (relative) URL must start with a forward slash (I have this working).
The total (relative) URL may contain a total of up to 255 characters.
All characters from 2-255 must be (lowercase) alpha-numeric or a forward slash.
Separately the forward slash regex works and the 2-255 part works for alpha-numeric and forward slashes. However I'm having trouble allowing both groups with the second group being optional.
What I have confirmed to work:
pattern="^\/"
pattern="[a-z0-9\/]"
However I can't determine how to allow the second group as an option (I've tried adding the ? after the ending square bracket in example without luck).
I also am not sure how to combine the length ({255,}) bit to the total pattern expression.
How do I combine all three aspects of the regular expression?
Note: tags seem to be broken at the moment of posting this.
You can use
pattern="/[a-z0-9/]{0,254}"
You do not need ^ nor $ in the pattern regex, by the way, it must match the whole string anyway, it will be parsed as ^(?:/[a-z0-9/]{0,254})$ pattern. That is, it will match a string that starts with / and then contains 0 to 254 lowercase ASCII letters, digits or slashes till the string end.
Note that / should only be escaped in regex literals where / is used as a delimiter char. pattern regexps are defined with literal strings.
When I create a string containing backslashes, they get duplicated:
>>> my_string = "why\does\it\happen?"
>>> my_string
'why\\does\\it\\happen?'
Why?
What you are seeing is the representation of my_string created by its __repr__() method. If you print it, you can see that you've actually got single backslashes, just as you intended:
>>> print(my_string)
why\does\it\happen?
The string below has three characters in it, not four:
>>> 'a\\b'
'a\\b'
>>> len('a\\b')
3
You can get the standard representation of a string (or any other object) with the repr() built-in function:
>>> print(repr(my_string))
'why\\does\\it\\happen?'
Python represents backslashes in strings as \\ because the backslash is an escape character - for instance, \n represents a newline, and \t represents a tab.
This can sometimes get you into trouble:
>>> print("this\text\is\not\what\it\seems")
this ext\is
ot\what\it\seems
Because of this, there needs to be a way to tell Python you really want the two characters \n rather than a newline, and you do that by escaping the backslash itself, with another one:
>>> print("this\\text\is\what\you\\need")
this\text\is\what\you\need
When Python returns the representation of a string, it plays safe, escaping all backslashes (even if they wouldn't otherwise be part of an escape sequence), and that's what you're seeing. However, the string itself contains only single backslashes.
More information about Python's string literals can be found at: String and Bytes literals in the Python documentation.
As Zero Piraeus's answer explains, using single backslashes like this (outside of raw string literals) is a bad idea.
But there's an additional problem: in the future, it will be an error to use an undefined escape sequence like \d, instead of meaning a literal backslash followed by a d. So, instead of just getting lucky that your string happened to use \d instead of \t so it did what you probably wanted, it will definitely not do what you want.
As of 3.6, it already raises a DeprecationWarning, although most people don't see those. It will become a SyntaxError in some future version.
In many other languages, including C, using a backslash that doesn't start an escape sequence means the backslash is ignored.
In a few languages, including Python, a backslash that doesn't start an escape sequence is a literal backslash.
In some languages, to avoid confusion about whether the language is C-like or Python-like, and to avoid the problem with \Foo working but \foo not working, a backslash that doesn't start an escape sequence is illegal.
I'm trying to make a number input field using the pattern attribute since the regular type number didn't support the validations I needed.
Essentially, I want to allow any numbers that make sense, including $, + or - at the start and a % at the end. Also, users should be able to separate their numbers with commas to avoid mistakes on long numbers, but this is not necessary and they should still be able to submit a long number without any type of separation. The field should also allow for decimals.
<input required pattern="[+-]?\$?\d+(,\d{3})*(\.\d+)?%?" type="text" />
I need to allow for the following examples:
Pass:
2000
-20%
2,000
$2,000.00
999,999,999,999,999,999,999.99
Fail:
123e9
Anything that has letters on it
This is the regex that I have so far, but it doesn't seem to work, even for the most basic numbers. I've been using scriptular to test my regex, but that doesn't seem to reflect the results of the actual HTML validation.
Regex: [+-]?\$?\d+(,\d{3})*(\.\d+)?%?
EDIT: For any Ruby on Rails devs, I realized one of my mistakes is that you must escape any backslashes in your regex when you are generating your text_field. So for example, the regex in the answer should look like (?:\\+|\\-|\\$)?\\d{1,}(?:\\,?\\d{3})*(?:\\.\\d+)?%?
Try with following regex.
Regex: (?:\+|\-|\$)?\d{1,}(?:\,?\d{3})*(?:\.\d+)?%?
Explanation:
(?:\+|\-|\$)? matches either + - or $ in-front of a number which is optional as ? quantifier is used.
\d{1,} matches integer part even if it doesn't have ,.
(?:\,?\d{3})* matches multiple occurrences of comma separated digits if present.
(?:\.\d+)? matches optional decimal part.
%? matches optional % character in the end.
?: stands for non-capturing groups. It will match but won't store it for back-referencing.
Regex101 Demo
I want to replace "\cite{foo123a}" with "[1]" and backwards. So far I was able to replace text with the following command
body.replaceText('.cite{foo}', '[1]');
but I did not manage to use
body.replaceText('\cite{foo}', '[1]');
body.replaceText('\\cite{foo}', '[1]');
Why?
The back conversion I cannot get to work at all
body.replaceText('[1]', '\\cite{foo}');
this will replace only the "1" not the [ ], this means the [] are interpreted as regex character set, escaping them will not help
body.replaceText('\[1\]', '\\cite{foo}');//no effect, still a char set
body.replaceText('/\[1\]/', '\\cite{foo}');//no matches
The documentation states
A subset of the JavaScript regular expression features are not fully supported, such as capture groups and mode modifiers.
Can I find a full description of what is supported and what not somewhere?
I'm not familiar with Google Apps Script, but this looks like ordinary regular expression troubles.
Your second conversion is not working because the string literal '\[1\]' is just the same as '[1]'. You want to quote the text \[1\] as a string literal, which means '\\[1\\]'. Slashes inside of a string literal have no relevant meaning; in that case you have written a pattern which matches the text /1/.
Your first conversion is not working because {...} denotes a quantifier, not literal braces, so you need \\\\cite\\{foo\\}. (The four backslashes are because to match a literal \ in a regular expression is \\, and to make that a string literal it is \\\\ — two escaped backslashes.)