what is the following regular expressions trying to match? - tcl

what is the following regular expressions trying to match:
expect -re "classType=(.{3})"?
what does this (.{3})means in regular expressions?

In regular expressions, . matches any character, {3} is a suffix to make something be repeated three times (i.e., ...) and the parentheses around it make it a capturing group. (That means that the matched piece of input will be available as $expect_out(1,string) afterwards.)

The same as (...) --- three characters.

Related

Regexp to match JSON key:value pairs with commas in value [duplicate]

Can't get why this regex (regex101)
/[\|]?([a-z0-9A-Z]+)(?:[\(]?[,][\)]?)?[\|]?/g
captures all the input, while this (regex101)
/[\|]+([a-z0-9A-Z]+)(?:[\(]?[,][\)]?)?[\|]?/g
captures only |Func
Input string is |Func(param1, param2, param32, param54, param293, par13am, param)|
Also how can i match repeated capturing group in normal way? E.g. i have regex
/\(\(\s*([a-z\_]+){1}(?:\s+\,\s+(\d+)*)*\s*\)\)/gui
And input string is (( string , 1 , 2 )).
Regex101 says "a repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations...". I've tried to follow this tip, but it didn't helped me.
Your /[\|]+([a-z0-9A-Z]+)(?:[\(]?[,][\)]?)?[\|]?/g regex does not match because you did not define a pattern to match the words inside parentheses. You might fix it as \|+([a-z0-9A-Z]+)(?:\(?(\w+(?:\s*,\s*\w+)*)\)?)?\|?, but all the values inside parentheses would be matched into one single group that you would have to split later.
It is not possible to get an arbitrary number of captures with a PCRE regex, as in case of repeated captures only the last captured value is stored in the group buffer.
What you may do is get mutliple matches with preg_match_all capturing the initial delimiter.
So, to match the second string, you may use
(?:\G(?!\A)\s*,\s*|\|+([a-z0-9A-Z]+)\()\K\w+
See the regex demo.
Details:
(?:\G(?!\A)\s*,\s*|\|+([a-z0-9A-Z]+)\() - either the end of the previous match (\G(?!\A)) and a comma enclosed with 0+ whitespaces (\s*,\s*), or 1+ | symbols (\|+), followed with 1+ alphanumeric chars (captured into Group 1, ([a-z0-9A-Z]+)) and a ( symbol (\()
\K - omit the text matched so far
\w+ - 1+ word chars.

Regex / Pattern HTML email

Is there a way to associate two regex ?
I have this one which prevents user to use this email (test#test.com)
pattern="^((?!test#test.com).)*$"
I also have one which validates email syntax
pattern="[a-z0-9._%+-]{3,}#[a-z]{3,}([.]{1}[a-z]{2,}|[.]{1}[a-z]{2,}[.]{1}[a-z]{2,})"
How to merge those two regex in order to prevent user to user test#test.com and to validate the email syntax ?
I tried to use an OR operator (single pipe) but I am missing something, it doesn't work ...
Thanks !
It seems you may use
pattern="(?!test#test\.com$)[a-z0-9._%+-]{3,}#[a-z]{3,}\.[a-z]{2,}(?:\.[a-z]{2,})?"
Note that the HTML5 patterns are automatically anchored as they are wrapped with ^(?: and )$ at the start/end, so no need adding ^ and $ at the start/end of the pattern.
The (?!test#test\.com$) negative lookahead will fail the match if the input string is equal to the test#test.com string (unlike your first regex that only fails the input that contains the email).
The rest is your second pattern, I only removed {1} that are implicit and contracted an alternation group to a \.[a-z]{2,}(?:\.[a-z]{2,})? where (?:\.[a-z]{2,})? is an optional non-capturing group matching 1 or 0 sequences of . and 2 or more lowercase ASCII letters.
Add A-Z to the character classes to also support uppercase ASCII letters.

Regex that allows numbers with commas and two decimals

I'm trying to make a number input field using the pattern attribute since the regular type number didn't support the validations I needed.
Essentially, I want to allow any numbers that make sense, including $, + or - at the start and a % at the end. Also, users should be able to separate their numbers with commas to avoid mistakes on long numbers, but this is not necessary and they should still be able to submit a long number without any type of separation. The field should also allow for decimals.
<input required pattern="[+-]?\$?\d+(,\d{3})*(\.\d+)?%?" type="text" />
I need to allow for the following examples:
Pass:
2000
-20%
2,000
$2,000.00
999,999,999,999,999,999,999.99
Fail:
123e9
Anything that has letters on it
This is the regex that I have so far, but it doesn't seem to work, even for the most basic numbers. I've been using scriptular to test my regex, but that doesn't seem to reflect the results of the actual HTML validation.
Regex: [+-]?\$?\d+(,\d{3})*(\.\d+)?%?
EDIT: For any Ruby on Rails devs, I realized one of my mistakes is that you must escape any backslashes in your regex when you are generating your text_field. So for example, the regex in the answer should look like (?:\\+|\\-|\\$)?\\d{1,}(?:\\,?\\d{3})*(?:\\.\\d+)?%?
Try with following regex.
Regex: (?:\+|\-|\$)?\d{1,}(?:\,?\d{3})*(?:\.\d+)?%?
Explanation:
(?:\+|\-|\$)? matches either + - or $ in-front of a number which is optional as ? quantifier is used.
\d{1,} matches integer part even if it doesn't have ,.
(?:\,?\d{3})* matches multiple occurrences of comma separated digits if present.
(?:\.\d+)? matches optional decimal part.
%? matches optional % character in the end.
?: stands for non-capturing groups. It will match but won't store it for back-referencing.
Regex101 Demo

How to search for pattern in multiple lines using a regular expression

Consider the pattern is:
PPP(GJ) {
__hj_o:
}
What is the regular expression match the above pattern?
Tcl's regular expressions can contain newlines just fine, but for anything complicated it can help to put it in its own variable instead of having it as an inline literal:
set RE {PPP(GJ) {
__hj_o:
}}
if {[regexp $RE $someString]} {
# We got a match!
}
Indeed, regexp would also match the above with this:
set RE {PPP(GJ)\s+{\s+__hj_o:\s+}}
because newlines are just ordinary whitespace characters (i.e., are matched by \s and .) by default. (The above REs are probably not exactly what you want; they likely need suitable patterns for the non-whitespace portions as well.)
However, you need to ensure that the string you are matching against has the whole thing that you want to match. If you're just feeding through one line at a time, that multiline pattern will consistently fail. This sounds obvious, but it is the easiest mistake to make.

How does pattern matching in Expect (TCL) work (without -re flag)?

Suppose I got this string to be expected: 100:~# or 100:~/tmp
This really means, I need to match the terminal prompt for a machine (which may or may not contain the path). Normally, with this regex pattern:
100:(~|/)(/+[a-zA-Z0-9]*)*#
It works for an input string such as: 100:~/foo/bar/foo/baz#
You can test it here: Regex Pal
But using Expect in TCL, I have to add -re to match such pattern. However, I am not allowed to do so. I tried the above pattern without regex, and it failed.
The current pattern for matching 100:~# or 100:~/tmp is very simple: 100:[~/]*#, and I was told that it is shell expression for matching strings, not regular expression. The 100:[~/]*# pattern means it matches anything between 100:[~/] (~ and / are optional) and #. The * character is meant to match anything, as opposed to the regular * which is zero or more in traditional regex sense.
What exactly is pattern matching expression in Expect withou -re flag?
They are known as "glob" patterns. They are styled after the shell's pattern matching. The documentation is here: http://tcl.tk/man/tcl8.5/TclCmd/string.htm#M40
*
Matches any sequence of characters in string, including a null string.
?
Matches any single character in string.
[chars]
Matches any character in the set given by chars. If a sequence of the form x-y appears in chars, then any character between x and y, inclusive, will match. When used with -nocase, the end points of the range are converted to lower case first. Whereas {[A-z]} matches “_” when matching case-sensitively (since “_” falls between the “Z” and “a”), with -nocase this is considered like {[A-Za-z]} (and probably what was meant in the first place).
\x
Matches the single character x. This provides a way of avoiding the special interpretation of the characters *?[]\ in pattern.