How to include a multiline text in regex pattern - json

I have a pattern ^[0-9]+$, but I want it to include a \n new-line symbol so that the string like below would be valid:
123\n345\n678\n9752\n or in other words:
123
345
678
9752

Assuming you don't want to include leading/trailing newlines, try:
\A[0-9]+(?:\n[0-9]+)*\Z
See an online demo.
\A - Start-string anchor;
[0-9]+ - 1+ digits;
(?:\n[0-9]+)* - Match nested non-capture group 0+ times validating a single newline character and 1+ digits;
\Z - End-string anchor.
Note: As per my comments, ^[0-9]+(?:\n[0-9]+)*$ would also work with the right flags turned on/off.

Related

RegEx replace only occurrences outside of <h> html tags

I would like to regex replace Plus in the below text, but only when it's not wrapped in a header tag:
<h4 class="Somethingsomething" id="something">Plus plan</h4>The <b>Plus</b> plan starts at $14 per person per month and comes with everything from Basic.
In the above I would like to replace the second "Plus" but not the first.
My regex attempt so far is:
(?!<h\d*>)\bPlus\b(?!<\\h>)
Meaning:
Do not capture the following if in a <h + 1 digit and 0 or more characters and end an closing <\h>
Capture only if the group "Plus" is surrounded by spaces or white space
However - this captures both occurrences. Can someone point out my mistake and correct this?
I want to use this in VBA but should be a general regex question, as far as I understand.
Somewhat related but not addressing my problem in regex
Not relevant, as not RegEx
You can use
\bPlus\b(?![^>]*<\/h\d+>)
See the regex demo. To use the match inside the replacement pattern, use the $& backreference in your VBA code.
Details:
\bPlus\b - a whole word Plus
(?![^>]*<\/h\d+>) - a negative lookahead that fails the match if, immediately to the right of the current location, there are
[^>]* - zero or more chars other than >
<\/h - </h string
\d+ - one or more digits
> - a > char.

Html5 input pattern check

i'm not good in html pattern validation.
I have this problem, my input text is valid only : min 3 max 30 chars,
white space at first and at end of the string is not allowed, is allowed white space between one word and another is allowed, is allowed A-Za-z, first char of word it must be Uppercase and the others word it must be Lowercase.
Thanks.
--UPDATE--
input#name
Valid Examples:
'Mario Giovanni'
'Maria'
'Jacopo Karol Pio'
'Jacopo K'
Invalid Examples:
' Mario Giovanni'
'Mario Giovanni '
' Mario Giovanni '
'Mario Giovanni'
'maria'
'mAria'
'Antonio mario'
If you need pure regex then this should work for you:
<input type="text" pattern="(?=^.{3,30}$)^[A-Z][a-z]*(?: [a-z]+)*$">
(?=^.{3,30}$) - use a positive lookahead to make sure we have between 3 and 30 chars
^[A-Z] - require start with a capital letter
[a-z]* - optionally allow lowercase letters to follow
(?: [a-z]+)* - optionally allow a repeating group of a space char follow by one or more letters
$ - end of string anchor
You will want to use a Regular Expression pattern to check whether the input is valid or not, as well as the maxlength and minlength attributes to ensure that the input is between 3 and 30 characters.
Regarding the RegEx pattern, we must:
Start at the beginning of the input: ^
Verify that the first character is between A and Z: [A-Z]
Verify that the following characters before the last one are lowercase letters or spaces: [a-z ]*, where * indicates that there might be multiple characters matching that part of the pattern; if you only want to allow one space between word, then use ([a-z]* ?)
Verify that the last character is a lowercase letter: [a-z]$, where $ indicates the end of the input
Below is the code I would use.
<input type="text" minlength=3 maxlength=30 pattern="^[A-Z][a-z ]*[a-z]$">
Looks like what you wait is:
<input type="text" pattern="(?=^.{3,30}$)^[A-Z][a-z]+( [A-Z][a-z]+)*$">
Notice this is being validated in the user browser, and doesn't configure a secure input validation. You should check the input again at server-side before using it anywhere.

Sublime Text regex to find and replace whitespace between two xml or html tags?

I'm using Sublime Text and I need to come up with a regex that will find the whitespaces between a certain opening and closing tag and replace them with commas.
Example: Replace white space in
<tags>This is an example</tags>
so it becomes
<tags>This,is,an,example</tags>
Thanks!
You have just to use a simple regex like:
\s+
And replace it with with a comma.
Working demo
This will find instances of
<tags>...</tags>
with whitespace between the tags
(<tags>\S+)\W(.+</tags>)
This will replace the first whitespace with a comma
\1,\2
Open Find and Replace [OS X Cmd+Opt+F :: Windows Ctrl+H]
Use the two values above to find and replace and use the 'Replace All' option. Repeat until all the whitespaces are converted to commas.
The best answer is probably a quick script but this will get you there fairly fast without needing to do any coding.
You can replace any one or more whitespace chunks in between two tags using a single regular expression:
(?s)(?:\G(?!\A)|<tags>(?=.*?</tags>))(?:(?!</?tags>).)*?\K\s+
See the regex demo. Details
(?s) - a DOTALL inline modifier, makes . match line breaks
(?:\G(?!\A)|<tags>(?=.*?</tags>)) - either the end of the previous successful match (\G(?!\A)) or (|) <tags> substring that is immediately followed with any zero or more chars, as few as possible and then </tags> (see (?=.*?</tags>))
(?:(?!</?tags>).)*? - any char that does not start a <tags> or </tags> substrings, zero or more occurrences but as few as possible
\K - match reset operator
\s+ - one or more whitespaces (NOTE: use \s if each whitespace must be replaced).
SublimeText settings:

Regex issue on SQL

Why do I get 0 when running this expression?
SELECT 'Nr. 1700-902-8423. asdasdasd' REGEXP '1+[ ,.-/\]*7+[ ,.-/\]*0+[ ,.-/\]*0+[ ,.-/\]*9+[ ,.-/\]*0+[ ,.-/\]*2+[ ,.-/\]*8+[ ,.-/\]*4+[ ,.-/\]*2+[ ,.-/\]*3+';
I need to get true, when the text contains the specified number (17009028423). There can be symbols ,.-/\ between digits.
For example, if I have number 17009028423, I need get true when in text is:
1700-902-8423
17-00,902-84.23
170/09-0.28\423
1700..902 842-3
17,.009028 4//2\3
etc.
Thanks.
There are two problems with your regular expression. First is that backslash in \] escapes the special meaning of ] to denote a character class. You need to escape your backslash: \\]. Another problem is that - denotes a range [ and ] (e.g. [a-zA-Z]). You need to escape that too or put it at the end like [a-zA-Z-] (as #tenub said). Plus the backslashes should be escaped themselves, which makes:
SELECT 'Nr. 1700-902-8423. asdasdasd' REGEXP '1[ ,./\\\\-]*7[ ,./\\\\-]*0[ ,./\\\\-]*0[ ,./\\\\-]*9[ ,./\\\\-]*0[ ,./\\\\-]*2[ ,./\\\\-]*8[ ,./\\\\-]*4[ ,./\\\\-]*2[ ,./\\\\-]*3'
You can check for yourself.
I also removed + signs in case you want to match each number only once.

Do all kinds of newlines get converted to \r\n when submitted through a html form?

The specification from w3c states the following for forms of enctype=application/x-www-form-urlencoded:
This is the default content type. Forms submitted with this content
type must be encoded as follows:
1) Control names and values are escaped. Space characters are replaced by +', and then reserved characters are escaped as described
in [RFC1738], section 2.2: Non-alphanumeric characters are replaced by
%HH', a percent sign and two hexadecimal digits representing the
ASCII code of the character. Line breaks are represented as "CR LF"
pairs (i.e., `%0D%0A').
2) The control names/values are listed in the order they appear in the document. The name is separated from the value by =' and
name/value pairs are separated from each other by&'.
There are a few kinds of line terminators in Unicode. Namely:
LF: Line Feed, U+000A
VT: Vertical Tab, U+000B
FF: Form Feed, U+000C
CR: Carriage Return, U+000D
CR+LF: CR (U+000D) followed by LF (U+000A)
NEL: Next Line, U+0085
LS: Line Separator, U+2028
PS: Paragraph Separator, U+2029
Are all of these converted to CR LF (\r\n)?
Are all of these converted to CR LF (\r\n)?
Nope. The HTML4 spec here is unclear on what a line break is, but what browsers do, and what HTML5 has gone on to standardise is that only CR and LF are involved:
replace every occurrence of a "CR" (U+000D) character not followed by a "LF" (U+000A) character, and every occurrence of a "LF" (U+000A) character not preceded by a "CR" (U+000D) character, by a two-character string consisting of a U+000D CARRIAGE RETURN "CRLF" (U+000A) character pair
(IE doesn't quite conform to this exactly, as it treats LFCR as a single newline. But it's close enough.)