HTML regex space - html

I'm trying to bind a regex to my HTML form for an EU bank account. So far, I was able to piece together that:
pattern="[A-Z]{2}[000000000-999999999]{9}
Will let something pass that's for example UK123456789
But I also want to let it pass for UK12 2345 6789
How do I go about accepting a space at exactly those placements?

pattern="[A-Z]{2}[000000000-999999999]{9}"
This only accidentally does what you want. ([000000000-999999999] says "this character should be a 0 or a 0 or a 0 or ... a character in the range of 0-9 or a 9 or a 9 or ... a 9.) The proper form is:
pattern="[A-Z]{2}[0-9]{9}"
or more accurately:
pattern="[A-Z]{2}\d{9}"
Now that we have something more rational, we can extend that to:
pattern="[A-Z]{2}\d{2}\s?\d{4}\s?\d{4}"
which allows optional whitespace at the specific locations.
If you want to allow just spaces rather than any whitespace character, you could do:
pattern="[A-Z]{2}\d{2} ?\d{4} ?\d{4}"

You can allow an optional whitespace using \s?, though it'l make your regex a little longer. Below regex will allow both with or without whitespace (DEMO)
\w{2}\d{2}\s?\d{4}\s?\d{4}
But be aware that an european IBAN is longer than what you have posted - though I'm not sure how it is in the UK.

If you don't care where the spaces are, as long as there are 9 digits, you can remove all the spaces before checking:
str = 'UK12 234 56789';
strToCheck = str.replace(/ /g, '');
validStr = strToCheck.match(/[a-zA-Z]{2}\d{9}/);
if (validStr) {
console.log('Valid');
}

Related

Can you create a pattern for HTML input fields with a minimum number of letters of a certain type?

I want to create a pattern for an HTML input field that needs to have at least 10 numbers in it and may also have spaces and a plus sign on top of that, but it's not required.
It's important that numbers and spaces can be mixed though. Also, the whole field can only have 17 characters all in all.
I'm not sure if it's even possible. I started doing something like that:
pattern="[0-9+\s]{10,17}*"
But like this, it's not guaranteed that there are at least 10 numbers.
Thanks in advance! Hope the question doesn't exist already, I looked but couldn't find it.
You can use
pattern="(?:[+\s]*\d){10,17}[+\s]*"
The regex matches
(?:[+\s]*\d){10,17} - ten to seveteen occurrences of zero or more + or whitespaces and then a digit
[+\s]* - zero or more + or whitespaces.
Note the pattern is anchored by default (it is wrapped with ^(?: and )$), so nothing else is allowed.

How to limit simple form input to 50 characters

Is it possible to limit a simple form input to only 50 characters without javascript?
I have used the max_length attribute, however this includes blank spaces which is not what i want.
I've attempted to use pattern (as suggested on another post), but i can't seem to get that to work either.
Thanks
I don't know why you don't want it to include blanks.
Usually I use max_length including blanks and leave it to the user to trim their excess whitespace. I'm not disagreeing, I honestly don't know what your requirement is.
If you want to allow leading and trailing whitespace, but are willing to leave it to the user to replace excess whitespace within the text to one whitespace character then this is the pattern you want:
<input pattern="^\s*.{0,50}\s*$">
Sometimes for multiline regular expressions, \A is used instead of ^ and \z is used instead of $, but I'm not sure HTML supports that in their regular expressions.

Regex all uppercase with special characters

I have a regex '^[A0-Z9]+$' that works until it reaches strings with 'special' characters like a period or dash.
List:
UPPER
lower
UPPER lower
lower UPPER
TEST
test
UPPER2.2-1
UPPER2
Gives:
UPPER
TEST
UPPER2
How do I get the regex to ignore non-alphanumeric characters also so it includes UPPER2.2-1 also?
I have a link here to show it 'real-time': http://www.rubular.com/r/ev23M7G1O3
This is for MySQL REGEX
EDIT: I didn't specify I wanted all non-alphanumeric characters (including spaces), but with the help of others here it led me to this: '^[A-Z-0-9[:punct:][:space:]]+$' is there anything wrong with this?
Try
'^[A-Z0-9.-]+$'
You just need to add the special characters to the group, optionally escaping them.
Additionally if you choose not to escape the -, be aware that it should be placed at the start or the end of the grouping expression to avoid the chance that it may be interpreted as delimiting a range.
To your updated question, if you want all non-whitespace, try using a group such as:
^[^ ]+$
which will match everything except for a space.
If instead what you wanted is all non-whitespace and non-lowercase, you likely will want to use:
^[^ a-z]+$
The 'trick' used here is adding a caret symbol after the opening [ in the group expression. This indicates that we want the negation of the match.
Following the pattern, we can also apply this 'trick' to get everything but lowercase letters like this:
^[^a-z]+$
I'm not really sure which of the 3 above you want, but if nothing else, this ought to serve as a good example of what you can do with character classes.
I believe you are looking for (one?) uppercase-word match, where word is pretty much anything.
^[^a-z\s]+$
...or if you want to allow more words with spaces, then probably just
^[^a-z]+$
You just need to put in the . and -. In theory, you don't need to escape because they are inside the brackets, but I like to to remind myself to escape when I have to.
'^[A-Z0-9\.\-]+$'
Try regular expression as below:
'^[A0-Z0\\.\\-]+$'

Inserting HTML tag in the middle of Arabic word breaks word connection (cursive)

From wikipedia:
Cursive (from Latin curro, currere, cucurri, cursum, to run, hasten) is any style of handwriting that is designed for writing notes and letters quickly by hand. In the Arabic, Latin, and Cyrillic writing systems, the letters in a word are connected, making a word one single complex stroke.
In the above languages when we want to format one single word with e.g. <span> tag to apply custom css style it breaks word conection, so is there any solution for this.
example this is for example normal arabic word: كتب
but when we want to color last letter in other color using the span tag get this:
because first two letter are in one tag and last is in other to color it.
Is there something I can do to avoid word breaks.
Here is the full html:
<p>كت<span style="color: Red;">ب</span></p>
I'm not sure if there's any HTML way to do it, but you can fix it by adding a zero-width joiner Unicode character before the opening span tag:
<p>كت‍<span style="color: Red;">ب</span></p>
You can use the actual Unicode character instead of the HTML character entity, of course, but that wouldn't be visible here. Or you can use the prettier ‍ entity.
Here it is in action (using an invisible <b> tag, since I can't do color here), without the joiner:
كتب
and with the joiner:
كت‍ب
It's supposed to work without the joiner as far as I understand it, though, and it does in some browsers, but clearly not all of them.
Update 2020/5
Google Chrome (Checked version 81.0.4044.138) and Firefox (76.0.1) have solved this issue when rendreing Arabic and Farsi words and there is no more need to handle the situation manually. Simply wrap the keyword with <span style="color:red">Keyword</span> works fine with both connecting and non-connecting characters.
For this reason, you probably can not see the difference between Correct and Wrong examples below:
Main post:
After 7 years of accepted answer I would like to add a new answer with more practical details as my native language is Farsi. I assume that we want to replace a keyword within a long word. This answer considers the following details:
1- Sometimes it is not enough to add ‍ only to the previous character becase next character should also has a tail to complete the connection.
body{font-size:36pt;}
span{color:red}
Wrong: مک‍<span>انیک</span>
<br>
Correct: مک‍<span>‍انیک</span>
2- We may also need to add ‍ after the keyword to connect it to next character.
body{font-size:36pt;}
span{color:red}
Wrong: مک‍<span>‍انیک</span>ی
<br>
Correct: مک‍<span>‍انیک‍</span>‍ی
3- There are some characters that accept tail before but not after. So we have to exclude them from accepting tail after them. This is the list of non-connecting characters to next characters: ا آ د ذ ر ز ژ و
4- Finally to respect search engines and scrappers, I recommend using javascript (jquery) to replace keywords after DOM ready to keep the page source clean.
This is my final code with regards to all details above:
$(document).ready(function(){
var tail="\u200D";
var keyword="ستر";
$(".searchableContent").each(function(){
var htm=$(this).html();
/*
preserve keywords which have space both before and after
with a temp sign say #fullHolder#
*/
htm=htm.split(' '+keyword+' ').join(' #fullHolder# ');
/*
preserve keywords which have only space after
with a temp sign say #preHolder#
*/
htm=htm.split(keyword+' ').join('#preHolder#'+' ');
/*
preserve keywords which have only space before
with a temp sign say #nextHolder#
*/
htm=htm.split(' '+keyword).join(' '+'#nextHolder#');
/*
replace remaining keywords with marked up span.
Add tail to both side of span to make sure it is
connected to both letters before and after
*/
htm=htm.split(keyword).join(tail+'<span style="color:#ff0000">'+tail+keyword+tail+'</span>'+tail);
//Deal #preHolder# by adding tail only before the keyword
htm=htm.split('#preHolder#'+' ').join(tail+'<span style="color:#ff0000">'+tail+keyword+'</span>'+' ');
//Deal #nextHolder# by adding tail only after the keyword
htm=htm.split(' '+'#nextHolder#').join(' '+'<span style="color:#ff0000">'+keyword+tail+'</span>'+tail);
//Deal #fullHolder# by adding markup only without tail
htm=htm.split(' '+'#fullHolder#'+' ').join(' '+'<span style="color:#ff0000">'+keyword+'</span>'+' ');
//Remove all possible combination of added tails to non-connecting characters
var nonConnectings=['ا','آ','د','ذ','ر','ز','ژ','و'];
for (x = 0; x < nonConnectings.length; x++) {
htm=htm.split(nonConnectings[x]+tail).join(nonConnectings[x]);
htm=htm.split(nonConnectings[x]+'<span style="color:#ff0000">'+tail).join(nonConnectings[x]+'<span style="color:#ff0000">');
htm=htm.split(nonConnectings[x]+'</span>'+tail).join(nonConnectings[x]+'</span>');
}
$(this).html(htm);
})
})
div{font-size:26pt}
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<div class="searchableContent">
سترون - بستری - آستر - بستر - استراحت
</div>

Regex for html style attribute

Trying to get a regex that I can get a style attribute value from the example below should explain my issue.
source: font-size:11pt;font-color:red;text-align:left;
want to say give me ..
font-size and returns 11pt
font-colour and returns red
text-align and returns left
Can someone point me in the right direction
Thanks
Lee
This question reminded me of a Jeff Atwood blog post, Parsing Html The Cthulhu Way. This isn't exactly the same question, but its the same sentiment. Don't parse CSS with regular expressions! There's tons of libraries out there to do this for you.
Logically you'd want:
[exact phrase] + 1 colon + 0 or more white space characters + 0 or more characters up to the first semicolon or closing quote.
I think this will get you headed in the right direction:
font-size[:][\s]*[^;'"]*
Gotchas:
the closing quote might be single or double and there may be a valid quote within (ie, quoting background image urls, for instance)
this is all dependent on the styles not being written in shorthand
var regex = new Regex(#"([\w-]+)\s*:\s*([^;]+)");
var match = regex.Match("font-size:11pt;font-color:red;text-align:left;");
while (match.Success)
{
var key = match.Groups[1].Value;
var value = match.Groups[2].Value;
Console.WriteLine("{0} : {1}", key, value);
match = match.NextMatch();
}
Edit: This is not supposed to be a 'complete' solution. It probably does the job for the 80% of the cases, and as ever the last 20% would be magnitudes more expensive ;-)