I have been studying patterns for a while now and I see it's not an easy thing to understand.
I have been trying to force the input field to accept at least 5 digits at the beginning and at least 3 letters.
I have gotten the pattern for at least 5 digits at the beginning
pattern="(?=\d{5}).*"
How do I include AT LEAST 3 LETTERS?
Related
I'm trying to cleanse a data set from erroneous phone number entries. Having trouble making the regular expression for the filter in MySQL.
The structure is the following:
First digit is in 2-9
Second and third digits can be any numeral except they may not be the same number
Forth digit is in 2-9
Fifth and sixth digits can be any numeral except '11'
I've landed on a few rather elaborate reg expressions which didn't quite work; but I'm sure there is a simplistic approach.
A "valid" number might look like:
2028658680
7137038891
My filter usually misses cases such as:
6778914351
7777777777
6178116678
Note that these numbers are completely made up.
This is possible, but it will be long and ugly. With a more robust regex engine you can do lookaround and even conditional statements, but MySQL doesn't support such things as far as I know.
^[2-9](?:0[1-9]|1[02-9]|2[013-9]|3[0-24-9]|4[0-35-9]|5[0-46-9]|6[0-57-9]|7[0-689]|8[0-79]|9[0-8])[2-9](?:1[02-9]|[02-9]1|[02-9]{2})[0-9]{4}$
https://regex101.com/r/qPuS5W/1
Explanation:
[2-9] First digit is any number from 2 to 9.
(?:0[1-9]|1[02-9]|2[013-9]|3[0-24-9]|4[0-35-9]|5[0-46-9]|6[0-57-9]|7[0-689]|8[0-79]|9[0-8]) Non capturing group that contains 10 alternatives starting with each number 0 to 9 followed by any number except that number.
(?:1[02-9]|[02-9]1|[02-9]{2}) Non capturing group that matches either 1 followed by a number that isn't 1, a number that isn't 1 followed by 1, or two numbers that aren't 1.
[0-9]{4} 4 of any number.
I am trying to write one single formula to identify all the patterns in a column/field. For example: Below are the five different patterns
AG 5643 895468 UWEB
7546 695321 IJJK
PE 45612384
8642567921
16724385
Formula for
First pattern: Contains 4 numbers 6 numbers
'*[0-9][0-9][0-9][0-9] [0-9][0-9][0-9][0-9][0-9][0-9] *' This is not working. Can we specify the length? Something like this [0-9]{4} - 4 digit number?
First pattern should pick second one also.
3rd one: first 2 characters are alphabets 8 or 10 digit numbers
4th one: 10 digit number
5th one 8 digit number
Thanks in advance!
If you're working in MySQL you can use regular expressions with the RLIKE filter operator.
For example, WHERE text RLIKE '[0-9]{8}' finds all the rows with any consecutive sequence of eight digits in them anywhere. (http://sqlfiddle.com/#!9/44996/1/0)
WHERE text RLIKE '^[0-9]{8}%' finds the rows consisting of nothing but an eight-digit sequence. (http://sqlfiddle.com/#!9/44996/2/0)
WHERE text RLIKE '^[0-9A-Z]{2} ' finds the rows starting with two letters or digits and then a space. (http://sqlfiddle.com/#!9/44996/3/0)
You get the idea. Regular expressions have a lot of power to them, generally beyond the scope of a SO answer to explain. Beware, though. This is a common saying: If you solve a problem with e regular expression, now you have two problems. You need to be careful with them.
I am trying to reverse engineer an algorithm used to generate a check digit.
Numbers are 8 digits long and the last digit is the check digit. I have thousands of valid numbers to test it on.
I have try several standard algorithm but come up with nothing
Here is some examples of valid numbers:
3482145 6
3482146 4
3482147 2
3482148 3
3482149 9
3482150 1
3482151 0
3482152 8
3482153 6
3482154 4
3482155 2
3482156 3
3482157 9
3482158 7
3482159 5
3482160 8
3482161 6
Is it possible to calculate this? Any ideas?
The amount of data you provided is insufficient to adequately assess the algo. The only thing I can see right now is that the sequence 64239xx8 is repeated twice, and the last digit is also 6.
Not an actual answer, I`m afraid, but StackOverflow does not yet allow me to leave comments.
The algorithm is this:
coef[]={4,2,1,6,3,7,9}
modulus 11
Case 10->0
Case 0->3
Is this just a coincidence that hexadecimal 0xaaaaaaaa represents binary with even positions set as 1.
Similarly something as elegant as 0x55555555 represents binary with odd positions set as 1 ?
Binary representation of 5 is 0101. So 0X55555555 has 16 ones, 16 zeros and the ones,zeros take alternate positions. Similarly 0X33333333 has 16 ones, 16 zeros and 2 consecutive ones, 2 consecutive zeros alternate.
Nothing special about those numbers per se, other than the fact that their corresponding bit patterns are useful.
I think the key realization here is that it's super easy to come up with a compact hex number to represent any longer bit pattern (even easier if it's repeating), right off the top of your head.
Why? Because it's trivial to convert from hex-to-binary or binary-to-hex - every four bits of the pattern can be neatly represented by one hex digit:
So let's say I wanted this 16-bit mask: 1110111011101110. This is 1110 repeated 4 times, so it's just some hex digit, 4 times. Since 1110 is 14 in decimal, that's gonna be "E", so our mask would be: 0xEEEE.
I was reading an article on binary numbers and it had some practice problems at the end but it didn't give the solutions to the problems. The last is "How many bits are required to represent the alphabet?". Can tell me the answer to that question and briefly explain why?
Thanks.
You would only need 5 bits because you are counting to 26 (if we take only upper or lowercase letters). 5 bits will count up to 31, so you've actually got more space than you need. You can't use 4 because that only counts to 15.
If you want both upper and lowercase then 6 bits is your answer - 6 bits will happily count to 63, while your double alphabet has (2 * 24 = 48) characters, again leaving plenty of headroom.
It depends on your definition of alphabet. If you want to represent one character from the 26-letter Roman alphabet (A-Z), then you need log2(26) = 4.7 bits. Obviously, in practice, you'll need 5 bits.
However, given an infinite stream of characters, you could theoretically come up with an encoding scheme that got close to 4.7 bits (there just won't be a one-to-one mapping between individual characters and bit vectors any more).
If you're talking about representing actual human language, then you can get away with a far lower number than this (in the region of 1.5 bits/character), due to redundancy. But that's too complicated to get into in a single post here... (Google keywords are "entropy", and "information content").
There are 26 letters in the alphabet so you 2^5 = 32 is the minimum word length than contain all the letters.
How direct does the representation need to be? If you need 1:1 with no translation layer, then 5 bits will do. But if a translation layer is an option, then you can get away with less. Morse code, for example, can do it in 3 bits. :)