I've created a regex for checking a date format ( 01-01-0000 to 31-12-9999).
I tried an example regex, and it works, so there is something wrong with my regex, but when I try it in a debugger (regexr) it works just fine.
What am I missing?
([0]{1}[1-9]{1}|[1-2]{1}[0-9]{1}|[3]{1}[0-1]{1})(\-)([0]{1}[1-9]{1}|[1]{1}[0-2]{1})(\-)\d{4}
New regex after edit:
(0[1-9]|[12][0-9]|3[01])-(0[1-9]|1[0-2])-\d{4}
I use an html input type text, and put the regex in pattern ="my pattern".
Thanks in advance (:
Edit: Fixed the regex according to Casimir et Hippolyte's comment, and now it works.
Your regex looks OK, at least it captures your both sample dates (tested on regex101.com).
You can simplify it a little:
No need for [...] around a single char (e.g. change [0] to 0).
No need for capturing groups around a dash (e.g. change (-) to -).
It is strange that you used capturing groups for day and month, but you
didn't for year field (I added it in the example below).
So try the following regex:
(0[1-9]|[12][0-9]|3[01])-(0[1-9]|1[0-2])-(\d{4})
It is however not clear, whether you realy need capturing groups.
Related
I created this question yesterday
I've since realised there are actually a few other bits of data that cause issues with the solutions I received. Hence, I thought it best to make a new question
Take the following example data;
"87",0000,0767,"078",0785,"0723",23487, "061 904 5284","17\/10\/2016","some.name.789#hotmail.com"
Using the accepted solution form above (?<!")(\b\d+\b)(?!")
The date string ends up having the middle number in between the two \/ wrapped, the number in quotes with spaces breaks as well as the email address.
The issues can be seen here: https://regex101.com/r/qVQYA7/6
My Solution
The following does seem to work for me, however it seems a bit messy. I have a feeling there's a much more succinct way to achieve the same result;
,(?<!("|\/|\\))(\b\d+\b)(?!("|\/|\\|( \d))) Replace with ,"$2"
https://regex101.com/r/qVQYA7/5
EDIT:
#Federico this screenshot shows that spaces before or after commas breaks the replace;
By reading your both questions, what I understand is that you want to wrap in double quots some numbers that aren't, so for this I can come up with a simple regex like this:
(?<=,)(\d+)(?=,)
With the replacement string: "$1"
Working demo
Update: after you updated the question, here I put the update for the answer. You can use this regex instead:
(?<=,)\s*(\d+)\s*(?=,)
I need to write a regular expression to catch the following things in bold
class="something_A211"
style="width:380px;margin-top: 20px;"
I have no idea how to write it, can someone help me?
I need this because, in html file i have to replace (whit notepad++) with empty, so i want to have a clear < tr > or < td > or anything else.
Thank you
You can use a regex like this to capture the content:
((?:class|style)=".*?")
Working demo
However, if you just want to match and delete that you can get rid of capturing groups:
(?:class|style)=".*?"
For all constructions like something="data", you can use this.
[^\s]*?\=\".*?\"
https://regex101.com/r/oQ5dR0/1
The link shows you what everything does.
To explain it briefly, a non space character can come before the "=" any mumber of times, then comes the quotes and info inside of them.
The question mark in .*? (and character any number of times) is needed so only the minimum amount of characters will be used (instead of looking for the next possible quotes somewhere further along)
I'm trying to filter using regex in mySQL.
The field is a text field and I want to find all that match 'MD' or similar ('M.D.', 'M. D.', 'DDS, M.D.' etc.).
I do not want to accept those that contain M and D as a part of another acronym (e.g., 'DMD'). However 'DMD, M.D.' I would want to find.
Apologies if this is a simple task - I read through some regex tutorials and couldn't figure this out! Thanks.
Update:
With help from the suggestions I arrived at the following solution:
(\s|^)M\.?\s*D\.?
which works for all of my cases. The quotes in my questions were to indicate it was a string, they are not a part of the string.
You can use a regex like this:
\b(M\.?\s*D\.?|D\.?\s*D\.?\s*S\.?)
Working demo
If I have understood your requirement:
'([^'.]*[ ,]*M[. ]*D[. ]*)'
this looks for MD preceded by space comma or ' separated by 0 or more dots & spaces, followed by '
it matches all the contents between the '' marks
test: https://regex101.com/r/oV2kV8/2
In the end I found this solution works:
(\s|^)M\.?\s*D\.?(\s|$)
This allows for the 'MD' to be at the start or after another credential and to have spaces or periods or nothing between the letters.
I have a page full of html data that I am scraping from.
There is one occurrence of a "gross amount" field that I am trying to extract.
<h3 id="cart_trans_detail_ach_grossamount_lbl">Gross Amount</h3>
<p id="cart_trans_detail_ach_grossamount_txt">$76.99 USD</p>
All I want to get from this is $76.99 USD
I have tried using Regex Buddy and putting together but regex is not my strong suite. Even something simple like this: <p id="cart_trans_detail_ach_grossamount_txt">(.*)</p> matches the whole string and not just what is between the tags.
Any ideas?
First of all, using a regex to parse HTML is unrecommended, you should use a HTML/XML parsing library instead. But if you really feel the need to use a regular expression for that, what you are missing is the ungreedy char (?) after your (*) so that your regex stops at the first </p> it finds.
<p id="cart_trans_detail_ach_grossamount_txt">(.*?)</p>
Try this pattern:
(?<=grossamount_txt">\$)(\d*\.?\d*) USD
It works in python and php, it shall also work in Java.
The group(1) gives you back only the amount without other things.
The first parenthesis encloses a positive lookbehind which looks if before the USD amount there is a string related to "grossamount_txt">$".
then the second parenthesis try to match for a numeric amount possibily expressed in integer number and decimal numbers.
Finally there the last part of the pattern is " USD".
You can test it here
https://www.regex101.com/#python
where you can also find some more detailed explanation.
Here about how lookaround works
http://www.regular-expressions.info/lookaround.html
Hope it helps.
I'm conducting a mass search of files in notepad++ and I need to determine if there are no values between a set of tags (i.e. ).
".*?" will search for 0 or more characters (well, most), which is fine. But I'm looking for a set of tags with at least one character between them.
".+?" is similar to the above and does work in notepad++.
I tried the following, which was unsuccessful:
<author>.{0}?</author>
Thank you for any help.
Since you look for something that doesn't exist you don't have to make it that complicated. Simply searching for <author></author> would do the trick, wouldn't it? If you want to include space-characters as "nothing" you could modify it to the following:
<author>\s*?</author>
Output:
<author></author> Match
<author> </author> Match
<author>something</author> No match
I don't understand why you are using the "?" operator; ".+" should yield the result you need.