Regex find two characters in order, between others, ignoring punctuation - mysql

I'm trying to filter using regex in mySQL.
The field is a text field and I want to find all that match 'MD' or similar ('M.D.', 'M. D.', 'DDS, M.D.' etc.).
I do not want to accept those that contain M and D as a part of another acronym (e.g., 'DMD'). However 'DMD, M.D.' I would want to find.
Apologies if this is a simple task - I read through some regex tutorials and couldn't figure this out! Thanks.
Update:
With help from the suggestions I arrived at the following solution:
(\s|^)M\.?\s*D\.?
which works for all of my cases. The quotes in my questions were to indicate it was a string, they are not a part of the string.

You can use a regex like this:
\b(M\.?\s*D\.?|D\.?\s*D\.?\s*S\.?)
Working demo

If I have understood your requirement:
'([^'.]*[ ,]*M[. ]*D[. ]*)'
this looks for MD preceded by space comma or ' separated by 0 or more dots & spaces, followed by '
it matches all the contents between the '' marks
test: https://regex101.com/r/oV2kV8/2

In the end I found this solution works:
(\s|^)M\.?\s*D\.?(\s|$)
This allows for the 'MD' to be at the start or after another credential and to have spaces or periods or nothing between the letters.

Related

How to replace a word in html only if some conditions are met with regex

I try to replace every occurrence of a word in a text (which is a html file) and everything around until we meet a " or a ' or a ( for behind or a ) for forward with a regex using nodejs.
My problem is that when I have two words to replace let's say 3.png and 13.png, 13.png is being replaced too by matching 3.png and when I come to replace 13.png in my text it's not there because it was already replaced when matching previous 3.png.
My ideal solution would be :
if matched pattern contains a /
then it must exact match after / and replace everything around (slash included) until we meet one of these characters (excluded) " or a ' or a ( for behind or a )
else exact match between "" or '' or ()
You can find here a regex101 example
Currently I'm sorting my words to search like so:
imgjson.sort((a, b) => b.name.length - a.name.length);
in order to replace the longest words first which solves my problem because we replace 13.png first then 3.png but I would like to know if this can be done with js regex?
Thanks a lot for your reply and time!
As #PushpeshKumarRajwanshi told use \b.
If you want to be more accurate and informed about regex, you can use https://regex101.com/.
In right-bottom corner you can find all special characters and functions of regex you may be need to use.

My input pattern doesn't work

I've created a regex for checking a date format ( 01-01-0000 to 31-12-9999).
I tried an example regex, and it works, so there is something wrong with my regex, but when I try it in a debugger (regexr) it works just fine.
What am I missing?
([0]{1}[1-9]{1}|[1-2]{1}[0-9]{1}|[3]{1}[0-1]{1})(\-)([0]{1}[1-9]{1}|[1]{1}[0-2]{1})(\-)\d{4}
New regex after edit:
(0[1-9]|[12][0-9]|3[01])-(0[1-9]|1[0-2])-\d{4}
I use an html input type text, and put the regex in pattern ="my pattern".
Thanks in advance (:
Edit: Fixed the regex according to Casimir et Hippolyte's comment, and now it works.
Your regex looks OK, at least it captures your both sample dates (tested on regex101.com).
You can simplify it a little:
No need for [...] around a single char (e.g. change [0] to 0).
No need for capturing groups around a dash (e.g. change (-) to -).
It is strange that you used capturing groups for day and month, but you
didn't for year field (I added it in the example below).
So try the following regex:
(0[1-9]|[12][0-9]|3[01])-(0[1-9]|1[0-2])-(\d{4})
It is however not clear, whether you realy need capturing groups.

Wrap integers in quotes from json data

I created this question yesterday
I've since realised there are actually a few other bits of data that cause issues with the solutions I received. Hence, I thought it best to make a new question
Take the following example data;
"87",0000,0767,"078",0785,"0723",23487, "061 904 5284","17\/10\/2016","some.name.789#hotmail.com"
Using the accepted solution form above (?<!")(\b\d+\b)(?!")
The date string ends up having the middle number in between the two \/ wrapped, the number in quotes with spaces breaks as well as the email address.
The issues can be seen here: https://regex101.com/r/qVQYA7/6
My Solution
The following does seem to work for me, however it seems a bit messy. I have a feeling there's a much more succinct way to achieve the same result;
,(?<!("|\/|\\))(\b\d+\b)(?!("|\/|\\|( \d))) Replace with ,"$2"
https://regex101.com/r/qVQYA7/5
EDIT:
#Federico this screenshot shows that spaces before or after commas breaks the replace;
By reading your both questions, what I understand is that you want to wrap in double quots some numbers that aren't, so for this I can come up with a simple regex like this:
(?<=,)(\d+)(?=,)
With the replacement string: "$1"
Working demo
Update: after you updated the question, here I put the update for the answer. You can use this regex instead:
(?<=,)\s*(\d+)\s*(?=,)

Regex all uppercase with special characters

I have a regex '^[A0-Z9]+$' that works until it reaches strings with 'special' characters like a period or dash.
List:
UPPER
lower
UPPER lower
lower UPPER
TEST
test
UPPER2.2-1
UPPER2
Gives:
UPPER
TEST
UPPER2
How do I get the regex to ignore non-alphanumeric characters also so it includes UPPER2.2-1 also?
I have a link here to show it 'real-time': http://www.rubular.com/r/ev23M7G1O3
This is for MySQL REGEX
EDIT: I didn't specify I wanted all non-alphanumeric characters (including spaces), but with the help of others here it led me to this: '^[A-Z-0-9[:punct:][:space:]]+$' is there anything wrong with this?
Try
'^[A-Z0-9.-]+$'
You just need to add the special characters to the group, optionally escaping them.
Additionally if you choose not to escape the -, be aware that it should be placed at the start or the end of the grouping expression to avoid the chance that it may be interpreted as delimiting a range.
To your updated question, if you want all non-whitespace, try using a group such as:
^[^ ]+$
which will match everything except for a space.
If instead what you wanted is all non-whitespace and non-lowercase, you likely will want to use:
^[^ a-z]+$
The 'trick' used here is adding a caret symbol after the opening [ in the group expression. This indicates that we want the negation of the match.
Following the pattern, we can also apply this 'trick' to get everything but lowercase letters like this:
^[^a-z]+$
I'm not really sure which of the 3 above you want, but if nothing else, this ought to serve as a good example of what you can do with character classes.
I believe you are looking for (one?) uppercase-word match, where word is pretty much anything.
^[^a-z\s]+$
...or if you want to allow more words with spaces, then probably just
^[^a-z]+$
You just need to put in the . and -. In theory, you don't need to escape because they are inside the brackets, but I like to to remind myself to escape when I have to.
'^[A-Z0-9\.\-]+$'
Try regular expression as below:
'^[A0-Z0\\.\\-]+$'

How do I check if values between html tags are blank or empty using regular expressions in notepad plus plus

I'm conducting a mass search of files in notepad++ and I need to determine if there are no values between a set of tags (i.e. ).
".*?" will search for 0 or more characters (well, most), which is fine. But I'm looking for a set of tags with at least one character between them.
".+?" is similar to the above and does work in notepad++.
I tried the following, which was unsuccessful:
<author>.{0}?</author>
Thank you for any help.
Since you look for something that doesn't exist you don't have to make it that complicated. Simply searching for <author></author> would do the trick, wouldn't it? If you want to include space-characters as "nothing" you could modify it to the following:
<author>\s*?</author>
Output:
<author></author> Match
<author> </author> Match
<author>something</author> No match
I don't understand why you are using the "?" operator; ".+" should yield the result you need.