I'm doing a field that will only accept whole numbers. So I did a regex validation like this /^\d{1,3}$/ this is validating whole number entry and does not allow decimal from .1 e.g it will make 1.1 invalid but when I tried to input 1.0 it accepted it. Is there a regex that will also check .0?
^\d{1,3}(\.0)?$ accepts one, two or three digit whole numbers as well as if they end with .0.
Related
Yeah there is! I made this question to share my knowledge, Q&A style since I had a hard time finding it myself :)
Thanks to https://stackoverflow.com/a/67821482/1561441 (Barbaros Özhan, see comments) for pointing me into the correct direction
The answer is: look here and here
Correct me if I'm wrong, but: Wow, currently to my knowledge a single .java file on GitHub, last commit in 2017, holds relevant parts of the official documentation of the JOLT syntax. I had to use its syntax since I'm working with NiFi and applied its JoltTransformJSON processor (hence the SEO abuses in my question, so more people find the answer)
Here are some of the most relevant parts copied from https://github.com/bazaarvoice/jolt/blob/master/jolt-core/src/main/java/com/bazaarvoice/jolt/Shiftr.java and slightly edited. The documentation itself is more extensive and also shows examples.
'*' Wildcard
Valid only on the LHS ( input JSON keys ) side of a Shiftr Spec
The '*' wildcard can be used by itself or to match part of a key.
'&' Wildcard
Valid on the LHS (left hand side - input JSON keys) and RHS (output data path)
Means, dereference against a "path" to get a value and use that value as if were a literal key.
The canonical form of the wildcard is "&(0,0)".
The first parameter is where in the input path to look for a value, and the second parameter is which part of the key to use (used with * key).
There are syntactic sugar versions of the wildcard, all of the following mean the same thing; Sugar : '&' = '&0' = '&(0)' = '&(0,0)
The syntactic sugar versions are nice, as there are a set of data transforms that do not need to use the canonical form, eg if your input data does not have any "prefixed" keys.
'$' Wildcard
Valid only on the LHS of the spec.
The existence of this wildcard is a reflection of the fact that the "data" of the input JSON, can be both in the "values" and the "keys" of the input JSON
The base case operation of Shiftr is to copy input JSON "values", thus we need a way to specify that we want to copy the input JSON "key" instead.
Thus '$' specifies that we want to use an input key, or input key derived value, as the data to be placed in the output JSON.
'$' has the same syntax as the '&' wildcard, and can be read as, dereference to get a value, and then use that value as the data to be output.
There are two cases where this is useful
when a "key" in the input JSON needs to be a "id" value in the output JSON, see the ' "$": "SecondaryRatings.&1.Id" ' example above.
you want to make a list of all the input keys.
'#' Wildcard
Valid both on the LHS and RHS, but has different behavior / format on either side.
The way to think of it, is that it allows you to specify a "synthentic" value, aka a value not found in the input data.
On the RHS of the spec, # is only valid in the the context of an array, like "[#2]".
What "[#2]" means is, go up the three levels and ask that node how many matches it has had, and then use that as an index in the arrays.
This means that, while Shiftr is doing its parallel tree walk of the input data and the spec, it tracks how many matches it has processed at each level of the spec tree.
This useful if you want to take a JSON map and turn it into a JSON array, and you do not care about the order of the array.
On the LHS of the spec, # allows you to specify a hard coded String to be place as a value in the output.
The initial use-case for this feature was to be able to process a Boolean input value, and if the value is boolean true write out the string "enabled". Note, this was possible before, but it required two Shiftr steps.
'#' Wildcard
Valid on both sides of the spec.
The basic '#' on the LHS.
This wildcard is necessary if you want to put both the input value and the input key somewhere in the output JSON.
Thus the '#' wildcard is the mean "copy the value of the data at this level in the tree, to the output".
Advanced '#' sign wildcard
The format is lools like "#(3,title)", where "3" means go up the tree 3 levels and then lookup the key "title" and use the value at that key.
I would love to know if there is an alternative to JoltTransformJSON simply because I'm struggling a lot with understanding it (not coming from a programming background myself). When it works (thanks to all the help here) it does simplify things a lot!
Here are a few other sites that help:
https://intercom.help/godigibee/en/articles/4044359-transformer-getting-to-know-jolt
https://erbalvindersingh.medium.com/applying-jolttransform-on-json-object-array-and-fetching-specific-fields-48946870b4fc
https://cool-cheng.blogspot.com/2019/12/json-jolt-tutorial.html
I have the following text strings:
"Name":"John"}]
"Age":36
"Address":"ABC,PQR234[]/.,#ANYCHARACTERS"
"Gender":null
I need to get two groups (key value pair) from this such that the output would be only:
Key|Value
Name|John
Age|36
Address|ABC,PQR234[]/.,#ANYCHARACTERS
The requirement is to have a single regex to grab everything in the double quotes if the double quotes are present. If not, take the value without the quotes.
In our example above, 36 and null are the one without the quotes and they need to be captured as well.
I have tried a lot but have failed to do so.
UPDATE:
I don't know why I am getting down votes for this question. Yes this is JSON that I am trying to parse but there is a reason behind why I am doing this and not using any document parser.
I am supposed to use Talend for getting a dynamic JSON converted into Key Value Pair. What I mean by dynamic is the fields of the JSON can vary and hence I do not have a fixed schema and hence cannot use a document parser (which demands a fixed structure of JSON). I am devising a solution to get around this using Normalizer (on comma) and then extracting the key value pair which will be in double quotes using Regular Expressions. I tried many things on my own and since I am not an expert in Regular expressions, I have come here to get inputs.
If you know any better solution to this, I would be very happy to get your inputs.
How about this?
/"?([^\n"]*)"?:"?([^\n"]*)"?/
Explained in detail at:
https://regex101.com/r/UM0rl2/1/
I am trying to create a regex to validate usernames which should match the following :
Only one special char (._-) allowed and it must not be at the extremes of the string
The first character cannot be a number
All the other characters allowed are letters and numbers
The total length should be between 3 and 20 chars
This is for an HTML validation pattern, so sadly it must be one big regex.
So far this is what I've got:
^(?=(?![0-9])[A-Za-z0-9]+[._-]?[A-Za-z0-9]+).{3,20}
But the positive lookahead can be repeated more than one time allowing to be more than one special character which is not what I wanted. And I don't know how to correct that.
You should split your regex into two parts (not two Expressions!) to make your life easier:
First, match the format the username needs to have:
^[a-zA-Z][a-zA-Z0-9]*[._-]?[a-zA-Z0-9]+$
Now, we just need to validate the length constraint. In order to not mess around with the already found pattern, you can use a non-consuming match that only validates the number of characters (its literally a hack for creating an and pattern for your regular expression): (?=^.{3,20}$)
The regex will only try to match the valid format if the length constraint is matched. It is non-consuming, so after it is successful, the engine still is at the start of the string.
so, all together:
(?=^.{3,20}$)^[a-zA-Z][a-zA-Z0-9]*[._-]?[a-zA-Z0-9]+$
Debugger Demo
I think you need to use ? instead of +, so the special character is matched only once or not.
^(?=(?![0-9])?[A-Za-z0-9]?[._-]?[A-Za-z0-9]+).{3,20}
How do I make a pattern for an email address that is only valid if it contains at least one character, followed by an # sign, followed by at least one character, followed by a period (.) followed by at least “co”. (So, “a#b.co” is an example of the “least valid” email address)
You will need to use regex to validate the email address. I suggest you look here if you don't know about regular expressions. The expression for validating an email is this:
/^[a-z0-9._%+-]+#[a-z0-9.-]+\.[a-z]{2,4}$/
You will have to validate it using JavaScript regex. Check that out here.
I have a regular expression that I am using for client side HTML5 validation and I need to add a max length element to it. Here is my regular expression :
#pattern = #"^([a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?)$"
How would I for example limit it to 50 characters?
EDIT : I need to check the max length in the same regular expression as I am using HTML5 validation which only currently allows checking against required and pattern attributes.
If you absolutely must use a regex, add a lookahead assertion at the start of the regex:
#pattern = #"^(?!.{51})([a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?)$"
The (?!.{51}) asserts that it's impossible to match 51 characters starting from the beginning of the string, without actually consuming any of the characters, so they are still available for the actual regex match.