Confusing RegEx results

Confusing RegEx results - json

why is the following regex:
"_id":"(.+?)"}\],"componentType":"(.+?)"
for this string:
"name":"in","_id":"a05d91a7-6be0-c252-08e9-bf94cc0be36e","value":"5.6"}],"_id":"e986915c-22db-429f-9fe7-ae2e2ddfa779","refId":"de9ff045-21ce-4833-af34-30f50c129840","failId":"8b723736-a391-fd7e-8d23-7cc72e568f48"},{"outputs":[{"metadata":{"label":{"value":"Output Integer","capco":"U"},"desc":{"value":"Output
Integer.","capco":"U"}},"name":"f7018f5c-057c-6ab9-7300-875c712b87b7","_id":"daad7ae7-356b-57ca-037e-0c4bcb307201"}],"componentType":"model","metadata":{"signature":"ab7e00a928dc79af806b828e1831a95e","zOrder":1,"label":{"lang":"en","value":"BBBBBBBBBBB","capco":"U"},"geom":{"w":150,"x":203,"h":60,"y":324}
pulling everything from the a05d91a7 UUID to the componentType at the bottom, and not from the _id at the bottom? I have (as far as I'm aware) nothing which indicates pulling additional content between the id (.+?) pattern and the componentType pattern?
What I'm trying to pull specifically is the following:
"_id":"daad7ae7-356b-57ca-037e-0c4bcb307201"}],"componentType":"model"
to be clear, the UUID is variable, hense the (.+?)

"_id":"([^"]*)"}],"componentType":"(.+?)"
Use this.See demo.
https://regex101.com/r/uF4oY4/38
The problem with your regex is .*? can expand based on what condition it needs to match ahead.when you use [^"]* its a negation based approach and cannot go beyond a " in any case.

There is a dedicated JMeter Test Element - JSON Path Extractor which adds JSON support to JMeter.
See Using the XPath Extractor in JMeter (scroll down to "Parsing JSON") for details on the plugin installation and some JSONPath language reference - it is much simpler than regular expressions, less fragile and more human-readable

Related

Regular Expression extractor in Jmeter not capturing value,

I am writing a regular expression extractor for a dynamic value id from the response data as below
I wrote the expression like this
But this is not capturing the value. This id value is to be used in the next request.
If i use the template as $0$, then it captures the value as %202197
Please help to correct the mistake I have made
I tried with template $0$, and match number 0 and 1, but I am getting the same expression
When I try with $1$ as template,the value is not identified at all

Your response seems to be JSON
JSON is not a regular language hence using regular expressions for parsing it is not the best idea
Consider switching to JSON Extractor, the relevant configuration would be something like:
More information: How to Use the JSON Extractor For Testing
If you want to proceed with regular expressions for any reason consider changing your regular expression part from [0-9]+ to (\d+)

Regex for replacing unnecessary quotation marks within a JSON object containing an array

I am currently trying to format a JSON object using LabVIEW and have ran into the issue where it adds additional quotation marks invalidating my JSON formatting. I have not found a way around this so I thought just formatting the string manually would be enough.
Here is the JSON object that I have:
{
"contentType":"application/json",
"content":{
"msgType":2,
"objects":"["cat","dog","bird"]",
"count":3
}
}
Here is the JSON object I want with the quotation marks removed.
{
"contentType":"application/json",
"content":{
"msgType":2,
"objects":["cat","dog","bird"],
"count":3
}
}
I am still not an expert with regex and using a regex tester I was only able to grab the "objects" and "count" fields but I would still feel I would have to utilize substrings to remove the quotation marks.
Example I am using (would use a "count" to find the start of the next field and work backwards from there)
"([objects]*)"
Additionally, all the other Regex I have been looking at removes all instances of quotation marks whereas I only need a specific area trimmed. Thus, I feel that a specific regex replace would be a much more elegant solution.
If there is a better way to go about this I am happy to hear any suggestions!

Your question suggests that the built-in LabVIEW JSON tools are insufficient for your use case.
The built-in library converts LabVIEW clusters to JSON in a one-shot approach. Bundle all your data into a cluster and then convert it to JSON.
When it comes to parsing JSON, you use the path input terminal and the default type terminals to control what data is parsed from a JSON string.
If you need to handle JSON in a manner similar to say JavaScript, I would recommend something like the JSONText Toolkit which is free to use (and distribute) under the BSD licence. This allows more complex and iterative building of JSON strings from LabVIEW types and has text-path style element access along with many more features.
The Output controls from both my examples are identical - although JSONText provides a handy Pretty Print vi.

After using a regex from one of the comments, I ended up with this regex which allowed me to match the array itself.
(\[(?:"[^"]*"|[^"])+\])
I was able to split the the JSON string into before match, match and after match and removed the quotation marks from the end of 'before match' and start of 'after match' and concatenated the strings again to form a new output.

Is there a JOLT documentation? What's the meaning of the &, # etc. operators? (NiFi, JoltTransformJSON)

Yeah there is! I made this question to share my knowledge, Q&A style since I had a hard time finding it myself :)
Thanks to https://stackoverflow.com/a/67821482/1561441 (Barbaros Özhan, see comments) for pointing me into the correct direction

The answer is: look here and here
Correct me if I'm wrong, but: Wow, currently to my knowledge a single .java file on GitHub, last commit in 2017, holds relevant parts of the official documentation of the JOLT syntax. I had to use its syntax since I'm working with NiFi and applied its JoltTransformJSON processor (hence the SEO abuses in my question, so more people find the answer)
Here are some of the most relevant parts copied from https://github.com/bazaarvoice/jolt/blob/master/jolt-core/src/main/java/com/bazaarvoice/jolt/Shiftr.java and slightly edited. The documentation itself is more extensive and also shows examples.
'*' Wildcard
Valid only on the LHS ( input JSON keys ) side of a Shiftr Spec
The '*' wildcard can be used by itself or to match part of a key.
'&' Wildcard
Valid on the LHS (left hand side - input JSON keys) and RHS (output data path)
Means, dereference against a "path" to get a value and use that value as if were a literal key.
The canonical form of the wildcard is "&(0,0)".
The first parameter is where in the input path to look for a value, and the second parameter is which part of the key to use (used with * key).
There are syntactic sugar versions of the wildcard, all of the following mean the same thing; Sugar : '&' = '&0' = '&(0)' = '&(0,0)
The syntactic sugar versions are nice, as there are a set of data transforms that do not need to use the canonical form, eg if your input data does not have any "prefixed" keys.
'$' Wildcard
Valid only on the LHS of the spec.
The existence of this wildcard is a reflection of the fact that the "data" of the input JSON, can be both in the "values" and the "keys" of the input JSON
The base case operation of Shiftr is to copy input JSON "values", thus we need a way to specify that we want to copy the input JSON "key" instead.
Thus '$' specifies that we want to use an input key, or input key derived value, as the data to be placed in the output JSON.
'$' has the same syntax as the '&' wildcard, and can be read as, dereference to get a value, and then use that value as the data to be output.
There are two cases where this is useful
when a "key" in the input JSON needs to be a "id" value in the output JSON, see the ' "$": "SecondaryRatings.&1.Id" ' example above.
you want to make a list of all the input keys.
'#' Wildcard
Valid both on the LHS and RHS, but has different behavior / format on either side.
The way to think of it, is that it allows you to specify a "synthentic" value, aka a value not found in the input data.
On the RHS of the spec, # is only valid in the the context of an array, like "[#2]".
What "[#2]" means is, go up the three levels and ask that node how many matches it has had, and then use that as an index in the arrays.
This means that, while Shiftr is doing its parallel tree walk of the input data and the spec, it tracks how many matches it has processed at each level of the spec tree.
This useful if you want to take a JSON map and turn it into a JSON array, and you do not care about the order of the array.
On the LHS of the spec, # allows you to specify a hard coded String to be place as a value in the output.
The initial use-case for this feature was to be able to process a Boolean input value, and if the value is boolean true write out the string "enabled". Note, this was possible before, but it required two Shiftr steps.
'#' Wildcard
Valid on both sides of the spec.
The basic '#' on the LHS.
This wildcard is necessary if you want to put both the input value and the input key somewhere in the output JSON.
Thus the '#' wildcard is the mean "copy the value of the data at this level in the tree, to the output".
Advanced '#' sign wildcard
The format is lools like "#(3,title)", where "3" means go up the tree 3 levels and then lookup the key "title" and use the value at that key.

I would love to know if there is an alternative to JoltTransformJSON simply because I'm struggling a lot with understanding it (not coming from a programming background myself). When it works (thanks to all the help here) it does simplify things a lot!
Here are a few other sites that help:
https://intercom.help/godigibee/en/articles/4044359-transformer-getting-to-know-jolt
https://erbalvindersingh.medium.com/applying-jolttransform-on-json-object-array-and-fetching-specific-fields-48946870b4fc
https://cool-cheng.blogspot.com/2019/12/json-jolt-tutorial.html

Extract Json Data with screaming frog

I'm using Screaming Frog as a way to extract data from a Json generated from an URL.
The Json generated is this form :
{"ville":[{"codePostal":"13009","ville":"VAUFREGE","popin":"ouverturePopin","zoneLivraison":"1300913982","url":""},{"codePostal":"13009","ville":"LES BAUMETTES","popin":"ouverturePopin","zoneLivraison":"1300913989","url":""},{"codePostal":"13009","ville":"MARSEILLE 9EME ARRON","popin":"ouverturePopin","zoneLivraison":"1300913209","url":""}]}
I'm using this regex in Custom > Extraction in Screaming Frog as a way to extract the values of "codePostal".
"codePostal":".*?"
Problem is it doesn't extract anything.
When I test my regex in regex101, it seems correct.
Do you have any clue about what is wrong ?
Thanks.
Regards.

Have you tried to save the output to understand what ScreamingFrog sees? It doesn't matter - not at the beginning - whether your RegEx works.
That said, don't forget that SF is a Java based tool hence it is the engine used by the reg ex, so make sure you test your regular expressions with the correct dialect.

You need to specify group extractors enclosed in parentheses. For instance in your example, you need to have ("codePostal":".*?") as extractor.
In addition if you simply want to extract the value, you could use the following instead.
"codePostal":"(.*?)"

It's not a problem with your Regular Expression. It seems to be that the problem is with the Content Type. ScreamingFrog isn't properly reading application/JSON content types for scraping. Hopefully they will fix this bug.

Jmeter - What is the best extractor to use on a json message?

Currently testing system where the output is in the form of formatted json.
As part of my tests I need to extract and validate two values from the json record.
The values both have individual identifiers on them but don't appear in the same part of the record, so I can't just grab a single long string.
Loose format of the information in both cases:
"identifier1": [{"identifier2":"idname","values":["bit_I_want!]}]
In the case of the bit I want, this can either be a single quoted value (e.g. "12345") or multiple quoted values (e.g. "12345","23456","98765").
In both cases I'm only interested in validating the whole string of values, not individual values from the set.
Can anyone recommend which of the various extractors in Jmeter would be best to achieve this?
Many Thanks!

The most obvious choicse seems to be JSON Path Assertion (available via JMeter Plugins), it allows not only executing arbitrary JSON queries but conditionally failing the sampler basing on actual and expected result match.
The recommended way of installing JMeter Plugins and keeping them up-to-date is using JMeter Plugins Manager

JMeter 3.1 comes with JSON Extractor to parse JSON response. you could use this expression $.identifier1[0].values
as the JSON Path to extract the values.
If your JSON response is going to simple always as shown in your question, you could use Regular Expression Extractor as well. Advantage is it is faster than JSON extractor. The regular expression would be "values":\[(.*?)\]
Reference: http://www.testautomationguru.com/jmeter-response-data-extractors-comparison/

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008