Error trying to parse odata4 from API REST using NIFI - json

I'm using a Microsoft REST API to query a Azure application, oauth and request goes without problem.
The response from InvokeHTTP has this format
{"#odata.context":"https://****.dynamics.com/api/data/v9.1/$metadata#endpoint","value":[ here comes the actual JSON result in format {
"#odata_etag" : "W/\"555598\"", "field":"value...},...]
,"#odata.nextLink":"https://****.dynamics.com/api/data/v9.1/endpoint?$skiptoken.....}
I need to extract the nextLink for pagination and Value to continue the flow and store the result.
When I try to parse with inferAvroSchema so I can start working with it throws this error "Illegal initial character: #odata.etag"
My Idea was to inferAvroSchema, then EvaluateJsonPath to extract the odata tags and then extract the values.
I tried using EvaluateJsonPath on the result asking to create an attribute for $.#odata.context but it doesn't find the item either, I'm sure is something about the #.
I can also replace all the # of the incoming flow for another char, but don't know if that makes sense.
I'm feeling that i'm not using a correct approach, but NIFI + odata doesn't give me results on google or here.
I'm open to any suggestions!
thank you!

Schema fields cannot contain #. You could replace the #, however you must be sure not to replace it in actual content like email addresses. Another solution is to transform the API response using JoltTransformJSON processor, such that your flow can work with it:
GenerateFlowFile:
For the JoltTransformJSON processor provide following Jolt specification:
[
{
"operation": "shift",
"spec": {
"\\#odata.nextLink": "next"
}
}
]
Leave the default values for the other properties. You can play around with Jolt here: http://jolt-demo.appspot.com/
EvaluateJsonPath:
Result:
Notice that the url is now part of the flowfile attributes.

Your hunch is correct, you can only have valid characters for the field names on the schema type you are using, avro or JSON.
You could get NiFi to remove illegal characters with the replacetext proceasor, have a read here on what is valid: http://avro.apache.org/docs/current/spec.html#names

Related

is there any way where we can load deformed json into python object?

i am getting a json data after hitting an API .
when i try to load that json into python using json.loads(response.text), I am getting a delimiter error .
when checked few fields in json dose not have "," separating them.
{
"id":"142379",
"label":"1182_Mailer_PROD",
"location":"Bangalore, India",
"targetType":"HTTPS performance",
"frequency":"15",
"fails":"2764",
"totalUptime":"85.32"
"tests":[
{"date":"09-24-2019 09:31","status":"Could not resolve: mailer.accenture.com (DNS server returned answer with no data)","responseTime":"0.000","dnsTime":"0.000","connectTime":"0.000","redirectTime":"0.000","firstbyteTime":"0.000","lastbyteTime":"0.000","pingLoss":"0.00","pingMin":"0.000","pingAvg":"0.000","pingMax":"0.000","size":"0","md5hash":"(null)"}
]
}
,
{
"id":"158651",
"label":"11883-GR dd-WSP",
"location":"Chicago, IL",
"targetType":"Performance transaction",
"frequency":"15",
"fails":"5919",
"totalUptime":"35.14"
,"tests":[
{"date":"09-24-2019 09:26","status":"Keywords not found - Working","responseTime":"0.669","stepresults":[
{"stepid":"1","date":"09-24-2019 09:26","status":"OK","responseTime":"0.453","dnsTime":"0.000","connectTime":"0.025","redirectTime":"0.264","firstbyteTime":"0.141","lastbyteTime":"0.024","size":"22351","md5hash":"ca002cf662980511a9faa88286f2ee96"},
{"stepid":"2","date":"09-24-2019 09:26","status":"Keywords not found - Working","responseTime":"0.216","dnsTime":"0.000","connectTime":"0.023","redirectTime":"0.000","firstbyteTime":"0.171","lastbyteTime":"0.022","size":"22457","md5hash":"38327404e4f2392979aa7dfa27118f4e"}
]}]
}
This is a small chunk of data from the response , as you could see "totalUptime":"85.32" doesn't have a comma separating it.
could you please let me know how can we load the data into python object even though the json is deformed
Deformed JSON is not JSON, so obviously you can't load it with a standard procedure. There are only two possibilities to load it:
Create your own parser
Modify the input to conform to the JSON standard
Both possibilities need you to define what format do you want to import. If it is OK for your format not to have commas then you have to define what your delimiters are.
From the example you posted is difficult to make any definitive assessment about how the input format is defined. So you probably will have to write a rudimentary parser and approximate it by try and error to the input you are trying to parse.

How do I parse a JSON from Azure Blob Storage file in Logic App?

I have a JSON file in Azure Blob storage that I need to parse and insert rows into SQL using the Logic App.
I am using the "Get Blob Content" and my first attempt was to then pass to "Parse JSON". It returns and error": InvalidTemplate. Unable to process template language expressions in action 'Parse_JSON' inputs at line '1' and column '2856'"
I found some discussion that indicated that the content needs to be converted to a string so I used "Compose" and edited the code as suggested to
"inputs": "#base64ToString(body('Get_blob_content').$content)"
This works but then the InvalidTemplate issue gets pushed to the Parse function and I get the InvalidTemplate error there. I have tried wrapping the output in JSON expression and a few other things but I just can't get it to parse.
If I take a sample or even the entire JSON and put it into the INPUT of the Parse function it works without issue but it will not accept the blob content as JSON.
The only thing I have been able to do successfully from blob content is to take it as a string and update a row in SQL to later use the OPENJSON in SQL...but I run into an issue there that is for another post.
I am at a loss of what to do.
You don't post much information about your logic app actions, so maybe you could refer to my flow design. I test with a json data with array.
The below is my flow picture. I'm not using compose action, and use decodeBase64(body('Get_blob_content')['$content']) as the Parse Json content.
And if select property from the json, you need set the array index. I set a variable to get a value 'body('Parse_JSON')1['name']'.
you could have a try with this, if still fail, please provide more information or some sample to let us have a test.

jmeter JSON Extractor failed to when to process [

I am having problem of extracting the json value when the data has a leading [. Ex: [{"userID":"12"}]
I used "jp#gc - Dummy Sampler" to mock a test json data and when removed the leading [ and trailing ], the JSON Extractor seems to be able to read the json. Ex: {"userID":"12"}
A leading [ is valid JSON format; therefore, I am not sure if my assumption is correct. Is my finding sounds correct? If yes, what is the best way for me to remove the leading and trailing [].
thanks
You can use .. - a deep scan operator in order to get the values from JSON no matter how many nested levels deep they are:
You may also find JMeter's JSON Path Extractor Plugin - Advanced Usage Scenarios article useful as it contains several most frequently required examples of working with JSON Path Extractor
"[]" means it's an array.
So to extract 12 you would use:
[0].userID
An alternative is to use:
$..userID

Is it valid for JSON data structure to vary between a list and a boolean

The json data structure for jstree is define in https://github.com/vakata/jstree, here is an example
[ { "text" : "Root node", "children" : [ "Child node 1", "Child node 2" ] } ]
Notably it says
The children key can be used to add children to the branch, it should
be an array
However later on in section Populating the tree using AJAX and lazy loading nodes it shows to use set children to false to indicate when a child has not be processed
[{
"id":1,"text":"Root node","children":[
{"id":2,"text":"Child node 1","children":true},
{"id":3,"text":"Child node 2"}
]
}]
So here we see children used as both as an array and as a boolean
I am using jstree as an example because this is where I encountered the issue, but my question is really a general json question. My question is this, is it valid JSON for the same element in json to be two different types (an array and a boolean)
Structure wise, both are valid JSON packets. This is okay, as JSON is somewhat less stricter than XML(with a XSD or a DTD). As per: https://www.w3schools.com/js/js_json_objects.asp,
JSON objects are surrounded by curly braces {}.
JSON objects are written in key/value pairs.
Keys must be strings, and values must be a valid JSON data type (string, number, object, array, boolean or null).
Keys and values are separated by a colon.
Each key/value pair is separated by a comma.
Having said that, if the sender is allowed to send such JSONs, only caveat is that server side will have to handle this discrepancy upon receiving such different packets. This is a bad-looking-contract, and hence server might need to do extra work to manage it. Server side handling of such incoming JSON packets can become tricky.
See: How do I create JSON data structure when element can be different types in for use by
You could validate whether a JSON is okay or not at https://jsonlint.com/
See more about JSON in this answer: https://stackoverflow.com/a/4862511/945214
It is valid Json. JSON RFC 8259 defines a general syntax but it contains nothing that would allow a tool to identify that two equally named entries are meant to describe the same conceptual thing.
The need to have a criteria to check two JSON structures for instance equality has been one motivation to create something like Json Schema.
I also think it is not too unusual for javascript to provide this kind of mixed data. Sometimes it might help to explicitly convert the javascript object to JSON. Like in JSON.stringify(testObject)
A thing for json validation
https://www.npmjs.com/package/json-validation
https://davidwalsh.name/json-validation.

is "HelloWorld" a valid json response

This could be the most basic json question ever. I'm creating a WCF REST service and have a HelloWorld test function that just returns a string. I'm testing the service in fiddler and the reponse body I'm getting back is:
"HelloWorld"
I also created a function that would just return a number (double) and the response body is:
1.0
Are those valid json responses? Are simple return types just returned in plain text like that (no markup with the brackets)?
Valid JSON responses start with either a { for an object or [ for a list of objects.
Native types are not valid JSON if not encapsulated. Try JSONlint to check validity.
RFC 4672, says no. Which doesn't mean it can't work, but it isn't strictly standards compliant. (Of course, neither are all the JSON readers...)
To quote from Section 2, "JSON Grammar":
A JSON text is a sequence of tokens. The set of tokens includes six
structural characters, strings, numbers, and three literal names.
A JSON text is a serialized object or array.
JSON-text = object / array
Only objects / maps, and arrays, at the top level.
According to the official website, you need to use a syntax like this:
You need to declare what you want beetween {}, like this:
{
"test": "HelloWorld"
}
No. The following is, for example:
{
"Foo": "HelloWorld"
}
You can try JSONLint to see what validates and what does not.