How do I write Grok patterns to parse json data? - json

I was tasked to filter through data using the elasticsearch-kibana stack. My data comes in JSON format like so,
{
"instagram_account": "christywood",
"full_name": "Christy Wood",
"bio": "Dog mom from Houston",
"followers_count": 1000,
"post_count": 1000,
"email": christy#gmail.com,
"updated_at": "2022-07-18 02:06:29.998639"
}
However, when I try to import the data into Kibana, I get an error that states my data does not match the default GROK pattern.
I tried writing my own GROK, using the list of acceptable syntaxes in this repo, but the debugger always parses the key rather than the actual desired value. For instance, the GROK pattern
%{USERNAME:instagram_account}
returns this undesired data structure
{
"instagram_account": "instagram_account"
}
I've tried a couple other syntax options, but it seems that my debugger always grabs the key and not the actual value. No wonder elastic search cannot make sense of my data!
I've searched for examples, but I am unable to find any that use JSON data. To be fair, I'm very unfamiliar with GROK and would like to understand what % , /n , and other delimiters mean within this context.
Please tell me what I'm doing wrong, and point me in the right direction. Thank you!

Related

How can I select all possible JSON Data in arrow syntax/JSON Extract in SQL

I need to be able to access all the available JSON data, the problem is that a lot of it is nested.
I currently have this query.
SELECT * FROM `system_log` WHERE entry->"$[0]" LIKE "%search_term%";
I need instead of entry->"$[0]", something like entry->"$*"
I think the arrow syntax is short for JSON_EXTRACT which I think would mean that a solution for extract would work for the arrow syntax.
{
" Name": {
"after": "Shop",
"before": "Supermarket"
}
}
This is an example of my JSON data and as you can see there are multiple levels to it meaning that entry->"$[0]" won't catch it.
version 8.0.19 of SQL
What I've tried so far is entry->"$[0]" and then prepending [0] after, but this solution does not seem very dynamic as the JSON data could get deeper and deeper.
JSON_SEARCH() won't work for the search you describe, because JSON_SEARCH() only searches for full string matches, not wildcards.
If you truly cannot predict the structure of your JSON, and you just want to find if the pattern '%search_term%' appears anywhere, then just treat the whole JSON document as a string, and use LIKE:
SELECT * FROM `system_log` WHERE entry LIKE "%search_term%";
If you have more specific search requirements, then you'll have to come up with a way to predict the path to the value you're searching for in your JSON document. That's something I cannot help you with, because I don't know your usage of JSON.

In Power Automate Parse JSON step I get the errors "missing required properties"

I am working on a Power Automate flow to get a JSON file from SharePoint and Parse it. On one of my previous questions I received a solution that worked with a testing JSON file. However when I ran a couple of tests with some JSON files that I need to use, the Parse JSON step gives out errors regarding "missing" required properties.
Basically, the JSON file's arrays do not always have exactly the same elements (properties). For example (below) the element "minimun_version" does not always appear under the element "link", like the image below
and because of this syntax I get the errors below
How can I Parse such a JSON file successfully?
Any idea or suggestion will help me get unstuck.
Thank you
You can allow null values in your Parse Json schema by simply adding that to the schema. April Dunnam has a nice article about this approach:
https://www.sharepointsiren.com/2018/10/flow-parse-json-null-error-fix/
I assume you have something like below in your schema?
"minimum_version": {
"type": "number"
}
You can change that to this to allow null values
"minimum_version": {
"type": [
"number",
"null"
]
}

manipulating (nested) JSON keys and there values, using nifi

I am currently facing an issue where I have to read a JSON file that has mostly the same structure, has about 10k+ lines, and is nested.
I thought about creating my own custom processor which reads the JSON and replaces several matching key/values to the ones needed. As I am trying to use NiFi I assume that there should be a more comfortable way as the JSON-structure itself is mostly consistent.
I already tried using the ReplaceText processor as well as the JoltTransformJson processor, but I could not figure out. How can I transform both keys and values, if needed? For example: if there is something like this:
{
"id": "test"
},
{
"id": "14"
}
It might be necessary to turn the "id" into "Number" and map "test" to "3", as I am using different keys/values in my jsonfiles/database, so they need to fit those. Is there a way of doing so without having to create my own processor?
Regards,
Steve

If JSON represents the 'object', what represents the 'class'?

JSON appears to be a nice way to represent a complex data structure in plain text. If we think of this complex data structure as analogous to an OOP object - an instance of a class - then is there a commonly used JSON-like format that represents the class itself (just the data part - forget methods)? Can JSON itself be used for this?
To put it another way, if JSON encodes name-value pairs, what should I use if I want to encode only the names?
The reason I want this is that I am designing a protocol to use with jQuery (to which I am a complete novice by the way). The client will communicate to the server the structure of the JSON object it wants back, and the server will return a JSON object of that structure with the values added.
The key point is that it is the client that is in full control of what data fields (name-value pairs) the server returns. It's a bit different from all the examples of jQuery that I've found so far on the web where the client makes a request (which usually includes a very limited set of parameters, if any) and the server makes the decision as to what fields to return in the JSON reply.
(Obviously, what the client asks for must be congruent with the server's data model; if the server has an array of widgets each with its own price, the client can't ask for an array of prices each with its own widget.)
This must be a common problem, and I don't want to reinvent the wheel. I want to adopt a solution that is already in common use across the web.
Edit
I just found JSON Schema. This is not what I am looking for. It contains way more than I need.
Edit
I'm looking more for a 'this is how it is usually done' answer, rather than a 'you could try…' answer. (I can invent dozens of possible answers myself.)
To encode only names within JSON, you could use a key/value pair where the key is either the class name or just a key named 'values' - with the value being an array of strings that are the names to be returned by the server. For example:
{ 'class_name' : [ "name1", "name2", "name3" ] }
The server can then either detect the class name from the key used and return the supplied values for the names in the array if the class supports it or ignore if it does not.
I'm looking more for a 'this is how it is usually done' answer
There is no single "correct" way to do what you want. Many people have their implementation. It depends on various factors -- what you want to do, where you want to do, how efficiently you want it to do?
For simple structures I would prefer and suggest the answer given by #dbr9979.
For nested structures, you can have nested arrays. Something like:
{
"nestedfield1": {
"nestedfield11":["nestedfield111", "nestedfield112"],
"nestedfield12":["nestedfield121", "nestedfield122"],
"__SIMPLE_FIELDS__": ["simplefield13", "simplefield14"]
}
}
The point is, if the key is __SIMPLE_FIELDS__, the value is an array of simple fields (string, numbers etc..), else the key stands for the key in the object.
For something more complex, what I would suggest is you have predefined structures, that both the server and the client know of. This is particularly useful when you have to make multiple identical requests. Assign some unique number for each of them. Something like:
1 => <the structure above>
2 => ["simplefield1", "simplefield2" ..]
3 => etc .. etc
The server stores the above structure and the relevant number in the database or something. And now, as it may be obvious by now, client sends across the id of the required structure, and the server responds in the appropriate fashion.
I think what you meant by this:
the client that is in full control of what data fields (name-value pairs) the server returns.
is like the difference between SELECT * FROM Bags and SELECT color, price FROM Bag in SQL. Am I interpreting you correctly?
You could query with:
{
'resource': 'Bag',
'field_names': ['color', 'price']
}
which will return the response:
{
'status': 'success',
'result': [
{'color': 'red', 'price': 50},
{'color': 'blue', 'price': 45},
]
}
most likely though, you may not actually need your request to be a JSON object; I've seen implementations where the field names is taken from the query string, like http://foo.com/bag?fields=color,price
I was looking for Partial Response.
RESTful API Design: can your API give developers just the information they need? explains it all and gives examples from LinkedIn, Facebook, and Google. Google and Facebook both have similar approaches. Here's how Lie Ryan's example would look using Google's approach:
url?fields=status,result(color,price)
Since Google and Facebook are behind this, I would not be surprised to see this become a de facto standard.
In my case I am likely to run into a length limitation on the URL and so have to use POST instead, but this is an excellent starting point for me.

Explaining JSON (structure) .. to a business user

Suppose you have some data you would want business users to contribute to, which will end up being represented as JSON. Data represents a piece of business logic your program knows how to handle.
As expected, JSON has nested sections, data has categorizations, some custom rules may optionally be introduced etc.
It so happens that you already a vision of what "a perfect" JSON should look like. That JSON is your starting point.
Question:
Is there a way one can take a (reasonably complex) JSON and present it in a (non-JSON) format, that would be easy for a non-technical person to understand?
If possible, could you provide an example?
What do you think of this?
http://www.codeproject.com/script/Articles/ArticleVersion.aspx?aid=90357&av=126401
Or, make your own using Ext JS for the visualization part. After all, JSON is a lingua franca on the web these days.
Apart from that, you could use XML instead of JSON, given that there are more "wizard" type tools for XML.
And finally, if when you say "business users" you mean "people who are going to laugh at you when you show them code," you should stop thinking about this as "How do I make people in suits edit JSON" and start thinking about it as "How do I make a GUI that makes sense to people, and I'll make it spit out JSON later."
Show them as key, value pairs. If your value has sub sections then show them as drill downs/tree structure. An HTML mockup which parses a JSON object in your system would help in the understanding.
Picked this example from JSON site
{
"name": "Jack (\"Bee\") Nimble",
"format": {
"type": "rect",
"width": 1920,
"height": 1080,
"interlace": false,
"frame rate": 24
}
}
Name,format would be the tree nodes.