Pipe not renaming fields - json

I have a JSON feed that I want to rename some of the fields and output a new JSON feed.
Here is one of the nodes on the source JSON feed:
{
"SiteId": "A2390506",
"SiteName": "Blackford Drain",
"SubRegion": "Blackford Drain",
"SubRegionID": 38,
"ResultDate": "28/08/14 15:00",
"10.00": 0.0,
"100.00": 1.293,
"151.00": 4.8747,
"821.00": 13501.0
},
Its those latter fields that are labelled as floats (10.00, 100.00, etc) that I want to change to more meaningful names.
I thought this could be quite easily achieved with a Pipe and a Rename Operator.
I can rename any of the fields that have string names, but not the latter fields.
Have I done something wrong, or is this a limitation of the rename operator? Any ideas on a workaround?
The source of the pipe is here.
--EDIT
Should also point out that I have also tried 'item.100.00' which also fails to rename.

Related

How do I write Grok patterns to parse json data?

I was tasked to filter through data using the elasticsearch-kibana stack. My data comes in JSON format like so,
{
"instagram_account": "christywood",
"full_name": "Christy Wood",
"bio": "Dog mom from Houston",
"followers_count": 1000,
"post_count": 1000,
"email": christy#gmail.com,
"updated_at": "2022-07-18 02:06:29.998639"
}
However, when I try to import the data into Kibana, I get an error that states my data does not match the default GROK pattern.
I tried writing my own GROK, using the list of acceptable syntaxes in this repo, but the debugger always parses the key rather than the actual desired value. For instance, the GROK pattern
%{USERNAME:instagram_account}
returns this undesired data structure
{
"instagram_account": "instagram_account"
}
I've tried a couple other syntax options, but it seems that my debugger always grabs the key and not the actual value. No wonder elastic search cannot make sense of my data!
I've searched for examples, but I am unable to find any that use JSON data. To be fair, I'm very unfamiliar with GROK and would like to understand what % , /n , and other delimiters mean within this context.
Please tell me what I'm doing wrong, and point me in the right direction. Thank you!

How to use regular expression in JSON as key

Is it even possible to do so? I didn't find a single solution on the internet. I'm working on Angular 6 project. I've to translate few strings and for that I'm maintaining one en-US.json file in i18n folder. JSON is like this:
"words": {
"Car": "Wagen",
"Ship": "Schiff",
"Flight": "Flug",
"*": "", // <------------ HELP NEEDED
}
I don't want to convert other strings whatsoever. Can I use regular expression to ignore them. Leaving them unattended is giving me strings like:
words.train
words.helicopter
Please correct me where I am wrong.

manipulating (nested) JSON keys and there values, using nifi

I am currently facing an issue where I have to read a JSON file that has mostly the same structure, has about 10k+ lines, and is nested.
I thought about creating my own custom processor which reads the JSON and replaces several matching key/values to the ones needed. As I am trying to use NiFi I assume that there should be a more comfortable way as the JSON-structure itself is mostly consistent.
I already tried using the ReplaceText processor as well as the JoltTransformJson processor, but I could not figure out. How can I transform both keys and values, if needed? For example: if there is something like this:
{
"id": "test"
},
{
"id": "14"
}
It might be necessary to turn the "id" into "Number" and map "test" to "3", as I am using different keys/values in my jsonfiles/database, so they need to fit those. Is there a way of doing so without having to create my own processor?
Regards,
Steve

Using PySpark, how we can output files per different schemas defined by some column value?

I use pySpark (Spark 2.1.0) and have event log files in S3.
Event logs has json format (each line represents valid json) and each line has event_id property. Remaining properties vary pretty much depends on event_id.
e.g.)
{"event_id": "1001", "account_id": 1, "name": "John"}
{"event_id": "1004", "account_id": 2, "purchase_id": 5, "purchase_num": 1}
If I load this json file into DF at once, it seems all columns are included as schema properties for all rows. But what I want to achieve is divide the input JSON data into each event_id's rows, set a proper schema for each event_id with parquet format, then save files individually to analyze each events from the dashboard.
I came up with the solution that specify each schema and load whole json files as many times as a number of event_id, but thought this is not efficient. I can also allow all rows included, but I'm not sure if that is the common way.
Is there any idiomatic and efficient way to achieve this?

JSON interface with UNIX

Am very new to JSON interaction, I have few doubts regarding it. Below are the basic one
1) How could we call/invoke/open JSON file through Unix, I mean let suppose I have a metedata file in JSON, then how should I fetch/update the value backforth from JSON file.
2) Need the example, on how to interact it.
3) How Unix Shell is compatible to JSON, whether is there any other tech/language/tool which is better than shell script.
Thanks,
Nikhil
JSON is just text following a specific format.
Use a text editor and follow the rules. Some editors with "JSON modes" will help with [invalid] syntax highlighting, indenting, brace matching..
A "Unix Shell" has nothing directly to do with JSON - how does a shell relate to text or XML files?
There are some utilities for dealing with JSON which might be of use such as jq - but it really depends on what needs to be done with the JSON (which is ultimately just text).
Json is a format to store strings, bools, numbers, lists, and dicts and combinations thereof (dicts of numbers pointing to lists containing strings etc.). Probably the data you want to store has some kind of structure which fits into these types. Consider that and think of a valid representation using the types given above.
For example, if your text configuration looks something like this:
Section
Name=Marsoon
Size=34
Contents
foo
bar
bloh
EndContents
EndSection
Section
Name=Billition
Size=103
Contents
one
two
three
EndContents
EndSection
… then this looks like a list of dicts which contain some strings and numbers and one list of strings. A valid representation of that in Json would be:
[
{
"Name": "Marsoon",
"Size": 34,
"Contents": [
"foo", "bar", "bloh"
]
},
{
"Name": "Billition",
"Size": 103,
"Contents": [
"one", "two", "three"
]
},
]
But in case you know that each such dictionary has a different Name and always the same fields, you don't have to store the field names and can use the Name as a key of a dictionary; so you can also represent it as a dict of strings pointing to lists containing numbers and lists of strings:
{
"Marsoon": [
34, [ "foo", "bar", "bloh" ]
],
"Billition": [
103, [ "one", "two", "three" ]
]
}
Both are valid representations of your original text configuration. How you'd choose depends mainly on the question whether you want to stay open for later changes of the data structure (the first solution is better then) or if you want to avoid bureaucratic overhead.
Such a Json can be stored as a simple text file. Use any text editor you like for this. Notice that all whitespace is optional. The last example could also be written in one line:
{"Marsoon":[34,["foo","bar","bloh"]],"Billition":[103,["one","two","three"]]}
So sometimes a computer-generated Json might be hard to read and would need an editor at least capable of handling very long lines.
Handling such a Json file in a shell script will not be easy just because the shell has no notion of the idea of such complex types. The most complicated it can handle properly is a dict of strings pointing to strings (bash arrays). So I propose to have a look for a more suitable language, e. g. Python. In Python you can handle all these structures quite efficiently and with very readable code:
import json
with open('myfile.json') as jsonFile:
data = json.load(jsonFile)
print data[0]['Contents'][2] # will print "bloh" in the first example
# or:
print data['Marsoon'][1][2] # will print "bloh" in the second example