Which JSONPath expression should I use to split my JSON string? - json

I want to apply SplitJson in order to split the following JSON file into 2 FlowFiles (according to hits):
{"took":0,"timed_out":false,"_shards":
{"total":5,"successful":5,"failed":0},
"hits":{"total":2,"max_score":0.0,
"hits":
[
{"_index":"my_index","_type":"my_entry","_id":"111","_score":0.0,"_source":{"ZoneId":"1","OriginId":"1"},
"fields":{"ttime":[11000]}},
{"_index":"my_index","_type":"my_entry","_id":"222","_score":0.0,"_source":{"ZoneId":"1","OriginId":"2"},
"fields":{"ttime":[5000]}}
]
}
}
Which JsonPath Expression should I use? I tried $.hits[*], but it splits the content according to the first level hits. In my case I have hits[hits[...]], but how should I specify it in the expression?
UPDATE:
I want to get two FlowFiles:
FlowFile #1: {"_index":"my_index","_type":"my_entry","_id":"111","_score":0.0,"_source":{"ZoneId":"1","OriginId":"1"},"fields":{"ttime":[11000]}}
FlowFile #2:
{"_index":"my_index","_type":"my_entry","_id":"222","_score":0.0,"_source":{"ZoneId":"1","OriginId":"2"},"fields":{"ttime":[5000]}}

var arr = $.hits.hits;
Will give you the array with 2 objects you desire.
var o1 = arr[0];
var o2 = arr[1];
Will give you 2 objects you desire.
var json1 = JSON.stringify(arr[0]);
var json2 = JSON.stringify(arr[1]);
Will give you 2 JSON files as requested.
Is this what you needed?

You can use this website for testing JSONPath Index for your case.
The right answer is $.hits.hits[*].
As mentioned DanteTheSmith, you can simply use $.hits.hits in your case. It depends on the post-processing. Both methods work fine.

Related

Azure tables unable to store flattened JSON

I am using the npm flat package, and arrays/objects are flattened, but object/array keys are surrounded by '' , like in 'task_status.0.data' using the object below.
These specific fields do not get stored into AzureTables - other fields go through, but these are silently ignored. How would I fix this?
var obj1 = {
"studentId": "abc",
"task_status": [
{
"status":"Current",
"date":516760078
},
{
"status":"Late",
"date":1516414446
}
],
"student_plan": "n"
}
Here is how I am using it - simplified code example: Again, it successfully gets written to the table, but does not write the properties that were flattened (see further below):
var flatten = require('flat')
newObj1 = flatten(obj1);
var entGen = azure.TableUtilities.entityGenerator;
newObj1.PartitionKey = entGen.String(uniqueIDFromMyDB);
newObj1.RowKey = entGen.String(uniqueStudentId);
tableService.insertEntity(myTableName, newObj1, myCallbackFunc);
In the above example, the flattened object would look like:
var obj1 = {
studentId: "abc",
'task_status.0.status': 'Current',
'task_status.0.date': 516760078,
'task_status.1.status': 'Late',
'task_status.1.date': 516760078,
student_plan: "n"
}
Then I would add PartitionKey and RowKey.
all the task_status fields would silently fail to be inserted.
EDIT: This does not have anything to do with the actual flattening process - I just checked a perfectly good JSON object, with keys that had 'x.y.z' in it, i.e. AzureTables doesn't seem to accept these column names....which almost completely destroys the value proposition of storing schema-less data, without significant rework.
. in column name is not supported. You can use a custom delimiter to flatten your objects instead.
For example:
newObj1 = flatten(obj1, {delimiter: '__'});

NodeJS: Adding new child nodes to JSON Object

lets say there is customer object, i need to add new element address to this json object customer. how can I achieve this?
Both of these are not altering the customer JSON object
customer['address'] = addressObj
customer.address = addressObj
and I can not use push() as this is not adding a new item in list of objects.
Thanks,
Naren
Maybe your addressObj is not properly formed.
This works for me:
var customer = {"name": "Naren"};
customer.address1 = "stackoverflow";
customer.address2 = {"fulladdress":"stackoverflow"};
JSON.stringify(customer)
Output:
"{"name":"Naren","address1":"stackoverflow","address2":{"fulladdress":"stackoverflow"}}"
Maybe I am not clear on what exactly you want to do but it sounds to me as if you want have a JSON and want to merge it with another JSON, creating just a JSON file.
let Json1 = {'Superman': 'Favorite' };
let Json2 = {'Supergirl': 'Greatest'};
let Json3 = {'IronFist': 'Top 10' };
You now want to add Supergirl (the new element) to Superman (the old element) I assume. Take a look here # merge-json a simple package which does its job well. You would code as follows:
use strict;
var mergeJSON = require("merge-json");
let Json1 = {'Superman': 'Favorite' };
let Json2 = {'Supergirl': 'Greatest'};
let Json3 = {'IronFist': 'Top 10' };
let Json6 = mergeJSON(Json1,Json2);
Json6=mergeJSON(Json6,Json3);
You would end up with as follows:
Json6 = {'Superman': 'Favorite', 'Supergirl': 'Greatest', 'IronFist': 'Top 10'}
This is how I make use of combining JSON information or text information into a JSON file. You can get much more sophisticated with the module mentioned above. (Just do not confuse merge-json with json-merge and other modules.)
If this is not what you are looking for my apologies, then I did not understand the question correctly.

Regular expression to extract a JSON array

I'm trying to use a PCRE regular expression to extract some JSON. I'm using a version of MariaDB which does not have JSON functions but does have REGEX functions.
My string is:
{"device_types":["smartphone"],"isps":["a","B"],"network_types":[],"countries":[],"category":["Jebb","Bush"],"carriers":[],"exclude_carriers":[]}
I want to grab the contents of category. I'd like a matching group that contains 2 items, Jebb and Bush (or however many items are in the array).
I've tried this pattern but it only matches the first occurrence: /(?<=category":\[).([^"]*).*?(?=\])/g
Does this match your needs? It should match the category array regardless of its size.
"category":(\[.*?\])
regex101 example
JSON not a regular language. Since it allows arbitrary embedding of balanced delimiters, it must be at least context-free.
For example, consider an array of arrays of arrays:
[ [ [ 1, 2], [2, 3] ] , [ [ 3, 4], [ 4, 5] ] ]
Clearly you couldn't parse that with true regular expressions.
See This Topic:
Regex for parsing single key: values out of JSON in Javascript
Maybe Helpful for you.
Using a set of non-capturing group you can extract a predefined json array
regex answer: (?:\"category\":)(?:\[)(.*)(?:\"\])
That expression extract "category":["Jebb","Bush"], so access the first group
to extract the array, sample java code:
Pattern pattern = Pattern.compile("(?:\"category\":)(?:\\[)(.*)(?:\"\\])");
String body = "{\"device_types\":[\"smartphone\"],\"isps\":[\"a\",\"B\"],\"network_types\":[],\"countries\":[],\"category\":[\"Jebb\",\"Bush\"],\"carriers\":[],\"exclude_carriers\":[]}";
Matcher matcher = pattern.matcher(body);
assertThat(matcher.find(), is(true));
String[] categories = matcher.group(1).replaceAll("\"","").split(",");
assertThat(categories.length, is(2));
assertThat(categories[0], is("Jebb"));
assertThat(categories[1], is("Bush"));
There are many ways. One sloppy way to do it is /([A-Z])\w+/g
Please try it on your console like
var data = '{"device_types":["smartphone"],"isps":["a","B"],"network_types":[],"countries":[],"category":["Jebb","Bush"],"carriers":[],"exclude_carriers":[]}',
res = [];
data.match(/([A-Z])\w+/g); // ["Jebb", "Bush"]
OK the above was pretty sloppy however a solid single regex solution to extract every single element regardless of the number, one by one and to place them in an array (res) is the following...
var rex = /[",]+(\w*)(?=[",\w]*"],"carriers)/g,
str = '{"device_types":["smartphone"],"isps":["a","B"],"network_types":[],"countries":[],"category":["Jebb","Bush","Donald","Trump"],"carriers":[],"exclude_carriers":[]}',
arr = [],
res = [];
while ((arr = rex.exec(str)) !== null) {
res.push(arr[1]); // <- ["Jebb", "Bush", "Donald", "Trump"]
}
Check it out # http://regexr.com/3d4ee
OK lets do it. I have come up with a devilish idea. If JS had look-behinds this could have been done simply by reversing the applied logic in the previous example where i had used a look-forward. Alas, there aren't... So i decided to turn the world the other way around. Check this out.
String.prototype.reverse = function(){
return this.split("").reverse().join("");
};
var rex = /[",]+(\w*)(?=[",\w]*"\[:"yrogetac)/g,
str = '{"device_types":["smartphone"],"isps":["a","B"],"network_types":[],"countries":[],"category":["Jebb","Bush","Donald","Trump"],"carriers":[],"exclude_carriers":[]}',
rev = str.reverse();
arr = [],
res = [];
while ((arr = rex.exec(rev)) !== null) {
res.push(arr[1].reverse()); // <- ["Trump", "Donald", "Bush", "Jebb"]
}
res.reverse(); // <- ["Jebb", "Bush", "Donald", "Trump"]
Just use your console to confirm.
In c++ you can do it like this
bool foundmatch = false;
try {
std::regex re("\"([a-zA-Z]+)\"*.:*.\\[[^\\]\r\n]+\\]");
foundmatch = std::regex_search(subject, re);
} catch (std::regex_error& e) {
// Syntax error in the regular expression
}
If the number of items in the array is limited (and manageable), you could define it with a finite number of optional items. Like this one with a maximum of 5 items:
"category":\["([^"]*)"(?:,"([^"]*)"(?:,"([^"]*)"(?:,"([^"]*)"(?:,"([^"]*)")?)?)?)?
regex101 example here.
Regards.

Compare two arrays in Ruby and create a third array of elements

I have two arrays, array1 and array2. I need to compare both of these arrays and I want to create a third array, array3, whereby it shows the elements that are in array2, that are not in array1.
This is what I have so far:
my_buckets = Model.select("DISTINCT bucket").where(["my_id = ?", params[:user]])
all_buckets = Model.select("DISTINCT bucket").collect { |x| x.bucket }.uniq.compact
buckets_not_in_my_buckets = Model.select("DISTINCT bucket").where(["bucket NOT IN (?)", my_buckets]).collect { |x| x.bucket }.uniq.compact
For some reason, the buckets_not_in_my_buckets is always returning an empty array ([]). Is there a better way to approach this? Any help would be appreciated.
buckets_not_in_my_buckets = all_buckets - my_buckets
I'm assuming that you have the eql? operator on your buckets object working how you'd like.
Please see the Array docs for more detail.

How to fetch a JSON file to get a row position from a given value or argument

I'm using wget to fetch several dozen JSON files on a daily basis that go like this:
{
"results": [
{
"id": "ABC789",
"title": "Apple",
},
{
"id": "XYZ123",
"title": "Orange",
}]
}
My goal is to find row's position on each JSON file given a value or set of values (i.e. "In which row XYZ123 is located?"). In previous example ABC789 is in row 1, XYZ123 in row 2 and so on.
As for now I use Google Regine to "quickly" visualize (using the Text Filter option) where the XYZ123 is standing (row 2).
But since it takes a while to do this manually for each file I was wondering if there is a quick and efficient way in one go.
What can I do and how can I fetch and do the request? Thanks in advance! FoF0
In python:
import json
#assume json_string = your loaded data
data = json.loads(json_string)
mapped_vals = []
for ent in data:
mapped_vals.append(ent['id'])
The order of items in the list will be indexed according to the json data, since the list is a sequenced collection.
In PHP:
$data = json_decode($json_string);
$output = array();
foreach($data as $values){
$output[] = $values->id;
}
Again, the ordered nature of PHP arrays ensure that the output will be ordered as-is with regard to indexes.
Either example could be modified to use a mapped dictionary (python) or an associative array (php) if needs demand.
You could adapt these to functions that take the id value as an argument, track how far they are into the array, and when found, break out and return the current index.
Wow. I posted the original question 10 months ago when I knew nothing about Python nor computer programming whatsoever!
Answer
But I learned basic Python last December and came up with a solution for not only get the rank order but to insert the results into a MySQL database:
import urllib.request
import json
# Make connection and get the content
response = urllib.request.urlopen(http://whatever.com/search?=ids=1212,125,54,454)
content = response.read()
# Decode Json search results to type dict
json_search = json.loads(content.decode("utf8"))
# Get 'results' key-value pairs to a list
search_data_all = []
for i in json_search['results']:
search_data_all.append(i)
# Prepare MySQL list with ranking order for each id item
ranks_list_to_mysql = []
for i in range(len(search_data_all)):
d = {}
d['id'] = search_data_all[i]['id']
d['rank'] = i + 1
ranks_list_to_mysql.append(d)