So, i'm making a programm fetching the data as hash from PostgreSQL, which uses a couple JSONs as parameter file and output datafile respectively. And i had some problem with fetching what it shouldn't fetch. Here's parameter json:
{
"queries": [
{
"table_name" : "t1",
"subqueries": [
{
"query_id" : "t1_1",
"query": [
.....some sql query
],
"to_hash" : {
"target_by" : "type_id", // key to index by
"keys" : [
{
"source" : "name", // key in hash from db
"target" : "name" // key in new hash
},
{
"source" : "r",
"target" : "r"
}
]
}
},
{
"query_id" : "t1_2",
"query": [
.....some sql query
],
"to_hash" : {
"target_by" : "type_id",
"keys" : [
{
"source" : "m",
"target" : "m"
}
]
}
}
]
}
]
}
....and here's a perl subroutine:
my $fname = "query_params.json";
my $q_data_raw;
{
local $/;
open(my $fh, "<:encoding(UTF-8)", $fname) or oops("$fname: $!");
$q_data_raw = <$fh>;
close($fh);
}
my $q_data = JSON->new->utf8->decode($q_data_raw);
my %result;
sub blabla {
my $data = shift;
my($tab, $i) = ($data->{table_name}, 0);
if ($data->{subqueries} ne "false"){
my %res_hash;
my #res_arr;
my $q_id;
foreach my $sq (#{$data->{subqueries}}){
my $query = "";
$q_id = $sq->{query_id};
print "\n";
print "$q_id\n";
for(#{$sq->{query}}){
$query .= "$_\n";
}
my $t_by = $sq->{to_hash}{target_by};
my $q_hash = $db_connection->prepare($query);
$q_hash->execute() or die( "Unable to get: " . $db_connection->errstr);
while(my $res = $q_hash->fetchrow_hashref()) {
# print Dumper $res; #print #1
for(#{$sq->{to_hash}->{keys}}){
# print "\nkey:\t" . $_->{target} . "\nvalue:\t".$res->{$_->{source}}; #print #2
$res_hash{$q_id}{$res->{$t_by}}{$_->{target}} = $res->{$_->{source}};
}
$res_hash{$q_id}{$res->{$t_by}}{__id} = $res->{$t_by};
# print Dumper %res_hash; #print #3
}
push #res_arr, $res_hash{$q_id};
# print Dumper $res_hash{$q_id}; #print #4
# print Dumper #res_arr; print #5
$result{$tab}{$q_id} = \#res_arr;
$q_hash->finish();
}
}
}
for (#{$q_data->{queries}}){ // hash from parameter json
blabla($_);
}
my $json = JSON->new->pretty->encode(\%result);
# to json file
....and there's what i get:
{
"t1" : {
"t1_1" : [
{
//type_id_1_* - from first query
"type_id_1_1" : {
"r" : "4746",
"__id" : "type_id_1_1",
"name" : "blablabla"
},
"type_id_1_2" : {
"r" : "7338",
"__id" : "type_id_1_2",
"name" : "nbmnbcxv"
},
....
},
{
//type_id_2_* - from second query
"type_id_2_1" : {
"m" : "6",
"__id" : "type_id_2_1"
},
"type_id_2_2" : {
"m" : "3",
"__id" : "type_id_2_2"
},
............
}
],
"t1_2" : [
{
"type_id_1_1" : {
"r" : "4746",
"__id" : "type_id_1_1",
"name" : "blablabla"
},
"type_id_1_2" : {
"r" : "7338",
"__id" : "type_id_1_2",
"name" : "nbmnbcxv"
},
....
},
{
"type_id_2_1" : {
"m" : "6",
"__id" : "type_id_2_1"
},
"type_id_2_2" : {
"m" : "3",
"__id" : "type_id_2_2"
},
............
}
]
}
}
Somehow, it fetches the queries from other subqueries, what i don't want. And loop seems to be ok, probably. What am i doing wrong?
Well, seems like i initialized #res_arr and #res_hash in wrong level - gotta be in foreach. The output is kinda what i need. At least i don't have duplicates.
Gotta be sleep more X(
....
my $q_id;
foreach my $sq (#{$data->{subqueries}}){
my %res_hash;
my #res_arr;
my $query = "";
$q_id = $sq->{query_id};
print "\n";
print "$q_id\n";
..........
}
Related
I have a JSON file:
{
"JSONS" : [
{
"id" : "ToRemove",
"First" : [
{
"id" : "geo",
"Name" : "Person1",
"model" : [
],
"adjustments" : [
{
"uid" : "3",
"name" : "4s",
"value" : "1"
},
{
"uid" : "5",
"name" : "3s",
"value" : "6"
}
]
},
{
"id" : "Meters",
"Dictionary" : "4.2"
},
{
"id" : "Moon",
"Filter" : "0.5",
"Saturn" : {
"s" : "0",
"v" : "1"
}
}
]
}
]
}
I would like to delete entire node, if the "id", in this example, contains "ToRemove" string. Everyting between { and }, including those lines also, to make the final JSON consistent.
This is a screenshot what I want to get rid of.
I only found how to delete properties, but not entire nodes. I've tried to appli something like this:
$ToRemove = Get-Content $SourceFile | ConvertFrom-Json
$ToRemove.PSObject.Object.Remove('id:','ToRemove')
$ToRemove | ConvertTo-Json -Depth 100 | Out-File $DestFile
but of course it didn't work.
How to delete the entire node? I would love to use an array to put all strings I would like to delete.
Based on your comment, you can remove that object having the property id = ToRemove by filtering where id is not equal to ToRemove and assigning that result to the .JSONS property:
$json = Get-Content path\to\json.json -Raw | ConvertFrom-Json
$json.JSONS = #($json.JSONS.Where{ $_.id -ne 'ToRemove' })
$json | ConvertTo-Json
The end result in this case would be an empty array for the .JSONS property:
{
"JSONS": []
}
.PSObject.Properties.Remove(...) wouldn't be useful in this case because what it does is remove properties from one object but what you want to do is filter out an entire object based on a condition.
You should be able to use just plain PowerShell, like this:
{
"JSONS" : [
{
"id" : "ToRemove",
"First" : [
{
"id" : "geo",
"Name" : "Person1",
"model" : [
]
},
{
"id" : "Meters",
"Dictionary" : "4.2"
}
]
},
{
"id" : "DontRemove",
"First" : []
}
]
}
$json = Get-Content -Path $SourceFile | ConvertFrom-Json
$json.JSONS = $json.JSONS | Where-Object { $_.Id -ne "ToRemove" }
$json | ConvertTo-Json -Depth 100 | Out-File -Path $DestFile
{
"root1" : {
"sub1" : null,
"sub2" : {
"subsub1" : {
"key1" : {
},
"key2" : {
},
"key3" : {
},
"key4" : {
}
}
},
"sub3" : {
"subsub2" : {
"key5" : {
}
}
}
},
"root2" : {
"sub1" : null,
"sub2" : {
"subsub1" : {
"key1" : {
},
"key2" : {
},
"key3" : {
},
"key4" : {
}
}
},
"sub3" : {
"subsub2" : {
"key8" : {
}
}
}
}
}
consider the above json.
How to know if 'key8' exists in this json and also find the path where its found in the json.
e.g if searched for 'key8' need to get output similar to :
root2->sub3->subsub2->key8
It's just a straightforward tree traversal. The following returns as soon as a match is found (rather than looking for all matches).
sub key_search {
my $target = $_[1];
my #todo = [ $_[0] ];
while (#todo) {
my ($val, #path) = #{ shift(#todo) };
my $reftype = ref($val);
if (!$reftype) {
# Nothing to do
}
elsif ($reftype eq 'HASH') {
for my $key (keys(%$val)) {
return #path, $target if $key eq $target;
push #todo, [ $val->{$key}, #path, $key ];
}
}
elsif ($reftype eq 'ARRAY') {
for my $i (0..$#$val) {
push #todo, [ $val->[$i], #path, $i ];
}
}
else {
die("Invalid data.\n");
}
}
return;
}
my #path = key_search($data, 'key8')
or die("Not found.\n");
Notes
The result is ambiguous if the data can contain arrays, and if any of the hashes can have integers for keys. Steps can be taken to disambiguate them.
The above doesn't check for cycles, but those can't exist in JSON.
Replace push with unshift to get a depth-first search instead of a breadth-first search.
I am trying to write a perl code to parse multiple JSON messages. The perl code that I have written only parses the values if the JSON file contains only one json message. But it fails when there are multiple messages in that file. It throws error: "Undefined subroutine &Carp::shortmess_heavy". The JSON file is in the following format:
{
"/test/test1/test2/test3/supertest4" : [],
"/test/test1/test2/test3/supertest2" : [
{
"tag1" : "",
"tag2" : true,
"tag3" : [
{
"status" : "TRUE",
"name" : "DEF",
"age" : "28",
"sex" : "f"
},
{
"status" : "FALSE",
"name" : "PQR",
"age" : "39",
"sex" : "f"
}
],
"tag4" : "FAILED",
"tag5" : "/test/test1/test2/test3/supertest2/test02",
"tag6" : ""
}
],
"/test/test1/test2/test3/supertest1" : [
{
"tag1" : "",
"tag2" : false,
"tag3" : [
{
"status" : "TRUE",
"name" : "ABC",
"age" : "21",
"sex" : "m"
},
{
"status" : "FALSE",
"name" : "XYZ",
"age" : "34",
"sex" : "f"
}
],
"tag4" : "PASSED",
"tag5" : "/test/test1/test2/test3/supertest1/test01",
"tag6" : ""
}
],
"/test/test1/test2/test3/supertest6" : []
}
My perl code to parse a single JSON message:
use strict;
use warnings;
use Data::Dumper;
use JSON;
use JSON qw( decode_json );
my $json_file = "tmp1.json";
my $json;
open (my $fh, '<', $json_file) or die "can not open file $json_file";
{ local $/; $json = <$fh>; }
close($fh);
my $decoded = decode_json($json);
print "TAG4 = " . $decoded->{'tag4'} . "\n";
print "TAg5 = " . $decoded->{'tag5'} . "\n";
my #tag3 = #{ $decoded->{'tag3'} };
foreach my $tg3 ( #tag3 ) {
print "Name = ". $tg3->{"name"} . "\n";
print "Status = ". $tg3->{"status"} . "\n";
print "Age = ". $tg3->{"age"} . "\n";
}
To parse multiple JSON objects, use JSON::XS/Cpanel::JSON::XS's incremental parsing.
my $json = '{"foo":"bar"} ["baz"] true 1.23';
my #results = JSON::XS->new->allow_nonref->incr_parse($json);
use Data::Dumper;
print Dumper \#results;
Output:
$VAR1 = [
{
'foo' => 'bar'
},
[
'baz'
],
bless( do{\(my $o = 1)}, 'JSON::PP::Boolean' ),
'1.23'
];
First of all - I cannot use perl MongoDB driver, so I'm interacting with MongoDB via IPC::Run. Now I'd like to get output from MongoDB as a hash ref.
Here is the code:
#!/usr/bin/env perl
use strict;
use warnings;
use JSON::XS;
use Try::Tiny;
use IPC::Run 'run';
use Data::Dumper;
my #cmd = ('/opt/mongo/bin/mongo', '127.0.0.1:27117/service_discovery', '--quiet', '-u', 'test', '-p', 'test', '--eval', 'db.sit.find().forEach(function(x){printjson(x)})');
my $out;
run \#cmd, '>>', \$out;
my $coder = JSON::XS->new->ascii->pretty->allow_nonref;
my $dec = try {my $output = $coder->decode($out)} catch {undef};
print Dumper (\%$dec);
It is not working now, %$dec is empty.
Here is the output of MongoDB query (value of $out):
{
"_id" : ObjectId("5696787eb8e5e87534777c82"),
"hostname" : "lab7n1",
"services" : [
{
"port" : 9000,
"name" : "ss-rest"
},
{
"port" : 9001,
"name" : "ss-rest"
},
{
"port" : 8060,
"name" : "websockets"
},
{
"port" : 8061,
"name" : "websockets"
}
]
}
{
"_id" : ObjectId("56967ab2b8e5e87534777c83"),
"hostname" : "lab7n2",
"services" : [
{
"port" : 8030,
"name" : "cloud-rest for batch"
},
{
"port" : 8031,
"name" : "cloud-rest for batch"
},
{
"port" : 8010,
"name" : "cloud-rest for bespoke"
},
{
"port" : 8011,
"name" : "cloud-rest for bespoke"
}
]
}
What should I do to make parser treat this output as legitimate JSON?
As suggested by #Matt i used incr_parse method and omitted _id field in output.
I'm trying to find objects using the built it queries and It just doesn't work..
My JSON file is something like this:
{ "Text1":
{
"id":"2"
},
"Text2":
{
"id":"2,3"
},
"Text3":
{
"id":"1"
}
}
And I write this db.myCollection.find({"id":2})
And it doesn't find anything.
When I write db.myCollection.find() it shows all the data as it should.
Anyone knows how to do it correctly?
Its hard to change the data-structure but as you want just your matching sub-document and you don't know where is your target sub-document (for example the query should be on Text1 or Text2 , ...) there is a good data structure for this:
{
"_id" : ObjectId("548dd9261a01c68fab8d67d7"),
"pair" : [
{
"id" : "2",
"key" : "Text1"
},
{
"id" : [
"2",
"3"
],
"key" : "Text2"
},
{
"id" : "1",
"key" : "Text3"
}
]
}
and your query is:
db.myCollection.findOne({'pair.id' : "2"} , {'pair.$':1, _id : -1}).pair // there is better ways (such as aggregation instead of above query)
as result you will have:
{
"0" : {
"id" : "2",
"key" : "Text1"
}
}
Update 1 (newbie way)
If you want all the document not just one use this
var result = [];
db.myCollection.find({'pair.id' : "2"} , {'pair.$':1, _id : -1}).forEach(function(item)
{
result.push(item.pair);
});
// the output will be in result
Update 2
Use this query to get all sub-documents
db.myCollection.aggregate
(
{ $unwind: '$pair' },
{ $match : {'pair.id' : "2"} }
).result
it produce output as
{
"0" : {
"_id" : ObjectId("548deb511a01c68fab8d67db"),
"pair" : {
"id" : "2",
"key" : "Text1"
}
},
"1" : {
"_id" : ObjectId("548deb511a01c68fab8d67db"),
"pair" : {
"id" : [
"2",
"3"
],
"key" : "Text2"
}
}
}
Since your are query specify a field in a subdocument this is what will work. see .find() documentation.
db.myCollection.find({"Text1.id" : "2"}, {"Text1.id": true})
{ "_id" : ObjectId("548dd798e2fa652e675af11d"), "Text1" : { "id" : "2" } }
If the query is on "Text1" or "Text2" the best thing to do here as mention in the accepted answer is changing you document structure. This can be easily done using the "Bulk" API.
var bulk = db.mycollection.initializeOrderedBulkOp(),
count = 0;
db.mycollection.find().forEach(function(doc) {
var pair = [];
for(var key in doc) {
if(key !== "_id") {
var id = doc[key]["id"].split(/[, ]/);
pair.push({"key": key, "id": id});
}
}
bulk.find({"_id": doc._id}).replaceOne({ "pair": pair });
count++; if (count % 300 == 0){
// Execute per 300 operations and re-Init
bulk.execute();
bulk = db.mycollection.initializeOrderedBulkOp();
}
})
// Clean up queues
if (count % 300 != 0 )
bulk.execute();
Your document now look like this:
{
"_id" : ObjectId("55edddc6602d0b4fd53a48d8"),
"pair" : [
{
"key" : "Text1",
"id" : [
"2"
]
},
{
"key" : "Text2",
"id" : [
"2",
"3"
]
},
{
"key" : "Text3",
"id" : [
"1"
]
}
]
}
Running the following query:
db.mycollection.aggregate([
{ "$project": {
"pair": {
"$setDifference": [
{ "$map": {
"input": "$pair",
"as": "pr",
"in": {
"$cond": [
{ "$setIsSubset": [ ["2"], "$$pr.id" ]},
"$$pr",
false
]
}
}},
[false]
]
}
}}
])
returns:
{
"_id" : ObjectId("55edddc6602d0b4fd53a48d8"),
"pair" : [
{
"key" : "Text1",
"id" : [
"2"
]
},
{
"key" : "Text2",
"id" : [
"2",
"3"
]
}
]
}