Decoding and using JSON data in Perl - json

I am confused about accessing the contents of some JSON data that I have decoded. Here is an example
I don't understand why this solution works and my own does not. My questions are rephrased below
my $json_raw = getJSON();
my $content = decode_json($json_raw);
print Data::Dumper($content);
At this point my JSON data has been transformed into this
$VAR1 = { 'items' => [ 1, 2, 3, 4 ] };
My guess tells me that, once decoded, the object will be a hash with one element that has the key items and an array reference as the value.
$content{'items'}[0]
where $content{'items'} would obtain the array reference, and the outer $...[0] would access the first element in the array and interpret it as a scalar. However this does not work. I get an error message use of uninitialized value [...]
However, the following does work:
$content->{items}[0]
where $content->{items} yields the array reference and [0] accesses the first element of that array.
Questions
Why does $content{'items'} not return an array reference? I even tried #{content{'items'}}, thinking that, once I got the value from content{'items'}, it would need to be interpreted as an array. But still, I receive the uninitialized array reference.
How can I access the array reference without using the arrow operator?

Beginner's answer to beginner :) Sure not as profesional as should be, but maybe helps you.
use strict; #use this all times
use warnings; #this too - helps a lot!
use JSON;
my $json_str = ' { "items" : [ 1, 2, 3, 4 ] } ';
my $content = decode_json($json_str);
You wrote:
My guess tells me that, once decoded, the object will be a hash with
one element that has the key items and an array reference as the value.
Yes, it is a hash, but the the decode_json returns a reference, in this case, the reference to hash. (from the docs)
expects an UTF-8 (binary) string and tries to parse that
as an UTF-8 encoded JSON text,
returning the resulting reference.
In the line
my $content = decode_json($json_str);
you assigning to an SCALAR variable (not to hash).
Because you know: it is a reference, you can do the next:
printf "reftype:%s\n", ref($content);
#print: reftype:HASH ^
#therefore the +------- is a SCALAR value containing a reference to hash
It is a hashref - you can dump all keys
print "key: $_\n" for keys %{$content}; #or in short %$content
#prints: key: items
also you can assing the value of the "items" (arrayref) to an scalar variable
my $aref = $content->{items}; #$hashref->{key}
#or
#my $aref = ${$content}{items}; #$hash{key}
but NOT
#my $aref = $content{items}; #throws error if "use strict;"
#Global symbol "%content" requires explicit package name at script.pl line 20.
The $content{item} is requesting a value from the hash %content and you never defined/assigned such variable. the $content is an scalar variable not hash variable %content.
{
#in perl 5.20 you can also
use 5.020;
use experimental 'postderef';
print "key-postderef: $_\n" for keys $content->%*;
}
Now step deeper - to the arrayref - again you can print out the reference type
printf "reftype:%s\n", ref($aref);
#reftype:ARRAY
print all elements of array
print "arr-item: $_\n" for #{$aref};
but again NOT
#print "$_\n" for #aref;
#dies: Global symbol "#aref" requires explicit package name at script.pl line 37.
{
#in perl 5.20 you can also
use 5.020;
use experimental 'postderef';
print "aref-postderef: $_\n" for $aref->#*;
}
Here is an simple rule:
my #arr; #array variable
my $arr_ref = \#arr; #scalar - containing a reference to #arr
#{$arr_ref} is the same as #arr
^^^^^^^^^^ - array reference in curly brackets
If you have an $arrayref - use the #{$array_ref} everywhere you want use the array.
my %hash; #hash variable
my $hash_ref = \%hash; #scalar - containing a reference to %hash
%{$hash_ref} is the same as %hash
^^^^^^^^^^^ - hash reference in curly brackets
If you have an $hash_ref - use the %{$hash_ref} everywhere you want use the hash.
For the whole structure, the following
say $content->{items}->[0];
say $content->{items}[0];
say ${$content}{items}->[0];
say ${$content}{items}[0];
say ${$content->{items}}[0];
say ${${$content}{items}}[0];
prints the same value 1.

$content is a hash reference, so you always need to use an arrow to access its contents. $content{items} would refer to a %content hash, which you don't have. That's where you're getting that "use of uninitialized value" error from.

I actually asked a similar question here
The answer:
In Perl, a function can only really return a scalar or a list.
Since hashes can be initialized or assigned from lists (e.g. %foo = (a => 1, b => 2)), I guess you're asking why json_decode returns something like { a => 1, b => 2 } (a reference to an anonymous hash) rather than (a => 1, b => 2) (a list that can be copied into a hash).
I can think of a few good reasons for this:
in Perl, an array or hash always contains scalars. So in something like { "a": { "b": 3 } }, the { "b": 3 } part has to be a scalar; and for consistency, it makes sense for the whole thing to be a scalar in the same way.
if the hash is quite large (many keys at top-level), it's pointless and expensive to iterate over all the elements to convert it into a list, and then build a new hash from that list.
in JSON, the top-level element can be either an object (= Perl hash) or an array (= Perl array). If json_decode returned a list in the former case, it's not clear what it would return in the latter case. After decoding the JSON string, how could you examine the result to know what to do with it? (And it wouldn't be safe to write %foo = json_decode(...) unless you already knew that you had a hash.) So json_decode's behavior works better for any general-purpose library code that has to use it without already knowing very much about the data it's working with.
I have to wonder exactly what you passed as an array to json_decode, because my results differ from yours.
#!/usr/bin/perl
use JSON qw (decode_json);
use Data::Dumper;
my $json = '["1", "2", "3", "4"]';
my $fromJSON = decode_json($json);
print Dumper($fromJSON);
The result is $VAR1 = [ '1', '2', '3', '4' ];
Which is an array ref, where your result is a hash ref
So did you pass in a hash with element items which was a reference to an array?
In my example you would get the array by doing
my #array = #{ $fromJSON };
In yours
my #array = #{ $content->{'items'} }

I don't understand why you dislike the arrow operator so much!
The decode_json function from the JSON module will always return a data reference.
Suppose you have a Perl program like this
use strict;
use warnings;
use JSON;
my $json_data = '{ "items": [ 1, 2, 3, 4 ] }';
my $content = decode_json($json_data);
use Data::Dump;
dd $content;
which outputs this text
{ items => [1 .. 4] }
showing that $content is a hash reference. Then you can access the array reference, as you found, with
dd $content->{items};
which shows
[1 .. 4]
and you can print the first element of the array by writing
print $content->{items}[0], "\n";
which, again as you have found, shows just
1
which is the first element of the array.
As #cjm mentions in a comment, it is imperative that you use strict and use warnings at the start of every Perl program. If you had those in place in the program where you tried to access $content{items}, your program would have failed to compile, and you would have seen the message
Global symbol "%content" requires explicit package name
which is a (poorly-phrased) way of telling you that there is no %content so there can be no items element.
The scalar variable $content is completely independent from the hash variable %content, which you are trying to access when you write $content{items}. %content has never been mentioned before and it is empty, so there is no items element. If you had tried #{$content->{items}} then it would have worked, as would #{${$content}{items}}
If you really have a problem with the arrow operator, then you could write
print ${$content}{items}[0], "\n";
which produces the same output; but I don't understand what is wrong with the original version.

Related

How do I recursively search a JSON file for all nodes matching a given pattern and return the JSON 'path' to the node and it's value?

Say I have this JSON in a text file:
{"widget": {
"debug": "on",
"window": {
"title": "Sample Konfabulator Widget",
"name": "main_window",
"width": 500,
"height": 500
},
"image": {
"src": "Images/Sun.png",
"name": "sun1",
"hOffset": 250,
"vOffset": 250,
"alignment": "center"
},
"text": {
"data": "Click Here",
"size": 36,
"style": "bold",
"name": "text1",
"hOffset": 250,
"vOffset": 100,
"alignment": "center",
"onMouseUp": "sun1.opacity = (sun1.opacity / 100) * 90;"
}
}}
Using Perl I have read the file into a JSON object called $json_obj using JSON::XS.
How do I search $json_obj for all nodes called name and return/print the following as the result/output:
widget->window->name: main_window
widget->image->name: sun1
widget->text->name: text1
Notes:
node names matching the search term could appear at any level of the tree
search terms could be plain text or a regular expression
I'd like to be able to supply my own branch separator to override a default of, say, ->
example / (for simplicity, I'll just put this in a perl $variable)
I would like to be able to specify multiple node levels in my search, so as the specify a path to match, for example: specifying id/colour would return all paths that contain a node called id that is also a parent with a child node called colour
displaying double quotes around the result values is optional
I want to be able to search for multiple patterns, e.g. /(name|alignment)/ for "find all nodes called name or alignment
Example showing results of search in last note above:
widget->window->name: main_window
widget->image->name: sun1
widget->image->alignment: center
widget->text->name: text1
widget->text->alignment: center
Since JSON is mostly just text, I'm not yet sure of the benefit of even using JSON::XS so any advice on why this is better or worse is most welcome.
It goes without saying that it needs to be recursive so it can search n arbitrary levels deep.
This is what I have so far, but I'm only part way there:
#!/usr/bin/perl
use 5.14.0;
use warnings;
use strict;
use IO::File;
use JSON::XS;
my $jsonfile = '/home/usr/filename.json';
my $jsonpath = 'image/src'; # example search path
my $pathsep = '/'; # for displaying results
my $fh = IO::File->new("$jsonfile", "r");
my $jsontext = join('',$fh->getlines());
$fh->close();
my $jsonobj = JSON::XS->new->utf8->pretty;
if (defined $jsonpath) {
my $perltext = $jsonobj->decode($jsontext); # is this correct?
recurse_tree($perltext);
} else {
# print file to STDOUT
say $jsontext;
}
sub recurse_tree {
my $hash = shift #_;
foreach my $key (sort keys %{$hash}) {
if ($key eq $jsonpath) {
say "$key = %{$hash}{$key} \n"; # example output
}
if (ref $hash->{$key} eq 'HASH' ||
ref $hash->{$key} eq 'ARRAY') {
recurse_tree($hash->{$key});
}
}
}
exit;
The expected result from the above script would be:
widget/image/src: Images/Sun.png
Once that JSON is decoded, there is a complex (nested) Perl data structure that you want to search through, and the code you show is honestly aiming for that.
However, there are libraries out there which can help; either to do the job fully or to provide complete, working, and tested code that you can fine tune to the exact needs.
The module Data::Leaf::Walker seems suitable. A simple example
use warnings;
use strict;
use feature 'say';
use Data::Dump qw(dd);
use JSON;
use List::Util qw(any);
use Data::Leaf::Walker;
my $file = shift // 'data.json'; # provided data sample
my $json_data = do { local (#ARGV, $/) = $file; <> }; # read into a string
chomp $json_data;
my $ds = decode_json $json_data;
dd $ds; say ''; # show decoded data
my $walker = Data::Leaf::Walker->new($ds);
my $sep = '->';
while ( my ($key_path, $value) = $walker->each ) {
my #keys_in_path = #$key_path;
if (any { $_ eq 'name' } #keys_in_path) { # selection criteria
say join($sep, #keys_in_path), " => $value"
}
}
This 'walker' goes through the data structure, keeping the list of keys to each leaf. This is what makes this module particularly suitable for your quest, along with its simplicity of purpose in comparison to many others. See documentation.
The above prints, for the sample data provided in the question
widget->window->name => main_window
widget->text->name => text1
widget->image->name => sun1
The implementation of the criterion for which key-paths get selected in the code above is rather simple-minded, since it checks for 'name' anywhere in the path, once, and then prints the whole path. While the question doesn't specify what to do about matches earlier in the path, or with multiple ones, this can be adjusted since we always have the full path.
The rest of your wish list is fairly straightforward to implement as well. Peruse List::Util and List::MoreUtils for help with array analysis.
Another module, that is a great starting point for possible specific needs, is Data::Traverse. It is particularly simple, at 70-odd lines of code, so very easy to customize.
Depending on your task, you might consider using jq. This output is simple, but you can get as complex as you like:
$ jq -r '.. | .image? | .src | strings' test.json
Images/Sun.png
$ jq -r '.. | .name? | strings' test.json
main_window
sun1
text1
Walking the data structure isn't that bad, although it's a bit weird the first couple of times you do it. There are various modules on CPAN that will do for you (as zdim shows), but this is something you should probably know how to do on your own. We have some big examples in Intermediate Perl.
One way to do it is to start with a queue of things to process. This is iteration, not recursion, and depending on how you add elements to the queue, you can do either depth-first or breadth-first searches.
For each item, I'll track the path of keys to get there, and the sub-hash. That's the problem with your recursive approach: you don't allow for a way to track the path.
At the start, the queue has one item because we are at the top. I'll also define a target key, since your problem has that:
my #queue = ( { key_path => [], node => $hash } );
my $target = 'name';
Next, I process every item in the queue (the while). I expect each value of node to be a hash, so I'll get all the keys of that hash (the foreach). This represents the next level of the hash.
Inside the foreach, I make a new key path with the one that exists along with the one I'm processing. I also get the next value by using that key.
After that, I can do task specific processing. If I've found my target key, I'll do whatever I need to do. In this case I output some stuff, but I could add to a different data structure and so on. I use next to stop processing that key (although I could keep going). If I didn't find the target key, I make another entry in the queue if the value is another hash reference.
Then, I go back to processing the queue.
use v5.24; # use postfix dereferencing
while( my $h = shift #queue ) {
foreach my $next_key ( keys $h->{node}->%* ) {
my $key_path = [ $h->{key_path}->#*, $next_key ];
my $value = $h->{node}{$next_key};
if( $next_key eq $target ) {
say join( "->", $key_path->#* ), " = $value";
next;
}
elsif( ref $value eq ref {} ) {
push #queue, { key_path => $key_path, node => $value };
}
}
}
I end up with output like:
widget->text->name = text1
widget->image->name = sun1
widget->window->name = main_window
From there, you can customize this to get the other features you need. If you want to find a complex key, you just do a little more work to compare the key path to what you want.

Pass data from JSON to variable for comparison

I have a request that I make in an API using GET
LWP::UserAgent,
the data is returned as JSON, with up to two results at most as follows:
{
"status":1,
"time":1507891855,
"response":{
"prices":{
"nome1\u2122":{
"preco1":1111,
"preco2":1585,
"preco3":1099
},
"nome2":{
"preco1":519,
"preco2":731,
"preco3":491
}
}
}
}
Dump:
$VAR1 = {
'status' => 1,
'time' => 1507891855,
'response' => {
'prices' => {
'nome1' => {
'preco1' => 1111,
'preco3' => 1099,
'preco2' => 1585
},
'nome2' => {
'preco3' => 491,
'preco1' => 519,
'preco2' => 731
}
}
}
};
What I would like to do is:
Take this data and save it in a variable to make a comparison using if with another variable that already has the name stored. The comparison would be with name1 / name2 and if it is true with the other variable it would get preco2 and preco3 to print everything
My biggest problem in the case is that some of these names in JSON contain characters like (TradeMark) that comes as \u2122 (some cases are other characters), so I can not make the comparison with the name of the other variable that is already with the correct name
nome1™
If I could only save the JSON already "converted" the characters would help me with the rest.
Basically after doing the request for the API I want to save the contents in a variable already converting all \u2122 to their respective character (this is the part that I do not know how to do in Perl) and then using another variable to compare them names are equal to show the price
Thanks for the help and any questions please tell me that I try to explain again in another way.
If I understand correctly, you need to get the JSON that you receive in UTF8 format to an internal variable that you can process. For that, you may use JSON::XS:
use utf8;
use JSON::XS;
my $name = "nome1™";
my $var1 = decode_json $utf8_encoded_json_text;
# Compare with name in $name
if( defined $var1->{'response'}->{'prices'}->{$name} ) {
# Do something with the name that matches
my $match = $var1->{'response'}->{'prices'}->{$name};
print $match->{'preco1'}, "\n";
}
Make sure you tell the Perl interpreter that your source is in UTF8 by specifying use utf8; at the beginning of the script. Then make sure you are editing the script with an editor that supports that format.
The function decode_json will return a ref to the converted value. In this case a hash ref. From there you work your way into the JSON.
If you know $name is going to be in the JSON you may omit the defined part. Otherwise, the defined clause will tell you whether the hash value is there. One you know, you may do something with it. If the hash values are a single word with no special characters, you may use $var1->{response}->{prices}->{$name}, but it is always safer to use $var1->{'response'}->{'prices'}->{$name}. Perl gets a bit ugly handling hash refs...
By the way, in JSON::XS you will also find the encode_json function to do the opposite and also an object oriented interface.

Perl, JSON parsing values incorrectly

I'm parsing a JSON string that is stored in a database.
{"name":"simon", "age":"23", "height":"tall"}
I'm pulling the data, then decoding. When running the code below, I'm receiving weird 'HASH' values back.
use JSON;
$data = decode_json($row->{'address'});
for my $key (keys %$data){
if($data->{$key} ne ''){
$XML .= " <$key>$data->{$key}</$key>";
}
}
// Returns data like so
<company_type>HASH(0x27dbac0)</company_type>
<county>HASH(0x27db7c0)</county>
<address1>HASH(0x27dba90)</address1>
<company_name>HASH(0x27db808)</company_name>
The Error happens when I have a data set like so:
{"name":"", "age":{}, "height":{}}
I don't understand why JSON / Arrays / Hashes have to be so difficult to work with in Perl. What point am I missing?
You are processing a flat hash, while your data in fact has another, nested, hashref. In the line
{ "name":"", "age":{}, "height":{} }
the {} may be intended to mean "nothing" but are in fact JSON "object", the next level of nested data (which are indeed empty). In Perl we get a hashref for it and that's what your code prints.
The other pillar of JSON is an "array" and in Perl we get an arrayref. And that's that -- decode_json gives us back the top-level hashref, which when dereferenced into a hash may contain further hash or array references as values. If you print the whole structure with Data::Dumper you'll see that.
To negotiate this we have to test each time for a reference. Since a dereferenced hash or array may contain yet further levels (more references), we need to use either a recursive routine (see this post for an example) or a module for complex data structures. But for the first level
for my $key (keys %$data)
{
next if $data->{$key} eq '';
my $ref_type = ref $data->{$key};
# if $data->{key} is not a reference ref() returns an empty string (false)
if (not $ref_type) {
$XML .= " <$key>$data->{$key}</$key>";
}
elsif ($ref_type eq 'HASH') {
# hashref, unpack and parse. it may contain references
say "$_ => $data->{$key}{$_}" for keys %{ $data->{$key} };
}
elsif ($ref_type eq 'ARRAY') {
# arrayref, unpack and parse. it may contain references
say "#{$data->{$key}}";
}
else { say "Reference is to type: $ref_type" }
}
If the argument of ref is not a reference (but a string or a number) ref returns an empty string, which evaluates as false, which is when you have plain data. Otherwise it returns the type the reference is to. Coming from JSON it can be either a HASH or an ARRAY. This is how nesting is accomplished.
In the shown example you are runnig into hashref. Since the ones you show are empty you can just discard them and the code for the specific example can reduce greatly, to one statement. However, I'd leave the other tests in place. This should also work as it stands with the posted example.

Convert json to array using Perl

I have a chunk of json that has the following format:
{"page":{"size":7,"number":1,"totalPages":1,"totalElements":7,"resultSetId":null,"duration":0},"content":[{"id":"787edc99-e94f-4132-b596-d04fc56596f9","name":"Verification","attributes":{"ruleExecutionClass":"VerificationRule"},"userTags":[],"links":[{"rel":"self","href":"/endpoint/787edc99-e94f-4132-b596-d04fc56596f9","id":"787edc99-e94f-...
Basically the size attribute (in this case) tells me that there are 7 parts to the content section. How do I convert this chunk of json to an array in Perl, and can I do it using the size attribute? Or is there a simpler way like just using decode_json()?
Here is what I have so far:
my $resources = get_that_json_chunk(); # function returns exactly the json you see, except all 7 resources in the content section
my #decoded_json = #$resources;
foreach my $resource (#decoded_json) {
I've also tried something like this:
my $deserialize = from_json( $resources );
my #decoded_json = (#{$deserialize});
I want to iterate over the array and handle the data. I've tried a few different ways because I read a little about array refs, but I keep getting "Not an ARRAY reference" errors and "Can't use string ("{"page":{"size":7,"number":1,"to"...) as an ARRAY ref while "strict refs" in use"
Thank you to Matt Jacob:
my $deserialized = decode_json($resources);
print "$_->{id}\n" for #{$deserialized->{content}};

Encoding an array of hashes in Perl

I'm trying to do something that seems to be very simple, but I can't figure out how to do it in Perl : I want to output a JSON-formatted array of hashes.
The array of hashes in question is actually an array of DBIx::MyParse Items object instances. Here is my code :
use strict;
use DBIx::MyParse;
use JSON::PP;
my $json = JSON::PP->new->ascii->pretty->allow_nonref;
our $parser = DBIx::MyParse->new( database => "test", datadir => "/tmp/myparse" );
our $query = $parser->parse("UPDATE table1 SET field1 = 1;");
$json->convert_blessed(1);
print $json->encode(#{$query} );
And this is what this script outputs :
"SQLCOM_UPDATE"
Which is actually the first element of the array that I want to output as a whole. Here is the content of the array that I see when I step-by-step debug the script :
I would like to have the whole structure in my JSON output. How can I achieve this ?
JSON::encode just expects a single argument, not a list. Use $json->encode( $query ).