Combining prisms when extracting JSON fields with lens-aeson - json

I have a JSON blob similar to the following:
[
{
"version": 1
},
{
"version": "3"
},
...
]
Note that some of the versions are numbers and some are strings.
I want to get a list of versions.
I can use the following lens combination to extract the numeric versions:
v1 :: [String]
v1 = obj ^.. AL.values . AL.key fieldName . AL._Number . to show
And the following to extract the strings
v2 :: [String]
v2 = obj ^.. AL.values . AL.key fieldName . AL._String . to T.unpack
But, how can I get a list of versions by a single pass over the list?
Is there any lens combinator that takes lenses AL._Number . to show and AL._String . to T.unpack and returns a combined getter so that if the first one failes, tries the second one? Something like msum for lenses?

There is in fact a combinator that tries an optic and goes to a backup if the first one fails. It's called failing.
Note that the condition on it should be satisfied by the case you describe. Even if it wasn't, the combinator would still function, it would just behave irregularly when refactoring. (Which is the main problem with using filtered as a Traversal.)

Before Carl's answer, which is what you should use, I was going to suggest outside as a way to perform case analysis with those prisms:
tryNumberThenString :: AL.AsPrimitive t => t -> [String]
tryNumberThenString =
outside AL._Number .~ (:[]) . show $
outside AL._String .~ (:[]) . T.unpack $
const []
v1 = obj ^.. AL.values . AL.key fieldName . folding tryNumberThenString
Note that, unless there is some other trick I am missing, this is not only more complicated than what Carl suggests but also less flexible -- I can only get a Fold from the plain function tryNumberThenString, while failing combines the prisms into a Traversal.

Related

How do I recursively search a JSON file for all nodes matching a given pattern and return the JSON 'path' to the node and it's value?

Say I have this JSON in a text file:
{"widget": {
"debug": "on",
"window": {
"title": "Sample Konfabulator Widget",
"name": "main_window",
"width": 500,
"height": 500
},
"image": {
"src": "Images/Sun.png",
"name": "sun1",
"hOffset": 250,
"vOffset": 250,
"alignment": "center"
},
"text": {
"data": "Click Here",
"size": 36,
"style": "bold",
"name": "text1",
"hOffset": 250,
"vOffset": 100,
"alignment": "center",
"onMouseUp": "sun1.opacity = (sun1.opacity / 100) * 90;"
}
}}
Using Perl I have read the file into a JSON object called $json_obj using JSON::XS.
How do I search $json_obj for all nodes called name and return/print the following as the result/output:
widget->window->name: main_window
widget->image->name: sun1
widget->text->name: text1
Notes:
node names matching the search term could appear at any level of the tree
search terms could be plain text or a regular expression
I'd like to be able to supply my own branch separator to override a default of, say, ->
example / (for simplicity, I'll just put this in a perl $variable)
I would like to be able to specify multiple node levels in my search, so as the specify a path to match, for example: specifying id/colour would return all paths that contain a node called id that is also a parent with a child node called colour
displaying double quotes around the result values is optional
I want to be able to search for multiple patterns, e.g. /(name|alignment)/ for "find all nodes called name or alignment
Example showing results of search in last note above:
widget->window->name: main_window
widget->image->name: sun1
widget->image->alignment: center
widget->text->name: text1
widget->text->alignment: center
Since JSON is mostly just text, I'm not yet sure of the benefit of even using JSON::XS so any advice on why this is better or worse is most welcome.
It goes without saying that it needs to be recursive so it can search n arbitrary levels deep.
This is what I have so far, but I'm only part way there:
#!/usr/bin/perl
use 5.14.0;
use warnings;
use strict;
use IO::File;
use JSON::XS;
my $jsonfile = '/home/usr/filename.json';
my $jsonpath = 'image/src'; # example search path
my $pathsep = '/'; # for displaying results
my $fh = IO::File->new("$jsonfile", "r");
my $jsontext = join('',$fh->getlines());
$fh->close();
my $jsonobj = JSON::XS->new->utf8->pretty;
if (defined $jsonpath) {
my $perltext = $jsonobj->decode($jsontext); # is this correct?
recurse_tree($perltext);
} else {
# print file to STDOUT
say $jsontext;
}
sub recurse_tree {
my $hash = shift #_;
foreach my $key (sort keys %{$hash}) {
if ($key eq $jsonpath) {
say "$key = %{$hash}{$key} \n"; # example output
}
if (ref $hash->{$key} eq 'HASH' ||
ref $hash->{$key} eq 'ARRAY') {
recurse_tree($hash->{$key});
}
}
}
exit;
The expected result from the above script would be:
widget/image/src: Images/Sun.png
Once that JSON is decoded, there is a complex (nested) Perl data structure that you want to search through, and the code you show is honestly aiming for that.
However, there are libraries out there which can help; either to do the job fully or to provide complete, working, and tested code that you can fine tune to the exact needs.
The module Data::Leaf::Walker seems suitable. A simple example
use warnings;
use strict;
use feature 'say';
use Data::Dump qw(dd);
use JSON;
use List::Util qw(any);
use Data::Leaf::Walker;
my $file = shift // 'data.json'; # provided data sample
my $json_data = do { local (#ARGV, $/) = $file; <> }; # read into a string
chomp $json_data;
my $ds = decode_json $json_data;
dd $ds; say ''; # show decoded data
my $walker = Data::Leaf::Walker->new($ds);
my $sep = '->';
while ( my ($key_path, $value) = $walker->each ) {
my #keys_in_path = #$key_path;
if (any { $_ eq 'name' } #keys_in_path) { # selection criteria
say join($sep, #keys_in_path), " => $value"
}
}
This 'walker' goes through the data structure, keeping the list of keys to each leaf. This is what makes this module particularly suitable for your quest, along with its simplicity of purpose in comparison to many others. See documentation.
The above prints, for the sample data provided in the question
widget->window->name => main_window
widget->text->name => text1
widget->image->name => sun1
The implementation of the criterion for which key-paths get selected in the code above is rather simple-minded, since it checks for 'name' anywhere in the path, once, and then prints the whole path. While the question doesn't specify what to do about matches earlier in the path, or with multiple ones, this can be adjusted since we always have the full path.
The rest of your wish list is fairly straightforward to implement as well. Peruse List::Util and List::MoreUtils for help with array analysis.
Another module, that is a great starting point for possible specific needs, is Data::Traverse. It is particularly simple, at 70-odd lines of code, so very easy to customize.
Depending on your task, you might consider using jq. This output is simple, but you can get as complex as you like:
$ jq -r '.. | .image? | .src | strings' test.json
Images/Sun.png
$ jq -r '.. | .name? | strings' test.json
main_window
sun1
text1
Walking the data structure isn't that bad, although it's a bit weird the first couple of times you do it. There are various modules on CPAN that will do for you (as zdim shows), but this is something you should probably know how to do on your own. We have some big examples in Intermediate Perl.
One way to do it is to start with a queue of things to process. This is iteration, not recursion, and depending on how you add elements to the queue, you can do either depth-first or breadth-first searches.
For each item, I'll track the path of keys to get there, and the sub-hash. That's the problem with your recursive approach: you don't allow for a way to track the path.
At the start, the queue has one item because we are at the top. I'll also define a target key, since your problem has that:
my #queue = ( { key_path => [], node => $hash } );
my $target = 'name';
Next, I process every item in the queue (the while). I expect each value of node to be a hash, so I'll get all the keys of that hash (the foreach). This represents the next level of the hash.
Inside the foreach, I make a new key path with the one that exists along with the one I'm processing. I also get the next value by using that key.
After that, I can do task specific processing. If I've found my target key, I'll do whatever I need to do. In this case I output some stuff, but I could add to a different data structure and so on. I use next to stop processing that key (although I could keep going). If I didn't find the target key, I make another entry in the queue if the value is another hash reference.
Then, I go back to processing the queue.
use v5.24; # use postfix dereferencing
while( my $h = shift #queue ) {
foreach my $next_key ( keys $h->{node}->%* ) {
my $key_path = [ $h->{key_path}->#*, $next_key ];
my $value = $h->{node}{$next_key};
if( $next_key eq $target ) {
say join( "->", $key_path->#* ), " = $value";
next;
}
elsif( ref $value eq ref {} ) {
push #queue, { key_path => $key_path, node => $value };
}
}
}
I end up with output like:
widget->text->name = text1
widget->image->name = sun1
widget->window->name = main_window
From there, you can customize this to get the other features you need. If you want to find a complex key, you just do a little more work to compare the key path to what you want.

TypeError: ufunc 'add' did not contain a loop with signature matching types dtype('<U57') dtype('<U57') dtype('<U57')

I am using great-expectation for pipeline testing.
I have One Dataframe batch of type :-
great_expectations.dataset.pandas_dataset.PandasDataset
I want to build dynamic validation expression.
i.e
batch.("columnname","value") in which
validationtype columname and value coming from json file .
JSON structure:-
{
"column_name": "sex",
"validation_type": "expect_column_values_to_be_in_set",
"validation_value": ["MALE","FEMALE"]
},
when i am building this expression getting error message described below .
Code:-
def add_validation(self,batch,validation_list):
for d in validation_list:
expression = "." + d["validation_type"] + "(" + d["column_name"] + "," +
str(d["validation_value"]) + ")"
print(expression)
batch+expression
batch.save_expectation_suite(discard_failed_expectations=False)
return batch
Output:-
print statement output
.expect_column_values_to_be_in_set(sex,['MALE','FEMALE'])
Error:-
TypeError: ufunc 'add' did not contain a loop with signature matching
types dtype('
In great_expectations, the expectation_suite object is designed to capture all of the information necessary to evaluate an expectation. So, in your case, the most natural thing to do would be to translate the source json file you have into the great_expectations expectation suite format.
The best way to do that will depend on where you're getting the original JSON structure from -- you'd ideally want to do the translation as early as possible (maybe even before creating that source JSON?) and keep the expectations in the GE format.
For example, if all of the expectations you have are of the type expect_column_values_to_be_in_set, you could do a direct translation:
expectations = []
for d in validation_list:
expectation_config = {
"expectation_type": d["validation_type"],
"kwargs": {
"column": d["column_name"],
"value_set": d["validation_value"]
}
}
expectation_suite = {
"expectation_suite_name": "my_suite",
"expectations": expectations
}
On the other hand, if you are working with a variety of different expectations, you would also need to make sure that the validation_value in your JSON gets mapped to the right kwargs for the expectation (for example, if you expect_column_values_to_be_between then you actually need to provide min_value and/or max_value).

How do I search for a string in this JSON with Python

My JSON file looks something like:
{
"generator": {
"name": "Xfer Records Serum",
....
},
"generator": {
"name: "Lennar Digital Sylenth1",
....
}
}
I ask the user for search term and the input is searched for in the name key only. All matching results are returned. It means if I input 's' only then also both the above ones would be returned. Also please explain me how to return all the object names which are generators. The more simple method the better it will be for me. I use json library. However if another library is required not a problem.
Before switching to JSON I tried XML but it did not work.
If your goal is just to search all name properties, this will do the trick:
import re
def search_names(term, lines):
name_search = re.compile('\s*"name"\s*:\s*"(.*' + term + '.*)",?$', re.I)
return [x.group(1) for x in [name_search.search(y) for y in lines] if x]
with open('path/to/your.json') as f:
lines = f.readlines()
print(search_names('s', lines))
which would return both names you listed in your example.
The way the search_names() function works is it builds a regular expression that will match any line starting with "name": " (with varying amount of whitespace) followed by your search term with any other characters around it then terminated with " followed by an optional , and the end of string. Then applies that to each line from the file. Finally it filters out any non-matching lines and returns the value of the name property (the capture group contents) for each match.

How can I improve the ease of working with JSON in Haskell?

Haskell has become useful as a web language (thanks Servant!), and yet JSON is still so painful for me so I must be doing something wrong (?)
I hear JSON mentioned as a pain point enough, and the responses I've heard revolve around "use PureScript", "wait for Sub/Row Typing", "use esoterica, like Vinyl", "Aeson + just deal with the explosion of boiler plate data types".
As an (unfair) reference point, I really enjoy the ease of Clojure's JSON "story" (of course, it's a dynamic language, and has it's tradeoffs for which I still prefer Haskell).
Here's an example I've been staring at for an hour.
{
"access_token": "xxx",
"batch": [
{"method":"GET", "name":"oldmsg", "relative_url": "<MESSAGE-ID>?fields=from,message,id"},
{"method":"GET", "name":"imp", "relative_url": "{result=oldmsg:$.from.id}?fields=impersonate_token"},
{"method":"POST", "name":"newmsg", "relative_url": "<GROUP-ID>/feed?access_token={result=imp:$.impersonate_token}", "body":"message={result=oldmsg:$.message}"},
{"method":"POST", "name":"oldcomment", "relative_url": "{result=oldmsg:$.id}/comments", "body":"message=Post moved to https://workplace.facebook.com/{result=newmsg:$.id}"},
{"method":"POST", "name":"newcomment", "relative_url": "{result=newmsg:$.id}/comments", "body":"message=Post moved from https://workplace.facebook.com/{result=oldmsg:$.id}"},
]
}
I need to POST this to FB workplace, which will copy a message to a new group, and comment a link on both, linking to each other.
My first attempt looked something like:
data BatchReq = BatchReq {
method :: Text
, name :: Text
, relativeUrl :: Text
, body :: Maybe Text
}
data BatchReqs = BatchReqs {
accessToken :: Text
, batch :: [BatchReq]
}
softMove tok msgId= BatchReqs tok [
BatchReq "GET" "oldmsg" (msgId `append` "?fields=from,message,id") Nothing
...
]
That's painfully rigid, and dealing with Maybes all over is uncomfortable. Is Nothing a JSON null? Or should the field be absent? Then I worried about deriving the Aeson instances, and had to figure out how to convert eg relativeUrl to relative_url. Then I added an endpoint, and now I have name clashes. DuplicateRecordFields! But wait, that causes so many problems elsewhere. So update the data type to use eg batchReqRelativeUrl, and peel that off when deriving instances using Typeables and Proxys. Then I needed to add endpoints, and or massage the shape of that rigid data type for which I added more datapoints, trying to not let the "tyranny of small differences" bloat my data types too much.
At this point, I was largely consuming JSON, so decided a "dynamic" thing would be to use lenses. So, to drill into a JSON field holding a group id I did:
filteredBy :: (Choice p, Applicative f) => (a -> Bool) -> Getting (Data.Monoid.First a) s a -> Optic' p f s s
filteredBy cond lens = filtered (\x -> maybe False cond (x ^? lens))
-- the group to which to move the message
groupId :: AsValue s => s -> AppM Text
groupId json = maybe (error500 "couldn't find group id in json.")
pure (json ^? l)
where l = changeValue . key "message_tags" . values . filteredBy (== "group") (key "type") . key "id" . _String
That's rather heavy to access fields. But I also need to generate payloads, and I'm not skilled enough to see how lenses will be nice for that. Circling around to the motivating batch request, I've come up with a "dynamic" way of writing these payloads. It could be simplified with helper fns, but, I'm not even sure how much nicer it'll get with that.
softMove :: Text -> Text -> Text -> Value
softMove accessToken msgId groupId = object [
"access_token" .= accessToken
, "batch" .= [
object ["method" .= String "GET", "name" .= String "oldmsg", "relative_url" .= String (msgId `append` "?fields=from,message,id")]
, object ["method" .= String "GET", "name" .= String "imp", "relative_url" .= String "{result=oldmsg:$.from.id}?fields=impersonate_token"]
, object ["method" .= String "POST", "name" .= String "newmsg", "relative_url" .= String (groupId `append` "/feed?access_token={result=imp:$.impersonate_token}"), "body" .= String "message={result=oldmsg:$.message}"]
, object ["method" .= String "POST", "name" .= String "oldcomment", "relative_url" .= String "{result=oldmsg:$.id}/comments", "body" .= String "message=Post moved to https://workplace.facebook.com/{result=newmsg:$.id}"]
, object ["method" .= String "POST", "name" .= String "newcomment", "relative_url" .= String "{result=newmsg:$.id}/comments", "body" .= String "message=Post moved from https://workplace.facebook.com/{result=oldmsg:$.id}"]
]
]
I'm considering having JSON blobs in code or reading them in as files and using Text.Printf to splice in variables...
I mean, I can do it all like this, but would sure appreciate finding an alternative. FB's API is a bit unique in that it can't be represented as a rigid data structure like a lot of REST APIs; they call it their Graph API which is quite a bit more dynamic in use, and treating it like a rigid API has been painful thus far.
(Also, thanks to all the community help getting me this far with Haskell!)
Update: Added some comments on the "dynamic strategy" at the bottom.
In similar situations, I've used single-character helpers to good effect:
json1 :: Value
json1 = o[ "batch" .=
[ o[ "method" .= s"GET", "name" .= s"oldmsg",
"url" .= s"..." ]
, o[ "method" .= s"POST", "name" .= s"newmsg",
"url" .= s"...", "body" .= s"..." ]
]
]
where o = object
s = String
Note that the non-standard syntax (no space between one-character helper and argument) is intentional. It's a signal to me and others reading my code that these are technical "annotations" to satisfy the type checker rather than a more usual kind of function call that's actually doing something.
While this adds a little clutter, the annotations are easy to ignore while reading the code. They're also easy to forget while writing code, but the type checker catches those, so they're easy to fix.
In your particular case, I think some more structured helpers do make sense. Something like:
softMove :: Text -> Text -> Text -> Value
softMove accessToken msgId groupId = object [
"access_token" .= accessToken
, "batch" .= [
get "oldmsg" (msgId <> "?fields=from,message,id")
, get "imp" "{result=oldmsg:$.from.id}?fields=impersonate_token"
, post "newmsg" (groupId <> "...") "..."
, post "oldcomment" "{result=oldmsg:$.id}/comments" "..."
, post "newcomment" "{result=newmsg:$.id}/comments" "..."
]
]
where get name url = object $ req "GET" name url
post name url body = object $ req "POST" name url
<> ["body" .= s body]
req method name url = [ "method" .= s method, "name" .= s name,
"relative_url" .= s url ]
s = String
Note that you can tailor these helpers to the specific JSON you're generating in a particular case and define them locally in a where clause. You don't need to commit to some big chunk of ADT and function infrastructure that covers all JSON use-cases in your code, as you might do if the JSON was more unified in structure across the application.
Comments on the "Dynamic Strategy"
With respect to whether or not using a "dynamic strategy" is the right approach, it probably depends on more context than can realistically be shared in a Stack Overflow question. But, taking a step back, the Haskell type system is useful to the extent that it helps clearly model the problem domain. At its best, the types feel natural and assist you with writing correct code. When they stop doing this, you need to rethink your types.
The pain you encountered with a more traditional ADT-driven approach to this problem (rigidity of the types, proliferation of Maybes, and the "tyranny of small differences") suggests that these types were a bad model at least for what you were trying to do in this case. In particular, given that your problem was one of generating fairly straightforward JSON directives/commands for an external API, rather than doing lots of data manipulation on structures that also happened to allow JSON serialization/deserialization, modeling the data as Haskell ADTs was probably overkill.
My best guess is that, if you really wanted to properly model the FB Workplace API, you wouldn't want to do it at the JSON level. Instead, you'd do it at a higher level of abstraction with Message, Comment, and Group types, and you'd end up wanting to generate the JSON dynamically anyway, because your types wouldn't directly map to the JSON structures expected by the API.
It might be insightful to compare your problem to generating HTML. Consider first the lucid (blaze-based) or shakespeare templating packages. If you look at how these work, they don't try to build up HTML by generating a DOM with ADTs like data Element = ImgElement ... | BlockquoteElement ... and then serializing them to HTML. Presumably the authors decided that this abstraction wasn't really necessary, because the HTML just needs to be generated, not analyzed. Instead they use functions (lucid) or a quasiquoter (shakespeare) to build up a dynamic data structure representing an HTML document. The chosen structure is rigid enough to ensure certain sorts of validity (e.g., proper matching of opening and closing element tags) but not others (e.g., no one stops you from sticking a <p> child in the middle of your <span> element).
When you use these packages in a larger web app, you model the problem domain at a higher level of abstraction than HTML elements, and you generate the HTML in a largely dynamic fashion because there's not a clear one-to-one mapping between the types in your problem domain model and HTML elements.
On the other hand, there's a type-of-html package that does model individual elements, so it's a type error to try to nest a <tr> inside <td> and so on. Developing these types probably took a lot of work, and there's a lot of inflexibility "baked in", but the trade-off is a whole other level of type safety. On the other hand, this seems easier to do for HTML than to do for a particular finicky JSON API.

Decoding and using JSON data in Perl

I am confused about accessing the contents of some JSON data that I have decoded. Here is an example
I don't understand why this solution works and my own does not. My questions are rephrased below
my $json_raw = getJSON();
my $content = decode_json($json_raw);
print Data::Dumper($content);
At this point my JSON data has been transformed into this
$VAR1 = { 'items' => [ 1, 2, 3, 4 ] };
My guess tells me that, once decoded, the object will be a hash with one element that has the key items and an array reference as the value.
$content{'items'}[0]
where $content{'items'} would obtain the array reference, and the outer $...[0] would access the first element in the array and interpret it as a scalar. However this does not work. I get an error message use of uninitialized value [...]
However, the following does work:
$content->{items}[0]
where $content->{items} yields the array reference and [0] accesses the first element of that array.
Questions
Why does $content{'items'} not return an array reference? I even tried #{content{'items'}}, thinking that, once I got the value from content{'items'}, it would need to be interpreted as an array. But still, I receive the uninitialized array reference.
How can I access the array reference without using the arrow operator?
Beginner's answer to beginner :) Sure not as profesional as should be, but maybe helps you.
use strict; #use this all times
use warnings; #this too - helps a lot!
use JSON;
my $json_str = ' { "items" : [ 1, 2, 3, 4 ] } ';
my $content = decode_json($json_str);
You wrote:
My guess tells me that, once decoded, the object will be a hash with
one element that has the key items and an array reference as the value.
Yes, it is a hash, but the the decode_json returns a reference, in this case, the reference to hash. (from the docs)
expects an UTF-8 (binary) string and tries to parse that
as an UTF-8 encoded JSON text,
returning the resulting reference.
In the line
my $content = decode_json($json_str);
you assigning to an SCALAR variable (not to hash).
Because you know: it is a reference, you can do the next:
printf "reftype:%s\n", ref($content);
#print: reftype:HASH ^
#therefore the +------- is a SCALAR value containing a reference to hash
It is a hashref - you can dump all keys
print "key: $_\n" for keys %{$content}; #or in short %$content
#prints: key: items
also you can assing the value of the "items" (arrayref) to an scalar variable
my $aref = $content->{items}; #$hashref->{key}
#or
#my $aref = ${$content}{items}; #$hash{key}
but NOT
#my $aref = $content{items}; #throws error if "use strict;"
#Global symbol "%content" requires explicit package name at script.pl line 20.
The $content{item} is requesting a value from the hash %content and you never defined/assigned such variable. the $content is an scalar variable not hash variable %content.
{
#in perl 5.20 you can also
use 5.020;
use experimental 'postderef';
print "key-postderef: $_\n" for keys $content->%*;
}
Now step deeper - to the arrayref - again you can print out the reference type
printf "reftype:%s\n", ref($aref);
#reftype:ARRAY
print all elements of array
print "arr-item: $_\n" for #{$aref};
but again NOT
#print "$_\n" for #aref;
#dies: Global symbol "#aref" requires explicit package name at script.pl line 37.
{
#in perl 5.20 you can also
use 5.020;
use experimental 'postderef';
print "aref-postderef: $_\n" for $aref->#*;
}
Here is an simple rule:
my #arr; #array variable
my $arr_ref = \#arr; #scalar - containing a reference to #arr
#{$arr_ref} is the same as #arr
^^^^^^^^^^ - array reference in curly brackets
If you have an $arrayref - use the #{$array_ref} everywhere you want use the array.
my %hash; #hash variable
my $hash_ref = \%hash; #scalar - containing a reference to %hash
%{$hash_ref} is the same as %hash
^^^^^^^^^^^ - hash reference in curly brackets
If you have an $hash_ref - use the %{$hash_ref} everywhere you want use the hash.
For the whole structure, the following
say $content->{items}->[0];
say $content->{items}[0];
say ${$content}{items}->[0];
say ${$content}{items}[0];
say ${$content->{items}}[0];
say ${${$content}{items}}[0];
prints the same value 1.
$content is a hash reference, so you always need to use an arrow to access its contents. $content{items} would refer to a %content hash, which you don't have. That's where you're getting that "use of uninitialized value" error from.
I actually asked a similar question here
The answer:
In Perl, a function can only really return a scalar or a list.
Since hashes can be initialized or assigned from lists (e.g. %foo = (a => 1, b => 2)), I guess you're asking why json_decode returns something like { a => 1, b => 2 } (a reference to an anonymous hash) rather than (a => 1, b => 2) (a list that can be copied into a hash).
I can think of a few good reasons for this:
in Perl, an array or hash always contains scalars. So in something like { "a": { "b": 3 } }, the { "b": 3 } part has to be a scalar; and for consistency, it makes sense for the whole thing to be a scalar in the same way.
if the hash is quite large (many keys at top-level), it's pointless and expensive to iterate over all the elements to convert it into a list, and then build a new hash from that list.
in JSON, the top-level element can be either an object (= Perl hash) or an array (= Perl array). If json_decode returned a list in the former case, it's not clear what it would return in the latter case. After decoding the JSON string, how could you examine the result to know what to do with it? (And it wouldn't be safe to write %foo = json_decode(...) unless you already knew that you had a hash.) So json_decode's behavior works better for any general-purpose library code that has to use it without already knowing very much about the data it's working with.
I have to wonder exactly what you passed as an array to json_decode, because my results differ from yours.
#!/usr/bin/perl
use JSON qw (decode_json);
use Data::Dumper;
my $json = '["1", "2", "3", "4"]';
my $fromJSON = decode_json($json);
print Dumper($fromJSON);
The result is $VAR1 = [ '1', '2', '3', '4' ];
Which is an array ref, where your result is a hash ref
So did you pass in a hash with element items which was a reference to an array?
In my example you would get the array by doing
my #array = #{ $fromJSON };
In yours
my #array = #{ $content->{'items'} }
I don't understand why you dislike the arrow operator so much!
The decode_json function from the JSON module will always return a data reference.
Suppose you have a Perl program like this
use strict;
use warnings;
use JSON;
my $json_data = '{ "items": [ 1, 2, 3, 4 ] }';
my $content = decode_json($json_data);
use Data::Dump;
dd $content;
which outputs this text
{ items => [1 .. 4] }
showing that $content is a hash reference. Then you can access the array reference, as you found, with
dd $content->{items};
which shows
[1 .. 4]
and you can print the first element of the array by writing
print $content->{items}[0], "\n";
which, again as you have found, shows just
1
which is the first element of the array.
As #cjm mentions in a comment, it is imperative that you use strict and use warnings at the start of every Perl program. If you had those in place in the program where you tried to access $content{items}, your program would have failed to compile, and you would have seen the message
Global symbol "%content" requires explicit package name
which is a (poorly-phrased) way of telling you that there is no %content so there can be no items element.
The scalar variable $content is completely independent from the hash variable %content, which you are trying to access when you write $content{items}. %content has never been mentioned before and it is empty, so there is no items element. If you had tried #{$content->{items}} then it would have worked, as would #{${$content}{items}}
If you really have a problem with the arrow operator, then you could write
print ${$content}{items}[0], "\n";
which produces the same output; but I don't understand what is wrong with the original version.