Read Perl JSON structure - json

I get the JSON from request:
use HTTP::Tiny;
my $response = HTTP::Tiny->new->get('https://jsonplaceholder.typicode.com/todos/1');
print "-------------------**------------------- \n";
my $content = $response->{content};
print $content->[0]->{name};
Response:
[
{
"id": 1,
"name": "Leanne Graham", "username": "Bret", "email": "Sincere#april.biz",
"address": {
"street": "Kulas Light", "suite": "Apt. 556",
"city": "Gwenborough", "zipcode": "92998-3874", "geo": { "lat": "-37.3159",
"lng": "81.1496" } }, "phone": "1-770-736-8031 x56442", "website": "hildegard.org",
"company": {
"name": "Romaguera-Crona",
"catchPhrase": "Multi-layered client-server neural-net",
"bs": "harness real-time e-markets"
}
},
{
"id": 2,
"name": "Ervin Howell",
"username": "Antonette",
"email": "Shanna#melissa.tv",
"address": {
"street": "Victor Plains",
"suite": "Suite 879",
"city": "Wisokyburgh",
"zipcode": "90566-7771",
"geo": {
"lat": "-43.9509",
"lng": "-34.4618"
}
}
]
How to read every content of the json return variable. I've tried this:
print $content->[0]->{name};
, but return nothing.
How to read through the JSON structure of perl?

Your variable contains a string that represents a data structure in the JSON format. You need to convert it to a Perl data structure in order to use traverse it in Perl. At this point, it's just a bunch of text, and HTTP::Tiny does not care what kind of data it returns.
Core Perl brings the JSON::PP module starting from version 5.13.9 (with the 5.14 release).
use strict;
use warnings;
use JSON::PP 'decode_json';
use Data::Printer;
my $json = qq({ "foo" : "bar" });
my $decoded = decode_json($json);
p $decoded;
print $decoded->{foo};
This will output:
\ {
foo "bar"
}
bar
If you've got a newer Perl and have some other stuff installed, you probably also have JSON::MaybeXS, which will pick the fastest JSON parser available automatically.
Now if you wanted your user agent to know how to do this in multiple places, you can easily create a sub class. I've done a rudimentary implementation here. Save it in a new file HTTP/Tiny/DecodeJSON.pm in the right folder. I would place it under lib in your script's directory.
.
├── lib
│ └── HTTP
│ └── Tiny
│ └── DecodeJSON.pm
└── script.pl
I would also suggest adding extensive error handling.
package HTTP::Tiny::DecodeJSON;
use strict;
use warnings;
use JSON::PP 'decode_json';
use parent 'HTTP::Tiny';
# we need this to not throw a warning in HTTP::Tiny::_agent()
use constant VERSION => '0.01';
sub get_json {
my $self = shift;
my $res = $self->get(#_);
# add error handling here ...
return decode_json $res->{content};
}
1;
You can then reuse it wherever you like. To use it in your script, you need to add the lib directory to the list of directories that Perl looks for it's modules.
use strict;
use warnings;
use Data::Printer;
use lib 'lib';
use HTTP::Tiny::DecodeJSON;
my $decoded = HTTP::Tiny::DecodeJSON->new->get_json(
'https://jsonplaceholder.typicode.com/todos/1'
);
p $decoded;

simbabque has explained a lot,
and it is useful to have an example of subclassing HTTP::Tiny. I would add the following
I believe that Cpanel::JSON::XS, despite its convoluted name, is the superior JSON module on CPAN
There is no $content->[0]->{name} element in the data returned from that URL, although I imagine that is because you are working on it. Thank you for posting a usable data source: it makes questions so much more pleasant to answer
It's pretty much essential to check whether the HTTP request has succeeded, and die with an explanatory message if there was a problem. It's just an extra statement
die $response->{reason} unless $response->{success};
Here's how I would write your code. Instead of selecting the field as you do I have used Data::Dump
to display the contents of the structure
use strict;
use warnings 'all';
use HTTP::Tiny;
use Cpanel::JSON::XS 'decode_json';
my $response = HTTP::Tiny->new->get('https://jsonplaceholder.typicode.com/todos/1');
die $response->{reason} unless $response->{success};
my $data = decode_json $response->{content};
use Data::Dump;
dd $data;
output
{
completed => bless(do{\(my $o = 0)}, "JSON::PP::Boolean"),
id => 1,
title => "delectus aut autem",
userId => 1,
}
As you can see, $content->[0]->{name} would never work because the data is a hash rather than an array, and there is no hash key name anywhere. But the Latin is a strong indicator that the server has been updated since your question so this is not a problem
The value $data->{completed} is boolean, and should probably be tested with
if ( $data->{completed} ) { ... }
to decide what to do with the response

Related

Terraform's external data source: STDOUT syntax unclear

I would like to use terraform's external data source to identify certain AWS EC2 instances:
data "external" "monitoring_instances" {
program = ["bash", "${path.module}/../bash/tf_datasource_monitoring.sh"]
query = {
env = var.env_stage
}
}
The bash script is using AWS CLI to return a list of instance IDs.
But I keep receiving this Error: command "bash" produced invalid JSON: json: cannot unmarshal array into Go value of type string
I don't understand what the expected syntax of my script's STDOUT would be for terraform to understand the result.
So let's assume the script is supposed to return 3 instance IDs i-1, i-2 and i-3.
What would be the correct JSON syntax to be returned to terraform?
Examples, that do NOT work:
{
"instances": [
"i-1",
"i-2",
"i-3"
]
}
[
"i-1",
"i-2",
"i-3"
]
It is a known issue in Terraform for provider-external: https://github.com/hashicorp/terraform-provider-external/issues/2. It was opened a while ago, unfortunately is still present for latest version (Terraform v1.011).
You may want to avoid returning JSON objects which contain arrays.
I had the same issue, while executing a python script to generate a dynamic bigquery schema, which is per definition an array of JSONs.
I solved it, by implementing a wrapper JSON with the schema as a string value (see dummy code below).
# get_dynamic_bigquery_schema.py
import json
bigquery_schema = [
{"name": "int_field", "type": "INTEGER", "mode": "NULLABLE", "description": "int_field"},
{"name": "int_field_repeated", "type": "INTEGER", "mode": "REPEATED", "description": "int_field_repeated"}
]
wrapper_json = {'actual_output': str(json.dumps(bigquery_schema))}
print(json.dumps(wrapper_json))
which I can then access within terraform
# main.tf
data "external" "bigquery_schema" {
program = ["python", "${path.module}/get_dynamic_bigquery_schema.py"]
}
locals {
bigquery_schema= data.external.bigquery_schema.result['actual_output']
}

Perl LWP::UserAgent parse response JSON

I am using the LWP::UserAgent module to issue a GET request to one of our APIs.
#!/usr/bin/perl
use strict;
use warning;
use LWP::UserAgent;
use Data::Dumper;
my $ua = LWP::UserAgent->new;
my $request = $ua->get("http://example.com/foo", Authorization => "Bearer abc123", Accept => "application/json" );
print Dumper $request->content;
The request is successful. Dumper returns the following JSON.
$VAR1 = '{
"apiVersion": "v1",
"data": {
"ca-bundle.crt": "-----BEGIN CERTIFICATE-----abc123-----END CERTIFICATE-----\\n"
},
"kind": "ConfigMap",
"metadata": {
"creationTimestamp": "2021-07-16T17:13:01Z",
"labels": {
"auth.openshift.io/managed-certificate-type": "ca-bundle"
},
"managedFields": [
{
"apiVersion": "v1",
"fieldsType": "FieldsV1",
"fieldsV1": {
"f:data": {
".": {},
"f:ca-bundle.crt": {}
},
"f:metadata": {
"f:labels": {
".": {},
"f:auth.openshift.io/managed-certificate-type": {}
}
}
},
"manager": "cluster-kube-apiserver-operator",
"operation": "Update",
"time": "2021-09-14T17:07:39Z"
}
],
"name": "kube-control-plane-signer-ca",
"namespace": "openshift-kube-apiserver-operator",
"resourceVersion": "65461225",
"selfLink": "/api/v1/namespaces/openshift-kube-apiserver-operator/configmaps/kube-control-plane-signer-ca",
"uid": "f9aea067-1234-5678-9101-9d4073f5ae53"
}
}';
Let's say I want to print the value of the apiVersion key, which should print v1.
print "API Version = $request->content->{'apiVersion'} \n";
The following is being printed. I am not sure how to print the value v1. Since HTTP::Response is included in the output, I suspect I might have to use the HTTP::Response module?
API Version = HTTP::Response=HASH(0x2dffe80)->content->{'apiVersion'}
Perl doesn't expand subroutine calls in a double-quoted string.
print "API Version = $request->content->{'apiVersion'} \n";
In this line of code, content() is a subroutine call. So Perl sees this as:
print "API Version = $request" . "->content->{'apiVersion'} \n";
And if you try to print most Perl objects, you'll get the hash reference along with the name of the class - hence HTTP::Response=HASH(0x2dffe80).
You might think that you just need to break up your print() statement like this:
print 'API Version = ', $request->content->{'apiVersion'}, "\n";
But that's not going to work either. $request->content doesn't return a Perl data structure, it returns a JSON-encoded string. You need to decode it into a data structure before you can access the individual elements.
use JSON;
print 'API Version = ', decode_json($request->content)->{'apiVersion'}, "\n";
But it might be cleaner to do the decoding outside of the print() statement.
use JSON;
my $data = decode_json($request->content);
In which case you can go back to something more like your original code:
print "API Version = $data->{'apiVersion'} \n";
The JSON content must be decoded first. There are several modules for that, like JSON:
use JSON;
# ...
my $href = decode_json $request->content;
And then use it like a normal hash reference: $href->{apiVersion}

How to extract certain data using Perl from a file?

I have data that needs to be extracted from a file, the lines I need for the moment are name,location and host. This is example of the extract. How would I go about getting these lines into a separate file? I have the Original file and the new file i want to create as the input/output file, there are thousands of devices contained within the output file and they are all the same formatting as in my example.
#!/usr/bin/perl
use strict;
use warnings;
use POSIX qw(strftime);
#names of files to be input output
my $inputfile = "/home/nmis/nmis_export.csv";
my $outputfile = "/home/nmis/nmis_data.csv";
open(INPUT,'<',$inputfile) or die $!;
open(OUTPUT, '>',$outputfile) or die $!;
my #data = <INPUT>;
close INPUT;
my $line="";
foreach $line (#data)
{
======Sample Extract=======
**"group" : "NMIS8",
"host" : "1.2.3.4",
"location" : "WATERLOO",
"max_msg_size" : 1472,
"max_repetitions" : 0,
"model" : "automatic",
"netType" : "lan",
"ping" : 1,
"polling_policy" : "default",
"port" : 161,
"rancid" : 0,
"roleType" : "access",
"serviceStatus" : "Production",
"services" : null,
"threshold" : 1,
"timezone" : 0,
"version" : "snmpv2c",
"webserver" : 0
},
"lastupdate" : 1616690858,
"name" : "test",
"overrides" : {}
},
{
"activated" : {
"NMIS" : 1
},
"addresses" : [],
"aliases" : [],
"configuration" : {
"Stratum" : 3,
"active" : 1,
"businessService" : "",
"calls" : 0,
"cbqos" : "none",
"collect" : 0,
"community" : "public",
"depend" : [
"N/A"
],
"group" : "NMIS8",
"host" : "1.2.3.5",
"location" : "WATERLOO",
"max_msg_size" : 1472,
"max_repetitions" : 0,
"model" : "automatic",
"netType" : "lan",
"ping" : 1,
"polling_policy" : "default",
"port" : 161,
"rancid" : 0,
"roleType" : "access",
"serviceStatus" : "Production",
"services" : null,
"threshold" : 1,
"timezone" : 0,
"version" : "snmpv2c",
"webserver" : 0
},
"lastupdate" : 1616690858,
"name" : "test2",
"overrides" : {}
},**
I would use jq for this not Perl. You just need to query a JSON document. That's what jq is for. You can see an example here
The jq query I created is this one,
.[] | {name: .name, group: .configuration.group, location: .configuration.location}
This breaks down into
.[] # iterate over the array
| # create a filter to send it to
{ # that produces an object with the bellow key/values
.name,
group: .configuration.group,
location: .configuration.location
}
It provides an output like this,
{
"name": "test2",
"group": "NMIS8",
"location": "WATERLOO"
}
{
"name": "test2",
"group": "NMIS8",
"location": "WATERLOO"
}
You can use this to generate a csv
jq -R '.[] | [.name, .configuration.group, .configuration.location] | #csv' ./file.json
Or this to generate a csv with a header,
jq -R '["name","group","location"], (.[] | [.name, .configuration.group, .configuration.location]) | #csv' ./file.json
You can use the JSON distribution for this. Read the entire file in one fell swoop to put the entire JSON string into a scalar (as opposed to putting it into an array and iterating over it), then simply decode the string into a Perl data structure:
use warnings;
use strict;
use JSON;
my $file = 'file.json';
my $json_string;
{
local $/; # Locally reset line endings to nothing
open my $fh, '<', $file or die "Can't open file $file!: $!";
$json_string = <$fh>; # Slurp in the entire file
}
my $perl_data_structure = decode_json $json_string;
As what you have there is JSON, you should parse it with a JSON parser. JSON::PP is part of the standard Perl distribution. If you want something faster, you could install something else from CPAN.
Update: I included a link to JSON::PP in my answer. Did you follow that link? If you did, you would have seen the documentation for the module. That has more information about how to use the module than I could include in an answer on SO.
But it's possible that you need a little more high-level information. The documentation says this:
JSON::PP is a pure perl JSON decoder/encoder
But perhaps you don't know what that means. So here's a primer.
JSON is a text format for storing complex data structures. The format was initially used in Javascript (the acronym stands for "JavaScript Object Notation") but it is now a standard that is used across pretty much all programming languages.
You rarely want to actually deal with JSON in a program. A JSON document is just text and manipulating that would require some complex regular expressions. When dealing with JSON, the usual approach is to "decode" the JSON into a data structure inside your program. You can then manipulate the data structure however you want before (optionally) "encoding" the data structure back into JSON so you can write it to an output file (in your case, you don't need to do that as you want your output as CSV).
So there are pretty much only two things that a Perl JSON library needs to do:
Take some JSON text and decode it into a Perl data structure
Take a Perl data structure and encode it into JSON text
If you look at the JSON::PP documentation you'll see that it contains two functions, encode_json() and decode_json() which do what I describe above. There's also an OO interface, but let's not overcomplicate things too quickly.
So your program now needs to have the following steps:
Read the JSON from the input file
Decode the JSON into a Perl data structure
Walk the Perl data structure to extract the items that you need
Write the required items into your output file (for which Text::CSV will be useful
Having said all that, it really does seem to me that the jq solution suggested by user157251 is a much better idea.

Bash, pipe parse and format json (slim it down)

I have a bash script that outputs a json like this:
{
"name": "some",
"desc": "this is a desc",
"env": "this is an env type",
"dd": {
"one": "rr",
"two": "aa"
},
"url": "http://someurl",
//etc......
}
I would like to pipe a new command in my script, to return the final json output as:
{
"name": "some",
"env": "this is an env type",
"dd": {
"one": "rr",
"two": "aa"
}
}
How can i achieve this without installing new tools/libs like jq etc..
Any clue?
I know you've stipulated that you do this without external tools but hopefully this will change your mind:
jq '{ name, env, dd }' file.json
That was easy!
jq is very easy to obtain.
A quick and dirty python script would also work:
import sys
import json
with open(sys.argv[1]) as file:
obj = json.load(file)
print json.dumps({ key: obj[key] for key in ("name", "env", "dd") })
It can be run like python script.py file.json. To improve the formatting, you can pass extra arguments to json.dumps (see the docs).
How can i achieve this without installing new tools/libs like jq etc..
Any clue?
Well, to make an analogy it's like you were asking "how can I become a world champion in 100m sprint run, but I have no legs and I want no fake legs". The simple answer is that you cannot, or if you do, it won't be flexible and generic enough to be really useful.
Doing shell scripting is like having a toolbox, with many tools, each being designed to do one task and do it well. So if you refuse the tool that might be the right thing for what you need, then you cannot do it.
So the right way to do it is to use a tool such as jq, or a small python/ruby/... script that will take keys out of your json data.

How to extract the value of "name" from this puppet metadata.json without jq in shell?

For some extreme reason, I can't use jq or other cli tool. I need to extract the value of "name" from any json matching this puppet metadata.json. format.
the json might not be properly formatted and indented but will be valid. Meaning, white spaces, and line breaks, carriage backs might be inserted in eligible places.
Note that there could be "name" elements in dependencies array.
So, how to extract the value only using standard unix commands and/or shell script without installing any application like jq or other tools?
Thank you!!
{
"name": "examplecorp-mymodule",
"version": "0.0.1",
"author": "Pat",
"license": "Apache-2.0",
"summary": "A module for a thing",
"source": "https://github.com/examplecorp/examplecorp-mymodule",
"project_page": "https://forge.puppetlabs.com/examplecorp/mymodule",
"issues_url": "https://github.com/examplecorp/examplecorp-mymodule/issues",
"tags": ["things", "stuff"],
"operatingsystem_support": [
{
"operatingsystem":"RedHat",
"operatingsystemrelease":[ "5.0", "6.0" ]
},
{
"operatingsystem": "Ubuntu",
"operatingsystemrelease": [ "12.04", "10.04" ]
}
],
"dependencies": [
{ "name": "puppetlabs/stdlib", "version_requirement": ">=3.2.0 <5.0.0" },
{ "name": "puppetlabs/firewall", "version_requirement": ">= 0.0.4" }
]
}
It's ugly, awful, horrible, not structure-aware and will give you incorrect results if you have extra contents in your input file that look similar to what you're trying to find -- but...
#!/bin/bash
# ^- NOT /bin/sh; shell-native regexes are a bash extension
contents=$(<in.json)
if [[ $contents =~ '"name":'[[:space:]]*'"'([^\"]*)'"' ]]; then
echo "Found name: ${BASH_REMATCH[1]}"
fi
Now, let's talk about some of the ways this answer is broken (and using jq would be better):
It finds the first name, even if it's not one at an outer nesting layer. That is to say, if "dependencies": [ { "name": "puppetlabs/stdlib", "version_requirement": ">=3.2.0 <5.0.0" } ] comes before "name": "examplecorp-mymodule", guess which result is being found? (The easy workarounds to this would involve making assumptions about whitespace/formatting, and are thus not proof against all possible JSON expressions of the same data).
It won't unescape contents inside your name that require, well, unescaping (think about names containing symbols encoded as &foo;).
It isn't multibyte-character aware, and thus isn't guaranteed to emit output that aligns on codepoint boundaries.
If you have a name with an escaped \" subsequence... well, guess what happens there?
Etc. It's not quite as awful as trying to parse XML with regular expressions (JSON is easier!), but it's still quite a mess.
This should work for you:
jq '.name' metadata.json