Perl HTTP Tiny response - json

How to put the JSON response of HTTP::Tiny in Perl?
use HTTP::Tiny;
my $response = HTTP::Tiny->new->get('http://example.com/');
die "Failed!\n" unless $response->{success};
print "$response->{status} $response->{reason}\n";
while (my ($k, $v) = each %{$response->{headers}}) {
for (ref $v eq 'ARRAY' ? #$v : $v) {
print "$k: $_\n";
}
}
print $response->{content} if length $response->{content};
How to put the $response->{content} into variable to be passed into html(tt2)?

You would pass your content to the TT processor like this:
use Template;
my $tt = Template->new;
$tt->process('some_template.tt', { content => $response->{content} })
or die $tt->error;
Perhaps you want to store it in an intermediate scalar first:
my $content = $response->{content};
$tt->process('some_template.tt', { content => $content })
or die $tt->error;
Perhaps you want to build up TT's variable hash before passing it to the processor.
my %vars;
$vars{content} = $response->{content};
$tt->process('some_template.tt', \%vars)
or die $tt->error;
Or you could use a hash reference instead of an actual hash:
my $vars;
$vars->{content} = $response->{content};
$tt->process('some_template.tt', $vars)
or die $tt->error;
Update: In a comment, simbabque suggests that you might actually be asking how to decode JSON content into an array or a hash. And I agree that's another way to interpret your question.
You'd need to use the JSON module. And it would look something like this:
use JSON;
my $json_parser = JSON->new; # Perhaps other options here, see docs.
my $decoded_json = $json_parser->decode($response->{content});

Related

Can I use Text::CSV_XS to parse a csv-format string without writing it to disk?

I am getting a "csv file" from a vendor (using their API), but what they do is just spew the whole thing into their response. It wouldn't be a significant problem except that, of course, some of those pesky humans entered the data and put in "features" like line breaks. What I am doing now is creating a file for the raw data and then reopening it to read the data:
open RAW, ">", "$rawfile" or die "ERROR: Could not open $rawfile for write: $! \n";
print RAW $response->content;
close RAW;
my $csv = Text::CSV_XS->new({ binary=>1,always_quote=>1,eol=>$/ });
open my $fh, "<", "$rawfile" or die "ERROR: Could not open $rawfile for read: $! \n";
while ( $line = $csv->getline ($fh) ) { ...
Somehow this seems ... inelegant. It seems that I ought to be able to just read the data from the $response->content (multiline string) as if it were a file. But I'm drawing a total blank on how do this.
A pointer would be greatly appreciated.
Thanks,
Paul
You could use a string filehandle:
my $data = $response->content;
open my $fh, "<", \$data or croak "unable to open string filehandle : $!";
my $csv = Text::CSV_XS->new({ binary=>1,always_quote=>1,eol=>$/ });
while ( $line = $csv->getline ($fh) ) { ... }
Yes, you can use Text::CSV_XS on a string, via its functional interface
use warnings;
use strict;
use feature 'say';
use Text::CSV_XS qw(csv); # must use _XS version
my $csv = qq(a,line\nand,another);
my $aoa = csv(in => \$csv)
or die Text::CSV->error_diag;
say "#$_" for #aoa;
Note that this indeed needs Text::CSV_XS (normally Text::CSV works but not with this).
I don't know why this isn't available in the OO interface (or perhaps is but is not documented).
While the above parses the string directly as asked, one can also lessen the "inelegant" aspect in your example by writing content directly to a file as it's acquired, what most libraries support like with :content_file option in LWP::UserAgent::get method.
Let me also note that most of the time you want the library to decode content, so for LWP::UA to use decoded_content (see HTTP::Response).
I cooked up this example with Mojo::UserAgent. For the CSV input I used various data sets from the NYC Open Data. This is also going to appear in the next update for Mojo Web Clients.
I build the request without making the request right away, and that gives me the transaction object, $tx. I can then replace the read event so I can immediately send the lines into Text::CSV_XS:
#!perl
use v5.10;
use Mojo::UserAgent;
my $ua = Mojo::UserAgent->new;
my $url = ...;
my $tx = $ua->build_tx( GET => $url );
$tx->res->content->unsubscribe('read')->on(read => sub {
state $csv = do {
require Text::CSV_XS;
Text::CSV_XS->new;
};
state $buffer;
state $reader = do {
open my $r, '<:encoding(UTF-8)', \$buffer;
$r;
};
my ($content, $bytes) = #_;
$buffer .= $bytes;
while (my $row = $csv->getline($reader) ) {
say join ':', $row->#[2,4];
}
});
$tx = $ua->start($tx);
That's not as nice as I'd like it to be because all the data still show up in the buffer. This is slightly more appealing, but it's fragile in the ways I note in the comments. I'm too lazy at the moment to make it any better because that gets hairy very quickly as you figure out when you have enough data to process a record. My particular code isn't as important as the idea that you can do whatever you like as the transactor reads data and passes it into the content handler:
use v5.10;
use strict;
use warnings;
use feature qw(signatures);
no warnings qw(experimental::signatures);
use Mojo::UserAgent;
my $ua = Mojo::UserAgent->new;
my $url = ...;
my $tx = $ua->build_tx( GET => $url );
$tx->res->content
->unsubscribe('read')
->on( read => process_bytes_factory() );
$tx = $ua->start($tx);
sub process_bytes_factory {
return sub ( $content, $bytes ) {
state $csv = do {
require Text::CSV_XS;
Text::CSV_XS->new( { decode_utf8 => 1 } );
};
state $buffer = '';
state $line_no = 0;
$buffer .= $bytes;
# fragile if the entire content does not end in a
# newline (or whatever the line ending is)
my $last_line_incomplete = $buffer !~ /\n\z/;
# will not work if the format allows embedded newlines
my #lines = split /\n/, $buffer;
$buffer = pop #lines if $last_line_incomplete;
foreach my $line ( #lines ) {
my $status = $csv->parse($line);
my #row = $csv->fields;
say join ':', $line_no++, #row[2,4];
}
};
}

Perl XML2JSON : How to preserve XML element order?

I have a configuration file which is in XML format. I need to parse the XML and convert to JSON. I'm able to convert it with XML2JSON module of perl. But the problem is, it is not maintaining the order of XML elements. I strictly need the elements in order otherwise I cannot configure
My XML file is something like this. I have to configure an IP address and set that IP as a gateway to certain route.
<Config>
<ip>
<address>1.1.1.1</address>
<netmask>255.255.255.0</netmask>
</ip>
<route>
<network>20.20.20.0</network>
<netmask>55.255.255.0</netmask>
<gateway>1.1.1.1</gateway>
</route>
</Config>
This is my perl code to convert to JSON
my $file = 'config.xml';
use Data::Dumper;
open my $fh, '<',$file or die;
$/ = undef;
my $data = <$fh>;
my $XML = $data;
my $XML2JSON = XML::XML2JSON->new();
my $Obj = $XML2JSON->xml2obj($XML);
print Dumper($Obj);
The output I'm getting is,
$VAR1 = {'Config' => {'route' => {'netmask' => {'$t' => '55.255.255.0'},'gateway' => {'$t' => '1.1.1.1'},'network' => {'$t' => '20.20.20.0'}},'ip' => {'netmask' => {'$t' => '255.255.255.0'},'address' => {'$t' => '1.1.1.1'}}},'#encoding' => 'UTF-8','#version' => '1.0'};
I have a script which reads the json object and configure..
But it fails as it first tries to set gateway ip address to a route where the ip address is not yet configured and add then add ip address.
I strictly want key ip to come first and then route for proper configuration without error. Like this I have many dependencies where order of keys is a must.
Is there any way I can tackle this problem? I tried almost all modules of XML parsing like XML::Simple,Twig::XML,XML::Parser. But nothing helped..
Here's a program that I hacked together that uses XML::Parser to parse some XML data and generate the equivalent JSON in the same order. It ignores any attributes, processing instructions etc. and requires that every XML element must contain either a list of child elements or a text node. Mixing text and elements won't work, and this isn't checked except that the program will die trying to dereference a string
It's intended to be a framework for you to enhance as you require, but works fine as it stands with the XML data you show in your question
use strict;
use warnings 'all';
use XML::Parser;
my $parser = XML::Parser->new(Handlers => {
Start => \&start_tag,
End => \&end_tag,
Char => \&text,
});
my $struct;
my #stack;
$parser->parsefile('config.xml');
print_json($struct->[1]);
sub start_tag {
my $expat = shift;
my ($tag, %attr) = #_;
my $elem = [ $tag => [] ];
if ( $struct ) {
my $content = $stack[-1][1];
push #{ $content }, $elem;
}
else {
$struct = $elem;
}
push #stack, $elem;
}
sub end_tag {
my $expat = shift;
my ($elem) = #_;
die "$elem <=> $stack[-1][0]" unless $stack[-1][0] eq $elem;
for my $content ( $stack[-1][1] ) {
$content = "#$content" unless grep ref, #$content;
}
pop #stack;
}
sub text {
my $expat = shift;
my ($string) = #_;
return unless $string =~ /\S/;
$string =~ s/\A\s+//;
$string =~ s/\s+\z//;
push #{ $stack[-1][1] }, $string;
}
sub print_json {
my ($data, $indent, $comma) = (#_, 0, '');
print "{\n";
for my $i ( 0 .. $#$data ) {
# Note that $data, $indent and $comma are overridden here
# to reflect the inner context
#
my $elem = $data->[$i];
my $comma = $i < $#$data ? ',' : '';
my ($tag, $data) = #$elem;
my $indent = $indent + 1;
printf qq{%s"%s" : }, ' ' x $indent, $tag;
if ( ref $data ) {
print_json($data, $indent, $comma);
}
else {
printf qq{"%s"%s\n}, $data, $comma;
}
}
# $indent and $comma (and $data) are restored here
#
printf "%s}%s\n", ' ' x $indent, $comma;
}
output
{
"ip" : {
"address" : "1.1.1.1",
"netmask" : "255.255.255.0"
},
"route" : {
"network" : "20.20.20.0",
"netmask" : "55.255.255.0",
"gateway" : "1.1.1.1"
}
}
The problem isn't so much to do with XML parsing, but because perl hashes are not ordered. So when you 'write' some JSON... it can be any order.
The way to avoid this is to apply a sort function to your JSON.
You can do this by using sort_by to explicitly sort:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
use JSON::PP;
use Data::Dumper;
sub order_nodes {
my %rank_of = ( ip => 0, route => 1, address => 2, network => 3, netmask => 4, gateway => 5 );
print "$JSON::PP::a <=> $JSON::PP::b\n";
return $rank_of{$JSON::PP::a} <=> $rank_of{$JSON::PP::b};
}
my $twig = XML::Twig -> parse (\*DATA);
my $json = JSON::PP -> new;
$json ->sort_by ( \&order_nodes );
print $json -> encode( $twig -> simplify );
__DATA__
<Config>
<ip>
<address>1.1.1.1</address>
<netmask>255.255.255.0</netmask>
</ip>
<route>
<network>20.20.20.0</network>
<netmask>55.255.255.0</netmask>
<gateway>1.1.1.1</gateway>
</route>
</Config>
In some scenarios, setting canonical can help, as that sets ordering to lexical order. (And means your JSON output would be consistently ordered). This doesn't apply to your case.
You could build the node ordering via XML::Twig, either by an xpath expression, or by using twig_handlers. I gave it a quick go, but got slightly unstuck in figuring out how you'd 'tell' how to figure out ordering based on getting address/netmask and then network/netmask/gateway.
As a simple example you could:
my $count = 0;
foreach my $node ( $twig -> get_xpath ( './*' ) ) {
$rank_of{$node->tag} = $count++ unless $rank_of{$node->tag};
}
print Dumper \%rank_of;
This will ensure ip and route are always the right way around. However it doesn't order the subkeys.
That actually gets a bit more complicated, as you'd need to recurse... and then decide how to handle 'collisions' (like netmask - address comes before, but how does it sort compared to network).
Or alternatively:
my $count = 0;
foreach my $node ( $twig->get_xpath('.//*') ) {
$rank_of{ $node->tag } = $count++ unless $rank_of{ $node->tag };
}
This walks all the nodes, and puts them in order. It doesn't quite work, because netmask appears in both stanzas though.
You get:
{"ip":{"address":"1.1.1.1","netmask":"255.255.255.0"},"route":{"netmask":"55.255.255.0","network":"20.20.20.0","gateway":"1.1.1.1"}}
I couldn't figure out a neat way of collapsing both lists.

How to check whether hash key contain JSON in perl?

I am sending a request to the endpoint url ,from there i am getting the response in case of success in form of JSON,but if it fails it return certain text .
Sending request:
$data->{response} = $self->{_http}->send($myData);
So before doing this:
$resp = from_json($data->{response});
i want to check whether the reponse is in json format or not .How we can handle this in Perl kindly help in this
You can catch exception thrown by from_json(),
my $resp;
my $ok = eval { $resp = from_json("{}"); 1 };
$ok or die "Not valid json";
or simpler,
my $resp = eval { from_json("rrr") };
$resp // die "Not valid json";
Use JSON or JSON::XS to decode the JSON into a Perl structure.
Simple example:
use strict;
use warnings;
use JSON::XS;
my $json = '[{"Year":"2012","Quarter":"Q3","DataType":"Other 3","Environment":"STEVE","Amount":125},{"Year":"2012","Quarter":"Q4","DataType":"Other 2","Environment":"MIKE","Amount":500}]';
my $arrayref = decode_json $json;
foreach my $item( #$arrayref ) {
# fields are in $item->{Year}, $item->{Quarter}, etc.
}
You could use a try/catch block using Try::Tiny
use Try::Tiny;
try {
$resp = from_json($data->{response});
} catch {
# Do something if it does not parse
warn 'Could not parse json'
};

perl Mojo and JSON for simultaneous requests

I'm usually no Perl coder. However I've got to complete this task.
The following code works for me:
#!/usr/bin/perl
use LWP::UserAgent;
use JSON;
use strict;
my $md5 = $ARGV[0];
$md5 =~ s/[^A-Fa-f0-9 ]*//g;
die "invalid MD5" unless ( length($md5) == 32 );
my $ua = LWP::UserAgent->new(ssl_opts => { verify_hostname => 1 }, timeout => 10);
my $key="12345...7890";
my $url='https://www.virustotal.com/vtapi/v2/file/report';
my $response = $ua->post( $url, ['apikey' => $key, 'resource' => $md5] );
die "$url error: ", $response->status_line unless $response->is_success;
my $results=$response->content;
my $json = JSON->new->allow_nonref;
my $decjson = $json->decode( $results);
print "md5: ",$md5,"\n";
print "positives: ", $decjson->{"positives"}, "\n";
print "total: ", $decjson->{"total"}, "\n";
print "date: ", $decjson->{"scan_date"}, "\n";
Now I would like to recode the above for using asynchronous http using Mojo. I'm trying this:
#!/usr/bin/perl
use warnings;
use strict;
use Mojo;
use Mojo::UserAgent;
my $md5 = $ARGV[0];
$md5 =~ s/[^A-Fa-f0-9 ]*//g;
die "invalid MD5" unless ( length($md5) == 32 );
my ($vt_positives, $vt_scandate, $response_vt);
my $url='https://www.virustotal.com/vtapi/v2/file/report';
my $key="12345...7890";
my $ua = Mojo::UserAgent->new;
my $delay = Mojo::IOLoop->delay;
$ua->max_redirects(0)->connect_timeout(3)->request_timeout(6);
$ua->max_redirects(5);
$delay->begin;
$response_vt = $ua->post( $url => ['apikey' => $key, 'resource' => $md5] => sub {
my ($ua, $tx) = #_;
$vt_positives=$tx->res->json->{"positives"};
print "Got response: $vt_positives\n";
});
Mojo::IOLoop->start unless Mojo::IOLoop->is_running;
The first code is OK, the second isn't working. I must be doing something wrong when sending the request since I seem to get a 403 response (incorrect API usage). I also tried -> json calls but it didn't work out.
And even if I had done the request correctly, I'm not sure if I'm correctly decoding the json results with Mojo.
Help will be appreciated!
EDIT
It seems that we missed the real question, how to post forms. Oops sorry about that.
Posting forms depends on which version of Mojolicious you are using. Until recently (v3.85 -- 2013-02-13) there was a post_form method. On reflection however, it was decided there should either be *_form methods for every request type, or we should do something smarter, and thus the form generator was born.
$response_vt = $ua->post(
$url,
form => {'apikey' => $key, 'resource' => $md5},
sub { ... }
);
It can be added to any request method, making it much more consistent than the old form. Also note that it should be a hashref, not an arrayref as LWP allows. BTW there is also a json generator that works like this too, or you can even add your own!
I'm leaving my original answer, showing non-blocking usage, which you may now amend given the above.
ORIGINAL
Building off the logic from creaktive, this is how I would start. The major difference is that there isn't a monitor watching to be sure that there are works going, rather when one finishes it checks to be sure that there are no idlers.
I have also made some changes in the parsing logic, but nothing major.
#!/usr/bin/env perl
use Mojo::Base -strict;
use utf8::all;
use Mojo::URL;
use Mojo::UserAgent;
# FIFO queue
my #urls = qw(
http://sysd.org/page/1/
http://sysd.org/page/2/
http://sysd.org/page/3/
);
# User agent following up to 5 redirects
my $ua = Mojo::UserAgent
->new(max_redirects => 5)
->detect_proxy;
start_urls($ua, \#urls, \&get_callback);
sub start_urls {
my ($ua, $queue, $cb) = #_;
# Limit parallel connections to 4
state $idle = 4;
state $delay = Mojo::IOLoop->delay(sub{say #$queue ? "Loop ended before queue depleated" : "Finished"});
while ( $idle and my $url = shift #$queue ) {
$idle--;
print "Starting $url, $idle idle\n\n";
$delay->begin;
$ua->get($url => sub{
$idle++;
print "Got $url, $idle idle\n\n";
$cb->(#_, $queue);
# refresh worker pool
start_urls($ua, $queue, $cb);
$delay->end;
});
}
# Start event loop if necessary
$delay->wait unless $delay->ioloop->is_running;
}
sub get_callback {
my ($ua, $tx, $queue) = #_;
# Parse only OK HTML responses
return unless
$tx->res->is_status_class(200)
and $tx->res->headers->content_type =~ m{^text/html\b}ix;
# Request URL
my $url = $tx->req->url;
say "Processing $url";
parse_html($url, $tx, $queue);
}
sub parse_html {
my ($url, $tx, $queue) = #_;
state %visited;
my $dom = $tx->res->dom;
say $dom->at('html title')->text;
# Extract and enqueue URLs
$dom->find('a[href]')->each(sub{
# Validate href attribute
my $link = Mojo::URL->new($_->{href});
return unless eval { $link->isa('Mojo::URL') };
# "normalize" link
$link = $link->to_abs($url)->fragment(undef);
return unless grep { $link->protocol eq $_ } qw(http https);
# Don't go deeper than /a/b/c
return if #{$link->path->parts} > 3;
# Access every link only once
return if $visited{$link->to_string}++;
# Don't visit other hosts
return if $link->host ne $url->host;
push #$queue, $link;
say " -> $link";
});
say '';
return;
}
Take a look at this concurrent-requesting Mojolicious-based web crawler I wrote to illustrate my article Web Scraping with Modern Perl:
#!/usr/bin/env perl
use 5.010;
use open qw(:locale);
use strict;
use utf8;
use warnings qw(all);
use Mojo::UserAgent;
# FIFO queue
my #urls = map { Mojo::URL->new($_) } qw(
http://sysd.org/page/1/
http://sysd.org/page/2/
http://sysd.org/page/3/
);
# Limit parallel connections to 4
my $max_conn = 4;
# User agent following up to 5 redirects
my $ua = Mojo::UserAgent
->new(max_redirects => 5)
->detect_proxy;
# Keep track of active connections
my $active = 0;
Mojo::IOLoop->recurring(
0 => sub {
for ($active + 1 .. $max_conn) {
# Dequeue or halt if there are no active crawlers anymore
return ($active or Mojo::IOLoop->stop)
unless my $url = shift #urls;
# Fetch non-blocking just by adding
# a callback and marking as active
++$active;
$ua->get($url => \&get_callback);
}
}
);
# Start event loop if necessary
Mojo::IOLoop->start unless Mojo::IOLoop->is_running;
sub get_callback {
my (undef, $tx) = #_;
# Deactivate
--$active;
# Parse only OK HTML responses
return
if not $tx->res->is_status_class(200)
or $tx->res->headers->content_type !~ m{^text/html\b}ix;
# Request URL
my $url = $tx->req->url;
say $url;
parse_html($url, $tx);
return;
}
sub parse_html {
my ($url, $tx) = #_;
say $tx->res->dom->at('html title')->text;
# Extract and enqueue URLs
for my $e ($tx->res->dom('a[href]')->each) {
# Validate href attribute
my $link = Mojo::URL->new($e->{href});
next if 'Mojo::URL' ne ref $link;
# "normalize" link
$link = $link->to_abs($tx->req->url)->fragment(undef);
next unless grep { $link->protocol eq $_ } qw(http https);
# Don't go deeper than /a/b/c
next if #{$link->path->parts} > 3;
# Access every link only once
state $uniq = {};
++$uniq->{$url->to_string};
next if ++$uniq->{$link->to_string} > 1;
# Don't visit other hosts
next if $link->host ne $url->host;
push #urls, $link;
say " -> $link";
}
say '';
return;
}
LWP::UserAgent takes arguments to post as either ref to array or ref to hash format.
http://search.cpan.org/~gaas/libwww-perl-6.04/lib/LWP/UserAgent.pm#REQUEST_METHODS
$ua->post( $url, \%form )
$ua->post( $url, \#form )
which you provide in the first script in the ref to array format "\#form"
my $response = $ua->post( $url, ['apikey' => $key, 'resource' => $md5] );
being as it is a hash this is probably better written in the hash format "\%form"
my $response = $ua->post( $url, {'apikey' => $key, 'resource' => $md5} );
In Mojo::UserAgent the arguments to post are a little more complex, but essentially appear to be a "string" of hash refs to hash keys, with which I am unfamiliar. However you may find using the hash ref format provides the expected arguments correctly.
http://search.cpan.org/~sri/Mojolicious-3.87/lib/Mojo/UserAgent.pm#post
POST
my $tx = $ua->post('kraih.com');
my $tx = $ua->post('http://kraih.com' => {DNT => 1} => 'Hi!');
my $tx = $ua->post('http://kraih.com' => {DNT => 1} => form => {a => 'b'});
my $tx = $ua->post('http://kraih.com' => {DNT => 1} => json => {a => 'b'});
try this ?:
$response_vt = $ua->post( $url => form => {'apikey' => $key, 'resource' => $md5} => sub {... });

Parsing JSON Data::Dumper output array in Perl

I'm trying to edit an old perl script and I'm a complete beginner. The request from the server returns as:
$VAR1 = [
{
'keywords' => [
'bare knuckle boxing',
'support group',
'dual identity',
'nihilism',
'support',
'rage and hate',
'insomnia',
'boxing',
'underground fighting'
],
}
];
How can I parse this JSON string to grab:
$keywords = "bare knuckle boxing,support group,dual identity,nihilism,support,rage and hate,insomnia,boxing,underground fighting"
Full perl code
#!/usr/bin/perl
use LWP::Simple; # From CPAN
use JSON qw( decode_json ); # From CPAN
use Data::Dumper; # Perl core module
use strict; # Good practice
use warnings; # Good practice
use WWW::TheMovieDB::Search;
use utf8::all;
use Encode;
use JSON::Parse 'json_to_perl';
use JSON::Any;
use JSON;
my $api = new WWW::TheMovieDB::Search('APIKEY');
my $img = $api->type('json');
$img = $api->Movie_imdbLookup('tt0137523');
my $decoded_json = decode_json( encode("utf8", $img) );
print Dumper $decoded_json;
Thanks.
Based on comments and on your recent edit, I would say that what you are asking is how to navigate a perl data structure, contained in the variable $decoded_json.
my $keywords = join ",", #{ $decoded_json->[0]{'keywords'} };
say qq{ #{ $arrayref->[0]->{'keywords'} } };
As TLP pointed out, all you've shown is a combination of perl arrays/hashes. But you should look at the JSON.pm documentation, if you have a JSON string.
The result you present is similar to json, but the Perl-variant of it. (ie => instead of : etc). I don't think you need to look into the json part of it, As you already got the data. You just need to use Perl to join the data into a text string.
Just to eleborate on the solution to vol7ron :
#get a reference to the list of keywords
my $keywords_list = $decoded_json->[0]{'keywords'};
#merge this list with commas
my $keywords = join(',', #$keywords_list );
print $keywords;