I'm trying to parse some json data with the fandom wikia API. When I browse to my marvel.fandom.com/api request I get following JSON output: {"batchcomplete":"","query":{"pages":{"45910":{"pageid":45910,"ns":0,"title":"Uncanny X-Men Vol 1 171"}}}}
Nothing to fancy to begin with and running it through a JSON parser online gives following output:
{
"batchcomplete":"",
"query":{
"pages":{
"45910":{
"pageid":45910,
"ns":0,
"title":"Uncanny X-Men Vol 1 171"
}
}
}
}
which seems to be ok as far as I can see
I want to get the pageid for several other requests but I can't seem to get the same output through Perl.
The script:
#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
use JSON;
use Data::Dumper;
my $url = "https://marvel.fandom.com/api.php?action=query&titles=Uncanny%20X-Men%20Vol%201%20171&format=json";
my $json = getprint( $url);
die "Could not get $url!" unless defined $json;
my $decoded_json = decode_json($json);
print Dumper($decoded_json);
but this gives following error:
Could not get https://marvel.fandom.com/api.php?action=query&titles=Uncanny%20X-Men%20Vol%201%20171&format=json! at ./marvelScraper.pl line 11.
When I change the get to getprint for some extra info, I get this:
500 Can't connect to marvel.fandom.com:443
<URL:https://marvel.fandom.com/api.php?action=query&titles=Uncanny%20X-Men%20Vol%201%20171&format=json>
malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before "(end of string)") at ./script.pl line 13.
I tried this on another computer and still get the same errors.
The versions of LWP::Simple and LWP::Protocol::https
/usr/bin/perl -MLWP::Simple -E'say $LWP::Simple::VERSION'
6.15
/usr/bin/perl -MLWP::Protocol::https -E'say $LWP::Protocol::https::VERSION'
6.09
Appearantly it has something to do with the Bash Ubuntu on Windows since on a Ubuntu 18.04 I get (with the same script) following response:
JSON text must be an object or array (but found number, string, true, false or null, use allow_nonref to allow this) at ./test.pl line 13.
{"batchcomplete":"","query":{"pages":{"45910":{"pageid":45910,"ns":0,"title":"Uncanny X-Men Vol 1 171"}}}}
Actually, the very same script works from my Bash Ubuntu on Windows with the get() command instead of the getprint() you gave after editing your question.
orabig#Windows:~/DEV$ ./so.pl
$VAR1 = {
'query' => {
'pages' => {
'45910' => {
'pageid' => 45910,
'ns' => 0,
'title' => 'Uncanny X-Men Vol 1 171'
}
}
},
'batchcomplete' => ''
};
So maybe you have another issue that has nothing to do with Perl or Ubuntu.
Can you try this for example ?
curl -v 'https://marvel.fandom.com/api.php?action=query&titles=Uncanny%20X-Men%20Vol%201%20171&format=json'
Maybe you just hit the site too much, and the 500 error is just a result of some anti-leech protection ?
Related
I am working with the Text::CSV library of Perl to import data from a CSV file, using the functional interface. The data is stored in an array of hashes, and the problem is that when the script tries to access those elements/keys, they are uninitialized (or undefined).
Using the library Dumper, it is possible to see that the array and the hashes are not empty, in fact, they are correctly filled with the data of the CSV file.
With this small piece of code, I get the following output:
my $array = csv(
in => $csv_file,
headers => 'auto');
foreach my $mem (#{$array}) {
print Dumper $mem;
foreach (keys $mem) {
print $mem{$_};
}
}
Last part of the output:
$VAR1 = {
'Column' => '16',
'Width' => '13',
'Type' => 'RAM',
'Depth' => '4096'
};
Use of uninitialized value in print at ** line 81.
Use of uninitialized value in print at ** line 81.
Use of uninitialized value in print at ** line 81.
Use of uninitialized value in print at ** line 81.
This happens with all the elements of the array. Is this problem related to the encoding, or I am just simply accessing the elements in a incorrect way?
$mem is a reference to a hash, but you keep trying to use it directly as a hash. Change your code to:
foreach (keys %$mem) {
print $mem->{$_};
}
There is a slight complication in that in some versions of perl, 'keys $mem' was allowed directly as an experimental feature, which later got removed. In any case, adding
use warnings;
use strict;
would likely have given you some helpful clues as to what was happening.
When I run your code on my version of Perl (5.24), I get this error:
Experimental keys on scalar is now forbidden at ... line ...
This points to the line:
foreach (keys $mem) {
You should dereference the hash ref:
use warnings;
use strict;
use Data::Dumper;
use Text::CSV qw( csv );
my $csv_file="data.csv";
my $array = csv(
in => $csv_file,
headers => 'auto');
foreach my $mem (#{$array}) {
print Dumper($mem);
foreach (keys %{ $mem }) {
print $mem->{$_}, "\n";
}
}
I really need your help for understanding with the following perl example code:
#!/usr/bin/perl
# Hashtest
use strict;
use DBI;
use DBIx::Log4perl;
use Data::Dumper;
use utf8;
if (my $dbh = DBIx::Log4perl->connect("DBI:mysql:myDB","myUser","myPassword",{
RaiseError => 1,
PrintError => 1,
AutoCommit => 0,
mysql_enable_utf8 => 1
}))
{
my $data = undef;
my $sql_query = <<EndOfSQL;
SELECT 1
EndOfSQL
my $out = $dbh->prepare($sql_query);
$out->execute() or exit(0);
my $row = $out->fetchrow_hashref();
$out->finish();
# Debugging
print Dumper($row);
$dbh->disconnect;
exit(0);
}
1;
If i run this code on two machines i get different results.
Result on machine 1: (Result i needed with integer value)
arties#p51s:~$ perl hashTest.pl
Log4perl: Seems like no initialization happened. Forgot to call init()?
$VAR1 = {
'1' => 1
};
Resulst on machine 2: (Result that makes trouble because of string value)
arties#core3:~$ perl hashTest.pl
Log4perl: Seems like no initialization happened. Forgot to call init()?
$VAR1 = {
'1' => '1'
};
As you can see on machine 1 the value from MySQL will be interpreted as integer value and on machine 2 as string value.
I need on both machines the integer value. And it is not possible to modify the hash later, because the original code has too much values, that must be changed...
Both machines uses DBI 1.642 and DBIx::Log4perl 0.26
The only difference is the perl version machine 1 (v5.26.1) vs. machine 2 (v5.14.2)
So the big question is, how can I make sure I always get the integer in the hash as the result?
Update 10.10.2019:
To show perhaps better the problem, i improve the above example:
...
use Data::Dumper;
use JSON; # <-- Inserted
use utf8;
...
...
print Dumper($row);
# JSON Output
print JSON::to_json($row)."\n"; # <-- Inserted
$dbh->disconnect;
...
Now the output on machine 1 with last line the JSON Output:
arties#p51s:~$ perl hashTest.pl
Log4perl: Seems like no initialization happened. Forgot to call init()?
$VAR1 = {
'1' => 1
};
{"1":1}
Now the output on machine 2 with last line the JSON Output:
arties#core3:~$ perl hashTest.pl
$VAR1 = {
'1' => '1'
};
{"1":"1"}
You see, that both Data::Dumper AND JSON has the same behavor. And as i wrote bevor, +0 is not an option because the original hash is much more complex.
Both machines use JSON 4.02
#Nick P : That's the solution you linked Why does DBI implicitly change integers to strings? , the DBD::mysql was different on both systems! So i upgraded on machine 2 from Version 4.020 to Version 4.050 and now both systems has the same result! And Integers are Integers ;-)
So the result on both machines is now:
$VAR1 = {
'1' => 1
};
{"1":1}
Thank you!
I am using Perl LWP::UserAgent to get response from an API. Everything works good except one issue.
The API that i am using it returns response in JSON format. But I am getting it as string when i get the response through LWP module, Something like below.
$VAR1 = '
{"status":"success","data":[{"empid":"345232","customername":"Lee gates","dynamicid":"2342342332sd32423"},{"empid":"36.VLXP.013727..CBCL..","customername":"Lee subdirectories","dynamicid":"223f3423dsf23423423"}],"message":""}'
I did "print Dumper $response" to get the output.
One more thing, The challenge is that my client do not want to go with Perl module for JSON (use JSON::Parse 'parse_json';).
Any help would be appreciated!
You need to decode the JSON string into a Perl data structure. If your version of perl is 5.14+, JSON::PP is included in core, so nothing to install.
use warnings;
use strict;
use Data::Dumper;
use JSON::PP qw(decode_json);
my $json = '{"status":"success","data":[{"empid":"345232","customername":"Lee gates","dynamicid":"2342342332sd32423"},{"empid":"36.VLXP.013727..CBCL..","customername":"Lee subdirectories","dynamicid":"223f3423dsf23423423"}],"message":""}';
my $perl = decode_json $json;
print Dumper $perl;
Output:
$VAR1 = {
'status' => 'success',
'message' => '',
'data' => [
{
'dynamicid' => '2342342332sd32423',
'customername' => 'Lee gates',
'empid' => '345232'
},
{
'empid' => '36.VLXP.013727..CBCL..',
'customername' => 'Lee subdirectories',
'dynamicid' => '223f3423dsf23423423'
}
]
};
My Perl script sends push notifications to an Apple APNS server. It works except when I try to send emojis (special characters).
My code
use DBI;
use JSON;
use Net::APNS::Persistent;
use Data::Dumper;
use Encode;
my $cfg;
my $apns;
...;
sub connect {
my ($sandbox, $cert, $key, $pass) = $cfg->getAPNSServer();
$apns = Net::APNS::Persistent->new({
sandbox => $sandbox,
cert => $cert,
key => $key,
}) or die("[-] Unable to connect to APNS server");
}
sub push {
my $msg = $_[1];
Logger::log(5, "[APNS Client] Got message ".Dumper($msg));
#Encode::_utf8_off($msg);
utf8::encode($msg);
my $pack = decode_json($msg);
my ($token, $payload) = #{$pack};
Logger::log(5, "Sending push with token: $token and Data: \n".Dumper($payload));
$apns->queue_notification(
$token,
$payload
);
$apns->send_queue;
}
So in the push subroutine I pass JSON data with the format given below. My problem is with the emoji character \x{2460}. You can see I added this line
utf8::encode($msg);
Before decoding the data. If I remove this line I get an error while decoding the JSON data
Wide character in subroutine entry at .....
With the above line added I can decode my JSON data. However when I try to write to the socket in the next line ($apns->send_queue) gives
Cannot decode string with wide characters at /usr/lib/perl/5.10/Encode.pm line 176
How do I solve this?
Message format (JSON)
["token",
{
"aps":{
"alert":"Alert: \x{2460}",
"content-available":1,
"badge":2,
"sound":"default.aiff"
},
"d":"Meta"
}
]
Dumper Output
[-] [ 2015-08-25T20:03:15 ] [APNS Client] Got message $VAR1 = "[\"19c360f37681035730a26cckjgkjgkj58b2d20326986f4265ee802c103f51\",{\"aps\":{\"alert\":\"Alert: \x{24bc}\",\"content-available\":1,\"badge\":2,\"sound\":\"default.aiff\"},\"d\":\"Meta\"}]";
[-] [ 2015-08-25T20:03:15 ] Sending push with token: 119c360f37681035730a26cckjgkjgkj58b2d20326986f4265ee802c103f51 and Data:
$VAR1 = {
'aps' => {
'alert' => "Alert: \x{24bc}",
'content-available' => 1,
'badge' => 2,
'sound' => 'default.aiff'
},
'd' => 'Meta'
};
[x] [ 2015-08-25T20:03:15 ] [APNS Client] Error writing to socket. Reconnecting : Cannot decode string with wide characters at /usr/lib/perl/5.10/Encode.pm line 176.
First of all, decode_json expects JSON encoded using UTF-8, so if you're starting with "decoded" JSON, it is proper to encode it as you did.
utf8::encode( my $json_utf8 = $json_uni );
my $data = decode_json($json_utf8);
However, it would have been simpler to use from_json.
my $data = from_json($json_uni);
Now on to your question. Whoever wrote Net::APNS::Persistent messed up big time. I looked at the source code, and they expect the alert message to be encoded using UTF-8. Adding the following will make your structure conform with the module's wonky expectation:
utf8::encode(
ref($payload->{aps}{alert}) eq 'HASH'
? $payload->{aps}{alert}{body}
: $payload->{aps}{alert}
);
It wouldn't surprise me if you ran into other issues. Notably, the modules uses the bytes module, a sure sign that something is being done incorrectly.
You probably have to UTF-8 encode the alert in $payload before sending it. You can also use from_json instead of decode_json to avoid the first encoding step:
sub push {
my $msg = $_[1];
Logger::log(5, "[APNS Client] Got message ".Dumper($msg));
my $pack = from_json($msg);
my ($token, $payload) = #{$pack};
Logger::log(5, "Sending push with token: $token and Data: \n".Dumper($payload));
# UTF-8 encode before sending.
utf8::encode($payload->{aps}{alert});
$apns->queue_notification(
$token,
$payload
);
$apns->send_queue;
}
I have tried my script on Mac OS Mavericks (perl 5.16.2) and Yosemite and also with Windows 7 (strawberry-perl-5.20.1.1-64bit-portable).
It is supposed to read UTF-8 data (russian text) and put it into a data structure - and finally print the data structure as JSON string (the output will be used to feed Core Data in an iOS word game).
The first part works (extracting words and printing them - to verify) works well, but the final part not: the resulting JSON string contains garbage:
Does anybody please know, how to fix my simple test script?
#!/usr/bin/perl -w
use strict;
use warnings;
use utf8;
use JSON;
binmode(STDOUT, ':utf8');
my $root = { words => [] };
while (<DATA>) {
chomp;
utf8::decode($_);
my #a = split /\s*[:,]\s*/;
my $words = [];
for my $word (#a[1 .. $#a]) {
print "WORD: $word\n";
#push #$words, utf8::encode($word);
push #$words, $word;
}
push #{$root->{words}}, $words;
}
print to_json($root, {utf8 => 1, pretty => 1});
__DATA__
Голова: небо, язык, мозг, глотка, надгортанник, пищевод, горло, гортань
Сумки: портмоне, кошелек, портфель, рюкзак, лямка, застежка
You're double encoding. You're encoding using from_json (utf8 => 1), then you're encoding again when outputting to STDOUT (binmode(STDOUT, ':utf8');).
The solution isn't clear, because it's not clear what you are trying to achieve. If you're really going to output non-JSON and JSON to STDOUT, don't ask from_json to encode.
The output looks "wrong", but that's OK: it's encoded. To see it correctly, just set
binmode STDOUT, ':raw';
before printing the JSON.
You can simplify the script by using encode_json:
#!/usr/bin/perl
use strict;
use warnings;
use utf8;
use JSON;
binmode(STDIN, ":utf8");
binmode(STDOUT, ":utf8");
my $root;
while (<DATA>) {
chomp;
my #words = split /\s*[:,]\s*/;
push #{ $root->{words} }, [];
for my $word (#words[1 .. $#words]) {
print "WORD: $word\n";
push #{ $root->{words}[-1] }, $word;
}
}
my $json = encode_json($root);
binmode STDOUT, ':raw';
print $json;