My code is as follows super simplistic but I am just not getting it to work as intended.
use strict;
use warnings;
use CGI::Carp qw(fatalsToBrowser);
use CGI qw(-dubug);
use warnings;
use diagnostics;
use strict;
use JSON;
use Data::Dumper;
my $q = CGI->new;
my $data = $q->param('POSTDATA');
my $data_hash;
if (defined($data)) {
$data_hash = decode_json($data);
}
sub test {
my $return_hash = shift;
return \$return_hash;
}
my $return_to_print = test($data_hash);
print $q->header();
print "This is a test: \n";
print Dumper($return_to_print);
Basically I am sending json to the perl script, I decode the json into a hashref, then id like to pass that data to the test sub who just does nothing more than return it back so the cgi can print it, all while keeping its structure. So far I am unsuccessful and I am hoping someone can shed some light on how to properly write something like this.
So in the end dumper should print something like:
$VAR1 = { 'key' => 'value', 'key2' => 'value' };
Your code boils down to
my $data_hash = decode_json($data);
my $return_hash = $data_hash;
my $return_to_print = \$return_hash;
It should not be a surprise that $return_hash is different than $return_to_print. You assigned a reference to a scalar to $return_to_print rather than copying its value (a refernce to a hash). You would need the following for them to be the same
my $return_to_print = $return_hash;
Which is to say you'd need the following:
return $return_hash;
Related
I have a XML file using which I am grepping some of the value based on some regex.
The XML file looks like this-
<Instance>Fuse_Name</Instance>
<Id>8'hed</ID>
<SomeAddr>17'h00baf</SomeAddr>
<PSomeAddr>17'h00baf</PSomeAddr>
I want to retrieve 17'h00baf value from "SomeAddr" tag. I am matching the regex "SomeAddr" so as to reach that row in the file and then using index and substr function I am retrieving value using below code
my $i = index($row,">");
my $j = index($row,"<");
$Size_in_bits = substr $row,$i+1,$j-$i-3;
But after doing this I am not getting 17'h00baf . Instead I am getting 17'h01191 . On similar approach I am able to grep other values which are decimal or string,Only with the hexadecimal values I am facing this problem. Can somebody please tell me what is wrong in the approach??
Please don't parse XML with regexes. Use a proper XML parser.
But, ignoring that advice temporarily, I don't get the behaviour you describe when testing your code.
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
while (<DATA>) {
next unless /<SomeAddr>/;
my $i = index($_, ">");
my $j = index($_, "<");
my $Size_in_bits = substr $_, $i + 1, $j - $i - 3;
say $Size_in_bits;
}
__END__
<Instance>Fuse_Name</Instance>
<Id>8'hed</ID>
<SomeAddr>17'h00baf</SomeAddr>
<PSomeAddr>17'h00baf</PSomeAddr>
And running it:
$ perl parsexml
17'h00baf
Of course, I've had to guess at what a lot of your code looks like because you didn't give us a complete example to test. So it looks likely that your problems are in bits of the code that you haven't shown us.
(My guess would be that there's another <SomeAddr> tag in the file somewhere.)
Never, ever use a regex to parse HTML/XML/.... Always use a proper parser and then implement your algorithm in the DOM domain.
My solution shows how to parse the XML and then extract the text content from <SomeAddr> nodes at the top-level of the XML document.
#!/usr/bin/perl
use warnings;
use strict;
use XML::LibXML;
my $doc = XML::LibXML->load_xml(IO => \*DATA);
my $xpc = XML::LibXML::XPathContext->new();
# register default NS
$xpc->registerNs('default', 'http://some.domain.com/some/path/to');
foreach my $node ($xpc->findnodes('//default:SomeAddr', $doc)) {
print $node->textContent, "\n";
}
exit 0;
__DATA__
<Root xmlns="http://some.domain.com/some/path/to">
<Instance>Fuse_Name</Instance>
<Id>8'hed</Id>
<SomeAddr>17'h00baf</SomeAddr>
<PSomeAddr>17'h00baf</PSomeAddr>
</Root>
Test run
$ perl dummy.pl
17'h00baf
I have been writing part of a website I'm making, part of the stats page will display information about a websites Json response.
The address of the website is: http://steamcommunity.com/market/listings/440/Name%20Tag/render/?count=1&start=1&query=.
Here is a link to a parser so the code is easier to read http://json.parser.online.fr/.
The code I have written so far works but no matter what i try I cant get the information I need.
use JSON::XS;
use WWW::Mechanize;
use HTTP::Cookies;
use LWP::Simple;
use strict;
use warnings;
my $url = "http://steamcommunity.com/market/listings/440/Name%20Tag/render/?count=2&start=2";
my $json = get $url;
my $data = decode_json $json;
my $info = $data -> {listinginfo};
My problem is that i would like to access the price of the listing however when new listings are made available the reference for them changes. I have no idea how to deal with this and Google is not helping. Any help would be greatly appreciated, thanks in advance.
Seb Morris.
EDIT: Thanks for the replies, I have progressed my code and ended up with:
my $data = decode_json $json;
my #infoids = keys %{$data -> {listinginfo}};
foreach my $infoid (#infoids) {
my $price = $data -> {listinginfo}{$infoid}{converted_price};
print "$price" . "\n";
}
However I am getting the error: Use of uninitialized value $price in string at line 30. I dont understand why I am getting this error as I have declared the variable. Any help would be really appreciated.
If I understand, your problem is that the listinginfo object contains key(s) which change for each request, and you don't know to find out what the key is for the request you just made.
You can find the keys to a perl hash using the 'keys' function. So you can get all of the keys of the listinginfo hash like this:
my #infoids = keys %{$data -> {listinginfo}};
Note the need to use %{ } to de-reference listinfo, which is itself a hash reference.
There could be more than one info ID, although when I tested the web service you linked in your question it only ever returned one. If you are sure there will only ever be one, you can use:
my $price = $data -> {listinginfo}{$infoids[0]}{price};
Or, if there might be more than one, you can loop through them:
foreach my $infoid (#infoids) {
my $price = $data -> {listinginfo}{$infoids[0]}{price};
# Now do something with price
}
I've been poking a this and can't get around this "unblessed reference" error. Here's my simplified code:
#!/usr/local/bin/perl
use strict;
use warnings;
use HTML::TokeParser;
my $p = HTML::TokeParser->new( $ARGV[0] );
while (my $t = $p->get_tag('img')) {
my $src = $t->get_attr('src');
print "$src\n";
}
And here's the error message when I try it:
Can't call method "get_attr" on unblessed reference at M:\list_images_in_html.pl line 9.
I gather that somehow it's not recognizing $t as a token object with a get_attr method, but I don't understand why.
According to the manual (HTML::TokeParse at MetaCPAN), get_tag() returns an array reference, not an object.
You cannot call get_attr() on a bog standard array ref.
get_attr is a convenience method in HTML::TokeParser::Simple (a wrapper for HTML::TokeParser) but does not exist in HTML::TokeParser.
Replace two lines in your code with this:
use HTML::TokeParser::Simple;
my $p = HTML::TokeParser::Simple->new( $ARGV[0] );
and your script will work.
I want to parse a Website into a Perl data structure.
First I load the page with
use LWP::Simple;
my $html = get("http://f.oo");
Now I know two ways to deal with it.
First are the regular expressions and secound the modules.
I started with reading about HTML::Parser and found some examples.
But I'm not that sure about by Perl knowledge.
My code example goes on
my #links;
my $p = HTML::Parser->new();
$p->handler(start => \&start_handler,"tagname,attr,self");
$p->parse($html);
foreach my $link(#links){
print "Linktext: ",$link->[1],"\tURL: ",$link->[0],"\n";
}
sub start_handler{
return if(shift ne 'a');
my ($class) = shift->{href};
my $self = shift;
my $text;
$self->handler(text => sub{$text = shift;},"dtext");
$self->handler(end => sub{push(#links,[$class,$text]) if(shift eq 'a')},"tagname");
}
I don't understand why there is two times a shift. The secound should be the self pointer. But the first makes me think that the self reference is allready shiftet, used as a Hash and the Value for href is stored in $class. Could someone Explain this line (my ($class) = shift->{href};)?
Beside this lack, I do not want to parse all the URLs, I want to put all the code between <div class ="foo"> and </div> into a string, where lots of code is between, specially other <div></div> tags. So I or a module has to find the right end.
After that I planed to scan the string again, to find special classes, like <h1>,<h2>, <p class ="foo2"></p>, etc.
I hope this informations helps you to give me some usefull advices, and please have in mind that first of all I want an easy understanding way, which has not to be a great performance in the first level!
HTML::Parser is more of a tokenizer than a parser. It leaves a lot of hard work up to you. Have you considered using HTML::TreeBuilder (which uses HTML::Parser) or XML::LibXML (a great library which has support for HTML)?
Use HTML::TokeParser::Simple.
Untested code based on your description:
#!/usr/bin/env perl
use strict; use warnings;
use HTML::TokeParser::Simple;
my $p = HTML::TokeParser::Simple->new(url => 'http://example.com/example.html');
my $level;
while (my $tag = $p->get_tag('div')) {
my $class = $tag->get_attr('class');
next unless defined($class) and $class eq 'foo';
$level += 1;
while (my $token = $p->get_token) {
$level += 1 if $token->is_start_tag('div');
$level -= 1 if $token->is_end_tag('div');
print $token->as_is;
unless ($level) {
last;
}
}
}
No need to get so complicated. You can retrieve and find elements in the DOM using CSS selectors with Mojo::UserAgent:
say Mojo::UserAgent->new->get('http://f.oo')->res->dom->find('div.foo');
or, loop through the elements found:
say $_ for Mojo::UserAgent->new->get('http://f.oo')->res->dom
->find('div.foo')->each;
or, loop using a callback:
Mojo::UserAgent->new->get('http://f.oo')->res->dom->find('div.foo')->each(sub {
my ($count, $el) = #_;
say "$count: $el";
});
According to the docs, the handler's signature is (\%attr, \#attr_seq, $text). There are three shifts, one for each argument.
my ($class) = shift->{href};
is equivalent to:
my $class;
my %attr_seq;
my $attr_seq_ref;
$attr_seq_ref = shift;
%attr_seq = %$attr_seq_ref;
$class = $attr_seq{'href'};
I have a perl script using CGI.
The browser calls it with some parameters.
I want to take those parameters, modify some of them and then send back a redirect with a new querystring representing the modified parameters.
I know that I could do this, like this:
my $cgi = CGI->new()
my %vars = $cgi->Vars
$vars{'modify_me'} .=' more stuff';
my $serialized = join '&', map {$_.'='.$cgi->escapeHTML($vars{$_})} keys %vars;
However, this just feels like it might be missing something. In addition, it doesn't do anything to handle multivalued parameters. Who knows what else it fails to do.
So, is there a module out there that just deals with this problem? I'm not interested in reinventing a wheel that a more talented wright wrought. Right?
The URI module is your friend. It has a query_form method that takes a hash, hashref or arrayref of parameters and generates a query string from it.
It will URL Encode your data for you (and note that you do want it URL Encoded and not HTML Encoded).
So you might have something like:
#!/usr/bin/perl
use strict;
use warnings;
use CGI;
use URI;
my $q = CGI->new;
my #data = map {
my $name = $_;
my #values = $q->param($name);
my $value;
if (scalar #values == 1) {
($value) = #values;
} else {
$value = \#values;
}
if ($name eq "foo") {
$value = "replaced";
}
($name, $value);
} $q->param;
my $uri = URI->new('http://example.com/myAlternative.cgi');
$uri->query_form(\#data);
print $q->redirect(
-uri=> $uri,
-status => 301
);
Have you looked at Data::URIEncode or URI::QueryParam?
Turns out, there's a way to achieve my specific need using just the CGI module. However, the other answers cover a wider need, to serialize an arbitrary hash.
If you want to modify incoming parameters and then create a link to the same script with modified parameters you can do this:
my $params = $cgi->Vars;
$ Modify the values in hash that $params references
my $new_url = $cgi->self_url(); # URL with modified parameters