I have a simple Perl script and I want to remove everything up to the word "city". Or remove everything up to the nth occurrence (the 2nd in my particular case) of the comma's " , ". Here's what is looks like below.
#!/usr/bin/perl
use warnings;
use strict;
my $CMD = `curl http://ip-api.com/json/8.8.8.8`;
chomp($CMD);
my $find = "^[^city]*city";
$CMD =~ s/$find//;
print $CMD;
The output is this:
{"as":"AS15169 Google Inc.","city":"Mountain View","country":"United States","countryCode":"US","isp":"Google","lat" :37.386,"lon":-122.0838,"org":"Google","query":"8.8.8.8","region":"CA","regionName":"California","status":"success","timezone":"America/Los_Angeles","zip":"94035"}
So i want do drop
" {"as":"AS15169 Google Inc.","
or drop up to
{"as":"AS15169 Google Inc.","city":"Mountain View",
EDIT:
I see I was doing far too much when matching the string. I simplified the fix for my problem with removing all before "city". My $find has been changed to
my $find = ".*city";
While I also changed the replace function like so,
$CMD =~ s/$find/city/;
Still haven't figured out how to remove all before the nth occurrence of a comma or any character / string for that matter.
The content you get back is JSON, so you can easily turn it into a Perl data structure, play with it, and even turn it back into JSON if you like. That's the point! And, it's so easy:
use Mojo::UserAgent;
use Mojo::JSON qw(decode_json encode_json);
my $ua = Mojo::UserAgent->new;
my $tx = $ua->get( 'http://ip-api.com/json/8.8.8.8' );
my $json = $tx->res->body;
my $perl = decode_json( $json );
delete $perl->{'as'};
my $new_json = encode_json( $perl );
print $new_json;
Mojolicious is wonderful for this. It's my preferred way for dealing with JSON even without the user-agent stuff. If you play with the JSON string directly, you're likely to have problems when the order of elements change or it contains wide characters.
You don't have to manually decode_json() with Mojolicious. Simply do this:
my $tx = $ua->get('http://ip-api.com/json/8.8.8.8');
my $json = $tx->res->json;
my $as = $json->{as}
You can even go fancy with JSON pointers:
my $as = $tx->res->json("/as");
Something like
#!/usr/bin/perl -w
my $results = `curl http://ip-api.com/json/8.8.8.8`;
chomp $results;
$results =~ s/^.*city":"\w+\s?\w+",//g;
print $results . "\n";
should do the trick.. unless there's a misunderstanding of what you want to keep v.s. remove.
FYI, http://regexr.com/ is totally my go to for regex happiness.
Related
I have a Perl script that reads in data from a database and prints out the result in HTML forms/tables. The form of each book also contains a submit button.
I want Perl to create a text file (or read into one already created) and print the title of the book that was inside the form submitted. But I can't seem to get param() to catch the submit action!
#!/usr/bin/perl -w
use warnings; # Allow for warnings to be sent if error's occur
use CGI; # Include CGI.pm module
use DBI;
use DBD::mysql; # Database data will come from mysql
my $dbh = DBI->connect('DBI:mysql:name?book_store', 'name', 'password')
or die("Could not make connection to database: $DBI::errstr"); # connect to the database with address and pass or return error
my $q = new CGI; # CGI object for basic stuff
my $ip = $q->remote_host(); # Get the user's ip
my $term = $q->param('searchterm'); # Set the search char to $term
$term =~ tr/A-Z/a-z/; # set all characters to lowercase for convenience of search
my $sql = '
SELECT *
FROM Books
WHERE Title LIKE ?
OR Description LIKE ?
OR Author LIKE ?
'; # Set the query string to search the database
my $sth = $dbh->prepare($sql); # Prepare to connect to the database
$sth->execute("%$term%", "%$term%", "%$term%")
or die "SQL Error: $DBI::errstr\n"; # Connect to the database or return an error
print $q->header;
print "<html>";
print "<body>";
print " <form name='book' action='bookcart.php' method=post> "; # Open a form for submitting the result of book selection
print "<table width=\"100%\" border=\"0\"> ";
my $title = $data[0];
my $desc = $data[1];
my $author = $data[2];
my $pub = $data[3];
my $isbn = $data[4];
my $photo = $data[5];
print "<tr> <td width=50%>Title: $title</td> <td width=50% rowspan=5><img src=$photo height=300px></td></tr><tr><td>Discreption Tags: $desc</td></tr><tr><td>Publication Date: $pub</td></tr><tr><td>Author: $author</td></tr><tr><td>ISBN: $isbn</td> </tr></table> <br>";
print "Add this to shopping cart:<input type='submit' name='submit' value='Add'>";
if ($q->param('submit')) {
open(FILE, ">>'$ip'.txt");
print FILE "$title\n";
close(FILE);
}
print "</form>"; # Close the form for submitting to shopping cart
You haven't used use strict, to force you to declare all your variables. This is a bad idea
You have used remote_host, which is the name of the client host system. Your server may not be able to resolve this value, in which case it will remain unset. If you want the IP address, use remote_addr
You have prepared and executed your SQL statement but have fetched no data from the query. You appear to expect the results to be in the array #data, but you haven't declared this array. You would have been told about this had you had use strict in effect
You have used the string '$ip'.txt for your file names so, if you were correctly using the IP address in stead of the host name, your files would look like '92.17.182.165'.txt. Do you really want the single quotes in there?
You don't check the status of your open call, so you have no idea whether the open succeeded, or the reason why it may have failed
I doubt if you have really spent the last 48 hours coding this. I think it is much more likely that you are throwing something together in a rush at the last minute, and using Stack Overflow to help you out of the hole you have dug for yourself.
Before asking for the aid of others you should at least use minimal good-practice coding methods such as applying use strict. You should also try your best to debug your code: it would have taken very little to find that $ip has the wrong value and #data is empty.
Use strict and warnings. You want to use strict for many reasons. A decent article on this is over at perlmonks, you can begin with this. Using strict and warnings
You don't necessarily need the following line, you are using DBI and can access mysql strictly with DBI.
use DBD::mysql;
Many of options are available with CGI, I would recommend reading the perldoc on this also based on user preferences and desired wants and needs.
I would not use the following:
my $q = new CGI;
# I would use as so..
my $q = CGI->new;
Use remote_addr instead of remote_host to retrieve your ip address.
The following line you are converting all uppercase to lowercase, unless it's a need to specifically read from your database with all lowercase, I find this useless.
$term =~ tr/A-Z/a-z/;
Next your $sql line, again user preference, but I would look into sprintf or using it directly inside your calls. Also you are trying to read an array of data that does not exist, where is the call to get back your data? I recommend reading the documentation for DBI also, many methods of returning your data. So you want your data back using an array for example...
Here is an untested example and hint to help get you started.
use strict;
use warnings;
use CGI qw( :standard );
use CGI::Carp qw( fatalsToBrowser ); # Track your syntax errors
use DBI;
# Get IP Address
my $ip = $ENV{'REMOTE_ADDR'};
# Get your query from param,
# I would also parse your data here
my $term = param('searchterm') || undef;
my $dbh = DBI->connect('DBI:mysql:db:host', 'user', 'pass',
{RaiseError => 1}) or die $DBI::errstr;
my $sql = sprintf ('SELECT * FROM Books WHERE Title LIKE %s
OR Description LIKE %s', $term, $term);
my $sth = $dbh->selectall_arrayref( $sql );
# Retrieve your result data from array ref and turn into
# a hash that has title for the key and a array ref to the data.
my %rows = ();
for my $i ( 0..$#{$sth} ) {
my ($title, $desc, $author, $pub, $isbn, $pic) = #{$sth->[$i]};
$rows{$title} = [ $desc, $author, $pub, $isbn, $pic ];
}
# Storing your table/column names
# in an array for mapping later.
my #cols;
$cols[0] = Tr(th('Title'), th('Desc'), th('Author'),
th('Published'), th('ISBN'), th('Photo'));
foreach (keys %rows) {
push #cols, Tr( td($_),
td($rows{$_}->[0]),
td($rows{$_}->[1]),
td($rows{$_}->[2]),
td($rows{$_}->[3]),
td(img({-src => $rows{$_}->[4]}));
}
print header,
start_html(-title => 'Example'),
start_form(-method => 'POST', -action => 'bookcart.php'), "\n",
table( {-border => undef, -width => '100%'}, #cols ),
submit(-name => 'Submit', -value => 'Add Entry'),
end_form,
end_html;
# Do something with if submit is clicked..
if ( param('Submit') ) {
......
}
This assumes that you're using the OO approach to CGI.pm, and that $q is the relevant object. This should work, assuming that you have $q = new CGI somewhere in your script.
Can you post the rest of the script?
I've created a mockup to test this, and it works as expected:
#!/usr/bin/perl
use CGI;
my $q = new CGI;
print $q->header;
print "<form><input type=submit name=submit value='add'></form>\n";
if ($q->param('submit')) {
print "submit is \"" . $q->param('submit') . "\"\n";
}
After the submit button is clicked, the page displays that submit is "add" which means the evaluation is going as planned.
I guess what you need to do is make sure that $q is your CGI object, and move forward from there.
I'm trying to edit an old perl script and I'm a complete beginner. The request from the server returns as:
$VAR1 = [
{
'keywords' => [
'bare knuckle boxing',
'support group',
'dual identity',
'nihilism',
'support',
'rage and hate',
'insomnia',
'boxing',
'underground fighting'
],
}
];
How can I parse this JSON string to grab:
$keywords = "bare knuckle boxing,support group,dual identity,nihilism,support,rage and hate,insomnia,boxing,underground fighting"
Full perl code
#!/usr/bin/perl
use LWP::Simple; # From CPAN
use JSON qw( decode_json ); # From CPAN
use Data::Dumper; # Perl core module
use strict; # Good practice
use warnings; # Good practice
use WWW::TheMovieDB::Search;
use utf8::all;
use Encode;
use JSON::Parse 'json_to_perl';
use JSON::Any;
use JSON;
my $api = new WWW::TheMovieDB::Search('APIKEY');
my $img = $api->type('json');
$img = $api->Movie_imdbLookup('tt0137523');
my $decoded_json = decode_json( encode("utf8", $img) );
print Dumper $decoded_json;
Thanks.
Based on comments and on your recent edit, I would say that what you are asking is how to navigate a perl data structure, contained in the variable $decoded_json.
my $keywords = join ",", #{ $decoded_json->[0]{'keywords'} };
say qq{ #{ $arrayref->[0]->{'keywords'} } };
As TLP pointed out, all you've shown is a combination of perl arrays/hashes. But you should look at the JSON.pm documentation, if you have a JSON string.
The result you present is similar to json, but the Perl-variant of it. (ie => instead of : etc). I don't think you need to look into the json part of it, As you already got the data. You just need to use Perl to join the data into a text string.
Just to eleborate on the solution to vol7ron :
#get a reference to the list of keywords
my $keywords_list = $decoded_json->[0]{'keywords'};
#merge this list with commas
my $keywords = join(',', #$keywords_list );
print $keywords;
I'm trying to parse the html file using perl script. I'm trying to grep all the text with html tag p. If I view the source code the data is written in this format.
<p> Metrics are all virtualization specific and are prioritized and grouped as follows: </p>
Here is the following code.
use HTML::TagParser();
use URI::Fetch;
//my #list = $html->getElementsByTagName( "p" );
foreach my $elem ( #list ) {
my $tagname = $elem->tagName;
my $attr = $elem->attributes;
my $text = $elem->innerText;
push (#array,"$text");
foreach $_ (#array) {
# print "$_\n";
print $html_fh "$_\n";
chomp ($_);
push (#array1, "$_");
}
}
}
$end = $#array1+1;
print "Elements in the array: $end\n";
close $html_fh;
The problem that I'm facing is that the output which is generated is 4.60 Mb and lot of the array elements are just repetition sentences. How can I avoid such repetition? Is there any other efficient way to grep the lines which I'm interested. Can anybody help me out with this issue?
The reason you are seeing duplicated lines is that you are printing your entire array once for every element in it.
foreach my $elem ( #list ) {
my $tagname = $elem->tagName;
my $attr = $elem->attributes;
my $text = $elem->innerText;
push (#array,"$text"); # this array is printed below
foreach $_ (#array) { # This is inside the other loop
# print "$_\n";
print $html_fh "$_\n"; # here comes the print
chomp ($_);
push (#array1, "$_");
}
}
So for example, if you have an array "foo", "bar", "baz", it would print:
foo # first iteration
foo # second
bar
foo # third
bar
baz
So, to fix your duplication errors, move the second loop outside the first one.
Some other notes:
You should always use these two pragmas:
use strict;
use warnings;
They will provide more help than any other single thing that you can do. The short learning curve associated with fixing the errors that appear more than make up for the massively reduced time spent debugging.
//my #list = $html->getElementsByTagName( "p" );
Comments in perl start with #. Not sure if this is a typo, because you use this array below.
foreach my $elem ( #list ) {
You don't need to actually store the tags into an array unless you need an array. This is an intermediate variable only in this case. You can simply do the following (note that for and foreach are exactly the same):
for my $elem ($html->getElementsByTagName("p")) {
These variables are also intermediate, and two of them unused.
my $tagname = $elem->tagName;
my $attr = $elem->attributes;
my $text = $elem->innerText;
push (#array,"$text");
Also note that you never have to quote a variable this way. You can simply do this:
push #array, $elem->innerText;
foreach $_ (#array) {
The $_ variable is used by default, no need to specify it explicitly.
print $html_fh "$_\n";
chomp ($_);
push (#array1, "$_");
I'm not sure why you are chomping the variable after you print it, but before you store it in this other array, but it doesn't seem to make sense to me. Also, this other array will contain the exact same elements as the other array, only duplicated.
$end = $#array1+1;
This is another intermediate variable, and also it can be simplified. The $# sigil will give you the index of the last element, but the array itself in scalar context will give you the size of it:
$end = #array1; # size = last index + 1
But you can do this in one go:
print "Elements in the array: " . #array1 . "\n";
Note that using the concatenation operator . here enforces scalar context on the array. If you had used the comma operator , it would have list context, and the array would have been expanded into a list of its elements. This is a typical way to manipulate by context.
close $html_fh;
Explicitly closing a file handle is not required as it will automatically closed when the script ends.
If you use Web::Scraper instead, your code gets even simpler and clearer (as long as you are able to construct CSS selectors or XPath queries):
#!/usr/bin/env perl
use strict;
use warnings qw(all);
use URI;
use Web::Scraper;
my $result = scraper {
process 'p',
'paragraph[]' => 'text';
}->scrape(URI->new('http://www.perl.org/'));
for my $test (#{$result->{paragraph}}) {
print "$test\n";
}
print "Elements in the array: " . (scalar #{$result->{paragraph}});
Here is another way to get all the content from between <p> tags, this time using Mojo::DOM part of the Mojolicious project.
#!/usr/bin/env perl
use strict;
use warnings;
use v5.10; # say
use Mojo::DOM;
my $html = <<'END';
<p>Paragraph 1</p>
<p>Paragraph 2</p>
<div>Should not find this</div>
<p>Paragraph 3</p>
END
my $dom = Mojo::DOM->new($html);
my #paragraphs = $dom->find('p')->pluck('text')->each;
say for #paragraphs;
I need to extract the IMDB id(example:for the movie 300 it is tt0416449) for a movie specified by the variable URL. I have looked at the page source for this page and come up with the following regex
use LWP::Simple;
$url = "http://www.imdb.com/search/title?title=$FORM{'title'}";
if (is_success( $content = LWP::Simple::get($url) ) ) {
print "$url is alive!\n";
} else {
print "No movies found";
}
$code = "";
if ($content=~/<td class="number">1\.</td><td class="image"><a href="\/title\/tt[\d]{1,7}"/s) {
$code = $1;
}
I am getting an internal server error at this line
$content=~/<td class="number">1\.</td><td class="image"><a href="\/title\/tt[\d]{1,7}"/s
I am very new to perl, and would be grateful if anyone could point out my mistake(s).
Use an HTML parser. Regular expressions cannot parse HTML.
Anyway, the reason for the error is probably that you forgot to escape a forward slash in your regex. It should look like this:
/<td class="number">1\.<\/td><td class="image"><a href="\/title\/tt[\d]{1,7}"/s
A very nice interface for this type of work is provided by some tools of the Mojolicious distribution.
Long version
The combination of its UserAgent, DOM and URL classes can work in a very robust way:
#!/usr/bin/env perl
use strict;
use warnings;
use feature 'say';
use Mojo::UserAgent;
use Mojo::URL;
# preparations
my $ua = Mojo::UserAgent->new;
my $url = "http://www.imdb.com/search/title?title=Casino%20Royale";
# try to load the page
my $tx = $ua->get($url);
# error handling
die join ', ' => $tx->error unless $tx->success;
# extract the url
my $movie_link = $tx->res->dom('a[href^=/title]')->first;
my $movie_url = Mojo::URL->new($movie_link->attrs('href'));
say $movie_url->path->parts->[-1];
Output:
tt0381061
Short version
The funny one liner helper module ojo helps to build a very short version:
$ perl -Mojo -E 'say g("imdb.com/search/title?title=Casino%20Royale")->dom("a[href^=/title]")->first->attrs("href") =~ m|([^/]+)/?$|'
Output:
tt0381061
I agree XML is anti-line-editing thus anti-unix but, there is AWK.
If awk can do, perl can surely do. I can produce a list:
curl -s 'http://www.imdb.com/find?q=300&s=all' | awk -vRS='<a|</a>' -vFS='>|"' -vID=$1 '
$NF ~ ID && /title/ { printf "%s\t", $NF; match($2, "/tt[0-9]+/"); print substr($2, RSTART+1, RLENGTH-2)}
' | uniq
Pass search string to "ID".
Basically it's all about how you choose your tokenizer in awk, I use the <a> tag. Should be easier in perl.
I'm using the following code to encode a simple hash
use JSON;
my $name = "test";
my $type = "A";
my $data = "1.1.1.1";
my $ttl = 84600;
#rec_hash = ('name'=>$name, 'type'=>$type,'data'=>$data,'ttl'=>$ttl);
but I get the following error:
hash- or arrayref expected <not a simple scalar, use allow_nonref to allow this>
Your code seems to be missing some significant chunks, so let's add in the missing bits (I'll make some assumptions here) and fix things as we go.
Add missing boilerplate.
#!/usr/bin/perl
use strict;
use warnings;
use JSON;
my $name = "test";
my $type = "A";
my $data = "1.1.1.1";
my $ttl = 84600;
Make the hash a hash and not an array and don't forget to localise it: my %
my %rec_hash = ('name'=>$name, 'type'=>$type,'data'=>$data,'ttl'=>$ttl);
Actually use the encode_json method (passing it a hashref):
my $json = encode_json \%rec_hash;
Output the result:
print $json;
And that works as I would expect without errors.
Try %rec_hash = ... instead. # indicates a list/array, while % indicates a hash.