i am new to Perl and using Perl in my back end script and HTML in front end and CGI framework . Initially i am reading some details from flat file and displaying them . I am then trying to print the details in the form of checkboxes. i am using use Data::Dumper; module to check the values . however my input value of the last checkbox has an extra space as shown below
print '<form action="process.pl " method="POST" id="sel">';
print '<input type="checkbox" onClick="checkedAll()">Select All<br />';
foreach my $i (#robos) {
print '<input type="checkbox" name="sel" value="';
print $i;
print '">';
print $i;
print '<br />';
}
print '<input type="submit" value="submit">';
print '</form>';
print "response " . Dumper \$data;
however in Process.pl the selected value is retrieved using
#server = $q->param('sel');
but the response of selection is
print "response " . Dumper \#server; => is showing [ 'ar', 'br', 'cr ' ]; which is
showing an additional space after cr . I dont know whats is wrong .
In my flat file i have
RMCList:ar:br:cr
i am then reading the details into an array and splitting using
foreach my $ln (#lines)
{
if ($ln =~ /^RMCList/)
{
#robos = split (/:/,$ln);
}
} #### end of for statement
shift (#robos);
print "The robos are ", (join ',', map { '"' . $_ . '"' } #robos), "\n";
This is showing the robos are:
"ar","br","cr
"
While reading the file into #lines, you forgot to remove the newline. This newline is interpreted as a simple space in a HTML document.
Remove the newlines with the chomp function like
chomp #lines;
before looping over the lines.
Actually, don't read the file into an array at all, unless you have to access each line more than once. Otherwise, read one line at the time:
open my $infile, "<", $filename or die "Cannot open $filename: $!";
my #robos;
while (my $ln = <$infile>) {
chomp $ln;
#robos = split /:/, $ln if $ln =~ /^RMCList/;
}
Related
The below code able to read the content of file and print the content of body with the file's content.
use strict;
my $filename = '.../text.txt';
open (my $ifh, '<', $filename)
or die "Could not open file '$filename' $!";
local $/ = undef;
my #row = (<$ifh>)[0..9];
close ($ifh);
print "#row\n";
my ($body) = #_;
my ($html_body)= #_;
.
.
.
print(MAIL "Subject: Important Announcement \n");
.
.
.
push(#$html_body, "<h1><b><font color= red ><u>ATTENTION!</u></b></h1></font><br>");
push(#$html_body, "#row");
.
.
.
print(MAIL "$body", "#$html_body");
close(MAIL);
But unfortunately, i am having problem to produce the email body with same format of the text.txt file. The output email produced only having single line instead of paragraphs of 3.
The problem you're facing is that plain text contains no formatting information when placed inside a HTML document. End of line characters are ignored and treated just like ordinary white space. You need to add HTML tags to the text to convey the formatting you want or you could wrap it up in a pre tag as that will display it "as is".
As mentioned by others in the comments above, your use of #_ doesn't make sense. And it doesn't really make sense for $html_body to be treated like an array either when all you're doing is appending HTML to it. So I've rewritten that chunk of code to use it as a scalar and append the HTML to it instead. And also fixed some mistakes in the HTML as you need to close tags in the same order as you open them.
print MAIL "Subject: Important Announcement \n";
print MAIL "\n"; # Need a blank line after the header to show it's finished
my $html_body = "<html><body>";
$html_body .= "<h1><b><font color="red"><u>ATTENTION!</u></font></b></h1>";
$html_body .= "<pre>";
$html_body .= join("", #row);
$html_body .= "</pre>";
$html_body .= "</body></html>";
print MAIL $html_body;
close(MAIL);
First of all #_ is an arrayof arguments passed to subroutines, and it looks like you're not in one. So, doing:
my ($body) = #_;
my ($html_body) = #_;
is setting $body & $html_body to $_[0], which is undef.
How to fix?
There are two ways if you wrap it in a subroutine:
Use shift -> Which will make the above code look like:
my ($body) = shift;
my ($html_body)= shift;
Or,
my ($body, $html_body) = #_;
I would recommend the last one because it is less code and is more readable than the first one.
I have the list of around 600 drugs as a input and I have written a perl script to get the list of 600 URLs for all these drugs, grabs the URL content. Also, there is a link inside each URL termed Shared/Embed Graph that can be clicked to view the HTML source code. However, I need to make the script such that it clicks all these links inside all the 600 URLs and prints the 600 HTML source codes to STDOUT in perl. Right now my script is:
<c>
#!/usr/bin/perl
use strict;
use warnings;
#use LWP::Simple qw(get);
#use HTML::Parse;
use YAML;
use WWW::Mechanize;
use Data::Dumper;
use Template::Extract;
my $infile = $ARGV[0];
my $outfile = $ARGV[1];
open (IFILE, "<", $infile) || die "Could not open $infile\n";
open (OFILE, ">", $outfile) || die "Could not open $outfile\n";
my #arrayofdrugterms;
while (<IFILE>) {
chomp;
push (#arrayofdrugterms, $_);
}
#print "#arrayofdrugterms\n";
my $url;
foreach my $arrayofdrugterms( #arrayofdrugterms) {
$url = "http://www.drugcite.com/?q=$arrayofdrugterms\n";
print OFILE "$url\n";
}
close OFILE;
#open outfile for reading
open (OFILE, "<", $outfile) || die "Cannot open $outfile\n";
my #arrayofurls;
my $mech;
my $ext;
my #result;
my #link;
my $template;
my $content;
while (<OFILE>) {
chomp;
#arrayofurls = split ( ' ', $_);
#print "#arrayofurls\n";
foreach my $arrayofurls ( #arrayofurls) {
#print "$arrayofurls\n";
$mech = WWW::Mechanize->new(autocheck => 0);
$mech->get( "$arrayofurls" );
#print $mech->get( "$arrayofurls" ). "\n";
$ext = Template::Extract->new;
#print "$ext\n";
</c>
<b>
$template = "<div id="[% DrugCite %]" style="[% padding:10px %]">
<img src="[% http://www.drugcite.com/img/? item=vg&q=$arrayofdrugterms&meddra=pt style=border;0px; alt=Top 10 $arrayofdrugterms ADVERSE EVENTS - DrugCite.com %]">
<br />
<a href="[% http://www.drugcite.com/?q=$arrayofdrugterms style=font-size:7pt;margin-left:20px;color:#c0c0c0;text-decoration:none %]">"[% Source DrugCite %]"
</a>
</div>";
</b>
<c>
#result = $ext->extract($template, $mech->content);
print "#result\n";
#print Dumper "\#result" . "\n";
foreach ($mech->links) {
if( $_->[0] =~ /^Share\/Embed Graph$/) {
$mech->get($_->[0]);
}
=cut
#else {
#print "Not found the required link\n";
#}
#else {
#push (#link, $_->[0]) . "\n";
#}
=cut
}#end foreach
#print STDOUT "$mech->content\n";
#print Dumper \#link . "\n";
foreach (#result) {
#print YAML::Dump $_;
}
}
}
</c>
Any help is appreciated.Thanks
how to color the output with specific string/character, like if the log contains word "ERROR", it will be RED, "Warning" will be yellow and "Info" will be green in PERL
If you're generating HTML, it seems to me that it would be a good option to apply classes to your elements, such as info, warning, error, which you then style using CSS.
How do you set a class? That all depends on what you're using to generate HTML. If you're just printing raw text, <span class="info">...</span> will do the trick. If you're using a DOM builder, you can probably pass it in as a hash argument somewhere or other.
How do you apply a CSS stylesheet? I Googled "css text color" and picked this tutorial out of the 10 million results.
Do you want to color HTML output (eg output lines like <FONT COLOR="red">Error!</FONT>)...or display error messages in the terminal in color?
If the latter, you can do something like this, which will display the text 'error message' in bold, red.
use if $^O eq 'MSWin32', 'Win32::Console::ANSI';
use Term::ANSIColor qw(:constants);
print BOLD, RED, "error message\n", RESET;
If you are going to output this to a web browser you could use:
$logfile = "/var/log/log.txt" ;
open INFILE, "$logfile" ;
#loglines = <INFILE> ;
close INFILE;
foreach $line (#loglines) {
if ($line =~ /ERROR/) {
$line = "<span style=\"color: red;\">" . $line . "</span>";
}
elsif ($line =~ /Warning/) {
$line = "<span style=\"color: yellow;\">" . $line . "</span>";
}
elsif ($line =~ /Info/) {
$line = "<span style=\"color: green;\">" . $line . "</span>";
}
print $line;
}
what i am trying to do is get the contents of a file from another server. Since im not in tune with perl, nor know its mods and functions iv'e gone about it this way:
my $fileContents;
if( $md5Con =~ m/\.php$/g ) {
my $ftp = Net::FTP->new($DB_ftpserver, Debug => 0) or die "Cannot connect to some.host.name: $#";
$ftp->login($DB_ftpuser, $DB_ftppass) or die "Cannot login ", $ftp->message;
$ftp->get("/" . $root . $webpage, "c:/perlscripts/" . md5_hex($md5Con) . "-code.php") or die $ftp->message;
open FILE, ">>c:/perlscripts/" . md5_hex($md5Con) . "-code.php" or die $!;
$fileContents = <FILE>;
close(FILE);
unlink("c:/perlscripts/" . md5_hex($md5Con) . "-code.php");
$ftp->quit;
}
What i thought id do is get the file from the server, put on my local machine, edit the content, upload to where ever an then delete the temp file.
But I cannot seem to figure out how to get the contents of the file;
open FILE, ">>c:/perlscripts/" . md5_hex($md5Con) . "-code.php" or die $!;
$fileContents = <FILE>;
close(FILE);
keep getting error;
Use of uninitialized value $fileContents
Which im guessing means it isn't returning a value.
Any help much appreciated.
>>>>>>>>>> EDIT <<<<<<<<<<
my $fileContents;
if( $md5Con =~ m/\.php$/g ) {
my $ftp = Net::FTP->new($DB_ftpserver, Debug => 0) or die "Cannot connect to some.host.name: $#";
$ftp->login($DB_ftpuser, $DB_ftppass) or die "Cannot login ", $ftp->message;
$ftp->get("/" . $root . $webpage, "c:/perlscripts/" . md5_hex($md5Con) . "-code.php") or die $ftp->message;
my $file = "c:/perlscripts/" . md5_hex($md5Con) . "-code.php";
{
local( $/ ); # undefine the record seperator
open FILE, "<", $file or die "Cannot open:$!\n";
my $fileContents = <FILE>;
#print $fileContents;
my $bodyContents;
my $headContents;
if( $fileContents =~ m/<\s*body[^>]*>.*$/gi ) {
print $0 . $1 . "\n";
$bodyContents = $dbh->quote($1);
}
if( $fileContents =~ m/^.*<\/head>/gi ) {
print $0 . $1 . "\n";
$headContents = $dbh->quote($1);
}
$bodyTable = $dbh->quote($bodyTable);
$headerTable = $dbh->quote($headerTable);
$dbh->do($createBodyTable) or die " error: Couldn't create body table: " . DBI->errstr;
$dbh->do($createHeadTable) or die " error: Couldn't create header table: " . DBI->errstr;
$dbh->do("INSERT INTO $headerTable ( headData, headDataOutput ) VALUES ( $headContents, $headContents )") or die " error: Couldn't connect to database: " . DBI->errstr;
$dbh->do("INSERT INTO $bodyTable ( bodyData, bodyDataOutput ) VALUES ( $bodyContents, $bodyContents )") or die " error: Couldn't connect to database: " . DBI->errstr;
$dbh->do("INSERT INTO page_names (linkFromRoot, linkTrue, page_name, table_name, navigation, location) VALUES ( $linkFromRoot, $linkTrue, $page_name, $table_name, $navigation, $location )") or die " error: Couldn't connect to database: " . DBI->errstr;
unlink("c:/perlscripts/" . md5_hex($md5Con) . "-code.php");
}
$ftp->quit;
}
the above using print WILL print the whole file. BUT, for some reason the two regular expresions are returning false. Any idea why?
if( $fileContents =~ m/<\s*body[^>]*>.*$/gi ) {
print $0 . $1 . "\n";
$bodyContents = $dbh->quote($1);
}
if( $fileContents =~ m/^.*<\/head>/gi ) {
print $0 . $1 . "\n";
$headContents = $dbh->quote($1);
}
This is covered in section 5 of the Perl FAQ included with the standard distribution.
How can I read in an entire file all at once?
You can use the Path::Class::File::slurp module to do it in one step.
use Path::Class;
$all_of_it = file($filename)->slurp; # entire file in scalar
#all_lines = file($filename)->slurp; # one line per element
The customary Perl approach for processing all the lines in a file is to do so one line at a time:
open (INPUT, $file) || die "can't open $file: $!";
while (<INPUT>) {
chomp;
# do something with $_
}
close(INPUT) || die "can't close $file: $!";
This is tremendously more efficient than reading the entire file into memory as an array of lines and then processing it one element at a time, which is often—if not almost always—the wrong approach. Whenever you see someone do this:
#lines = <INPUT>;
you should think long and hard about why you need everything loaded at once. It's just not a scalable solution. You might also find it more fun to use the standard Tie::File module, or the DB_File module's $DB_RECNO bindings, which allow you to tie an array to a file so that accessing an element the array actually accesses the corresponding line in the file.
You can read the entire filehandle contents into a scalar.
{
local(*INPUT, $/);
open (INPUT, $file) || die "can't open $file: $!";
$var = <INPUT>;
}
That temporarily undefs your record separator, and will automatically close the file at block exit. If the file is already open, just use this:
$var = do { local $/; <INPUT> };
For ordinary files you can also use the read function.
read( INPUT, $var, -s INPUT );
The third argument tests the byte size of the data on the INPUT filehandle and reads that many bytes into the buffer $var.
Use Path::Class::File::slurp if you want to read all file contents in one go.
However, more importantly, use an HTML parser to parse HTML.
open FILE, "c:/perlscripts" . md5_hex($md5Con) . "-code.php" or die $!;
while (<FILE>) {
# each line is in $_
}
close(FILE);
will open the file and allow you to process it line-by-line (if that's what you want - otherwise investigate binmode). I think the problem is in your prepending the filename to open with >>. See this tutorial for more info.
I note you're also using regular expressions to parse HTML. Generally I would recommend using a parser to do this (e.g. see HTML::Parser). Regular expressions aren't suited to HTML due to HTML's lack of regularity, and won't work reliably in general cases.
Also, if you are in need of editing the contents of the files take a look at the CPAN module
Tie::File
This module relieves you from the need to creation of a temp file for editing the content
and writing it back to the same file.
EDIT:
What you are looking at is a way to slurp the file. May be you have to undefine
the record separator variable $/
The below code works fine for me:
use strict;
my $file = "test.txt";
{
local( $/ ); # undefine the record seperator
open FILE, "<", $file or die "Cannot open:$!\n";
my $lines =<FILE>;
print $lines;
}
Also see the section "Traditional Slurping" in this article.
BUT, for some reason the two regular expresions are returning false. Any idea why?
. in a regular expression by default matches any character except newline. Presumably you have newlines before the </head> tag and after the <body> tag. To make . match any character including newlines, use the //s flag.
I'm not sure what your print $0 . $1 ... code is about; you aren't capturing anything in your matches to be stored in $1, and $0 isn't a variable used for regular expression captures, it's something very different.
if you want to get the content of the file,
#lines = <FILE>;
Use File::Slurp::Tiny. As convenient as File::Slurp, but without the bugs.
I'm trying to use regular expressions in Perl to parse a table with the following structure. The first line is as follows:
<tr class="Highlight"><td>Time Played</a></td><td></td><td>Artist</td><td width="1%"></td><td>Title</td><td>Label</td></tr>
Here I wish to take out "Time Played", "Artist", "Title", and "Label", and print them to an output file.
I've tried many regular expressions such as:
$lines =~ / (<td>) /
OR
$lines =~ / <td>(.*)< /
OR
$lines =~ / >(.*)< /
My current program looks like so:
#!perl -w
open INPUT_FILE, "<", "FIRST_LINE_OF_OUTPUT.txt" or die $!;
open OUTPUT_FILE, ">>", "PLAYLIST_TABLE.txt" or die $!;
my $lines = join '', <INPUT_FILE>;
print "Hello 2\n";
if ($lines =~ / (\S.*\S) /) {
print "this is 1: \n";
print $1;
if ($lines =~ / <td>(.*)< / ) {
print "this is the 2nd 1: \n";
print $1;
print "the word was: $1.\n";
$Time = $1;
print $Time;
print OUTPUT_FILE $Time;
} else {
print "2ND IF FAILED\n";
}
} else {
print "THIS FAILED\n";
}
close(INPUT_FILE);
close(OUTPUT_FILE);
Do NOT use regexps to parse HTML. There are a very large number of CPAN modules which do this for you much more effectively.
Can you provide some examples of why it is hard to parse XML and HTML with a regex?
Can you provide an example of parsing HTML with your favorite parser?
HTML::Parser
HTML::TreeBuilder
HTML::TableExtract
Use HTML::TableExtract. Really.
#!/usr/bin/perl
use strict;
use warnings;
use HTML::TableExtract;
use LWP::Simple;
my $file = 'Table3.htm';
unless ( -e $file ) {
my $rc = getstore(
'http://www.ntsb.gov/aviation/Table3.htm',
$file);
die "Failed to download document\n" unless $rc == 200;
}
my #headers = qw( Year Fatalities );
my $te = HTML::TableExtract->new(
headers => \#headers,
attribs => { id => 'myTable' },
);
$te->parse_file($file);
my ($table) = $te->tables;
print join("\t", #headers), "\n";
for my $row ($te->rows ) {
print join("\t", #$row), "\n";
}
This is what I meant in another post by "task-specific" HTML parsers.
You could have saved a lot of time by directing your energy to reading some documentation rather than throwing regexes at the wall and seeing if any stuck.
That's an easy one:
my $html = '<tr class="Highlight"><td>Time Played</a></td><td></td><td>Artist</td><td width="1%"></td><td>Title</td><td>Label</td></tr>';
my #stuff = $html =~ />([^<]+)</g;
print join (", ", #stuff), "\n";
See http://codepad.org/qz9d5Bro if you want to try running it.