Displaying HTML in Perl Tag - html

I have a Perl Script which makes a DB connection and displays the output in HTML format. The data which it's trying to display has tags embedded (<>) in it, hence the HTML does not get displayed. If I open the actual HTML file which the script generates using Notepad, I see the data. However I am unable to display it due to the tags. Any idea how this can be fixed?
#!/usr/bin/perl
use DBI;
use HTML::Escape 'escape_html';
unlink("D:\\Perl32\\scripts\\UndeliveredRAW.html");
my $host = '${Node.Caption}';
my $user = '${USER}';
my $pwd = '${PASSWORD}';
my $driver = "SQL Server";
$dbhslam = DBI->connect("dbi:ODBC:Driver=$driver;Server=$host;UID=$user;PWD=$pwd") || die "connect failed:";
$sthslam = $dbhslam->prepare("SELECT
DBA_Reports.dbo.undelivered_raw_host_msgs.ID
DBA_Reports.dbo.undelivered_raw_host_msgs.MESSAGE
FROM
DBA_Reports.dbo.undelivered_raw_host_msgs");
$sthslam->execute;
$msg = "Up";
$Count = 0;
$Output = "";
$Temp = "";
$tbl = "<TABLE border=1 bordercolor=orange cellspacing=0 cellpadding=1>";
$tblhd = "<TR><TH>ID</TH><TH>MESSAGE</TH></TR>";
while (my $ref = $sthslam->fetchrow_hashref()) {
$Count++;
$Output .= '<TR><TD align=center rowspan=1 valign=top width=1000 height=1000>'
. $ref->{'ID'}.'</TD>'
. '<TD align=center rowspan=1 valign=top width=1000 height=1000>'
. escape_html($ref->{'MESSAGE'}).'</TD></TR>';
}
$dbhslam->disconnect;
$Output = "$tbl$tblhd$Output</TABLE>";
my $filename1 = 'D:\\Perl32\\Scripts\\UndeliveredRAW.html';
open(my $fh1, '>', $filename1) or die "Could not open file '$filename1' $!";
print $fh1 "$Output";
close $fh1;
if ($Count > 0) {
$msg = $Output;
}
print "\nMessage: $msg";
print "\nStatistic: $Count";
Desired Output
Contents of HTML generated

Following code snippet demonstrates slightly modified version of posted code.
Please see the loop section for escape_html(...) usage on database obtained data.
#!/usr/bin/env perl
#
# vim: ai ts=4 sw=4
use strict;
use warnings;
use feature 'say';
use DBI;
use HTML::Escape qw/escape_html/;
my $filename = 'D:\Perl32\scripts\UndeliveredRAW.html';
unlink($filename) if -e $filename;
my $host = ${Node.Caption};
my $user = ${USER};
my $pwd = ${PASSWORD};
my $driver = 'SQL Server';
my $dbh = DBI->connect("dbi:ODBC:Driver=$driver;Server=$host;UID=$user;PWD=$pwd")
or die 'DB connect failed:';
my $query = '
SELECT
DBA_Reports.dbo.undelivered_raw_host_msgs.MESSAGE
FROM
DBA_Reports.dbo.undelivered_raw_host_msgs
';
my $rv = $dbh->do($query) or die $dbh->errstr;
my $msg = 'Up';
my $Count = 0;
my $tbl = '
<TABLE border=1 bordercolor=orange cellspacing=0 cellpadding=1>
<TR><TH>MESSAGE</TH></TR>
';
while (my $ref = $sth->fetchrow_hashref()) {
$Count++;
$tbl .= "\n\t<TR><TD align=center rowspan=1 valign=top width=5000 height=5000>"
. escape_html($ref->{'MESSAGE'})
. '</TD></TR>';
}
$tbl .= '
</TABLE>
';
my $html =
'<!DOCTYPE html>
<html>
<head>
<title>Undelivered RAW</title>
</head>
<body>
<h1>DB table data</h1>
' . $tbl . '
</body>
</html>
';
open my $fh, '>', $filename
or die "Could not open file '$filename1' $!";
print $fh $html;
close $fh;
if ($Count > 0) {
say 'Message: ' . $msg;
say 'Statistic: ' . $Count";
}
Note: to avoid polluting code with HTML style attributes find some time to learn CSS, your generated HTML does not include required sections DOCTYPE, html, head, title, body
Reference:
DBI
HTML::Escape
DBI/DBD::ODBC Tutorial
CSS
HTML

HTML has a well-understood mechanism to include characters that would normally be interpreted as special characters. For example, if you want to include a < in your HTML, that would normally be seen as starting a new HTML element in your document.
The solution is to replace those problematic characters with HTML entities that represent those characters. For example, < should be replaced with <. Note that this means the ampersand (&) needs to be added to the set of characters that should be replaced (in this case by &) if you want to include it in your HTML.
Perl has a long history of being used on the web, so it's no surprise that there are many tools available to carry out this replacement. HTML::Escape is probably the best known. It supplies a single function (escape_html()) which takes a text string and returns that same string with all of the problematic characters replaced by the appropriate entities.
use HTML::Escape 'escape_html';
my $html = '<some text> & <some other text>'
my $escaped_html = escape_html($html);
After running this code, $escaped_html now contains "$lt;some text$gt; $amp; $lt;some other text$gt;". And if you send that text to a browser, you will get the correct output displayed.
So the easiest solution is to load HTML::Escape at the top of your program and then call escape_html() whenever you're adding potentially problematic strings to your output. That means your while loop would look something like this:
while (my $ref = $sthslam->fetchrow_hashref()) {
$Count++;
$Output .= '<TR><TD align=center rowspan=1 valign=top width=5000 height=5000>'
. escape_html($ref->{'MESSAGE'})
. '</TD></TR>';
}
Note that I've removed the $Temp variable (which didn't seem to be doing anything useful) and switched to using .= to build up your output string. =. is the "assignment concatenation" operator - it adds the new string on its right to the end of whatever currently exists in the variable on its left.
You seem to be learning Perl on the job (which is great) but it's a real shame that you're learning it in an environment that seems to use techniques that have been outdated for about twenty years. Your question is a good example of why trying to build up raw HTML strings inside your Perl code is a bad idea. It's a far better idea to use a templating engine of some kind (the defacto standard in the Perl world seems to be the Template Toolkit).
I also recommend looking at Cascading Style Sheets as a more modern approach to styling your HTML output.

Related

Assign variables to all lines in a text file written in perl and use them to display in a HTML table

For example, I have a cgi where I have appended the shell script to a text file lstatus.txt as follows:
use IO::All;
use CGI;
system `bash /opt/apache/cgi-bin/lisa/lisapage3.sh > lstatus.txt`
io('lstatus.txt') > $data
Print << EOF;
<HTML>
<BODY>
<P>$data</P>
</BODY>
</HTML>
The output is as follows: EnterpriseDasboardService is Running. RegistryService is Running.PortalService is Running.VirtualServiceEnvironment is Running.
Now, I tried editing the text file lstatus.txt and the content in it is as follows:
EnterpriseDasboardService is Running > $abs
RegistryService is Running > $pqr
PortalService is Running > $qwe
VirtualServiceEnvironment is Running > $dfg
I have assigned the variables to each line in a text file as mentioned above.
I need each variable to be used in the table tag as follows:
<table>
<th>STATUS</th>
<tr>
<td>$abs</td>
.....
<td>$dfg</td>
</tr>
</table>
I want the output to be displayed in a table which I could not do by using the above changes.
Perhaps you want something like following code bellow
Note: read documentation for system
use strict;
use warnings;
use feature 'say';
use CGI;
my $filename = 'lstatus.txt';
my #arguments = ('/opt/apache/cgi-bin/lisa/lisapage3.sh','>',$filename);
system 'bash', #arguments;
open my $fh, '<', $filename
or die "Couldn't open $filename";
my #data = <$fh>;
close $fh;
chomp #data;
my $table = data2table(\#data);
$html =
"<html>
<head>
<title>Status webpage</title>
</head>
<body>
$table
</body>
</html>
";
say $html;
sub data2table {
my $data = shift;
my $table;
my $space = "\n\t\t";
$table = $space . '<table>';
$table .= $space . "\t<tr><th>STATUS</th></tr>";
$table .= $space . "\t<tr><td>$_</td></tr>" for #{$data};
$table .= $space . '</table>';
return $table;
}

Set HTML table cell background colour according to text content

I have written a Perl program to create a web page with an HTML table derived from text file textfile.txt.
I would like to change it so that cells of the table are coloured according to the text content. For instance, if the text is Reject then background of the cell should be red.
Here are two methods that I tried. Neither of them worked
Method 1
if ( $_ eq "REJECT" ) {
print map { "<td style=width:705 bgcolor=#FF0000 >REJECT</td>" } #$d;
}
Method 2
foreach my $d ( #data ) {
$d //= ''; # Convert undefined values to empty strings
my $class;
if ( $d eq 'REJECT' ) {
$class = 'hilite';
}
$html .= '<td';
$html .= " class='$class'" if $class;
$html .= ">$d</td>";
}
Perl program
#!/usr/bin/perl
print "Content-type: text/html\n\n";
use strict;
use warnings;
my $output = `cat textfile.txt`;
my #lines = split /\n/, $output;
my #data;
foreach my $line ( #lines ) {
chomp $line;
my #d = split /\s+/, $line;
push #data, \#d;
}
my $color1 = "black";
my $color2 = "darkgreen";
my $color3 = "black";
my $color4 = "red";
my $color5 = "lime";
my $num = 6;
my $title = "This is the heading";
my $fstyle = "Helvetica";
print "<body bgcolor = $color3>";
print "<font color = $color5 face = $fstyle size = $num>$title</font><br />";
foreach my $d ( #data ) {
print "<html>";
print "<body>";
print "<table style=table-layout= fixed width= 705 height=110 text = $color4 border = 2 bordercolor = $color1 bgcolor = $color2>";
print "<tr>";
print map {"<th style=width:705 >Column1</th>"}
print map {"<th style=width:705 >Column2</th>"}
print "</tr>";
print "<tr>";
print map {"<td style=width:705 >$_</td>"} #$d;
if ( $d eq 'REJECT' ) {
print map {"<td style=width:705 bgcolor=#FF0000 >Reject</td>"} #$d;
}
print "</tr>";
print "</table>";
print "</body>";
print "</html>";
}
Input text file:
Column1 Column2
Accept Reject
Accept Reject
Accept Reject
This line
print map { "<td style=width:705 bgcolor=#FF0000 >Reject</td>"
is adding the background color RED to the cell but it is not matching the condition Reject.
Output
Here are some of the errors in your Perl code
As I said, you are misusing map
You are creating a new HTML document for every element of #data. What do you expect the browser to do with multiple <html> elements? It can't display them all
You are expecting the string REJECT to match Reject
You are using a mixture of CSS style strings and element attributes. For instance
print "<table style=table-layout= fixed width= 705 height=110 text = $color4 border = 2 bordercolor = $color1 bgcolor = $color2>"
should be
print qq{<table
style="table-layout:fixed; width=705; height=110; text=$color4"
border=2
bordercolor="$color1"
bgcolor="$color2">\n}
because table-layout, width, height, and text are CSS properties, while border, bordercolor, and bgcolor are HTML element attributes
I think you should be writing CSS to solve this problem, but that is another matter
It would also help you a lot if you printed a newline "\n" after each HTML element. That way the output will be much more readable and you will be able to see better what you have created
Please don't persist with this "try things until it works" approach. You always end up coming here for help to get you out of the mess you've created, and you're not asking intelligent questions. To be using map like that after so long means you're not learning at all, and you owe it to yourself as well as your employer to learn the language properly
Here is a solution that performs correctly, but it is no more than a correction of your own code. The problems that I have outlined have been fixed, and that is all
#!/usr/bin/perl
use strict;
use warnings 'all';
my $color1 = 'black';
my $color2 = 'darkgreen';
my $color3 = 'black';
my $color4 = 'red';
my $color5 = 'lime';
my $size = 6;
my $title = 'This is the heading';
my $fstyle = 'Helvetica';
print "Content-type: text/html\n\n";
print "<body bgcolor = $color3>\n";
print "<font color = $color5 face=$fstyle size=$size>$title</font><br />\n";
{
print "<html>\n";
print "<body>\n";
print qq{<table
style="table-layout:fixed; width=705; height=110; text=$color4"
border=2
bordercolor="$color1"
bgcolor="$color2">\n};
print "<tr>\n";
print qq{<th style="width:705" >Column1</th>};
print qq{<th style="width:705" >Column2</th>};
print "</tr>\n";
open my $fh, '<', 'textfile.txt' or die $!;
while ( <$fh> ) {
my #line = split;
print "<tr>\n";
for ( #line ) {
if ( /reject/i ) {
print qq{<td style=width:705 bgcolor=red>$_</td>};
}
else {
print "<td style=width:705>$_</td>"
}
}
print "</tr>\n";
}
print "</table>\n";
print "</body>\n";
print "</html>\n";
}
output
Content-type: text/html
<body bgcolor = black>
<font color = lime face=Helvetica size=6>This is the heading</font><br />
<html>
<body>
<table
style="table-layout:fixed; width=705; height=110; text=red"
border=2
bordercolor="black"
bgcolor="darkgreen">
<tr>
<th style="width:705" >Column1</th><th style="width:705" >Column2</th></tr>
<tr>
<td style=width:705>Column1</td><td style=width:705>Column2</td></tr>
<tr>
<td style=width:705>Accept</td><td style=width:705 bgcolor=red>Reject</td></tr>
<tr>
<td style=width:705>Accept</td><td style=width:705 bgcolor=red>Reject</td></tr>
<tr>
<td style=width:705>Accept</td><td style=width:705 bgcolor=red>Reject</td></tr>
</table>
</body>
</html>
Appearance
I still have misgivings about your approach. Hacking together a program from bits and pieces of others' work that you don't understand is a recipe for failure. If you have no inclination to explore and learn the details enough to survive on your own then you have chosen the wrong job
I think you should be using a template system such as
Template::Toolkit instead of printing HTML from your Perl program
The colours should be changed using CSS and an appropriate class, rather than printing HTML attributes in line
You seem to think that a casual and approximate approach is fine, or at least that you are unwilling to offer any more, but while that may be true of other professions, software engineering requires much more care and precision

Perl to read each lines and print the output in email (html)

The below code able to read the content of file and print the content of body with the file's content.
use strict;
my $filename = '.../text.txt';
open (my $ifh, '<', $filename)
or die "Could not open file '$filename' $!";
local $/ = undef;
my #row = (<$ifh>)[0..9];
close ($ifh);
print "#row\n";
my ($body) = #_;
my ($html_body)= #_;
.
.
.
print(MAIL "Subject: Important Announcement \n");
.
.
.
push(#$html_body, "<h1><b><font color= red ><u>ATTENTION!</u></b></h1></font><br>");
push(#$html_body, "#row");
.
.
.
print(MAIL "$body", "#$html_body");
close(MAIL);
But unfortunately, i am having problem to produce the email body with same format of the text.txt file. The output email produced only having single line instead of paragraphs of 3.
The problem you're facing is that plain text contains no formatting information when placed inside a HTML document. End of line characters are ignored and treated just like ordinary white space. You need to add HTML tags to the text to convey the formatting you want or you could wrap it up in a pre tag as that will display it "as is".
As mentioned by others in the comments above, your use of #_ doesn't make sense. And it doesn't really make sense for $html_body to be treated like an array either when all you're doing is appending HTML to it. So I've rewritten that chunk of code to use it as a scalar and append the HTML to it instead. And also fixed some mistakes in the HTML as you need to close tags in the same order as you open them.
print MAIL "Subject: Important Announcement \n";
print MAIL "\n"; # Need a blank line after the header to show it's finished
my $html_body = "<html><body>";
$html_body .= "<h1><b><font color="red"><u>ATTENTION!</u></font></b></h1>";
$html_body .= "<pre>";
$html_body .= join("", #row);
$html_body .= "</pre>";
$html_body .= "</body></html>";
print MAIL $html_body;
close(MAIL);
First of all #_ is an arrayof arguments passed to subroutines, and it looks like you're not in one. So, doing:
my ($body) = #_;
my ($html_body) = #_;
is setting $body & $html_body to $_[0], which is undef.
How to fix?
There are two ways if you wrap it in a subroutine:
Use shift -> Which will make the above code look like:
my ($body) = shift;
my ($html_body)= shift;
Or,
my ($body, $html_body) = #_;
I would recommend the last one because it is less code and is more readable than the first one.

How can I do paging and sorting in a Perl CGI program?

Here is the table in which I am retrieving the data from an SQLite database.
Its having lots of records, so near that ADD button I need something like
|< < > >| which would do the paging function whenever I click.
Also, besides the table each header (e.g. UserName UserId) I need a sorting
button. Something like a ^ button. Please do help me find the solution..Thank You.
#!C:\perl\bin\perl.exe
use CGI;
use CGI qw/:standard/;
use CGI::Carp qw(warningsToBrowser fatalsToBrowser);
my $q = new CGI;
use DBI;
use CGI qw(:all);
use warnings;
print $q->header ( );
my $dbh = DBI->connect(
"dbi:SQLite:DEVICE.db",
"", "",
{
RaiseError => 1,
AutoCommit => 1
}
);
my #rows = ();
my $sql = "SELECT UserId,UserName,CardNo,GroupId,Role,VerifyType FROM UsersList";
my $sth = $dbh->prepare($sql) or die("\n\nPREPARE ERROR:\n\n$DBI::errstr");
$sth->execute or die("\n\nQUERY ERROR:\n\n$DBI::errstr");
print '<table>';
print "<tr>";
print "<th>$sth->{NAME}->[0]</th>";
print "<th>$sth->{NAME}->[1]</th>";
print "<th>$sth->{NAME}->[2]</th>";
print "<th>$sth->{NAME}->[3]</th>";
print "<th>$sth->{NAME}->[4]</th>";
print "<th>$sth->{NAME}->[5]</th>";
print "<th> EDIT </th>";
print "<th> DELETE </th>";
while (my #row = $sth->fetchrow_array) {
print "
<tr>
<td>$row[0]</td>
<td>$row[1]</td>
<td>$row[2]</td>
<td>$row[3]</td>
<td>$row[4]</td>
<td>$row[5]</td>
<td>EDIT</td>
<td>DELETE</td>
</tr>";
}
print "<tr style='background-color:#CDC9C9;'><td><A HREF=\"http://localhost/cgi-
bin/AddUser.cgi\">ADD</A></td><td></td><td></td><td></td><td></td></tr>";
print"</table>";
$sth->finish();
$dbh->commit();
$dbh->disconnect;
print <<END_HTML;
<html>
<head><title></title></head>
<body>
<form action="UsersList.cgi" method="get">
<TABLE align="center">
<TR>
<TD align="left">
<input type="hidden" name="submit" value="Submit">
</TD>
</TR>
</TABLE>
</form>
</body></html>
END_HTML
----------------------------------------
Ok, first thing, get and read Learning Perl. It is, hands down, the best book to learn Perl with.
Next, take a look at Ovid's CGI Course.
Third, your code has some major problems, and you'll need to walk before you run.
I've tidied and commented the heck out of your code.
#!C:\perl\bin\perl.exe
# Windows perl ignores the shebang, except to check for flags and
# arguments to start the Perl interpreter with.
# Your webserver might use it though
# You forgot to enable strict. You enabled warnings further down in
# your code. These two pragmas will help you write bug free code by
# catching many errors.
#
# Keep your module and pragma usage at the top of your
# scripts. It aids readability.
use strict;
use warnings;
# Using CGI is a good idea, but you only need to use CGI one time.
use CGI qw/:all/;
# These are good while learning and debugging.
# Do not use them in production code.
use CGI::Carp qw(warningsToBrowser fatalsToBrowser);
use DBI;
my $dbh = DBI->connect(
"dbi:SQLite:DEVICE.db",
"", "",
{
RaiseError => 1,
AutoCommit => 1
}
);
# Don't use indirect object notation. It can lead to subtle bugs.
# Use the arrow notation for method invocation instead.
my $q = CGI->new();
print $q->header ( );
# The #rows array was doing nothing.
# No need to commit when autocommit is on.
$dbh->commit();
$dbh->disconnect;
# Here we get the html table in a string.
my $table = generate_data_table( $dbi );
# And here we print your whole HTML block with the table interpolated
# into the the main text. As it was, the HTML page was printing AFTER
# the table you generated.
#
# I put a crappy improper stylesheet in the header of your html page.
# Unless you are only doing the most rudimentary HTML work, learn to
# use CSS properly. Your time will be repayed hundreds of times over.
# For only rudimentary work, there's still a good chance you'll break
# even on any time you invest in learning CSS.
print <<END_HTML;
<html>
<head>
<title>Add Users</title>
<style>
.adduser {
background-color:#CDC9C9;
}
</style>
</head>
<body>
<form action="UsersList.cgi" method="get">
$table
<input type="hidden" name="submit" value="Submit">
</form>
</body>
</html>
END_HTML
# Use subroutines to group related actions.
sub generate_data_table {
my $dbi = shift;
my $sql = "SELECT UserId,UserName,CardNo,GroupId,Role,VerifyType FROM UsersList";
my $sth = $dbh->prepare($sql)
or die("\n\nPREPARE ERROR:\n\n$DBI::errstr");
$sth->execute
or die("\n\nQUERY ERROR:\n\n$DBI::errstr");
# Actually generate the table HTML
my $table = '<table><tr>';
# Header
$table .= join '', map "<th>$sth->{NAME}[$_]</th>\n", 0..5;
$table .= "</tr>\n";
# Normal Rows
while (my #row = $sth->fetchrow_array) {
$table .= '<tr>',
$table .= join '', map "<td>$row[$_]</td>\n", 0..5;
$table .= join "\n",
'<td>EDIT</td>'
'<td>DELETE</td>'
"</tr>\n";
}
# Special Row
#
# Don't use inline CSS, use classes and either group all your css at
# the top of your html code, or better yet, load an external stylesheet.
# There is no reason to have to escape quotes when working with Perl CGI.
# First, in html ' and " are interchangeable, so you can pick a quote
# that doesn't need esacaping.
#
# Finally, if you MUST use both ' and " in a single string, you can use
# Perl's quoting operators (q and qq) to select a safe delimiter that will allow you
# to avoid escaping.
$table .=
"<tr class='adduser' >"
. '<td>ADD</td>'
. '<td></td><td></td><td></td><td></td></tr>'
. "</table>";
$sth->finish();
return $table;
}
Finally, to handle sorting and paging, you can use a library as others have suggested, or you can modify your SQL query. The keywords you want for grabbing only a range of results are LIMIT and OFFSET, use an ORDER BY clause to sort your result set. Add some parameters to your forms to indicate what sorting methods or range you want.
One of the (many) advantages that you'd get from using DBIx::Class for your database access is that all searches have built-in support for paging.
Alternatively, you might find something like Data::Page to be useful.
As for sorting, that's probably best done in your SQL query with a 'sort' clause.

How do I read a file's contents into a Perl scalar?

what i am trying to do is get the contents of a file from another server. Since im not in tune with perl, nor know its mods and functions iv'e gone about it this way:
my $fileContents;
if( $md5Con =~ m/\.php$/g ) {
my $ftp = Net::FTP->new($DB_ftpserver, Debug => 0) or die "Cannot connect to some.host.name: $#";
$ftp->login($DB_ftpuser, $DB_ftppass) or die "Cannot login ", $ftp->message;
$ftp->get("/" . $root . $webpage, "c:/perlscripts/" . md5_hex($md5Con) . "-code.php") or die $ftp->message;
open FILE, ">>c:/perlscripts/" . md5_hex($md5Con) . "-code.php" or die $!;
$fileContents = <FILE>;
close(FILE);
unlink("c:/perlscripts/" . md5_hex($md5Con) . "-code.php");
$ftp->quit;
}
What i thought id do is get the file from the server, put on my local machine, edit the content, upload to where ever an then delete the temp file.
But I cannot seem to figure out how to get the contents of the file;
open FILE, ">>c:/perlscripts/" . md5_hex($md5Con) . "-code.php" or die $!;
$fileContents = <FILE>;
close(FILE);
keep getting error;
Use of uninitialized value $fileContents
Which im guessing means it isn't returning a value.
Any help much appreciated.
>>>>>>>>>> EDIT <<<<<<<<<<
my $fileContents;
if( $md5Con =~ m/\.php$/g ) {
my $ftp = Net::FTP->new($DB_ftpserver, Debug => 0) or die "Cannot connect to some.host.name: $#";
$ftp->login($DB_ftpuser, $DB_ftppass) or die "Cannot login ", $ftp->message;
$ftp->get("/" . $root . $webpage, "c:/perlscripts/" . md5_hex($md5Con) . "-code.php") or die $ftp->message;
my $file = "c:/perlscripts/" . md5_hex($md5Con) . "-code.php";
{
local( $/ ); # undefine the record seperator
open FILE, "<", $file or die "Cannot open:$!\n";
my $fileContents = <FILE>;
#print $fileContents;
my $bodyContents;
my $headContents;
if( $fileContents =~ m/<\s*body[^>]*>.*$/gi ) {
print $0 . $1 . "\n";
$bodyContents = $dbh->quote($1);
}
if( $fileContents =~ m/^.*<\/head>/gi ) {
print $0 . $1 . "\n";
$headContents = $dbh->quote($1);
}
$bodyTable = $dbh->quote($bodyTable);
$headerTable = $dbh->quote($headerTable);
$dbh->do($createBodyTable) or die " error: Couldn't create body table: " . DBI->errstr;
$dbh->do($createHeadTable) or die " error: Couldn't create header table: " . DBI->errstr;
$dbh->do("INSERT INTO $headerTable ( headData, headDataOutput ) VALUES ( $headContents, $headContents )") or die " error: Couldn't connect to database: " . DBI->errstr;
$dbh->do("INSERT INTO $bodyTable ( bodyData, bodyDataOutput ) VALUES ( $bodyContents, $bodyContents )") or die " error: Couldn't connect to database: " . DBI->errstr;
$dbh->do("INSERT INTO page_names (linkFromRoot, linkTrue, page_name, table_name, navigation, location) VALUES ( $linkFromRoot, $linkTrue, $page_name, $table_name, $navigation, $location )") or die " error: Couldn't connect to database: " . DBI->errstr;
unlink("c:/perlscripts/" . md5_hex($md5Con) . "-code.php");
}
$ftp->quit;
}
the above using print WILL print the whole file. BUT, for some reason the two regular expresions are returning false. Any idea why?
if( $fileContents =~ m/<\s*body[^>]*>.*$/gi ) {
print $0 . $1 . "\n";
$bodyContents = $dbh->quote($1);
}
if( $fileContents =~ m/^.*<\/head>/gi ) {
print $0 . $1 . "\n";
$headContents = $dbh->quote($1);
}
This is covered in section 5 of the Perl FAQ included with the standard distribution.
How can I read in an entire file all at once?
You can use the Path::Class::File::slurp module to do it in one step.
use Path::Class;
$all_of_it = file($filename)->slurp; # entire file in scalar
#all_lines = file($filename)->slurp; # one line per element
The customary Perl approach for processing all the lines in a file is to do so one line at a time:
open (INPUT, $file) || die "can't open $file: $!";
while (<INPUT>) {
chomp;
# do something with $_
}
close(INPUT) || die "can't close $file: $!";
This is tremendously more efficient than reading the entire file into memory as an array of lines and then processing it one element at a time, which is often—if not almost always—the wrong approach. Whenever you see someone do this:
#lines = <INPUT>;
you should think long and hard about why you need everything loaded at once. It's just not a scalable solution. You might also find it more fun to use the standard Tie::File module, or the DB_File module's $DB_RECNO bindings, which allow you to tie an array to a file so that accessing an element the array actually accesses the corresponding line in the file.
You can read the entire filehandle contents into a scalar.
{
local(*INPUT, $/);
open (INPUT, $file) || die "can't open $file: $!";
$var = <INPUT>;
}
That temporarily undefs your record separator, and will automatically close the file at block exit. If the file is already open, just use this:
$var = do { local $/; <INPUT> };
For ordinary files you can also use the read function.
read( INPUT, $var, -s INPUT );
The third argument tests the byte size of the data on the INPUT filehandle and reads that many bytes into the buffer $var.
Use Path::Class::File::slurp if you want to read all file contents in one go.
However, more importantly, use an HTML parser to parse HTML.
open FILE, "c:/perlscripts" . md5_hex($md5Con) . "-code.php" or die $!;
while (<FILE>) {
# each line is in $_
}
close(FILE);
will open the file and allow you to process it line-by-line (if that's what you want - otherwise investigate binmode). I think the problem is in your prepending the filename to open with >>. See this tutorial for more info.
I note you're also using regular expressions to parse HTML. Generally I would recommend using a parser to do this (e.g. see HTML::Parser). Regular expressions aren't suited to HTML due to HTML's lack of regularity, and won't work reliably in general cases.
Also, if you are in need of editing the contents of the files take a look at the CPAN module
Tie::File
This module relieves you from the need to creation of a temp file for editing the content
and writing it back to the same file.
EDIT:
What you are looking at is a way to slurp the file. May be you have to undefine
the record separator variable $/
The below code works fine for me:
use strict;
my $file = "test.txt";
{
local( $/ ); # undefine the record seperator
open FILE, "<", $file or die "Cannot open:$!\n";
my $lines =<FILE>;
print $lines;
}
Also see the section "Traditional Slurping" in this article.
BUT, for some reason the two regular expresions are returning false. Any idea why?
. in a regular expression by default matches any character except newline. Presumably you have newlines before the </head> tag and after the <body> tag. To make . match any character including newlines, use the //s flag.
I'm not sure what your print $0 . $1 ... code is about; you aren't capturing anything in your matches to be stored in $1, and $0 isn't a variable used for regular expression captures, it's something very different.
if you want to get the content of the file,
#lines = <FILE>;
Use File::Slurp::Tiny. As convenient as File::Slurp, but without the bugs.