Processing form data using CGI - html

I'm programming in Perl and need to get data from the following HTML form:
<FORM action="./cgi-bin/Perl.pl" method="GET">
<br>
Full name: <br><input type="text" name="full_name" maxlength="20"><br>
Username: <br><input type="text" name="user_name" maxlength="8"><br>
Password: <br><input type="password" name="password" maxlength="15"><br>
Confirm password: <br><input type="password" name="new_password" maxlength="15"><br>
<input type="submit" value ="Submit"><br><br>
</FORM>
EDIT: If i cannot use CGI.pm, will the following work?
local ($buffer, #pairs, $pair, $name, $value, %FORM);
# Read in text
$ENV{'REQUEST_METHOD'} =~ tr/a-z/A-Z/;
if ($ENV{'REQUEST_METHOD'} eq "GET") {
read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'});
}
else {
$buffer = $ENV{'QUERY_STRING'};
}
# Split information into name/value pairs
#pairs = split(/&/, $buffer);
foreach $pair (#pairs)
{
($name, $value) = split(/=/, $pair);
$value =~ tr/+/ /;
$value =~ s/%(..)/pack("C", hex($1))/eg;
$FORM{$name} = $value;
}
but every time I attempt to use these values I get the error:
Use of unitilialized value
How can I properly use CGI to handle my form data?
EDIT: It's possible that my error lies elsewhere. This is my code. Could it be the way in which I'm using grep? Should I not be using the GET method?
#!/usr/bin/perl
use CGI qw(:standard);
use strict;
use warnings;
print "Content-type: text/html\n\n";
#getting these from HTML form
my $full_name = param('full_name');
my $user_name= param('user_name');
my $password = param('password');
my $new_password = param('new_password');
#checking that inputs are alphanumeric or an underscore
my $mismatch = grep /^[a-zA-Z0-9_]*$/i, $full_name, $user_name, $password, $new_password;
if($mismatch) {
#error message if invalid input
print qq(<html>\n);
print qq(<head>\n);
print qq(<title> Error: alphanumeric inputs only. </title>\n);
print qq{<meta http-equiv="refresh" content="5;URL="http://www.cs.mcgill.ca/~amosqu/registration.html">\n};
print qq(</head>\n);
print qq(<body>\n);
print qq(<b><center> Inputs with alphanumeric characters only please. </b></center>\n\n);
print qq(</body>\n);
print qq(</html>\n);
}

You have altered the regex that I suggested in my answer to your previous question, which was
grep /[^A-Z0-9]/i, $full_name, $user_name, $password, $new_password
You have changed it so that $mismatch is now set to the number of parameters that are valid, and the condition for an invalid set of arguments is now the awkward $mismatch < 4.
If your requirement has altered from alphanumeric to alphanumeric plus underscore, then you can restore the sense of the grep by writing
my $mismatch = grep /\W/, $full_name, $user_name, $password, $new_password
which will set $mismatch to a positive true value if any of the values contains a "non-word" character, which is alphanumeric plus underscore as you wanted.
However, the problem you are seeing
Use of uninitialized value $_ in pattern match (m//)
is because at least one of the parameters $full_name, $user_name, $password, or $new_password is undefined. You need to find out which one and why it is happening. Are you sure that all four query parameters full_name, user_name, password, and new_password are present in the query string you're getting back? Take a look at what the query_string method returns to see.

Well, "use of initialized value" isn't an error, it's just a warning. More recent versions of Perl will tell you which variable is causing the problem.
Are you sure that it's the grep line that is generating the errors? Are you sure that you are filling in all of the form inputs when you're testing this?
The following suggestions, don't address your warning. But they are problems with your code.
The regex you are using for your grep seems broken. Your code says "set $missing to the number of variables that include nothing but alphanumeric characters". That will be set to four if you get four alphanumeric-only inputs. You then trigger the error page - which seems to be the inverse of what you want. Your regex also checks for zero or more alphanumeric characters. So it accepts the empty string. Is that what you want?
Also, your regex is too complicated. There's no need to include both A-Z and a-z if you're using /i to make the match case-insensitive. In fact your regex can be collapsed to /^\w*$/i (as \w means "alphanumeric characters plus an underscore").
Checking that inputs only contain alphanumeric characters is probably a bad idea as well. Most people's full names will include at least one space. And limiting passwords to just containing alphanumeric characters is a terrible idea.
When people point out that CGI is no longer recommended, that doesn't mean that you should go back to using Matt Wright's broken CGI parameter parser from twenty years ago. That code is just as broken as it always was. No-one should be using it. You should be looking at one of the modern Perl web development frameworks that are based on PSGI - something like Web::Simple, Dancer or Mojolicious. See CGI::Alternatives for details.

Related

How to convert \u0421 into letter "C"?

I made a post query to server and got json. It contains wrong symbol: instead "Correct" I got "\u0421orrect". How can I encode this text?
A parse_json function performs it like "РЎorrect";
I found out that
$a = "\x{0421}orrect";
$a= encode("utf-8", $a);
returns "РЎorrect", and
$a = "\x{0421}orrect";
$a= encode("cp1251", $a);
returns "Correct"
So I've decided to change \u to \x and then to use cp1251.
\u to \x
I wrote:
Encode::Escape::enmode 'unicode-escape', 'perl';
Encode::Escape::demode 'unicode-escape', 'python';
$content= encode 'unicode-escape', decode 'unicode-escape', $content;
and got \x{0421}orrect.
And then I tried:
$content = encode( 'cp1251', $content );
And... nothing changed! I still have \x{0421}orrect...
I notice something interesting:
$a = "\x{0421}orrect";
$a= encode("cp1251", $a);
returns "Correct"
BUT
$a = '\x{0421}orrect';
$a= encode("cp1251", $a);
still returns "\x{0421}orrect".
Maybe this is a key, but I don't know what I can do with this.
I've already tried to encode and decode, Encode:: from_to,JSON::XS and utf8.
You mention escaping multiple times, but you want to do the opposite (unescape).
decode_json/from_json will correctly return "Сorrect" (Where the "C" is CYRILLIC CAPITAL LETTER ES).
use JSON::XS qw( decode_json );
my $json_utf8 = '{"value":"\u0421orrect"}';
my $data = decode_json($json_utf8);
You do need to encode your outputs, though. For example, if you have Cyrillic-based Windows system, and you wanted to create a native file, you could use
open(my $fh, '>:encoding(cp1251)', $qfn)
or die("Can't create \"$qfn\": $!\n");
say $fh $data->{value};
If you want to hardcode the encoding, or if you're interested in the encoding output to STDOUT and STDERR as well, check out this.
Apologies if you realise this already - I just think it's worth pointing out so we're all on the same page.
Character number \x{0421} has the description "CYRILLIC CAPITAL LETTER ES" and looks like this: С
Character number \x{0043} has the description "LATIN CAPITAL LETTER C" and looks like this: C
So depending on the font you're using, it's entirely likely that the two characters appear identical.
You asked "How can I encode this text?" but you didn't explain what you mean by that or why you want to "encode" it. There is no encoding that will convert 'С' (\x{0421}) into 'C' (\x{0043}) - they are two different characters from two different alphabets.
So the question is, what are you trying to achieve? Are you trying to check if the string returned from the server matched "Correct"? If so, that simply won't work, because the server is returning the string "Сorrect". They might look the same, but they are two different strings.
It's possible that whole situation is an error in the server code and it should be returning "Correct". If that is the case and you can't rely on the server reliably returning the "Correct", one workaround would be to use a character replacement, to "normalise" the string before you inspect its contents. For example:
use JSON::XS qw( decode_json );
my $response = <<EOF;
{
"status": "\u0421orrect"
}
EOF
my $data = decode_json($response);
my $status = $data->{status};
$status =~ tr/\x{0421}/C/;
if($status eq "Correct") {
say "The status is correct";
}
else {
say "The status is not correct";
}
This code will work now, and in the future if the server code is fixed to return "Correct".

Use regular expression to extract img tag from HTML in Perl

I need to extract captcha from url and recognised it with Tesseract.
My code is:
#!/usr/bin/perl -X
###
$user = 'user'; #Enter your username here
$pass = 'pass'; #Enter your password here
###
#Server settings
$home = "http://perltest.adavice.com";
$url = "$home/c/test.cgi?u=$user&p=$pass";
###Add code here!
#Grab img from HTML code
#if ($html =~ /<img. *?src. *?>/)
#{
# $img1 = $1;
#}
#else
#{
# $img1 = "";
#}
$img2 = grep(/<img. *src=.*>/,$html);
if ($html =~ /\img[^>]* src=\"([^\"]*)\"[^>]*/)
{
my $takeImg = $1;
my #dirs = split('/', $takeImg);
my $img = $dirs[2];
}
else
{
print "Image not found\n";
}
###
die "<img> not found\n" if (!$img);
#Download image to server (save as: ocr_me.img)
print "GET '$img' > ocr_me.img\n";
system "GET '$img' > ocr_me.img";
###Add code here!
#Run OCR (using shell command tesseract) on img and save text as ocr_result.txt
system("tesseract ocr_me.img ocr_result");
print "GET '$txt' > ocr_result.txt\n";
system "GET '$txt' > ocr_result.txt";
###
die "ocr_result.txt not found\n" if (!-e "ocr_result.txt");
# check OCR results:
$txt = 'cat ocr_result.txt';
$txt =~ s/[^A-Za-z0-9\-_\.]+//sg;
$img =~ s/^.*\///;
print `echo -n "file=$img&text=$txt" | POST "$url"`;
As you see I`m trying extract img src tag. This solution did not work for me ($img1) use shell command tesseract in perl script to print a text output. Also I used adopted version of that solution($img2) How can I extract URL and link text from HTML in Perl?.
If you need HTMLcode from that page, here is:
<html>
<head>
<title>Perl test</title>
</head>
<body style="font: 18px Arial;">
<nobr>somenumbersimg src="/JJ822RCXHFC23OXONNHR.png"
somenumbers<img src="/captcha/1533030599.png"/>
somenumbersimg src="/JJ822RCXHFC23OXONNHR.png" </nobr><br/><br/><form method="post" action="?u=user&p=pass">User: <input name="u"/><br/>PW: <input name="p"/><br/><input type="hidden" name="file" value="1533030599.png"/>Text: <input name="text"></br><input type="submit"></form><br/>
</body>
</html>
I got error that image not found. My problem is wrong regular expression, as I think.I can not install any modules such as HTTP::Parser or similar
Aside from the fact that using regular expressions on HTML isn't very reliable, your regular expression in the following code isn't going to work because it's missing a capture group, so $1 won't be assigned a value.
if ($html =~ /<img. *?src. *?>/)
{
$img = $1;
}
If you want to extract parts of text using a regular expression you need to put that part inside brackets. Like for example:
$example = "hello world";
$example =~ /(hello) world/;
this will set $1 to "hello".
The regular expression itself doesn't make that much sense - where you have ". *?", that'll match any character followed by 0 or more spaces. Is that a typo for ".*?" which would match any number of characters but isn't greedy like ".*", so will stop when it finds a match for the next part of the regex.
This regular expression is possibly closer to what you're looking for. It'll match the first img tag that has a src attribute that starts with "/captcha/" and store the image URL in $1
$html =~ m%<img[^>]*src="(/captcha/[^"]*)"%s;
To break it down how it works. The "m%....%" is just a different way of saying "/.../" that allows you to put slashes in the regex without needing to escape them. "[^>]*" will match zero or more of any character except ">" - so it won't match the end of the tag. And "(/captcha/[^"]*)" is using a capture group to grab anything inside the double quotes that will be the URL. It's also using the "/s" modifier on the end which will treat $html as if it is just one long line of text and ignoring any \n in it which probably isn't needed, but on the off chance the img tag is split over multiple lines it'll still work.

Checking for alphanumeric characters and getting input from HTML form

I'm fairly new to programming in Perl and I have a couple of compilation issues I can't seem to resolve. My program gets input from this HTML form.
Question: Should my form use the post or get method?
<FORM action="./cgi-bin/Perl.pl" method="GET">
<br>
Full name: <br><input type="text" name="full_name" maxlength="20"><br>
Username: <br><input type="text" name="user_name" maxlength="8"><br>
Password: <br><input type="password" name="password" maxlength="15"><br>
Confirm password: <br><input type="password" name="new_password" maxlength="15"><br>
I open a CSV file, write the value of user_name into an array and do a number of checks on the user's input.
Problem #1: I need to check that full_name, user_name, password, and new_password are all alphanumeric or a space but I keep getting multiple errors that look like:
Use of uninitialized value $full_name in string eq at Perl.pl line 33
I don't think I've used CGI correctly to get these values from the form. I also believe I'm not correctly checking for alphanumeric characters. How can I resolve this?
Problem #2: I need to redirect the user to a specific webpage if their passwords don't match and if the username is already taken. I used a meta redirect but it's not doing it successfully. How can I display a proper error page?
This is my code:
#!/usr/bin/perl
use CGI qw(:standard);
use strict;
use warnings;
print "Content-type: text/html\n\n";
#opening Members.csv for reading
my $file = '/home/2014/amosqu/public_html/cgi-bin/Members.csv';
open(my $csv, '<', $file) || die "Could not open your file";
#getting these from HTML form
my $full_name = param('full_name');
my $user_name= param('user_name');
my $password = param('password');
my $new_password = param('new_password');
my #users = ();
#splitting each line of csv file
foreach (<$csv>) {
chomp;
my #fields = split (/\,/);
push #users, $fields[1]; #put all usernames inside of array
}
close $csv;
#opening Members.csv for appending
open(my $fh, '>>', $file) || die "Could not open your file";
#SOURCE OF PROBLEM 1
#checking that all values are alphanumeric
if(($full_name && $user_name && $password && $new_password) eq /\A[[:alnum:]]+\z/) {
#if passwords don't match, redirect to error page
if($password ne $new_password){
print qq(<html>\n);
print qq(<head>\n);
print qq(<title> Passwords don't match. </title> \n);
print qq{<meta http-equiv="refresh"content="5;URL="http://www.cs.mcgill.ca/~amosqu/registration.html">\n};
print qq(</head>\n);
print qq(<body>\n);
print qq(<b><center> Passwords don't match </b></center>\n\n);
print qq(</body>\n);
print qq(</html>\n);
}
#if they do match, check that user name isn't in Members.csv
else {
if(grep (/$user_name/, #users)) {
print qq(<html>\n);
print qq(<head>\n);
print qq(<title> Sorry username already taken. </title>\n);
print qq{<meta http-equiv="refresh"content="5;URL="http://www.cs.mcgill.ca/~amosqu/registration.html">\n};
print qq(</head>\n);
print qq(<body>\n);
print qq(<b><center> Username already taken. </b></center>\n\n);
print qq(</body>\n);
print qq(</html>\n);
}
#if it isn't already in Members.csv append values to the file
else {
print $fh "$full_name, $user_name, $password \n";
}
}
}
close $fh;
This should get you going. There is a number of issues with your code that don't stop it from working, but current wisdom is not to use CGI at all so I will roll with you.
Use GET unless you have a good reason to use POST
The problem is here
if(($full_name && $user_name && $password && $new_password) eq /\A[[:alnum:]]+\z/) {
You are using a Boolean && operation that combines the truth of the three variables, and checking whether that, as a string, is equal to the result of matching the contents of $_ against that regular expression.
You must check each of the variables individually, and use the binding operator =~ to test them against a regex. It is also bad form to use the POSIX character classes. I suggest you use grep, like this
my $mismatch = grep /[^A-Z0-9]/i, $full_name, $user_name, $password, $new_password;
Now, $mismatch is true if any of the variables contain a non-alphanumeric character. (Strictly, it is set to the number of variables that have a a non-alphanumeric character, which is zero (false) if none of them do.)
Then you can say
if (not $mismatch) { ... }
It looks like you just need an else that builds a separate page.

Why does the JSON module quote some numbers but not others?

We recently switched to the new JSON2 perl module.
I thought all and everything gets returned quoted now.
But i encountered some cases in which a number (250) got returned as unquoted number in the json string created by perl.
Out of curiosity:
Does anyone know why such cases exist and how the json module decides if to quote a value?
It will be unquoted if it's a number. Without getting too deeply into Perl internals, something is a number if it's a literal number or the result of an arithmetic operation, and it hasn't been stringified since its numeric value was produced.
use JSON::XS;
my $json = JSON::XS->new->allow_nonref;
say $json->encode(42); # 42
say $json->encode("42"); # "42"
my $x = 4;
say $json->encode($x); # 4
my $y = "There are $x lights!";
say $json->encode($x); # "4"
$x++; # modifies the numeric value of $x
say $json->encode($x); # 5
Note that printing a number isn't "stringifying it" even though it produces a string representation of the number to output; print $x doesn't cause a number to be a string, but print "$x" does.
Anyway, all of this is a bit weird, but if you want a value to be reliably unquoted in JSON then put 0 + $value into your structure immediately before encoding it, and if you want it to be reliably quoted then use "" . $value or "$value".
You can force it into a string by doing something like this:
$number_str = '' . $number;
For example:
perl -MJSON -le 'print encode_json({foo=>123, bar=>"".123})'
{"bar":"123","foo":123}
It looks like older versions of JSON has autoconvert functionality that can be set. Did you not have $JSON::AUTOCONVERT set to a true value?

Perl param() receiving from its own print HTML

I have a Perl script that reads in data from a database and prints out the result in HTML forms/tables. The form of each book also contains a submit button.
I want Perl to create a text file (or read into one already created) and print the title of the book that was inside the form submitted. But I can't seem to get param() to catch the submit action!
#!/usr/bin/perl -w
use warnings; # Allow for warnings to be sent if error's occur
use CGI; # Include CGI.pm module
use DBI;
use DBD::mysql; # Database data will come from mysql
my $dbh = DBI->connect('DBI:mysql:name?book_store', 'name', 'password')
or die("Could not make connection to database: $DBI::errstr"); # connect to the database with address and pass or return error
my $q = new CGI; # CGI object for basic stuff
my $ip = $q->remote_host(); # Get the user's ip
my $term = $q->param('searchterm'); # Set the search char to $term
$term =~ tr/A-Z/a-z/; # set all characters to lowercase for convenience of search
my $sql = '
SELECT *
FROM Books
WHERE Title LIKE ?
OR Description LIKE ?
OR Author LIKE ?
'; # Set the query string to search the database
my $sth = $dbh->prepare($sql); # Prepare to connect to the database
$sth->execute("%$term%", "%$term%", "%$term%")
or die "SQL Error: $DBI::errstr\n"; # Connect to the database or return an error
print $q->header;
print "<html>";
print "<body>";
print " <form name='book' action='bookcart.php' method=post> "; # Open a form for submitting the result of book selection
print "<table width=\"100%\" border=\"0\"> ";
my $title = $data[0];
my $desc = $data[1];
my $author = $data[2];
my $pub = $data[3];
my $isbn = $data[4];
my $photo = $data[5];
print "<tr> <td width=50%>Title: $title</td> <td width=50% rowspan=5><img src=$photo height=300px></td></tr><tr><td>Discreption Tags: $desc</td></tr><tr><td>Publication Date: $pub</td></tr><tr><td>Author: $author</td></tr><tr><td>ISBN: $isbn</td> </tr></table> <br>";
print "Add this to shopping cart:<input type='submit' name='submit' value='Add'>";
if ($q->param('submit')) {
open(FILE, ">>'$ip'.txt");
print FILE "$title\n";
close(FILE);
}
print "</form>"; # Close the form for submitting to shopping cart
You haven't used use strict, to force you to declare all your variables. This is a bad idea
You have used remote_host, which is the name of the client host system. Your server may not be able to resolve this value, in which case it will remain unset. If you want the IP address, use remote_addr
You have prepared and executed your SQL statement but have fetched no data from the query. You appear to expect the results to be in the array #data, but you haven't declared this array. You would have been told about this had you had use strict in effect
You have used the string '$ip'.txt for your file names so, if you were correctly using the IP address in stead of the host name, your files would look like '92.17.182.165'.txt. Do you really want the single quotes in there?
You don't check the status of your open call, so you have no idea whether the open succeeded, or the reason why it may have failed
I doubt if you have really spent the last 48 hours coding this. I think it is much more likely that you are throwing something together in a rush at the last minute, and using Stack Overflow to help you out of the hole you have dug for yourself.
Before asking for the aid of others you should at least use minimal good-practice coding methods such as applying use strict. You should also try your best to debug your code: it would have taken very little to find that $ip has the wrong value and #data is empty.
Use strict and warnings. You want to use strict for many reasons. A decent article on this is over at perlmonks, you can begin with this. Using strict and warnings
You don't necessarily need the following line, you are using DBI and can access mysql strictly with DBI.
use DBD::mysql;
Many of options are available with CGI, I would recommend reading the perldoc on this also based on user preferences and desired wants and needs.
I would not use the following:
my $q = new CGI;
# I would use as so..
my $q = CGI->new;
Use remote_addr instead of remote_host to retrieve your ip address.
The following line you are converting all uppercase to lowercase, unless it's a need to specifically read from your database with all lowercase, I find this useless.
$term =~ tr/A-Z/a-z/;
Next your $sql line, again user preference, but I would look into sprintf or using it directly inside your calls. Also you are trying to read an array of data that does not exist, where is the call to get back your data? I recommend reading the documentation for DBI also, many methods of returning your data. So you want your data back using an array for example...
Here is an untested example and hint to help get you started.
use strict;
use warnings;
use CGI qw( :standard );
use CGI::Carp qw( fatalsToBrowser ); # Track your syntax errors
use DBI;
# Get IP Address
my $ip = $ENV{'REMOTE_ADDR'};
# Get your query from param,
# I would also parse your data here
my $term = param('searchterm') || undef;
my $dbh = DBI->connect('DBI:mysql:db:host', 'user', 'pass',
{RaiseError => 1}) or die $DBI::errstr;
my $sql = sprintf ('SELECT * FROM Books WHERE Title LIKE %s
OR Description LIKE %s', $term, $term);
my $sth = $dbh->selectall_arrayref( $sql );
# Retrieve your result data from array ref and turn into
# a hash that has title for the key and a array ref to the data.
my %rows = ();
for my $i ( 0..$#{$sth} ) {
my ($title, $desc, $author, $pub, $isbn, $pic) = #{$sth->[$i]};
$rows{$title} = [ $desc, $author, $pub, $isbn, $pic ];
}
# Storing your table/column names
# in an array for mapping later.
my #cols;
$cols[0] = Tr(th('Title'), th('Desc'), th('Author'),
th('Published'), th('ISBN'), th('Photo'));
foreach (keys %rows) {
push #cols, Tr( td($_),
td($rows{$_}->[0]),
td($rows{$_}->[1]),
td($rows{$_}->[2]),
td($rows{$_}->[3]),
td(img({-src => $rows{$_}->[4]}));
}
print header,
start_html(-title => 'Example'),
start_form(-method => 'POST', -action => 'bookcart.php'), "\n",
table( {-border => undef, -width => '100%'}, #cols ),
submit(-name => 'Submit', -value => 'Add Entry'),
end_form,
end_html;
# Do something with if submit is clicked..
if ( param('Submit') ) {
......
}
This assumes that you're using the OO approach to CGI.pm, and that $q is the relevant object. This should work, assuming that you have $q = new CGI somewhere in your script.
Can you post the rest of the script?
I've created a mockup to test this, and it works as expected:
#!/usr/bin/perl
use CGI;
my $q = new CGI;
print $q->header;
print "<form><input type=submit name=submit value='add'></form>\n";
if ($q->param('submit')) {
print "submit is \"" . $q->param('submit') . "\"\n";
}
After the submit button is clicked, the page displays that submit is "add" which means the evaluation is going as planned.
I guess what you need to do is make sure that $q is your CGI object, and move forward from there.