How to start reading CSV from beginning again? - csv

use Text::CSV_XS;
my $csv = Text::CSV_XS->new;
open my $fh, "test.csv" or die "test.csv: $!";
while (my $row = $csv->getline($fh)) {
my #fields = #$row;
if ($fields[0] eq "A1") {
print "Found A1", "\n";
last;
}
}
# now start searching the CSV again
If I have gone through some of a CSV using Text::CSV_XS, how can I then start again from the beginning? Is there some way to return the pointer/window to the beginning of the file?

use Fcntl qw( SEEK_SET );
seek($fh, 0, SEEK_SET);
You could also just re-open the file.

Related

How to read a csv using Perl?

I want to read a csv using perl excluding the first row. Further, col 2 and col3 variables need to be stored in another file and the row read must be deleted.
Edit : Following code has worked. I just want the deletion part.
use strict;
use warnings;
my ($field1, $field2, $field3, $line);
my $file = 'D:\Patching_test\ptch_file.csv';
open( my $data, '<', $file ) or die;
while ( $line = <$data> ) {
next if $. == 1;
( $field1, $field2, $field3 ) = split ',', $line;
print "$field1 : $field2 : $field3 ";
my $filename = 'D:\DB_Patch.properties';
unlink $filename;
open( my $sh, '>', $filename )
or die "Could not open file '$filename' $!";
print $sh "Patch_id=$field2\n";
print $sh "Patch_Name=$field3";
close($sh);
close($data);
exit 0;
}
OPs problem poorly presented for processing
no input data sample provided
no desired output data presented
no modified input file after processing presented
Based on problem description following code provided
use strict;
use warnings;
use feature 'say';
my $input = 'D:\Patching_test\ptch_file.csv';
my $output = 'D:\DB_Patch.properties';
my $temp = 'D:\script_temp.dat';
open my $in, '<', $input
or die "Couldn't open $input";
open my $out, '>', $output
or die "Couldn't open $output";
open my $tmp, '>', $temp
or die "Couldn't open $temp";
while ( <$in> ) {
if( $. == 1 ) {
say $tmp $_;
} else {
my($patch_id, $patch_name) = (split ',')[1,2];
say $out "Patch_id=$patch_id";
say $out "Patch_Name=$patch_name";
}
}
close $in;
close $out;
close $tmp;
rename $temp,$input;
exit 0;

write into a csv file in multiple cells

I am coding in perl, how can you write into a csv file multiple variables and put each one in a separate cell in the same line.
this a part of my Code:
#!/usr/bin/perl
use feature qw(say);
use strict;
use warnings;
use constant BUFSIZE => 6;
my $year += 1900;
my $input_file = 'path\ZONE0.txt';
my $outputfile = 'path\outputfile.csv';
open (my $BIN, "<:raw", $input_file) or die "can't open the file $input_file: $!";
my $buffer;
open(FH, '>>', $outputfile) or die $!;
while (1) {
my $bytes_read = sysread $BIN, $buffer, BUFSIZE;
die "Could not read file $input_file: $!" if !defined $bytes_read;
last if $bytes_read <= 0;
my #decimal= map { unpack "C", $_ } split //, $buffer;
my $start= $decimal[0];
my $DevType = $decimal[1];
my #hexDevType = sprintf("0x%x", $DevType);
my #DevUID =($decimal[5], $decimal[4], $decimal[3], $decimal[2]);
my #hexDevUID = map { sprintf("0x%x",$_) } #DevUID;
print FH $start, ' ' , print FH $DevType,' ', #hexDevUID , "\n";
}
close $BIN;
this results in puting all the variable next to each other in one cell, which is not what I want. can you help me separate the variables.
CSV files don't have cells. I suspect you're opening the file in a spreadsheet program.
The secret of a CSV file is that the values are separated by commas. So you need to put commas between any values that you want to appear in separate cells in your spreadsheet.
It looks like your data is in #hexDevUID. The simplest way is to turn that into a comma-separated string using join():
join(',', #hexDevUID)
But the more robust approach will be to use Text::CSV_XS.
Bellow is modified OPs code which does not utilize any CVS modules for output.
Added error handling code for read error and insufficient number of read bytes for further processing.
use strict;
use warnings;
use feature 'say';
use constant BUFSIZE => 6;
my($buffer,$bytes_read);
my $infile = shift || 'path\ZONE0.txt';
my $outfile = 'path\outputfile.csv';
open my $in, '<:raw', $infile
or die "Can't open $infile: $!";
open my $out, '+>>', $outfile
or die "Can't open $outfile: $!";
do {
$bytes_read = sysread $in, $buffer, BUFSIZE;
die "Error: read from $infile: $!" unless defined $bytes_read;
error_handler($bytes_read) unless $bytes_read == 6;
my #decimal = map { ord } split //, $buffer;
my($start,$DevType) = #decimal[0,1];
my #hexDevUID = map { sprintf("0x%02x",$_) } #decimal[5,4,3,2];
say $out join(',',($start,$DevType,#hexDevUID));
} while ( $bytes_read );
sub error_handler {
my $bytes = shift;
close $out;
close $in;
say "
Error: called error_handler(\$read_bytes)
Action: Emergency file closure to preserve data
Cause: Read insufficient $bytes bytes
" unless $bytes == 0;
exit $bytes ? 1 : 0;
}
The loop can be rewritten with use of unpack like following
do {
$bytes_read = sysread $in, $buffer, BUFSIZE;
die "Error: read from $infile: $!" unless defined $bytes_read;
error_handler($bytes_read) unless $bytes_read == 6;
my($start,$DevType,#devUID) = unpack('CCC4',$buffer);
my #hexDevUID = reverse map { sprintf "0x%02x", $_ } #devUID;
say $out join(',',($start,$DevType,#hexDevUID));
} while ( $bytes_read );

Corrupted JSON encoding in Perl (missign comma)

My custom code (on Perl) give next wrong JSON, missing comma between blocks:
{
"data": [{
"{#LOGFILEPATH}": "/tmp/QRZ2007.tcserverlogs",
"{#LOGFILE}": "QRZ2007"
} **missing comma** {
"{#LOGFILE}": "ARZ2007",
"{#LOGFILEPATH}": "/tmp/ARZ2007.tcserverlogs"
}]
}
My terrible code:
#!/usr/bin/perl
use strict;
use warnings;
use File::Basename;
use utf8;
use JSON;
binmode STDOUT, ":utf8";
my $dir = $ARGV[0];
my $json = JSON->new->utf8->space_after;
opendir(DIR, $dir) or die $!;
print '{"data": [';
while (my $file = readdir(DIR)) {
next unless (-f "$dir/$file");
next unless ($file =~ m/\.tcserverlogs$/);
my $fullPath = "$dir/$file";
my $filenameshort = basename($file, ".tcserverlogs");
my $data_to_json = {"{#LOGFILEPATH}"=>$fullPath,"{#LOGFILE}"=>$filenameshort};
my $data_to_json = {"{#LOGFILEPATH}"=>$fullPath,"{#LOGFILE}"=>$filenameshort};
print $json->encode($data_to_json);
}
print ']}'."\n";
closedir(DIR);
exit 0;
Dear Team i am not a programmer, please any idea how fix it, thank you!
If you do not print a comma, you will not get a comma.
You are trying to build your own JSON string from pre-encoded pieces of smaller data structures. That will not work unless you tell Perl when to put commas. You could do that, but it's easier to just collect all the data into a Perl data structure that is equivalent to the JSON string you want to produce, and encode the whole thing in one go when you're done.
my $dir = $ARGV[0];
my $json = JSON->new->utf8->space_after;
my #data;
opendir( DIR, $dir ) or die $!;
while ( my $file = readdir(DIR) ) {
next unless ( -f "$dir/$file" );
next unless ( $file =~ m/\.tcserverlogs$/ );
my $fullPath = "$dir/$file";
my $filenameshort = basename( $file, ".tcserverlogs" );
my $data_to_json = { "{#LOGFILEPATH}" => $fullPath, "{#LOGFILE}" => $filenameshort };
push #data, $data_to_json;
}
closedir(DIR);
print $json->encode( { data => \#data } );

HTML tag parsing script

I've written an HTML tag parsing script that I think should work but I'm getting a file not found error. Maybe I'm having a senior moment but I'm stuck. I have all of the *.html files that I want to parse in a directory called Test and I am executing the perl script from a folder called temp that has the directory Test in it. The exact error is: Error opening Test/1.html: No such file or directory.
Here's the code:
#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
use HTTP::Headers;
use HTML::HeadParser;
use Text::CSV;
my $csv1 = Text::CSV->new ( { binary => 1 } ) or die Text::CSV->error_diag();
$csv1->eol ("\n");
my $dfile = 'all_tags.csv';
open my $fh1, ">:encoding(utf8)", "$dfile" or die "Error opening $dfile: $!";
my $dir = 'Test';
find (\&HTML_Files, $dir);
print "directory is";
print $dir;
close $fh1 or die "Error closing $dfile: $!";
exit;
sub HTML_Files {
Parse_HTML_Header($File::Find::name) if /\.html?$/;
}
sub Parse_HTML_Header {
my $ifile = shift;
open(my $fh0, '<', $ifile) or die "Error opening $ifile: $!\n";
my $text = '';
{
$/ = undef;
$text = <$fh0>;
}
close $fh0;
my $h = HTTP::Headers->new;
my $p = HTML::HeadParser->new($h);
$p->parse($text);
for ($h->header_field_names) {
my #values = split ',', $h->header($_);
if (/keywords/i) {
$csv1->print ($fh1, \#values);
} elsif (/description/i) {
$csv1->print ($fh1, \#values);
} elsif (/title/i) {
$csv1->print ($fh1, \#values);
}
}
}
It's because File::Find is doing a chdir as it runs. You should pass $_ instead of $File::Find::name. Or set no_chdir:
no_chdir
Does not chdir() to each directory as it recurses. The wanted() function will need to be aware of this, of course. In this case, $_ will be the same as $File::Find::name .
Because you are specifying a relative path, $File::Find::name is also a relative path. You can avoid this by specifying a full path to find as well. (e.g. /full/path/to/dir)

Remove trailing commas at the end of the string using Perl

I'm parsing a CSV file in which each line look something as below.
10998,4499,SLC27A5,Q9Y2P5,GO:0000166,GO:0032403,GO:0005524,GO:0016874,GO:0047747,GO:0004467,GO:0015245,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
There seems to be trailing commas at the end of each line.
I want to get the first term, in this case "10998" and get the number of GO terms related to it.
So my output in this case should be,
Output:
10998,7
But instead it shows 299. I realized overall there are 303 commas in each line. And I'm not able to figure out an easy way to remove trailing commas. Can anyone help me solve this issue?
Thanks!
My Code:
use strict;
use warnings;
open my $IN, '<', 'test.csv' or die "can't find file: $!";
open(CSV, ">GO_MF_counts_Genes.csv") or die "Error!! Cannot create the file: $!\n";
my #genes = ();
my $mf;
foreach my $line (<$IN>) {
chomp $line;
my #array = split(/,/, $line);
my #GO = splice(#array, 4);
my $GO = join(',', #GO);
$mf = count($GO);
print CSV "$array[0],$mf\n";
}
sub count {
my $go = shift #_;
my $count = my #go = split(/,/, $go);
return $count;
}
I'd use juanrpozo's solution for counting but if you still want to go your way, then remove the commas with regex substitution.
$line =~ s/,+$//;
I suggest this more concise way of coding your program.
Note that the line my #data = split /,/, $line discards trailing empty fields (#data has only 11 fields with your sample data) so will produce the same result whether or not trailing commas are removed beforehand.
use strict;
use warnings;
open my $in, '<', 'test.csv' or die "Cannot open file for input: $!";
open my $out, '>', 'GO_MF_counts_Genes.csv' or die "Cannot open file for output: $!";
foreach my $line (<$in>) {
chomp $line;
my #data = split /,/, $line;
printf $out "%s,%d\n", $data[0], scalar grep /^GO:/, #data;
}
You can apply grep to #array
my $mf = grep { /^GO:/ } #array;
assuming $array[0] never matches /^GO:/
For each your line:
foreach my $line (<$IN>) {
my ($first_term) = ($line =~ /(\d+),/);
my #tmp = split('GO', " $line ");
my $nr_of_GOs = #tmp - 1;
print CSV "$first_term,$nr_of_GOs\n";
}