Using Text::CSV on a String Containing Quotes - csv

I have pored over this site (and others) trying to glean the answer for this but have been unsuccessful.
use Text::CSV;
my $csv = Text::CSV->new ( { binary => 1, auto_diag => 1 } );
$line = q(data="a=1,b=2",c=3);
my $csvParse = $csv->parse($line);
my #fields = $csv->fields();
for my $field (#fields) {
print "FIELD ==> $field\n";
}
Here's the output:
# CSV_XS ERROR: 2034 - EIF - Loose unescaped quote # rec 0 pos 6 field 1
FIELD ==>
I am expecting 2 array elements:
data="a=1,b=2"
c=3
What am I missing?

You may get away with using Text::ParseWords. Since you are not using real csv, it may be fine. Example:
use strict;
use warnings;
use Data::Dumper;
use Text::ParseWords;
my $line = q(data="a=1,b=2",c=3);
my #fields = quotewords(',', 1, $line);
print Dumper \#fields;
This will print
$VAR1 = [
'data="a=1,b=2"',
'c=3'
];
As you requested. You may want to test further on your data.

Your input data isn't "standard" CSV, at least not the kind that Text::CSV expects and not the kind that things like Excel produce. An entire field has to be quoted or not at all. The "standard" encoding of that would be "data=""a=1,b=2""",c=3 (which you can see by asking Text::CSV to print your expected data using say).
If you pass the allow_loose_quotes option to the Text::CSV constructor, it won't error on your input, but it won't consider the quotes to be "protecting" the comma, so you will get three fields, namely data="a=1, b=2" and c=3.

Related

Problem accessing hash imported from CSV in Perl

I am working with the Text::CSV library of Perl to import data from a CSV file, using the functional interface. The data is stored in an array of hashes, and the problem is that when the script tries to access those elements/keys, they are uninitialized (or undefined).
Using the library Dumper, it is possible to see that the array and the hashes are not empty, in fact, they are correctly filled with the data of the CSV file.
With this small piece of code, I get the following output:
my $array = csv(
in => $csv_file,
headers => 'auto');
foreach my $mem (#{$array}) {
print Dumper $mem;
foreach (keys $mem) {
print $mem{$_};
}
}
Last part of the output:
$VAR1 = {
'Column' => '16',
'Width' => '13',
'Type' => 'RAM',
'Depth' => '4096'
};
Use of uninitialized value in print at ** line 81.
Use of uninitialized value in print at ** line 81.
Use of uninitialized value in print at ** line 81.
Use of uninitialized value in print at ** line 81.
This happens with all the elements of the array. Is this problem related to the encoding, or I am just simply accessing the elements in a incorrect way?
$mem is a reference to a hash, but you keep trying to use it directly as a hash. Change your code to:
foreach (keys %$mem) {
print $mem->{$_};
}
There is a slight complication in that in some versions of perl, 'keys $mem' was allowed directly as an experimental feature, which later got removed. In any case, adding
use warnings;
use strict;
would likely have given you some helpful clues as to what was happening.
When I run your code on my version of Perl (5.24), I get this error:
Experimental keys on scalar is now forbidden at ... line ...
This points to the line:
foreach (keys $mem) {
You should dereference the hash ref:
use warnings;
use strict;
use Data::Dumper;
use Text::CSV qw( csv );
my $csv_file="data.csv";
my $array = csv(
in => $csv_file,
headers => 'auto');
foreach my $mem (#{$array}) {
print Dumper($mem);
foreach (keys %{ $mem }) {
print $mem->{$_}, "\n";
}
}

putting commas separated values into a new file line by line

From command line, we are passing multiple values separated by commas such as sydney,delhi,NY,Russia as an option. These values are getting stored under $runTest in the perl script. Now I want to create a new file under the script with contents of $runTest but as line by line. For example:
INPUT (passed values from command line):
sydney,delhi,NY,Russia
OUTPUT (under new file: myfile):
sydney
delhi
NY
Russia
In this simple example, it is better to use split on a delimiter than tr in such case. A few minor points: use snake_case for names instead of CamelCase, and use autodie to make open, close, etc, fatal, without the need to clutter the code with or die "...":
use autodie;
my $run_test = 'sydney,delhi,NY,Russia';
open my $out, '>', 'myFile';
print {$out} map { "$_\n" } split /,/, $run_test;
close $out;
For more robust parsing in general, beyond this simple example, prefer specialized modules, such as Text::CSV or Text::CSV_XS for csv parsing. Compared to the overly simplistic split, Text::CSV_XS enables correct input/output of quoted fields, fields containing the delimiter (comma), binary characters, provides error messages and more. Example:
use Text::CSV_XS;
use autodie;
open my $out, q{>}, q{myFile};
# All of these input strings are parsed correctly, unlike when using "split":
# my $run_test = q{sydney,delhi,NY,Russia};
# my $run_test = q{sydney,delhi,NY,Russia,"field,with,commas"};
my $run_test = q{sydney,delhi,NY,Russia,"field,with,commas","field,with,missing,quote};
# binary => 1 : enable parsing binary characters in quoted fields.
# auto_diag => 1 : print the internal error code and the associated error message to STDERR.
my $csv = Text::CSV_XS->new( { binary => 1, auto_diag => 1 } );
if ( $csv->parse( $run_test ) ) {
print {$out} map { "$_\n" } $csv->fields;
}
else {
print STDERR q{parse() failed on: }, $csv->error_input, qq{\n};
}

How to force a variable to be treated as a String in perl?

I have a JSON object which has a key value pair and the value of a one such pair is 0E10.
The problem is that this value should be a string but this is being treated as a float because of the presence of letter E after a number, hence it is showing 0 whenever I print this value (0*e+10).
Can somebody please help me solve this problem?
I am using perl to pass the JSON and reading it through Javascript. (Solution in any language would be acceptable)
This is what I get when I print the JSON.
KEY1 : 0E10
KEY2 : "XYZ"
You can clearly see that, if the value is string it puts under quotes (") but for 0E10 it is not using the quotes (").
The problem in my case is that I am reading the JSON from an API whose control is beyond my reach. I have a backend service which is written in perl which passes the JSON returned by the API. So whenever I hit a URL, the backend service written in perl is called. This service gets the JSON from the API and return back the JSON to the service which is hitting the URL.
See the difference:
Option A
use strict;
use warnings;
use JSON;
my $value = 12345;
my $hr = { KEY1=> $value, KEY2=> "XYZ" };
my $json = encode_json $hr;
print $json, "\n";
#<-- prints: {"KEY2":"XYZ","KEY1":12345}
Option B: double quote the $value assign to KEY1
use strict;
use warnings;
use JSON;
my $value = 12345;
my $hr = { KEY1=> "$value", KEY2=> "XYZ" };
my $json = encode_json $hr;
print $json, "\n";
#<-- prints: {"KEY2":"XYZ","KEY1":"12345"}
If you want to generate key: 0E10 (as opposed to key: 0 and key: '0E10'), then you'll have to generate your own JSON. Perl doesn't have a way of storing 0E10 differently than 0E9. (Neither do JavaScript, Java, C, C++, ...)
If you're willing to accept any exponent, you'll probably still have to generate your own JSON. Perl doesn't have a type system, and JSON encoders tend to use integer notation for integers (in the mathematical sense).
I specifically tested JSON::XS and JSON::PP will use 0 for a zero internally stored as a floating point number.
$ perl -MJSON::XS -MDevel::Peek -E'($_=1.1)-=$_; Dump $_; say encode_json([$_]);'
SV = PVNV(0x8002b7d8) at 0x800720f0
REFCNT = 1
FLAGS = (NOK,pNOK)
IV = 1
NV = 0
PV = 0
[0]
$ perl -MJSON::PP -MDevel::Peek -E'($_=1.1)-=$_; Dump $_; say encode_json([$_]);'
SV = PVNV(0x801602b0) at 0x8008e520
REFCNT = 1
FLAGS = (NOK,pNOK)
IV = 1
NV = 0
PV = 0
[0]
(NOK indicates the scalar contains a value stored as a floating point number.)

Why does the JSON module quote some numbers but not others?

We recently switched to the new JSON2 perl module.
I thought all and everything gets returned quoted now.
But i encountered some cases in which a number (250) got returned as unquoted number in the json string created by perl.
Out of curiosity:
Does anyone know why such cases exist and how the json module decides if to quote a value?
It will be unquoted if it's a number. Without getting too deeply into Perl internals, something is a number if it's a literal number or the result of an arithmetic operation, and it hasn't been stringified since its numeric value was produced.
use JSON::XS;
my $json = JSON::XS->new->allow_nonref;
say $json->encode(42); # 42
say $json->encode("42"); # "42"
my $x = 4;
say $json->encode($x); # 4
my $y = "There are $x lights!";
say $json->encode($x); # "4"
$x++; # modifies the numeric value of $x
say $json->encode($x); # 5
Note that printing a number isn't "stringifying it" even though it produces a string representation of the number to output; print $x doesn't cause a number to be a string, but print "$x" does.
Anyway, all of this is a bit weird, but if you want a value to be reliably unquoted in JSON then put 0 + $value into your structure immediately before encoding it, and if you want it to be reliably quoted then use "" . $value or "$value".
You can force it into a string by doing something like this:
$number_str = '' . $number;
For example:
perl -MJSON -le 'print encode_json({foo=>123, bar=>"".123})'
{"bar":"123","foo":123}
It looks like older versions of JSON has autoconvert functionality that can be set. Did you not have $JSON::AUTOCONVERT set to a true value?

Parsing JSON Data::Dumper output array in Perl

I'm trying to edit an old perl script and I'm a complete beginner. The request from the server returns as:
$VAR1 = [
{
'keywords' => [
'bare knuckle boxing',
'support group',
'dual identity',
'nihilism',
'support',
'rage and hate',
'insomnia',
'boxing',
'underground fighting'
],
}
];
How can I parse this JSON string to grab:
$keywords = "bare knuckle boxing,support group,dual identity,nihilism,support,rage and hate,insomnia,boxing,underground fighting"
Full perl code
#!/usr/bin/perl
use LWP::Simple; # From CPAN
use JSON qw( decode_json ); # From CPAN
use Data::Dumper; # Perl core module
use strict; # Good practice
use warnings; # Good practice
use WWW::TheMovieDB::Search;
use utf8::all;
use Encode;
use JSON::Parse 'json_to_perl';
use JSON::Any;
use JSON;
my $api = new WWW::TheMovieDB::Search('APIKEY');
my $img = $api->type('json');
$img = $api->Movie_imdbLookup('tt0137523');
my $decoded_json = decode_json( encode("utf8", $img) );
print Dumper $decoded_json;
Thanks.
Based on comments and on your recent edit, I would say that what you are asking is how to navigate a perl data structure, contained in the variable $decoded_json.
my $keywords = join ",", #{ $decoded_json->[0]{'keywords'} };
say qq{ #{ $arrayref->[0]->{'keywords'} } };
As TLP pointed out, all you've shown is a combination of perl arrays/hashes. But you should look at the JSON.pm documentation, if you have a JSON string.
The result you present is similar to json, but the Perl-variant of it. (ie => instead of : etc). I don't think you need to look into the json part of it, As you already got the data. You just need to use Perl to join the data into a text string.
Just to eleborate on the solution to vol7ron :
#get a reference to the list of keywords
my $keywords_list = $decoded_json->[0]{'keywords'};
#merge this list with commas
my $keywords = join(',', #$keywords_list );
print $keywords;