I need to convert a Json file into a readable csv - json

I got this Perl code that is supposed to read my categories and put them into a csv file. After many tries i finally got it but is only ready 50 of my over 500 categories. Any way to modify this routine to read all my categories.
Here is the Perl file I got from the Bigcommerce forum.
use strict;
use JSON::PP;
open (my $fh, "<", 'categories.json');
my $json_text = <$fh>;
my $perl_scalar = decode_json($json_text);
# Make a list of ids to names, so that I can build a content path for Neto category CSV
my $id;
foreach my $element (#$perl_scalar)
{
$id->{$element->{id}}=$element->{name};
}
# Actually print out the CSV content, in Neto's required format.
print "content type,content path,name,description 1,description 2,sort order,seo meta description,seo page title,seo meta keywords\n";
foreach my $element (#$perl_scalar)
{
print "Product Category,";
my $parent_category = $element->{parent_category_list}[0];
if ($parent_category == $element->{id})
{
print ",";
}
else
{
print $id->{$parent_category}, ",";
}
print $element->{name}, ",", $element->{description}, ",,", $element->{sort_order}, ",", $element->{meta_description}, ",,\n";
}
Thanks in advance

There is a pretty fundamental problem with mapping JSON to CSV. JSON is a nested data structure, where CSV isn't. Therefore you'll always have to mess around with converting - how would you colliminate:
{
"data2" : {
"fish" : "paste"
},
"data" : [
{
"somesub" : "somethingelse"
},
{
"somesub" : "anotherthing"
}
]
}
This won't turn into a flat data structure like CSV easily.
If you've some trivial JSON to convert, it's not too hard, but depends entirely on the structure of your JSON file, and how you want to map things.
For a trivial example:
use strict;
use warnings;
use JSON;
use Data::Dumper;
local $/;
my $data = from_json(<DATA>);
print Dumper $data;
my #columns = qw ( col1 col2 col3 );
print join( ",", "key", #columns ), "\n";
foreach my $key ( sort keys %$data ) {
print join( ",", $key, #{ $data->{$key} }{#columns} ), "\n";
}
__DATA__
{
"1" :
{
"col1" : "value1",
"col2" : "value2",
"col3" : "value3"
},
"2" : {
"col1" : "value4",
"col2" : "value5",
"col3" : "value6"
}
}
For a more complex example - it may be appropriate to use Text::CSV - but it depends rather what's in your JSON content - the simplistic join approach above doesn't cope with line feeds, embedded quotes or commas within the text. So it might be better to use Text::CSV:
#!/usr/bin/env perl
use strict;
use warnings;
use JSON;
use Text::CSV;
use Data::Dumper;
local $/;
my $data = from_json ( <DATA> );
print Dumper $data;
my $csv = Text::CSV -> new ( { 'binary' => 1 } );
my #columns = qw ( col1 col2 col3 );
$csv -> column_names ( #columns );
foreach my $key ( sort keys %$data ) {
$csv -> print_hr ( \*STDOUT, $data->{$key} );
print "\n";
}
foreach my $key ( sort keys %$data ) {
my $row = [ $key, #{$data->{$key}}{#columns} ];
$csv -> print ( \*STDOUT, $row );
print "\n";
}
This uses the same __DATA__ block as above, and also runs twice - once with using 'column headings' to print - which works provided you don't want to preserve the "key" field, and the second which assembles an array reference to print.

Related

Want to read JSON file strings (multiple values) in perl

Following is my json file (demo.json)
[
{
"Directory" : "/opt/gehc/abc/root/mr_csd/xxx/diag/",
"Files" : [ "abc.html","xyz.html",
"mnp.html"],
"Permission" : 555
}
]
i want to read each files in "Files" one by one which lies in "Directory", and change its "Permissions"
Following is the code i have started, Pls Help :
#!/usr/bin/perl
use JSON;
my $filename = 'demo.json';
my $data;
{
local $/ = undef;
open my $fh, '<', $filename;
$data = <$fh>;
close $fh;
}
my $result = decode_json( $data );
for my $report ( #{$result} ) {
Using your own code, you can easily de-reference the json-structure to simpler structures:
my #files = #{ $result->[0]{Files} };
my $perm = $result->[0]{Permission};
print Dumper \#files, $perm;
Which will print:
$VAR1 = [
'abc.html',
'xyz.html',
'mnp.html'
];
$VAR2 = 555;
Then you can loop over the files with a simple for loop:
for my $file (#files) { ...
And chmod files as necessary. Though you may have to prepare the number 555 to be an octal, as described in the documentation.
And if you have several levels of this array that you call "reports", you can loop over them like so:
for my $report (#{ $result }) {
my #files = #{ $report->{Files} };
my $perm = $report->{Permission};
print Dumper \#files, $perm;
}

How to convert a JSON file to Delimited File using perl.

#/usr/lib/perl
use lib qw(..);
use JSON qw( );
open json_fh, "<$ARGV[0]" or die "Couldn't open file $ARGV[0]!\n";
open csv_fh, ">$ARGV[1]" or die "Couldn't open file $ARGV[1]!\n";
#json_text =<json_fh>;
close json_fh;
foreach $json_text( #json_text )
{
chomp $json_text;
$json = JSON->new;
$data = $json->decode($json_text);
$id=$data->{_id};
#lines=#{$data->{accounts}};
foreach $line ( #lines )
{
$accountNumber = $line->{accountNumber};
$accountType = $line->{accountType};
$cardType = $line->{cardType};
$cardSubType = $line->{cardSubType};
$protectionMethod = $line->{protectionMethod};
$protectionSource = $line->{protectionSource};
$expirationDate = $line->{expirationDate};
$nameOnAccount = $line->{nameOnAccount};
$cardStatus = $line->{cardStatus};
$cardHolderType = $line->{cardHolderType};
$createdBy = $line->{createdBy};
$addressId = $line->{addressId};
$productType = $line->{productType};
$isDefaultAccount = $line->{isDefaultAccount};
#Write to the file in delimited file format
print csv_fh "$id|$accountNumber|$accountType|$cardType|$cardSubType|$protectionMethod|$protectionSource|$expirationDate|$nameOnAccount|$cardStatus|$cardHolderType|$createdBy|$addressId|$productType|$isDefaultAccount\n";
}
}
close csv_fh;
This is a perl script that, I created to convert the JSON file to a delimited file, If the element names are known.
Could anyone please help me to modify the code so that this conversion can be done, when the element names are unknown.
Assuming that every account has the same fields —it makes no sense otherwise— you can use the following:
my $json_parser = JSON->new;
my #headers;
for my $json_doc (#json_docs) {
my $data = $json_parser->decode($json_doc);
my $id = $data->{_id};
for my $account (#{ $data->{accounts} }) {
if (!#headers) {
#headers = sort keys %$account;
say join "|", 'id', #headers;
}
say join "|", $id, #$account{#headers};
}
}
You didn't provide an example input file, so I'm guessing it is something like this:
{ "accounts": [ { "_id": "1", "accountNumber": "99999", "accountType": "acctTypeA", "cardType": "cardTypeA", "cardSubType": "cardSubTypeA", "protectionMethod": "protectionMethodA", "protectionSource": "protectionSourceA", "expirationDate": "2020-09", "nameOnAccount": "First Last", "cardStatus": "OK", "cardHolderType": "CHTypeA", "createdBy": "userX", "addressId": "444", "productType": "prodTypeA", "isDefaultAccount": "1", "optional": "OptA" } ] }
You're pretty close, but usually the entire file is a JSON record, so you don't loop line-by-line, you create a data structure (hashref) that represents the entire file (i.e. you only need to do $json->decode once per file).
Additionally, I'd suggest some checks to validate input, such as missing fields; you can see I have it die with an error message if any field is missing.
#!/usr/bin/env perl
use strict;
use lib qw(..);
use JSON qw( );
#ARGV == 2 or die("Infile, Outfile required\n");
open json_fh, "<$ARGV[0]" or die "Couldn't open file $ARGV[0]!\n";
open csv_fh, ">$ARGV[1]" or die "Couldn't open file $ARGV[1]!\n";
my $json_text =<json_fh>;
close json_fh;
my $json = JSON->new->allow_nonref;
my $data = $json->decode($json_text);
my $accounts = $data->{accounts};
my #required = qw(_id accountNumber accountType cardType cardSubType protectionMethod protectionSource expirationDate nameOnAccount cardStatus cardHolderType createdBy addressId productType isDefaultAccount);
my #opt = (); # learn these
my %col; # key => column index
my $lastIndex;
for (my $i=0; $i<=$#required; ++$i) { $lastIndex = $col{$required[$i]} = $i }
print "There are ", $lastIndex+1, " required cols\n";
foreach my $rec ( #$accounts )
{
my #row;
foreach my $key ( keys %$rec )
{
if ( ! exists($col{$key}) ) {
# new (optional) key
push #opt, $key;
$col{$key} = ++$lastIndex;
print "New col: $key (col ", $lastIndex+1, ")\n";
}
$row[$col{$key}] = $rec->{$key};
}
# check for all required
for (my $i=0; $i<=$#required; ++$i) {
defined($row[$i]) or die("Missing: $required[$i]\n");
}
#Write to the file in delimited file format
print csv_fh join("|", #row), "\n";
}
close csv_fh;

Corrupted JSON encoding in Perl (missign comma)

My custom code (on Perl) give next wrong JSON, missing comma between blocks:
{
"data": [{
"{#LOGFILEPATH}": "/tmp/QRZ2007.tcserverlogs",
"{#LOGFILE}": "QRZ2007"
} **missing comma** {
"{#LOGFILE}": "ARZ2007",
"{#LOGFILEPATH}": "/tmp/ARZ2007.tcserverlogs"
}]
}
My terrible code:
#!/usr/bin/perl
use strict;
use warnings;
use File::Basename;
use utf8;
use JSON;
binmode STDOUT, ":utf8";
my $dir = $ARGV[0];
my $json = JSON->new->utf8->space_after;
opendir(DIR, $dir) or die $!;
print '{"data": [';
while (my $file = readdir(DIR)) {
next unless (-f "$dir/$file");
next unless ($file =~ m/\.tcserverlogs$/);
my $fullPath = "$dir/$file";
my $filenameshort = basename($file, ".tcserverlogs");
my $data_to_json = {"{#LOGFILEPATH}"=>$fullPath,"{#LOGFILE}"=>$filenameshort};
my $data_to_json = {"{#LOGFILEPATH}"=>$fullPath,"{#LOGFILE}"=>$filenameshort};
print $json->encode($data_to_json);
}
print ']}'."\n";
closedir(DIR);
exit 0;
Dear Team i am not a programmer, please any idea how fix it, thank you!
If you do not print a comma, you will not get a comma.
You are trying to build your own JSON string from pre-encoded pieces of smaller data structures. That will not work unless you tell Perl when to put commas. You could do that, but it's easier to just collect all the data into a Perl data structure that is equivalent to the JSON string you want to produce, and encode the whole thing in one go when you're done.
my $dir = $ARGV[0];
my $json = JSON->new->utf8->space_after;
my #data;
opendir( DIR, $dir ) or die $!;
while ( my $file = readdir(DIR) ) {
next unless ( -f "$dir/$file" );
next unless ( $file =~ m/\.tcserverlogs$/ );
my $fullPath = "$dir/$file";
my $filenameshort = basename( $file, ".tcserverlogs" );
my $data_to_json = { "{#LOGFILEPATH}" => $fullPath, "{#LOGFILE}" => $filenameshort };
push #data, $data_to_json;
}
closedir(DIR);
print $json->encode( { data => \#data } );

Accessing nested JSON elements in Perl

I get an error when attempting to access the contents of my JSON array.
Here is the contents of my JSON array assets.json:
[{"id":1002,"interfaces":[{"ip_addresses":[{"value":"172.16.77.239"}]}]},{"id":1003,"interfaces":[{"ip_addresses":[{"value":"192.168.0.2"}]}]}]
Here is my code
#!/usr/bin/perl
use strict;
use warnings;
use JSON::XS;
use File::Slurp;
my $json_source = "assets.json";
my $json = read_file( $json_source ) ;
my $json_array = decode_json $json;
foreach my $item( #$json_array ) {
print $item->{id};
print "\n";
print $item->{interfaces}->{ip_addresses}->{value};
print "\n\n";
}
I get the expected output for $item->{id} but when accessing the nested element
I get the error "Not a HASH reference"
Data::Dumper is your friend here:
Trying this:
#!/usr/bin/env perl
use strict;
use warnings;
use JSON::XS;
use Data::Dumper;
$Data::Dumper::Indent = 1;
$Data::Dumper::Terse = 1;
my $json_array = decode_json ( do { local $/; <DATA> } );
print Dumper $json_array;
__DATA__
[{"id":1002,"interfaces":[{"ip_addresses":[{"value":"172.16.77.239"}]}]},{"id":1003,"interfaces":[{"ip_addresses":[{"value":"192.168.0.2"}]}]}]
Gives:
[
{
'interfaces' => [
{
'ip_addresses' => [
{
'value' => '172.16.77.239'
}
]
}
],
'id' => 1002
},
{
'interfaces' => [
{
'ip_addresses' => [
{
'value' => '192.168.0.2'
}
]
}
],
'id' => 1003
}
]
Important point of note - you have nested arrays (the [] denotes array, the {} a hash).
So you can extract your thing with:
print $item->{interfaces}->[0]->{ip_addresses}->[0]->{value};
Or as friedo notes:
Note that you may omit the -> operator after the first one, so $item->{interfaces}[0]{ip_addresses}[0]{value} will also work.

Getting links from an HTML table using HTML::TableExtract and HTML::Extor in Perl

My goal is to extract the links from the tables titled "Agonists," "Antagonists," and "Allosteric Regulators" in the following site:
http://www.iuphar-db.org/DATABASE/ObjectDisplayForward?objectId=1&familyId=1
I've been using HTML::TableExtract to extract the tables but have been unable to get HTML::LinkExtor to retrieve the links in question. Here is the code I have so far:
use warnings;
use strict;
use HTML::TableExtract;
use HTML::LinkExtor;
my #names = `ls /home/wallakin/LINDA/ligands/iuphar/data/html2/`;
foreach (#names)
{
chomp ($_);
my $te = HTML::TableExtract->new( headers => [ "Ligand",
"Sp.",
"Action",
"Affinity",
"Units",
"Reference" ] );
my $le = HTML::LinkExtor->new();
$te->parse_file("/home/wallakin/LINDA/ligands/iuphar/data/html2/$_");
my $output = $_;
$output =~ s/\.html/\.txt/g;
open (RESET, ">/home/wallakin/LINDA/ligands/iuphar/data/links/$output") or die "Can't reset";
close RESET;
#open (DATA, ">>/home/wallakin/LINDA/ligands/iuphar/data/links/$output") or die "Can't append to file";
foreach my $ts ($te->tables)
{
foreach my $row ($ts->rows)
{
$le->parse($row->[0]);
for my $link_tag ( $le->links )
{
my %links = #$link_tag;
print #$link_tag, "\n";
}
}
}
#print "Links extracted from $_\n";
}
I've tried using some sample code from another thread on this site (Perl parse links from HTML Table) to no avail. I'm not sure whether it's a problem of parsing or table recognition. Any help provided would be greatly appreciated. Thanks!
Try this as a base script (you only need to adapt it to fetch links) :
use warnings; use strict;
use HTML::TableExtract;
use HTML::LinkExtor;
use WWW::Mechanize;
use utf8;
binmode(STDIN, ":utf8");
binmode(STDOUT, ":utf8");
binmode(STDERR, ":utf8");
my $m = WWW::Mechanize->new( autocheck => 1, quiet => 0 );
$m->agent_alias("Linux Mozilla");
$m->cookie_jar({});
my $te = HTML::TableExtract->new(
headers => [
"Ligand",
"Sp.",
"Action",
"Affinity",
"Units",
"Reference"
]
);
$te->parse(
$m->get("http://tinyurl.com/jvwov9m")->content
);
foreach my $ts ($te->tables) {
print "Table (", join(',', $ts->coords), "):\n";
foreach my $row ($ts->rows) {
print join(',', #$row), "\n";
}
}
You don't describe what the problem is...what exactly doesn't work? What does $row->[0] contain? But part of the problem might be that TableExtract returns just the 'visible' text, not the raw html, by default. You probably want to use the keep_html option in HTML::TableExtract.