perl decode and encode json preserving order - json

I have in a text database field a json encoded chart configuration in the form of:
{"Name":[[1,1],[1,2],[2,1]],"Name2":[[3,2]]}
The first number of these IDs is a primary key of another table. I'd like to remove those entries with a trigger when the row is deleted, a plperl function would be good except it does not preserve the order of the hash and the order is important in this project. What can I do (without changing the format of the json encoded config)? Note: the chart name can contain any characters so it's hard to do it with regex.

You need to use a streaming JSON decoder, such as JSON::Streaming::Reader. You could then store your JSON as an array of key/value pairs, instead of a hash.
The actual implementation of how you might use do this is highly dependent on the structure of your data, but given the simple example provided... here's a simple implementation.
use strict;
use warnings;
use JSON::Streaming::Reader;
use JSON 'to_json';
my $s = '{"Name":[[1,1],[1,2],[2,1]],"Name2":[[3,2]]}';
my $jsonr = JSON::Streaming::Reader->for_string($s);
my #data;
while (my $token = $jsonr->get_token) {
my ($key, $value) = #$token;
if ($key eq 'start_property') {
push #data, { $value => $jsonr->slurp };
}
}
print to_json(\#data);
The output for this script is always: -
[{"Name":[[1,1],[1,2],[2,1]]},{"Name2":[[3,2]]}]

Well, I managed to solve my problem, but it's not a general solution so it will probably not help the casual reader. Anyway I got the order of keys using the help of the database, I called my function like this:
SELECT remove_from_chart(
chart_config,
array(select * from json_object_keys(chart_config::json)),
id);
then I walked through the keys in the order of the second parameter and put the results in a new tied (IxHash) hash and json encoded it.
It's pretty sad that there is no perl json decoder that could preserve the key order when everything else I work with, at least on this project, does it (php, postgres, firefox, chrome).

JSON objects are unordered. You will have to encode the desired order into your data somehow
{"Name":[[1,1],[1,2],[2,1]],"Name2":[[3,2]], "__order__":["Name","Name2"]}
[{"Name":[[1,1],[1,2],[2,1]]},{"Name2":[[3,2]]}]

May be you want streaming decoder of JSON data like SAX parser. If so then see JSON::Streaming::Reader, or JSON::SL.

Related

How to parse a text file to csv file using Perl

I am learning Perl and would like to parse a text file to csv file using Perl. I have a loop that generates the following text file:
//This part is what outputs on the text file
for $row(#$data) {
while(my($key,$value) = each(%$row)) {
print "${key}=${value}, ";
}
print "\n";
}
Text File Output:
name=Mary, id=231, age=38, weight=130, height=5.05, speed=26.233, time=30,
time=25, name=Jose, age=30, id=638, weight=150, height=6.05, speed=20.233,
age=40, weight=130, name=Mark, id=369, speed=40.555, height=5.07, time=30
CSV File Desired Output:
name,age,weight,height,speed,time
Mary,38,130,5.05,26.233,30,
Jose,30,150,6.05,20.233,25,
Mark,40,130,5.04,40.555,30
Any good feedback is welcome!
The key part here is how to manipulate your data so to extract what need be printed for each line. Then you are best off using a module to produce valid CSV, and Text::CSV is very good.
A program using an array of small hashrefs, mimicking data in the question
use strict;
use warnings;
use feature 'say';
use Text::CSV;
my #data = (
{ name => 'A', age => 1, weight => 10 },
{ name => 'B', age => 2, weight => 20 },
);
my $csv = Text::CSV->new({ binary => 1, auto_diag => 2 });
my $outfile = 'test.csv';
open my $ofh, '>', $outfile or die "Can't open $outfile: $!";
# Header, also used below for order of values for fields
my #hdr = qw(name age weight);
$csv->say($ofh, \#hdr);
foreach my $href (#data) {
$csv->say($ofh, [ #{$href}{#hdr} ]);
}
The values from hashrefs in a desired order are extracted using a hashref slice #{$href}{#hdr}, what is in general
#{ expression returning hash reference } { list of keys }
This returns a list of values for the given list of keys, from the hashref that the expression in the block {} must return. That is then used to build an arrayref (an anonymous array here, using []), what the module's say method needs in order to make and print a string of comma-separated-values† from that list of values.
Note a block that evaluates to a hash reference, used instead of a hash name that is used for a slice of a hash. This is a general rule that
Anywhere you'd put an identifier (or chain of identifiers) as part of a variable or subroutine name, you can replace the identifier with a BLOCK returning a reference of the correct type.
Some further comments
Look over the supported constructor's attributes; there are many goodies
For very simple data you can simply join fields with a comma and print
say $ofh join ',', #{$href}{#hdr};
But it is far safer to use a module to construct a valid CSV record. With the right choice of attributes in the constructor it can handle whatever is legal to embed in fields (some of what can take quite a bit of work to do correctly by hand) and it calls things which aren't
I list column names explicitly. Instead, you can fetch the keys and then sort in a desired order, but this will again need a hard-coded list for sorting
The program creates the file test.csv and prints to it the expected header and data lines.
† But separating those "values" with commas may involve a whole lot more than merely what the acronym for the "CSV format" stands for. A variety of things may come between those commas, including commas, newlines, and whatnot. This is why one is best advised to always use a library. Seeing constructor's options is informative.
The following commentary referred to the initial question. In the meanwhile the problems this addresses were corrected in OP's code and the question updated. I'm still leaving this text for some general comments that can be useful.
As for the code in the question and its output, there is almost certainly an issue with how the data is processed to produce #data, judged by the presence of keys HASH(address) in the output.
That string HASH(0x...) is output when one prints a variable which is a hash reference (what cannot show any of hash's content). Perl handles such a print by stringifying (producing a printable string out of something which is more complex) the reference in that way.
There is no good reason to have a hash reference for a hash key. So I'd suggest that you review your data and its processing and see how that comes about. (Or briefly show this, or post another question with it if it isn't feasible to add that to this one.)
One measure you can use to bypass that is to only use a list of keys that you know are valid, like I show above; however, then you may be leaving some outright error unhandled. So I'd rather suggest to find what is wrong.

How to parse lua table object into json?

I was wondering if there was a way to parse a lua table into an javascript object, without using any libraries i.e require("json") haven't seen one yet, but if someone knows how please answer
If you want to know how to parse Lua tables to JSON strings take a look into the source code of any of the many JSON libraries available for Lua.
http://lua-users.org/wiki/JsonModules
For example:
https://github.com/rxi/json.lua/blob/master/json.lua
or
https://github.com/LuaDist/dkjson/blob/master/dkjson.lua
If you do not want to use any library and want to do it with pure Lua code the most convenient way for me is to use table.concat function:
local result
for key, value in ipairs(tableWithData) do
-- prepare json key-value pairs and save them in separate table
table.insert(result, string.format("\"%s\":%s", key, value))
end
-- get simple json string
result = "{" .. table.concat(result, ",") .. "}"
If your table has nested tables you can do this recursively.
The are a lot of pure-Lua JSON libraries.
Even me have one.
How to include pure-Lua module into your script without using require():
Download the Lua JSON module (for example, go to my json.lua, right-click on Raw and select Save Link as in context menu)
Delete the last line return json from this file
Insert the whole file at the beginning of your script
Now you can use local json_as_string = json.encode(your_Lua_table) in your script.

MARKLOGIC: Is it possible to use more than 1 columns from a CSV file when generating URI ID during data ingestion in MarkLogic?

I am quite new to MarkLogic and I am not sure how to best deal with the challenge I have right now.
I have a CSV file exported from a table that will be ingested to MarkLogic database. Now the source table uses 4 columns as its unique primary key combination.
In MarkLogic, by default, only one column from CSV file can be used as the URI ID.
My question is, is it possible to use more than 1 columns from a CSV file as the URI ID during data ingestion in MarkLogic?
If yes, is this feature or setting available in data hub?
If it is not possible, what is usually the best practice for this in MarkLogic?
I know that one possible work around is to create a new column combining the data from 4 primary key columns and use it as the URI ID.
You can use MLCP Transforms to transform both the content value, and the uri. It gets a hash object $content containing both. Update its values as desired, and return the updated hash object. Something like:
declare function example:transform(
$content as map:map,
$context as map:map
) as map:map*
{
let $record := map:get($content, "value")
let $uri := $record/prop1 || $record/prop2 || $record/prop3
let $_ := map:put($content, "uri", $uri)
return $content
};
You can use such MLCP transforms in marklogic-data-hub as well.
HTH!

Pass variable with double quotes to JSON for REST client in PERL

Situation is the I am dealing with REST and JSON. Making JSON request for a REST client.
I have a simple for loop that creates ids for me
for (my $i=1;$i<=2504;$i++)
{
push (#elements,$i);
}
my $ids = join ',',map{"\"$_\""}#elements;
However, when I pass this to JSON then I see backslash are being printed
$hashref=({"varreq"=>{"search"=>{"Ids"=>[$ids],"genome"=>[{"hugo"=>["$hugo"]}]},"output_format">{"groupby"=>"gene"}}});
Above is encoded in JSON and then a post request is made
I am getting this:
"\"1\",\"2\",\"3\",\"4\",......
and I want:
"1","2","3","4",.....
If you're doing JSON, why not just:
use JSON;
Rather than hacking it with regular expressions:
#!/usr/bin/env perl
use strict;
use warnings;
use JSON;
my $json_str = to_json ( [1..2504] );
print $json_str;
With to_json you can encode into a JSON structure pretty much any perl data structure. (and from_json to turn it back again).
You can do an OO style with encode_json/decode_json.
You seem to be doing this already, but ... this here:
{"Ids"=>[$ids],
Can be simply changed as the above:
{ "Ids" => [#elements]
Which should do what you want.
From the comments - I don't think anything receiving JSON should be getting confused by an array of numbers vs. an array of numeric strings.
But if they do:
my $json_str = to_json [ map {"$_"} 1..2504 ];
well, after encoding but before making the POST request I end up doing the following and it worked:
$postRequestJSON=~s/\\//g;

CGI::Application::Plugin::JSON - json_body returns backwards

I was wondering if anyone knew why this return is backwards with CGI::Application::Plugin::JSON
sub {
my ($self) = #_;
my $q = $self->query;
return $self->json_body({ result => '1', message => 'I should be AFTER result'} );
}
The Output is as follows:
{"message":"I should be AFTER result","result":"1"}
I would assume it would format the JSON left to right from the key/value pairs, and remembering it will be backwards is okay, but I have alot of returns to handle and the validation on the client-side is done with the 'result' value so if I am just missing something I would like to have it output just like it is input.
EDIT:
Also I just notices it is not returning a JSON Boolean type object as "result":"1" will deserialize as as sting object and not a JSON Boolean. Is there a way to have it output "result":1
Thanks for any help I can get with this one.
I would assume it would format the JSON left to right from the key/value pairs
You're confusing the list you assigned to the hash with the hash itself. Hashes don't have a left and a right; they have an array of linked lists.
You're getting the order in which the elements are found in the hash. You can't control that order as long as you use a hash.
If you really do need to have the fields in a specific order (which would be really weird), you could try using something that looks like a hash but remembers insertion order (like Tie::IxHash).
remembering it will be backwards is okay
Not only are they not "backwards", the order isn't even predictable.
$ perl -MJSON -E'say encode_json {a=>1,b=>2,c=>3} for 1..3'
{"b":2,"c":3,"a":1}
{"c":3,"a":1,"b":2}
{"a":1,"c":3,"b":2}
Is there a way to have it output "result":1
result => 1