How to construct json text using string? - json

I'm trying to construct json text as show below. But the variables such as $token, $state, $failedServers are not been replaced with its value. Note- I don't want to use any module specifically for this to work, I just want some plain string to work. Can anyone help me ?
my $json = '{"serverToken":"$token", "state":"$state","parameters" :"$failedServers"}';
current output was:
{"serverToken":"$token", "state":"$state","parameters" :"$failedServers"}
needed output format:
{"serverToken":"1213", "state":"failed","parameters" :"oracleapps.veeralab.com,suntrust.com"}

Your variables are not being replaced, because they are inside of a single-quoted string--that is, they are inside a string quoted by ' characters. This prevents variable substitution.
You will also be much better off creating JSON using a JSON library, such as this one. Simply using a quoted string is very dangerous. Suppose your one of your variables ends up containing a special character; you will end up with invalid JSON.
{"serverToken":"123"ABC", "state":"offline", "paramameters":"bugs"}
If your variables come from user input, really bad things could happen. Imagine that $token is set to equal foo", "state":"online", "foo":"bar. Your resulting JSON structure would be:
{"serverToken":"foo", "state":"online", "foo":"bar", "state":"offline" ...
Certainly not what you want.
Possible solutions:
The most blatantly obvious solution is simply not to the ' quote character. This has the drawback of requiring you to escape your double quote (") characters, though, but it's easy:
my $json = "{\"serverToken\":\"$token\", \"state\":\"$state\",\"parameters\" :\"$failedServers\"}";
Another option is to use sprintf:
my $json = sprintf('{"serverToken":"%s", "state":"%s", "parameters":"%s"}', $token, $state, $failedServers);
But by far, the best solution, because it won't break with wonky input, is to use a library:
use JSON;
my $json = encode_json( {
serverToken => $token,
state => $state,
paramaters => $failedServers
} );

Related

How do I match a CSV-style quoted string in nom?

A CSV style quoted string, for the purposes of this question, is a string in which:
The string starts and ends with exactly one ".
Two double quotes inside the string are collapsed to one double quote. "Alo""ha"→Alo"ha.
"" on its own is an empty string.
Error inputs, such as "A""" e", cannot be parsed. It's an A", followed by junk e".
I've tried several things, none of which have worked fully.
The closest I've gotten, thanks to some help from user pinkieval in #nom on the Mozilla IRC:
use std::error as stderror; /* Avoids needing nightly to compile */
named!(csv_style_string<&str, String>, map_res!(
terminated!(tag!("\""), not!(peek!(char!('"')))),
csv_string_to_string
));
fn csv_string_to_string(s: &str) -> Result<String, Box<stderror::Error>> {
Ok(s.to_string().replace("\"\"", "\""))
}
This does not catch the end of the string correctly.
I've also attempted to use the re_match! macro with r#""([^"]|"")*""#, but that always results in an Err::Incomplete(1).
I've determined that the given CSV example for Nom 1.0 doesn't work for a quoted CSV string as I'm describing it, but I do know implementations differ.
Here is one way of doing it:
use nom::types::CompleteStr;
use nom::*;
named!(csv_style_string<CompleteStr, String>,
delimited!(
char!('"'),
map!(
many0!(
alt!(
// Eat a " delimiter and the " that follows it
tag!("\"\"") => { |_| '"' }
| // Normal character
none_of!("\"")
)
),
// Make a string from a vector of chars
|v| v.iter().collect::<String>()
),
char!('"')
)
);
fn main() {
println!(r#""Alo\"ha" = {:?}"#, csv_style_string(CompleteStr(r#""Alo""ha""#)));
println!(r#""" = {:?}"#, csv_style_string(CompleteStr(r#""""#)));
println!(r#"bad format: {:?}"#, csv_style_string(CompleteStr(r#""A""" e""#)));
}
(I wrote it in full nom, but a solution like yours, based on an external function instead of map!() each character, would work too, and may be more efficient.)
The magic here, that would also solve your regexp issue, is to use CompleteStr. This basically tells nom that nothing will come after that input (otherwise, nom assumes you're doing a streaming parser, so more input may follow).
This is needed because we need to know what to do with a " if it is the last character fed to nom. Depending on the character that comes after it (another ", a normal character, or EOF), we have to take a different decision -- hence the Incomplete result, meaning nom does not have enough input to make the decision. Telling nom that EOF comes next solves this indecision.
Further reading on Incomplete on nom's author's blog: http://unhandledexpression.com/general/2018/05/14/nom-4-0-faster-safer-simpler-parsers.html#dealing-with-incomplete-usage
You may note that this parser does not actually rejects the invalid input, but parses the beginning and returns the rest. If you use this parser as a subparser in another parser, the latter would then feed the remainder to the next subparser, which would crash as well (because it would expect a comma), causing the overall parser to fail.
If you don't want that, you could make csv_style_string match peek!(alt!(char!(',')|char!('\n")|eof!())).

Pass data from JSON to variable for comparison

I have a request that I make in an API using GET
LWP::UserAgent,
the data is returned as JSON, with up to two results at most as follows:
{
"status":1,
"time":1507891855,
"response":{
"prices":{
"nome1\u2122":{
"preco1":1111,
"preco2":1585,
"preco3":1099
},
"nome2":{
"preco1":519,
"preco2":731,
"preco3":491
}
}
}
}
Dump:
$VAR1 = {
'status' => 1,
'time' => 1507891855,
'response' => {
'prices' => {
'nome1' => {
'preco1' => 1111,
'preco3' => 1099,
'preco2' => 1585
},
'nome2' => {
'preco3' => 491,
'preco1' => 519,
'preco2' => 731
}
}
}
};
What I would like to do is:
Take this data and save it in a variable to make a comparison using if with another variable that already has the name stored. The comparison would be with name1 / name2 and if it is true with the other variable it would get preco2 and preco3 to print everything
My biggest problem in the case is that some of these names in JSON contain characters like (TradeMark) that comes as \u2122 (some cases are other characters), so I can not make the comparison with the name of the other variable that is already with the correct name
nome1™
If I could only save the JSON already "converted" the characters would help me with the rest.
Basically after doing the request for the API I want to save the contents in a variable already converting all \u2122 to their respective character (this is the part that I do not know how to do in Perl) and then using another variable to compare them names are equal to show the price
Thanks for the help and any questions please tell me that I try to explain again in another way.
If I understand correctly, you need to get the JSON that you receive in UTF8 format to an internal variable that you can process. For that, you may use JSON::XS:
use utf8;
use JSON::XS;
my $name = "nome1™";
my $var1 = decode_json $utf8_encoded_json_text;
# Compare with name in $name
if( defined $var1->{'response'}->{'prices'}->{$name} ) {
# Do something with the name that matches
my $match = $var1->{'response'}->{'prices'}->{$name};
print $match->{'preco1'}, "\n";
}
Make sure you tell the Perl interpreter that your source is in UTF8 by specifying use utf8; at the beginning of the script. Then make sure you are editing the script with an editor that supports that format.
The function decode_json will return a ref to the converted value. In this case a hash ref. From there you work your way into the JSON.
If you know $name is going to be in the JSON you may omit the defined part. Otherwise, the defined clause will tell you whether the hash value is there. One you know, you may do something with it. If the hash values are a single word with no special characters, you may use $var1->{response}->{prices}->{$name}, but it is always safer to use $var1->{'response'}->{'prices'}->{$name}. Perl gets a bit ugly handling hash refs...
By the way, in JSON::XS you will also find the encode_json function to do the opposite and also an object oriented interface.

Using gsub to replace "=>" with ":" in an array of hashes

I think I've written myself in a corner. Basically, I have an array of hashes, like so.
my_hashes = [{"colorName"=>"first", "hexValue"=>"#f00"}, {"colorName"=>"green", "hexValue"=>"#0f0"},
{"colorName"=>"blue", "hexValue"=>"#00f"}, {"colorName"=>"cyan", "hexValue"=>"#0ff"},
{"colorName"=>"magenta", "hexValue"=>"#f0f"}, {"colorName"=>"yellow", "hexValue"=>"#ff0"},
{"colorName"=>"black", "hexValue"=>"#000"}]
I need to use JSON.parse to eventually be able to transform these hashes into CSV format. The only problem is I can't get JSON.parse to work as long as the "=>" symbol is present. I've tried just doing a regular gsub('=>', ':') but it appears that I cannot use it as this is an array of hashes. I've tried variations of the following method:
my_hashes.each do |hash|
hash.each do |key, value|
key.gsub!('=>', ':')
value.gsub!('=>', ':')
end
end
I need these hash values to stay intact, so even if I transform them intro strings, if I transform them back they'll still have the '=>' symbol available. Any advice?
Changing => to : wouldn't make a Ruby hash to a JSON object. And in fact you cannot just change a hash like that at all. Because the written representation of a hash is not the same as the interpreted version in memory.
But that doesn't solve your problem: You need a JSON representation of a Ruby hash, just use to_json:
my_hashes = [
{"colorName"=>"first", "hexValue"=>"#f00"},
{"colorName"=>"green", "hexValue"=>"#0f0"},
{"colorName"=>"blue", "hexValue"=>"#00f"},
{"colorName"=>"cyan", "hexValue"=>"#0ff"},
{"colorName"=>"magenta", "hexValue"=>"#f0f"},
{"colorName"=>"yellow", "hexValue"=>"#ff0"},
{"colorName"=>"black", "hexValue"=>"#000"}
]
require 'json'
my_hashes.to_json
#=> "[{"colorName":"first","hexValue":"#f00"},{"colorName":"green","hexValue":"#0f0"},{"colorName":"blue","hexValue":"#00f"},{"colorName":"cyan","hexValue":"#0ff"},{"colorName":"magenta","hexValue":"#f0f"},{"colorName":"yellow","hexValue":"#ff0"},{"colorName":"black","hexValue":"#000"}]"
my_hashes=[{"colorName"=>"first", "hexValue"=>"#f00"}]
new_data = my_hashes.to_json.gsub(/\=\>/, ':')
data = Json.parse new_data

Decoding and using JSON data in Perl

I am confused about accessing the contents of some JSON data that I have decoded. Here is an example
I don't understand why this solution works and my own does not. My questions are rephrased below
my $json_raw = getJSON();
my $content = decode_json($json_raw);
print Data::Dumper($content);
At this point my JSON data has been transformed into this
$VAR1 = { 'items' => [ 1, 2, 3, 4 ] };
My guess tells me that, once decoded, the object will be a hash with one element that has the key items and an array reference as the value.
$content{'items'}[0]
where $content{'items'} would obtain the array reference, and the outer $...[0] would access the first element in the array and interpret it as a scalar. However this does not work. I get an error message use of uninitialized value [...]
However, the following does work:
$content->{items}[0]
where $content->{items} yields the array reference and [0] accesses the first element of that array.
Questions
Why does $content{'items'} not return an array reference? I even tried #{content{'items'}}, thinking that, once I got the value from content{'items'}, it would need to be interpreted as an array. But still, I receive the uninitialized array reference.
How can I access the array reference without using the arrow operator?
Beginner's answer to beginner :) Sure not as profesional as should be, but maybe helps you.
use strict; #use this all times
use warnings; #this too - helps a lot!
use JSON;
my $json_str = ' { "items" : [ 1, 2, 3, 4 ] } ';
my $content = decode_json($json_str);
You wrote:
My guess tells me that, once decoded, the object will be a hash with
one element that has the key items and an array reference as the value.
Yes, it is a hash, but the the decode_json returns a reference, in this case, the reference to hash. (from the docs)
expects an UTF-8 (binary) string and tries to parse that
as an UTF-8 encoded JSON text,
returning the resulting reference.
In the line
my $content = decode_json($json_str);
you assigning to an SCALAR variable (not to hash).
Because you know: it is a reference, you can do the next:
printf "reftype:%s\n", ref($content);
#print: reftype:HASH ^
#therefore the +------- is a SCALAR value containing a reference to hash
It is a hashref - you can dump all keys
print "key: $_\n" for keys %{$content}; #or in short %$content
#prints: key: items
also you can assing the value of the "items" (arrayref) to an scalar variable
my $aref = $content->{items}; #$hashref->{key}
#or
#my $aref = ${$content}{items}; #$hash{key}
but NOT
#my $aref = $content{items}; #throws error if "use strict;"
#Global symbol "%content" requires explicit package name at script.pl line 20.
The $content{item} is requesting a value from the hash %content and you never defined/assigned such variable. the $content is an scalar variable not hash variable %content.
{
#in perl 5.20 you can also
use 5.020;
use experimental 'postderef';
print "key-postderef: $_\n" for keys $content->%*;
}
Now step deeper - to the arrayref - again you can print out the reference type
printf "reftype:%s\n", ref($aref);
#reftype:ARRAY
print all elements of array
print "arr-item: $_\n" for #{$aref};
but again NOT
#print "$_\n" for #aref;
#dies: Global symbol "#aref" requires explicit package name at script.pl line 37.
{
#in perl 5.20 you can also
use 5.020;
use experimental 'postderef';
print "aref-postderef: $_\n" for $aref->#*;
}
Here is an simple rule:
my #arr; #array variable
my $arr_ref = \#arr; #scalar - containing a reference to #arr
#{$arr_ref} is the same as #arr
^^^^^^^^^^ - array reference in curly brackets
If you have an $arrayref - use the #{$array_ref} everywhere you want use the array.
my %hash; #hash variable
my $hash_ref = \%hash; #scalar - containing a reference to %hash
%{$hash_ref} is the same as %hash
^^^^^^^^^^^ - hash reference in curly brackets
If you have an $hash_ref - use the %{$hash_ref} everywhere you want use the hash.
For the whole structure, the following
say $content->{items}->[0];
say $content->{items}[0];
say ${$content}{items}->[0];
say ${$content}{items}[0];
say ${$content->{items}}[0];
say ${${$content}{items}}[0];
prints the same value 1.
$content is a hash reference, so you always need to use an arrow to access its contents. $content{items} would refer to a %content hash, which you don't have. That's where you're getting that "use of uninitialized value" error from.
I actually asked a similar question here
The answer:
In Perl, a function can only really return a scalar or a list.
Since hashes can be initialized or assigned from lists (e.g. %foo = (a => 1, b => 2)), I guess you're asking why json_decode returns something like { a => 1, b => 2 } (a reference to an anonymous hash) rather than (a => 1, b => 2) (a list that can be copied into a hash).
I can think of a few good reasons for this:
in Perl, an array or hash always contains scalars. So in something like { "a": { "b": 3 } }, the { "b": 3 } part has to be a scalar; and for consistency, it makes sense for the whole thing to be a scalar in the same way.
if the hash is quite large (many keys at top-level), it's pointless and expensive to iterate over all the elements to convert it into a list, and then build a new hash from that list.
in JSON, the top-level element can be either an object (= Perl hash) or an array (= Perl array). If json_decode returned a list in the former case, it's not clear what it would return in the latter case. After decoding the JSON string, how could you examine the result to know what to do with it? (And it wouldn't be safe to write %foo = json_decode(...) unless you already knew that you had a hash.) So json_decode's behavior works better for any general-purpose library code that has to use it without already knowing very much about the data it's working with.
I have to wonder exactly what you passed as an array to json_decode, because my results differ from yours.
#!/usr/bin/perl
use JSON qw (decode_json);
use Data::Dumper;
my $json = '["1", "2", "3", "4"]';
my $fromJSON = decode_json($json);
print Dumper($fromJSON);
The result is $VAR1 = [ '1', '2', '3', '4' ];
Which is an array ref, where your result is a hash ref
So did you pass in a hash with element items which was a reference to an array?
In my example you would get the array by doing
my #array = #{ $fromJSON };
In yours
my #array = #{ $content->{'items'} }
I don't understand why you dislike the arrow operator so much!
The decode_json function from the JSON module will always return a data reference.
Suppose you have a Perl program like this
use strict;
use warnings;
use JSON;
my $json_data = '{ "items": [ 1, 2, 3, 4 ] }';
my $content = decode_json($json_data);
use Data::Dump;
dd $content;
which outputs this text
{ items => [1 .. 4] }
showing that $content is a hash reference. Then you can access the array reference, as you found, with
dd $content->{items};
which shows
[1 .. 4]
and you can print the first element of the array by writing
print $content->{items}[0], "\n";
which, again as you have found, shows just
1
which is the first element of the array.
As #cjm mentions in a comment, it is imperative that you use strict and use warnings at the start of every Perl program. If you had those in place in the program where you tried to access $content{items}, your program would have failed to compile, and you would have seen the message
Global symbol "%content" requires explicit package name
which is a (poorly-phrased) way of telling you that there is no %content so there can be no items element.
The scalar variable $content is completely independent from the hash variable %content, which you are trying to access when you write $content{items}. %content has never been mentioned before and it is empty, so there is no items element. If you had tried #{$content->{items}} then it would have worked, as would #{${$content}{items}}
If you really have a problem with the arrow operator, then you could write
print ${$content}{items}[0], "\n";
which produces the same output; but I don't understand what is wrong with the original version.

How to Enable HTML::TableExtract to Recognize Special Characters

I was trying to parse a page that contain scientific notation (Greek, etc).
This is the page. Note that there are other pages with more notations to be parsed.
For example it contain the following HTML
<td> human Interleukin 1β </td>
where &beta encode the Greek alphabet.
However after parsing with HTML::TableExtract it became:
human Interleukin 1\x{3b2}
Is there a way to make the code below capture the original HTML as it is,
i.e. maintaning 1&beta.
use HTML::TableExtract;
use Data::Dumper;
# Local file for http://www.violinet.org/vaxjo/vaxjo_detail.php?c_vaxjo_id=55
my $file = "vaxjo_detail.php\?c_vaxjo_id\=50.html";
my $te = HTML::TableExtract->new();
$te->parse_file($file);
my ($table) = $te->tables;
print Dumper $table ;
It did not return
human Interleukin 1\x{3b2}
It returned
human Interleukin 1β
Dumper simply prints that out as Perl string literal
"human Interleukin 1\x{3b2}"
Anyway, if you want the raw HTML instead of the text it represents, I believe passing keep_html => 1 to the constructor will do the trick.