Destructuring/list assignment with the `has` declarator - constructor

[I ran into the issues that prompted this question and my previous question at the same time, but decided the two questions deserve to be separate.]
The docs describe using destructuring assignment with my and our variables, but don't mention whether it can be used with has variables. But Raku is consistent enough that I decided to try, and it appears to work:
class C { has $.a; has $.b }
class D { has ($.a, $.b) }
C.new: :a<foo>; # OUTPUT: «C.new(a => "foo", b => Any)»
D.new: :a<foo>; # OUTPUT: «D.new(a => "foo", b => Any)»
However, this form seems to break attribute defaults:
class C { has $.a; has $.b = 42 }
class D { has ($.a, $.b = 42) }
C.new: :a<foo>; # OUTPUT: «C.new(a => "foo", b => 42)»
D.new: :a<foo>; # OUTPUT: «C.new(a => "foo", b => Any)»
Additionally, flipping the position of the default provides an error message that might provide some insight into what is going on (though not enough for me to understand if the above behavior is correct).
class D { has ($.a = 42, $.b) }
# OUTPUT:
===SORRY!=== Error while compiling:
Cannot put required parameter $.b after optional parameters
So, a few questions: is destructuring assignment even supposed to work with has? If so, is the behavior with default values correct/is there a way to assign default values when using destucturing assignment?
(I really hope that destructuring assignment is supported with has and can be made to work with default values; even though it might seem like a niche feature for someone using classes for true OO, it's very handy for someone writing more functional code who wants to use a class as a slightly-more-typesafe Hash with fixed keys. Being able to write things like class User { has (Str $.first-name, Str $.last-name, Int $.age) } is very helpful for that sort of code)

This is currently a known bug in Rakudo. The intended behavior is for has to support list assignment, which would make syntax very much like that shown in the question work.
I am not sure if the supported syntax will be:
class D { has ($.a, $.b = 42) }
D.new: :a<foo>; # OUTPUT: «D.new(a => "foo", b => 42)»
or
class D { has ($.a, $.b) = (Any, 42) }
D.new: :a<foo>; # OUTPUT: «D.new(a => "foo", b => 42)»
but, either way, there will be a way to use a single has to declare multiple attributes while also providing default values for those attributes.
The current expectation is that this bug will get resolved sometime after the RakuAST branch is merged.

Related

How can I define a Raku grammar to parse TSV text?

I have some TSV data
ID Name Email
1 test test#email.com
321 stan stan#nowhere.net
I would like to parse this into a list of hashes
#entities[0]<Name> eq "test";
#entities[1]<Email> eq "stan#nowhere.net";
I'm having trouble with using the newline metacharacter to delimit the header row from the value rows. My grammar definition:
use v6;
grammar Parser {
token TOP { <headerRow><valueRow>+ }
token headerRow { [\s*<header>]+\n }
token header { \S+ }
token valueRow { [\s*<value>]+\n? }
token value { \S+ }
}
my $dat = q:to/EOF/;
ID Name Email
1 test test#email.com
321 stan stan#nowhere.net
EOF
say Parser.parse($dat);
But this is returning Nil. I think I'm misunderstanding something fundamental about regexes in raku.
Probably the main thing that's throwing it off is that \s matches horizontal and vertical space. To match just horizontal space, use \h, and to match just vertical space, \v.
One small recommendation I'd make is to avoid including the newlines in the token. You might also want to use the alternation operators % or %%, as they're designed for handling this type work:
grammar Parser {
token TOP {
<headerRow> \n
<valueRow>+ %% \n
}
token headerRow { <.ws>* %% <header> }
token valueRow { <.ws>* %% <value> }
token header { \S+ }
token value { \S+ }
token ws { \h* }
}
The result of Parser.parse($dat) for this is the following:
「ID Name Email
1 test test#email.com
321 stan stan#nowhere.net
」
headerRow => 「ID Name Email」
header => 「ID」
header => 「Name」
header => 「Email」
valueRow => 「 1 test test#email.com」
value => 「1」
value => 「test」
value => 「test#email.com」
valueRow => 「 321 stan stan#nowhere.net」
value => 「321」
value => 「stan」
value => 「stan#nowhere.net」
valueRow => 「」
which shows us that the grammar has successfully parsed everything. However, let's focus on the second part of your question, that you want to it to be available in a variable for you. To do that, you'll need to supply an actions class which is very simple for this project. You just make a class whose methods match the methods of your grammar (although very simple ones, like value/header that don't require special processing besides stringification, can be ignored). There are some more creative/compact ways to handle processing of yours, but I'll go with a fairly rudimentary approach for illustration. Here's our class:
class ParserActions {
method headerRow ($/) { ... }
method valueRow ($/) { ... }
method TOP ($/) { ... }
}
Each method has the signature ($/) which is the regex match variable. So now, let's ask what information we want from each token. In header row, we want each of the header values, in a row. So:
method headerRow ($/) { 
my #headers = $<header>.map: *.Str
make #headers;
}
Any token with a quantifier on it will be treated as a Positional, so we could also access each individual header match with $<header>[0], $<header>[1], etc. But those are match objects, so we just quickly stringify them. The make command allows other tokens to access this special data that we've created.
Our value row will look identically, because the $<value> tokens are what we care about.
method valueRow ($/) { 
my #values = $<value>.map: *.Str
make #values;
}
When we get to last method, we will want to create the array with hashes.
method TOP ($/) {
my #entries;
my #headers = $<headerRow>.made;
my #rows = $<valueRow>.map: *.made;
for #rows -> #values {
my %entry = flat #headers Z #values;
#entries.push: %entry;
}
make #entries;
}
Here you can see how we access the stuff we processed in headerRow() and valueRow(): You use the .made method. Because there are multiple valueRows, to get each of their made values, we need to do a map (this is a situation where I tend to write my grammar to have simply <header><data> in the grammar, and defeine the data as being multiple rows, but this is simple enough it's not too bad).
Now that we have the headers and rows in two arrays, it's simply a matter of making them an array of hashes, which we do in the for loop. The flat #x Z #y just intercolates the elements, and the hash assignment Does What We Mean, but there are other ways to get the array in hash you want.
Once you're done, you just make it, and then it will be available in the made of the parse:
say Parser.parse($dat, :actions(ParserActions)).made
-> [{Email => test#email.com, ID => 1, Name => test} {Email => stan#nowhere.net, ID => 321, Name => stan} {}]
It's fairly common to wrap these into a method, like
sub parse-tsv($tsv) {
return Parser.parse($tsv, :actions(ParserActions)).made
}
That way you can just say
my #entries = parse-tsv($dat);
say #entries[0]<Name>; # test
say #entries[1]<Email>; # stan#nowhere.net
TL;DR: you don't. Just use Text::CSV, which is able to deal with every format.
I will show how old Text::CSV will probably be useful:
use Text::CSV;
my $text = q:to/EOF/;
ID Name Email
1 test test#email.com
321 stan stan#nowhere.net
EOF
my #data = $text.lines.map: *.split(/\t/).list;
say #data.perl;
my $csv = csv( in => #data, key => "ID");
print $csv.perl;
The key part here is the data munging that converts the initial file into an array or arrays (in #data). It's only needed, however, because the csv command is not able to deal with strings; if data is in a file, you're good to go.
The last line will print:
${" 1" => ${:Email("test\#email.com"), :ID(" 1"), :Name("test")}, " 321" => ${:Email("stan\#nowhere.net"), :ID(" 321"), :Name("stan")}}%
The ID field will become the key to the hash, and the whole thing an array of hashes.
TL;DR regexs backtrack. tokens don't. That's why your pattern isn't matching. This answer focuses on explaining that, and how to trivially fix your grammar. However, you should probably rewrite it, or use an existing parser, which is what you should definitely do if you just want to parse TSV rather than learn about raku regexes.
A fundamental misunderstanding?
I think I'm misunderstanding something fundamental about regexes in raku.
(If you already know the term "regexes" is a highly ambiguous one, consider skipping this section.)
One fundamental thing you may be misunderstanding is the meaning of the word "regexes". Here are some popular meanings folk assume:
Formal regular expressions.
Perl regexes.
Perl Compatible Regular Expressions (PCRE).
Text pattern matching expressions called "regexes" that look like any of the above and do something similar.
None of these meanings are compatible with each other.
While Perl regexes are semantically a superset of formal regular expressions, they are far more useful in many ways but also more vulnerable to pathological backtracking.
While Perl Compatible Regular Expressions are compatible with Perl in the sense they were originally the same as standard Perl regexes in the late 1990s, and in the sense that Perl supports pluggable regex engines including the PCRE engine, PCRE regex syntax is not identical to the standard Perl regex used by default by Perl in 2020.
And while text pattern matching expressions called "regexes" generally do look somewhat like each other, and do all match text, there are dozens, perhaps hundreds, of variations in syntax, and even in semantics for the same syntax.
Raku text pattern matching expressions are typically called either "rules" or "regexes". The use of the term "regexes" conveys the fact that they look somewhat like other regexes (although the syntax has been cleaned up). The term "rules" conveys the fact they are part of a much broader set of features and tools that scale up to parsing (and beyond).
The quick fix
With the above fundamental aspect of the word "regexes" out of the way, I can now turn to the fundamental aspect of your "regex"'s behavior.
If we switch three of the patterns in your grammar for the token declarator to the regex declarator, your grammar works as you intended:
grammar Parser {
regex TOP { <headerRow><valueRow>+ }
regex headerRow { [\s*<header>]+\n }
token header { \S+ }
regex valueRow { [\s*<value>]+\n? }
token value { \S+ }
}
The sole difference between a token and a regex is that a regex backtracks whereas a token doesn't. Thus:
say 'ab' ~~ regex { [ \s* a ]+ b } # 「ab」
say 'ab' ~~ token { [ \s* a ]+ b } # 「ab」
say 'ab' ~~ regex { [ \s* \S ]+ b } # 「ab」
say 'ab' ~~ token { [ \s* \S ]+ b } # Nil
During processing of the last pattern (that could be and often is called a "regex", but whose actual declarator is token, not regex), the \S will swallow the 'b', just as it temporarily will have done during processing of the regex in the prior line. But, because the pattern is declared as a token, the rules engine (aka "regex engine") does not backtrack, so the overall match fails.
That's what's going on in your OP.
The right fix
A better solution in general is to wean yourself from assuming backtracking behavior, because it can be slow and even catastrophically slow (indistinguishable from the program hanging) when used in matching against a maliciously constructed string or one with an accidentally unfortunate combination of characters.
Sometimes regexs are appropriate. For example, if you're writing a one-off and a regex does the job, then you're done. That's fine. That's part of the reason that / ... / syntax in raku declares a backtracking pattern, just like regex. (Then again you can write / :r ... / if you want to switch on ratcheting -- "ratchet" means the opposite of "backtrack", so :r switches a regex to token semantics.)
Occasionally backtracking still has a role in a parsing context. For example, while the grammar for raku generally eschews backtracking, and instead has hundreds of rules and tokens, it nevertheless still has 3 regexs.
I've upvoted #user0721090601++'s answer because it's useful. It also addresses several things that immediately seemed to me to be idiomatically off in your code, and, importantly, sticks to tokens. It may well be the answer you prefer, which will be cool.

Pass data from JSON to variable for comparison

I have a request that I make in an API using GET
LWP::UserAgent,
the data is returned as JSON, with up to two results at most as follows:
{
"status":1,
"time":1507891855,
"response":{
"prices":{
"nome1\u2122":{
"preco1":1111,
"preco2":1585,
"preco3":1099
},
"nome2":{
"preco1":519,
"preco2":731,
"preco3":491
}
}
}
}
Dump:
$VAR1 = {
'status' => 1,
'time' => 1507891855,
'response' => {
'prices' => {
'nome1' => {
'preco1' => 1111,
'preco3' => 1099,
'preco2' => 1585
},
'nome2' => {
'preco3' => 491,
'preco1' => 519,
'preco2' => 731
}
}
}
};
What I would like to do is:
Take this data and save it in a variable to make a comparison using if with another variable that already has the name stored. The comparison would be with name1 / name2 and if it is true with the other variable it would get preco2 and preco3 to print everything
My biggest problem in the case is that some of these names in JSON contain characters like (TradeMark) that comes as \u2122 (some cases are other characters), so I can not make the comparison with the name of the other variable that is already with the correct name
nome1™
If I could only save the JSON already "converted" the characters would help me with the rest.
Basically after doing the request for the API I want to save the contents in a variable already converting all \u2122 to their respective character (this is the part that I do not know how to do in Perl) and then using another variable to compare them names are equal to show the price
Thanks for the help and any questions please tell me that I try to explain again in another way.
If I understand correctly, you need to get the JSON that you receive in UTF8 format to an internal variable that you can process. For that, you may use JSON::XS:
use utf8;
use JSON::XS;
my $name = "nome1™";
my $var1 = decode_json $utf8_encoded_json_text;
# Compare with name in $name
if( defined $var1->{'response'}->{'prices'}->{$name} ) {
# Do something with the name that matches
my $match = $var1->{'response'}->{'prices'}->{$name};
print $match->{'preco1'}, "\n";
}
Make sure you tell the Perl interpreter that your source is in UTF8 by specifying use utf8; at the beginning of the script. Then make sure you are editing the script with an editor that supports that format.
The function decode_json will return a ref to the converted value. In this case a hash ref. From there you work your way into the JSON.
If you know $name is going to be in the JSON you may omit the defined part. Otherwise, the defined clause will tell you whether the hash value is there. One you know, you may do something with it. If the hash values are a single word with no special characters, you may use $var1->{response}->{prices}->{$name}, but it is always safer to use $var1->{'response'}->{'prices'}->{$name}. Perl gets a bit ugly handling hash refs...
By the way, in JSON::XS you will also find the encode_json function to do the opposite and also an object oriented interface.

How can I get ruby's JSON to follow object references like Pry/PP?

I've stared at this so long I'm going in circles...
I'm using the rbvmomi gem, and in Pry, when I display an object, it recurses down thru the structure showing me the nested objects - but to_json seems to "dig down" into some objects, but just dump the reference for others> Here's an example:
[24] pry(main)> g
=> [GuestNicInfo(
connected: true,
deviceConfigId: 4000,
dynamicProperty: [],
ipAddress: ["10.102.155.146"],
ipConfig: NetIpConfigInfo(
dynamicProperty: [],
ipAddress: [NetIpConfigInfoIpAddress(
dynamicProperty: [],
ipAddress: "10.102.155.146",
prefixLength: 20,
state: "preferred"
)]
),
macAddress: "00:50:56:a0:56:9d",
network: "F5_Real_VM_IPs"
)]
[25] pry(main)> g.to_json
=> "[\"#<RbVmomi::VIM::GuestNicInfo:0x000000085ecc68>\"]"
Pry apparently just uses a souped-up pp, and while "pp g" gives me close to what I want, I'm kinda steering as hard as I can toward json so that I don't need a custom parser to load up and manipulate the results.
The question is - how can I get the json module to dig down like pp does? And if the answer is "you can't" - any other suggestions for achieving the goal? I'm not married to json - if I can get the data serialized and read it back later (without writing something to parse pp output... which may already exist and I should look for it), then it's all win.
My "real" goal here is to slurp up a bunch of info from our vsphere stuff via rbvmomi so that I can do some network/vm analysis on it, which is why I'd like to get it in a nice machine-parsed format. If I'm doing something stupid here and there's an easier way to go about this - lay it on me, I'm not proud. Thank you all for your time and attention.
Update: Based on Arnie's response, I added this monkeypatch to my script:
class RbVmomi::BasicTypes::DataObject
def to_json(*args)
h = self.props
m = h.merge({ JSON.create_id => self.class.name })
m.to_json(*args)
end
end
and now my to_json recurses down nicely. I'll see about submitting this (or the def, really) to the project.
The .to_json works in a recursive manner, the default behavior is defined as:
Converts this object to a string (calling to_s), converts it to a JSON string, and returns the result. This is a fallback, if no special method to_json was defined for some object.
json library has added some implementation for some common classes (check the left hand side of this documentation), such as Array, Range, DateTime.
For an array, to_json first convert all the elements to json object, concat then together, and then add the array mark [/].
For your case, you need to define your customized to_json method for GuestNicInfo, NetIpConfigInfo and NetIpConfigInfoIpAddress. I don't know your implementation about these three classes, so I wrote a example to demonstrate how to achieve this:
require 'json'
class MyClass
attr_accessor :a, :b
def initialize(a, b)
#a = a
#b = b
end
end
data = [MyClass.new(1, "foobar")]
puts data.to_json
#=> ["#<MyClass:0x007fb6626c7260>"]
class MyClass
def to_json(*args)
{
JSON.create_id => self.class.name,
:a => a,
:b => b
}.to_json(*args)
end
end
puts data.to_json
#=> [{"json_class":"MyClass","a":1,"b":"foobar"}]

With Boost.spirit, is there any way to pass additional argument to attribute constructor?

Maybe a noob question, I've a piece of code like this:
struct S {
S() {...}
S(int v) {
// ...
}
};
qi::rule<const char*, S(), boost::spirit::ascii::space_type> ip=qi::int_parser<S()>();
qi::rule<const char*, std::vector<S>(), boost::spirit::ascii::space_type> parser %= ip % ',';
...
Rules above can work, but the code breaks if S constructors require additional parameters, such as:
struct S {
S(T t) {...}
S(T t, int v) {
// ...
}
};
I've spent days to find solution, but no luck so far.
Can anyone help?
There is no direct way, but you can probably explicitely initialize things:
qi::rule<It, optional<S>(), Skipper> myrule;
myrule %=
qi::eps [ _val = phoenix::construct<S>(42) ] >>
int_parser<S()>;
However, since you are returning it from the int_parser, my intuition says that default-initialization should be appropriate (or perhaps the type S doesn't have a single, clear, responsibility?).
Edit
In response to the comment, it looks like you want this:
T someTvalue;
myrule = qi::int_
[ qi::_val = phx::construct<S>(someTvalue, qi::_1) ];
Or, if someTvalue is a variable outside the grammar constructor, and may change value during execution of the parser (and it lives long enough!), you could do
myrule = qi::int_
[ qi::_val = phx::construct<S>(phx::ref(someTvalue), qi::_1) ];
Hope that helps

apply different functions to each element of a Perl data structure

Given an arbitrarily nested data structure, how can I create a new data structure so that all the elements in it have been standardized by applying a function on all the elements depending on the type of the element. For example, I might have
$data = {
name => 'some one',
date => '2010-10-10 12:23:45',
sale => [34, 22, 65],
cust => {
name => 'Jimmy',
addr => '1 Foobar Way',
amnt => 452.024,
item => ['books', 'pens', 'post-it notes']
}
}
and I want to convert all text values to upper case, all dates to UTC date times, find the square of all integers, round down all real numbers and add 1, and so on. So, in effect, I want to apply a different function to each element depending on the type of element.
In reality the data might arrive via a database query, in which case they are already a Perl data structure, or they might start life as a JSON object, in which case I can use JSON::from_json to convert it to a Perl data structure. The idea is to standardize all the values in the data structure based on the value type, and then spit out the Perl data structure back again as a JSON object.
I read the answers to executing a function on every element of a data structure and feel that Data::Rmap might do the trick, but can't figure out how. Seems like Rmap works on all the keys as well, not just the values.
It's crazy straightforward with Data::Rmap you mentioned.
use Data::Rmap qw( rmap );
rmap { $_ = transform($_); } $data;
Regarding the question in the comments:
use Data::Rmap qw( rmap );
use Scalar::Util qw( looks_like_number );
# Transforms $_ in place.
sub transform {
if (looks_like_number($_)) {
if (...) {
$_ *= 2;
}
$_ = 0+$_; # Makes it look like a number to JSON::XS
} else {
...
}
}
&rmap(\&transform, $data);