TJsonSerializer with arrays - json

I have a record notably containing the following field :
colorMatrix: Array [1 .. 2] of Array [1 .. 5] of TColor;
Unfortunately, calling TJsonSerializer's Serialise function does not serialise this field. How do I make it do so ?

Related

In Julia is it worth type-narrowing a dictionary returned by `JSON.parsefile`

I’m writing Julia code whose inputs are json files, that performs analysis in (the field of mathematical finance) and writes results as json. The code is a port from R in the hope of performance improvement.
I parse the input files using JSON.parsefile. This returns a Dict in which I observe that all vectors are of type Array{Any,1}. As it happens, I know that the input file will never contain vectors of mixed type, such as some Strings and some Numbers.
So I wrote the following code, which seems to work well and is “safe” in the sense that if the calls to convert fail then a vector continues to have type Array{Any,1}.
function typenarrow!(d::Dict)
for k in keys(d)
if d[k] isa Array{Any,1}
d[k] = typenarrow(d[k])
elseif d[k] isa Dict
typenarrow!(d[k])
end
end
end
function typenarrow(v::Array{Any,1})
for T in [String,Int64,Float64,Bool,Vector{Float64}]
try
return(convert(Vector{T},v))
catch; end
end
return(v)
end
My question is: Is this worth doing? Can I expect code that processes the contents of the Dict to execute faster if I do this type narrowing? I think the answer is yes in that the Julia performance tips recommend to “Annotate values taken from untyped locations” and this approach ensures there are no “untyped locations”.
There are two levels of the answer to this question:
Level 1
Yes, it will help the performance of the code. See for instance the following benchmark:
julia> using BenchmarkTools
julia> x = Any[1 for i in 1:10^6];
julia> y = [1 for i in 1:10^6];
julia> #btime sum($x)
26.507 ms (477759 allocations: 7.29 MiB)
1000000
julia> #btime sum($y)
226.184 μs (0 allocations: 0 bytes)
1000000
You can write your typenarrow function using a bit simpler approach like this:
typenarrow(x) = [v for v in x]
as using the comprehension will produce a vector of concrete type (assuming your source vector is homogeneous)
Level 2
This is not fully optimal. The problem that is still left is that you have a Dict that is a container with abstract type parameter (see https://docs.julialang.org/en/latest/manual/performance-tips/#Avoid-containers-with-abstract-type-parameters-1). Therefore in order for the computations to be fast you have to use a barrier function (see https://docs.julialang.org/en/latest/manual/performance-tips/#kernel-functions-1) or use type annotation for variables you introduce (see https://docs.julialang.org/en/v1/manual/types/index.html#Type-Declarations-1).
In the ideal world your Dict would have keys and values of homogeneous types and all would be maximally fast then, but if I understand your code correctly values in your case are not homogeneous.
EDIT
In order to solve the Level 2 isuue you can convert Dict into NamedTuple like this (this is a minimal example assuming that Dicts only nest in Dicts directly, but it should be easy enough to extend if you want more flexibility).
First, the function performing the conversion looks like:
function typenarrow!(d::Dict)
for k in keys(d)
if d[k] isa Array{Any,1}
d[k] = [v for v in d[k]]
elseif d[k] isa Dict
d[k] = typenarrow!(d[k])
end
end
NamedTuple{Tuple(Symbol.(keys(d)))}(values(d))
end
Now a MWE of its use:
julia> using JSON
julia> x = """
{
"name": "John",
"age": 27,
"values": {
"v1": [1,2,3],
"v2": [1.5,2.5,3.5]
},
"v3": [1,2,3]
}
""";
julia> j1 = JSON.parse(x)
Dict{String,Any} with 4 entries:
"name" => "John"
"values" => Dict{String,Any}("v2"=>Any[1.5, 2.5, 3.5],"v1"=>Any[1, 2, 3])
"age" => 27
"v3" => Any[1, 2, 3]
julia> j2 = typenarrow!(j1)
(name = "John", values = (v2 = [1.5, 2.5, 3.5], v1 = [1, 2, 3]), age = 27, v3 = [1, 2, 3])
julia> dump(j2)
NamedTuple{(:name, :values, :age, :v3),Tuple{String,NamedTuple{(:v2, :v1),Tuple{Array{Float64,1},Array{Int64,1}}},Int64,Array{Int64,1}}}
name: String "John"
values: NamedTuple{(:v2, :v1),Tuple{Array{Float64,1},Array{Int64,1}}}
v2: Array{Float64}((3,)) [1.5, 2.5, 3.5]
v1: Array{Int64}((3,)) [1, 2, 3]
age: Int64 27
v3: Array{Int64}((3,)) [1, 2, 3]
The beauty of this approach is that Julia will know all types in j2, so if you pass j2 to any function as a parameter all calculations inside this function will be fast.
The downside of this approach is that a function taking j2 has to be pre-compiled, which might be problematic if j2 structure is huge (as then the structure of resulting NamedTuple is complex) and the amount of work your function does is relatively small. But for small JSON-s (small in the sense of structure, as vectors held in them can be large - their size does not add to the complexity) this approach has proven to be efficient in several applications I have developed.

Julia: creating a method for Any vector with missing values

I would like to create a function that deals with missing values. However, when I tried to specify the missing type Array{Missing, 1}, it errors.
function f(x::Array{<:Number, 1})
# do something complicated
println("no missings.")
println(sum(x))
end
function f(x::Array{Missing, 1})
x = collect(skipmissing(x))
# do something complicated
println("removed missings.")
f(x)
end
f([2, 3, 5])
f([2, 3, 5, missing])
I understand that my type is not Missing but Array{Union{Missing, Int64},1}
When I specify this type, it works in the case above. However, I would like to work with all types (strings, floats etc., not only Int64).
I tried
function f(x::Array{Missing, 1})
...
end
But it errors again... Saying that
f (generic function with 1 method)
ERROR: LoadError: MethodError: no method matching f(::Array{Union{Missing, Int64},1})
Closest candidates are:
f(::Array{Any,1}) at ...
How can I say that I wand the type to be union missings with whatever?
EDIT (reformulation)
Let's have these 4 vectors and two functions dealing with strings and numbers.
x1 = [1, 2, 3]
x2 = [1, 2, 3, missing]
x3 = ["1", "2", "3"]
x4 = ["1", "2", "3", missing]
function f(x::Array{<:Number,1})
println(sum(x))
end
function f(x::Array{String,1})
println(join(x))
end
f(x) doesn't work for x2 and x3, because they are of type Array{Union{Missing, Int64},1} and Array{Union{Missing, String},1}, respectively.
It is possible to have only one function that detects whether the vector contains missings, removes them and then deals appropriately with it.
for instance:
function f(x::Array{Any, 1})
x = collect(skipmissing(x))
print("removed missings")
f(x)
end
But this doesn't work because Any indicates a mixed type (e.g., strings and nums) and does not mean string OR numbers or whatever.
EDIT 2 Partial fix
This works:
function f(x::Array)
x = collect(skipmissing(x))
print("removed missings")
f(x)
end
[But how, then, to specify the shape (number of dimensions) of the array...? (this might be an unrelated topic though)]
You can do it in the following way:
function f(x::Vector{<:Number})
# do something complicated
println("no missings.")
println(sum(x))
end
function f(x::Vector{Union{Missing,T}}) where {T<:Number}
x = collect(skipmissing(x))
# do something complicated
println("removed missings.")
f(x)
end
and now it works:
julia> f([2, 3, 5])
no missings.
10
julia> f([2, 3, 5, missing])
removed missings.
no missings.
10
EDIT:
I will try to answer the questions raised (if I miss something please add a comment).
First Vector{Union{Missing, <:Number}} is the same as Vector{Union{Missing, Number}} because of the scoping rules as tibL indicated as Vector{Union{Missing, <:Number}} translates to Array{Union{Missing, T} where T<:Number,1} and where clause is inside Array.
Second (here I am not sure if this is what you want). I understand you want the following behavior:
julia> g(x::Array{>:Missing,1}) = "$(eltype(x)) allows missing"
g (generic function with 2 methods)
julia> g(x::Array{T,1}) where T = "$(eltype(x)) does not allow missing"
g (generic function with 2 methods)
julia> g([1,2,3])
"Int64 does not allow missing"
julia> g([1,2,missing])
"Union{Missing, Int64} allows missing"
julia> g(["a",'a'])
"Any allows missing"
julia> g(Union{String,Char}["a",'a'])
"Union{Char, String} does not allow missing"
Note the last two line - although ["a", 'a'] does not contain missing the array has Any element type so it might contain missing. The last case excludes it.
Also you can see that you could change the second parameter of Array{T,N} to something else to get a different dimensionality.
Also this example works because the first method, as more specific, catches all cases that allow Missing and a second method, as more general, catches what is left (i.e. essentially what does not allow Missing).

How to create a TCL variable of type bytearray

I am using TCL 8.4.20.
So I have the following code:
set a [binary format H2 1]
set b [binary format H2 2]
set c [binary format H2 3]
set bytes $a
append bytes $a
append bytes $b
append bytes $c
puts $bytes
I set a breakpoint at Tcl_PutsObjCmd() function in TCL's C source code and I see its argument, $bytes, is of type string while I expect it to be bytearray.
Question 1:Why is that? From the first assignment to the final appending, "bytes" accepts nothing but binary data.
The reason I do this experiment is, we have a TCL extension command in C, it expects the command argument is of byte array type - it has a check the value's typePtr should be tclByteArrayType. My TCL code currently fails on this command because the data passed to the command is of type string, just as demo'ed above.
I googled around, seems the "right" way to make a byte array object is to have every byte ready first and finally use one "binary format" command to put all into one. But it is a fairly big change to my current TCL code.
Question 2: Given that I already have a TCL variable whose data are all binaries (created using "binary format" for each byte and put together using "append") while its type is string, How can I change its internal type to "bytearray" through some TCL maneuvering?
Technically, the internal type is not a guaranteed property. Everything is a string. The code may shimmer a type away whenever it feels like. And code that depends on the internal type is usually very brittle or outright broken.
So your C code should call Tcl_GetByteArrayFromObj() instead of peeking at the arguments internals. That does the proper conversion if the object has not yet a byteArray representation.
About your questions:
Why doesn't append of two byte arrays keep the byte array type?
It does, at least for 8.6, if you do it right and never trigger the creation of a string rep.
Running this in tkcon, the append turns the value into a string:
() 98 % set a [binary format H2 1]

() 99 % set b [binary format H2 1]

() 100 % ::tcl::unsupported::representation $a
value is a bytearray with a refcount of 2, object pointer at 0000000005665420, internal representation 000000000587B280:0000000005665240, string representation ""
() 101 % ::tcl::unsupported::representation $b
value is a bytearray with a refcount of 2, object pointer at 000000000564EEB0, internal representation 000000000587B4A0:00000000056590E0, string representation ""
() 102 % set x $a

() 103 % ::tcl::unsupported::representation $x
value is a bytearray with a refcount of 4, object pointer at 0000000005665420, internal representation 000000000587B280:0000000005665240, string representation ""
() 104 % append x $b

() 105 % ::tcl::unsupported::representation $x
value is a string with a refcount of 3, object pointer at 0000000005663F50, internal representation 0000000005896BA0:000000000564F030, string representation ""
this happens, because the bytearray has a string rep (due to Tkcon echoing the value) created. The append optimization only works for 'pure' bytearrays, e.g. bytearrays that do not have a string rep. This is similar to some optimizations for 'pure' lists.
So it works like this, preventing the shimmering result echo:
() 106 % set b [binary format H2 1]; puts "pure"
pure
() 107 % set a [binary format H2 1]; puts "pure"
pure
() 108 % set x $a; puts "pure"
pure
() 109 % ::tcl::unsupported::representation $a
value is a bytearray with a refcount of 3, object pointer at 0000000005658780, internal representation 000000000587B320:0000000005658CF0, no string representation
() 110 % ::tcl::unsupported::representation $b
value is a bytearray with a refcount of 2, object pointer at 000000000564ED60, internal representation 000000000587B500:0000000005658750, no string representation
() 111 % ::tcl::unsupported::representation $x
value is a bytearray with a refcount of 3, object pointer at 0000000005658780, internal representation 000000000587B320:0000000005658CF0, no string representation
() 112 % append x $b; puts "pure"
pure
() 113 % ::tcl::unsupported::representation $x
value is a bytearray with a refcount of 2, object pointer at 0000000005658690, internal representation 00000000058A5C60:0000000005658960, no string representation
Note the no string representation part.
How to turn a string into a bytearray
Just do a binary format:
set x [binary format a* $x]

Wrong arity of simple function in clojure

I've started to learn clojure. In my book there is following exercise:
Write a function, mapset, that works like map except the return value is a set:
(mapset inc [1 1 2 2])
; => #{2 3}
I've started with something like this:
(defn mapset
[vect]
(set vect))
The result is error
"Wrong number of args (2) passed to: core/mapset"
I tried [& args] as well.
So, the question is: how can I solve such problem?
Take a closer look at your call to mapset:
(mapset inc [1 1 2 2])
Since code is data, this "call" is just a list of three elements:
The symbol mapset
The symbol inc
The vector [1 1 2 2]
When you evaluate this code, Clojure will see that it is a list and proceed to evaluate each of the items in that list (once it determines that it isn't a special form or macro), so it will then have a new list of three elements:
The function to which the symbol core/mapset was bound
The function to which the symbol clojure.core/inc was bound
The vector [1 1 2 2]
Finally, Clojure will call the first element of the list with the rest of the elements as arguments. In this case, there are two arguments in the rest of the list, but in your function definition, you only accounted for one:
(defn mapset
[vect]
(set vect))
To remedy this, you could implement mapset as follows:
(defn mapset
[f vect]
(set (map f vect)))
Now, when you call (mapset inc [1 1 2 2]), the argument f will be found to the function clojure.core/inc, and the argument vect will be bound to the vector [1 1 2 2].
Your definition of mapset takes a single argument vect
At a minimum you need to take 2 arguments, a function and a sequence
(defn mapset [f xs] (set (map f xs)))`
But it is interesting to think about this as the composition of 2 functions also:
(def mapset (comp set map))

Loop in clojure with or condition

I'm trying to write a function in clojure that calls a condition for each value in a vector; the function should return the OR of the result of the condition applied to each value. Eg I have a vector [1 2 3 4] and a condition (>= x 3) where x is an element in the vector, then the function should return true (similary [0 1 0 1] should return false).
I wrote the method
(defn contains-value-greater-or-equal-to-three [values]
(or (for [x values] (>= x 3)))
)
but for the vector [1 2 3 4] this just yields (false false true true); what I want instead is the 'or' function applied to these values.
I'm quite new to functional programming and clojure, let me know if I'm thinking about this the wrong way.
Have a look at the function some, that takes a predicate and a collection as arguments, and returns true iff the predicate returns true for some item in the collection.
(some #(>= % 3) [1 2 3 4]) ; => true
(some #(>= % 3) [0 1 0 1]) ; => false
The problem with your method is that for returns a sequence of true/false values, while or expects a number of individual arguments. Given only one argument, or will return that. What you could have done is to reduce the sequence returned by for, like this:
(reduce #(or %1 %2) (for [x values] (>= x 3)))
(since or is a macro, not a function, (reduce or ...) will not work.