Create list of dicts within a dict in TCL - json

I am forced to use TCL for something and I need to create a json string like this:
{ "mainKey": "mainValue", "subKey": [{"key1":"value1"},{"key2":"value2"}]}
So I am trying to do this:
set subDict1 [dict create key1 value1]
set subDict2 [dict create key2 value2]
set subDictList [list $subDict1 $subDict2]
set finalDict [dict create mainKey mainValue subKey $subDictList]
When I convert this dict to json, I get:
{"mainKey":"mainValue", "subKey":{"key1 value1":{"key2":"value2"}}}
instead of the required:
{ "mainKey": "mainValue", "subKey": [{"key1":"value1"},{"key2":"value2"}]}
What am I doing wrong?

First you have to understand that TCL is a very typeless language. What exactly are list and dicts in tcl?
In Tcl a list is a string that is properly formatted where each member of the list is separated by spaces (space, tab or newline) and if the data contained by an item contains spaces they can be escaped either by:
using backslash escaping:
"this is a list\ of\ four\ items"
using "" grouping:
{this is a "list of four items"}
using {} grouping:
{this is a {list of four items}}
Note that internally, once a string has been parsed as a list, Tcl uses a different internal data structure to store the list for speed. But semantically it is still a string. Just like HTML is a specially formatted string or JSON is a specially formatted string Tcl takes the attitude that lists are nothing but specially formatted strings.
So, what are dicts? In Tcl dicts are lists with even number of elements. That's it. Nothing special. A dict is therefore also semantically a string (though as mentioned above, once tcl sees you using that string as a dict it will compile it to a different data structure for optimizing speed).
Note again the core philosophy in tcl: almost all data structures (with the exception of arrays) are merely strings that happens to be formatted in a way that has special meaning.
This is the reason you can't auto-convert tcl data structures to JSON - if you ask Tcl to guess what the data structure is you end up with whatever the programmer who wrote the guessing function want it to be. In your case it looks like it defaults to always detecting lists with even number of elements as dicts.
So how can you generate JSON correctly?
There are several ways to do this. You can of course use custom dedicated for loops or functions to convert your data structure (which again, is just a specially formatted string) to JSON.
Several years ago I've written this JSON compiler:
# data is plain old tcl values
# spec is defined as follows:
# {string} - data is simply a string, "quote" it if it's not a number
# {list} - data is a tcl list of strings, convert to JSON arrays
# {list list} - data is a tcl list of lists
# {list dict} - data is a tcl list of dicts
# {dict} - data is a tcl dict of strings
# {dict xx list} - data is a tcl dict where the value of key xx is a tcl list
# {dict * list} - data is a tcl dict of lists
# etc..
proc compile_json {spec data} {
while [llength $spec] {
set type [lindex $spec 0]
set spec [lrange $spec 1 end]
switch -- $type {
dict {
lappend spec * string
set json {}
foreach {key val} $data {
foreach {keymatch valtype} $spec {
if {[string match $keymatch $key]} {
lappend json [subst {"$key":[
compile_json $valtype $val]}]
break
}
}
}
return "{[join $json ,]}"
}
list {
if {![llength $spec]} {
set spec string
} else {
set spec [lindex $spec 0]
}
set json {}
foreach {val} $data {
lappend json [compile_json $spec $val]
}
return "\[[join $json ,]\]"
}
string {
if {[string is double -strict $data]} {
return $data
} else {
return "\"$data\""
}
}
default {error "Invalid type"}
}
}
}
(See http://wiki.tcl.tk/JSON for the original implementation and discussion of JSON parsing)
Because tcl can never correctly guess what your "string" is I've opted to supply a format string to the function in order to correctly interpret tcl data structures. For example, using the function above to compile your dict you'd call it like this:
compile_json {dict subKey list} finalDict
I've begged the tcllib maintainers to steal my code because I still believe it's the correct way to handle JSON in tcl but so far it's still not in tcllib.
BTW: I license the code above as public domain and you or anyone may claim full authorship of it if you wish.

It's not completely wrong to say that Tcl is a typeless language, because the types of the data objects in a Tcl program aren't expressed fully in the code, and not always even in the Tcl_Obj structures that represent data objects internally. Still, types are certainly not absent from a Tcl program, it's just that the type system is a lot less intrusive in Tcl than in most other programming languages.
The complete type definition in a Tcl program emerges from a dynamic combination of code and data objects as the program executes. The interpreter trusts you to tell it how you want your data objects to behave.
As an example, consider the following string:
set s {title: Mr. name: Peter surname: Lewerin}
Is this a string, an array, or a dictionary? All of the above, actually. (At least it's not an integer, a double or a boolean, other possible Tcl types.)
Using this string, I can answer a number of questions:
Tell me about your name
puts $s
# => title: Mr. name: Peter surname: Lewerin
What do polite people call you?
puts [dict values $s]
# => Mr. Peter Lewerin
What was your last name again?
puts [lindex $s end]
# => Lewerin
Here, I used the same string as a string, as a dictionary, and as an array. The same string representation was used for all three types of object, and it was the operations I used on it that determined the type of the object in that precise moment.
Similarly, the literal 1 can mean the integer 1, the single-character string 1, or boolean truth. There is no way to specify which kind of 1 you mean, but there is no need either, since the interpreter won't complain about the ambiguity.
Because Tcl doesn't store complete type information, it's quite hard to serialize arbitrary collections of data objects. That doesn't mean Tcl can't play well with serialization, though: you just need to add annotations to your data.
This string:
di [dm [st mainKey] [st mainValue]] [dm [st subKey] [ar [di [dm [st key1] [st value1]]] [di [dm [st key2] [st value2]]]]]
can be fed into the Tcl interpreter, and given the proper definitions of di, dm, st, and ar (which I intend to signify "dictionary", "dictionary member", "string", and "array", respectively), I can have the string construct a dictionary structure equivalent to the one in the question, or the string representation of such an object, just a bare list of keys and values, or XML, or JSON, etc. By using namespaces and/or slave interpreters, I can even dynamically switch between various forms. I won't provide examples for all forms, just JSON:
proc di args {return "{[join $args {, }]}"}
proc st val {return "\"$val\""}
proc ar args {return "\[[join $args {, }]]"}
proc dm {k v} {return "$k: $v"}
The output becomes:
{"mainKey": "mainValue", "subKey": [{"key1": "value1"}, {"key2": "value2"}]}
This example used the command nesting of the Tcl interpreter to define the structure of the data. Tcl doesn't need even that: a list of token classes and tokens such as a scanner would emit will suffice:
< : ' mainKey ' mainValue : ' subKey ( < : ' key1 ' value1 > < : ' key2 ' value2 > ) >
Using these simple commands:
proc jsonparseseq {endtok args} {
set seq [list]
while {[lsearch $args $endtok] > 0} {
lassign [jsonparseexpr {*}$args] args expr
lappend seq $expr
}
list [lassign $args -] $seq
}
proc jsonparseexpr args {
set args [lassign $args token]
switch -- $token {
' {
set args [lassign $args str]
set json \"$str\"
}
: {
lassign [jsonparseexpr {*}$args] args key
lassign [jsonparseexpr {*}$args] args val
set json "$key: $val"
}
< {
lassign [jsonparseseq > {*}$args] args dict
set json "{[join $dict {, }]}"
}
( {
lassign [jsonparseseq ) {*}$args] args arr
set json "\[[join $arr {, }]]"
}
}
list $args $json
}
proc jsonparse args {
lindex [jsonparseexpr {*}$args] end
}
I can parse that stream of token classes (<, (, ', :, ), >) and tokens into the same JSON string as above:
jsonparse < : ' mainKey ' mainValue : ' subKey ( < : ' key1 ' value1 > < : ' key2 ' value2 > ) >
# -> {"mainKey": "mainValue", "subKey": [{"key1": "value1"}, {"key2": "value2"}]}
Tcl offers quite a lot of flexibility; few languages will be as responsive to the programmer's whim as Tcl.
For completeness I will also demonstrate using the Tcllib huddle package mentioned by slebetman to create a the kind of structure mentioned in the question, and serialize that into JSON:
package require huddle
# -> 0.1.5
set subDict1 [huddle create key1 value1]
# -> HUDDLE {D {key1 {s value1}}}
set subDict2 [huddle create key2 value2]
# -> HUDDLE {D {key2 {s value2}}}
set subDictList [huddle list $subDict1 $subDict2]
# -> HUDDLE {L {{D {key1 {s value1}}} {D {key2 {s value2}}}}}
set finalDict [huddle create mainKey mainValue subKey $subDictList]
# -> HUDDLE {D {mainKey {s mainValue} subKey {L {{D {key1 {s value1}}} {D {key2 {s value2}}}}}}}
huddle jsondump $finalDict {} {}
# -> {"mainKey":"mainValue","subKey":[{"key1":"value1"},{"key2":"value2"}]}
Another approach is to create regular Tcl structures and convert ("compile") them to huddle data according to a type specification:
set subDict1 [dict create key1 value1]
set subDict2 [dict create key2 value2]
set subDictList [list $subDict1 $subDict2]
set finalDict [dict create mainKey mainValue subKey $subDictList]
huddle compile {dict mainKey string subKey {list {dict * string}}} $finalDict
The result of the last command is the same as of the last huddle create command in the previous example.
Documentation: dict, join, lappend, lassign, lindex, list, lsearch, proc, puts, return, set, switch, while

Related

print dictionary keys and values in one column in tcl

I am new learner of tcl scripting language. I am using TCL version 8.5. I read text file through tcl script and count similar words frequency. I used for loop and dictionary to count similar words and their frequency but output of the program print like this: alpha 4 beta 2 gamma 1 delta 1
But I want to print it in one column each key, value pair of dictionary or we could say each key, value pair print line by line in output. Following is my script in tcl and its output at the end.
set f [open input.txt]
set text [read $f]
foreach word [split $text] {
dict incr words $word
}
puts $words
Output of the above script:
alpha 4 beta 2 gamma 1 delta 1
You would do:
dict for {key value} $words {
puts "$key $value"
}
When reading the dict documentation, take care about which subcommands require a dictionaryVariable (like dict incr) and which require a dictionaryValue (like dict for)
For nice formatting, as suggested by Donal, here's a very terse method:
set maxWid [tcl::mathfunc::max {*}[lmap w [dict keys $words] {string length $w}]]
dict for {word count} $words {puts [format "%-*s = %s" $maxWid $word $count]}
Or, look at the source code for the parray command for further inspiration:
parray tcl_platform ;# to load the proc
info body parray

Extracting query string value

How to extract the username value from this query string (HTTP url-encoded): username=james&password=pwd in Tcl?
I can get it through Java's request.getParameter("username"); but how to get using Tcl?
The first stage is to split the query string up, and form a dictionary of it (which isn't strictly correct, but I'm guessing you don't care about the case where someone puts multiple username fields in the query string!). However, you also need to decode the encoding of the contents, and that's pretty awful:
proc QueryStringToDict {qs} {
set mapping {}
foreach item [split $qs "&"] {
if {[regexp {^([^=]+)=(.*)$} $item -> key value]} {
dict set mapping [DecodeURL $key] [DecodeURL $value]
}
}
return $mapping
}
proc DecodeURL {string} {
# This *is* tricky! The URL encoding of fields is way nastier than you thought!
set mapped [string map {+ { } \[ "\\\[" \] "\\\]" $ "\\$" \\ "\\\\"} $string]
encoding convertfrom utf-8 \
[subst [regsub -all {%([[:xdigit:]]{2})} $string {[format %c 0x\1]}]]
}
set qs "username=james&password=pwd"
set info [QueryStringToDict $qs]
puts "user name is [dict get $info username]"
In 8.7 (currently in alpha) it'll be much simpler to do that inner encoding; there won't need to be that subst call in there for example. But you haven't got that version of Tcl; nobody has (except for people who insist on being right on the bleeding edge and get themselves into trouble over it).
Assuming this is a CGI environment, where the environment will contain
REQUEST_METHOD=GET
QUERY_STRING='username=james&password=pwd'
or
REQUEST_METHOD=POST
CONTENT_LENGTH=27
# and stdin contains "username=james&password=pwd"
then use tcllib's ncgi module
$ cat > cgi.tcl
#!/usr/bin/env tclsh
package require ncgi
::ncgi::parse
array set params [::ncgi::nvlist]
parray params
$ printf "username=james&password=pwd" | env REQUEST_METHOD=POST CONTENT_LENGTH=27 ./cgi.tcl
params(password) = pwd
params(username) = james
$ env REQUEST_METHOD=GET QUERY_STRING='username=james&password=pwd' ./cgi.tcl
params(password) = pwd
params(username) = james
An alternative to Donal's suggestion, sharing the spirit, but building on battery pieces: tcllib rest package:
(1) To process the query (as part of a valid URL)
% package req rest
1.3.1
% set query [rest::parameters ?username=jo%3Dhn]; # http:// is default scheme, ? is minimum URL boilerplate
username jo%3Dhn
(2) Run a URL decoder (e.g., the one by Donal or the one from Rosetta code):
% proc urlDecode {str} {
set specialMap {"[" "%5B" "]" "%5D"}
set seqRE {%([0-9a-fA-F]{2})}
set replacement {[format "%c" [scan "\1" "%2x"]]}
set modStr [regsub -all $seqRE [string map $specialMap $str] $replacement]
return [encoding convertfrom utf-8 [subst -nobackslash -novariable $modStr]]
}
then:
% set info [lmap v $query {urlDecode $v}]
username jo=hn
% dict get $info username
jo=hn

Tcl - Differentiate between list/dict and anonymous proc

I wrote the following proc, which simulates the filter function in Lodash (javascript library) (https://lodash.com/docs/4.17.4#filter). You can call it in 3.5 basic formats, seen in the examples section. For the latter three calling options I would like to get rid of the the requirement to send in -s (shorthand). In order to do that I need to differentiate between an anonymous proc and a list/dict/string.
I tried looking at string is, but there isn't a string is proc. In researching here: http://wiki.tcl.tk/10166 I found they recommend info complete, however in most cases the parameters would pass that test regardless of the type of parameter.
Does anyone know of a way to reliable test this? I know I could leave it or change the proc definition, but I'm trying to stay as true as possible to Lodash.
Examples:
set users [list \
[dict create user barney age 36 active true] \
[dict create user fred age 40 active false] \
]
1. set result [_filter [list 1 2 3 4] {x {return true}}]
2. set result [_filter $users -s [dict create age 36 active true]]
3. set result [_filter $users -s [list age 36]]
4. set result [_filter $users -s "active"]
Proc Code:
proc _filter {collection predicate args} {
# They want to use shorthand syntax
if {$predicate=="-s"} {
# They passed a list/dict
if {[_dictIs {*}$args]} {
set predicate {x {
upvar args args
set truthy 1
dict for {k v} {*}$args {
if {[dict get $x $k]!=$v} {
set truthy false
break
}
}
return $truthy
}}
# They passed just an individual string
} else {
set predicate {x {
upvar args args;
if {[dict get $x $args]} {
return true;
}
return false;
}}
}
}
# Start the result list and the index (which may not be used)
set result {}
set i -1
# For each item in collection apply the iteratee.
# Dynamically pass the correct parameters.
set paramLen [llength [lindex $predicate 0]]
foreach item $collection {
set param [list $item]
if {$paramLen>=2} {lappend param [incr i];}
if {$paramLen>=3} {lappend param $collection;}
if {[apply $predicate {*}$param]} {
lappend result $item
}
}
return $result
}
Is x {return true} a string, a list, a dictionary or a lambda term (the correct name for an anonymous proc)?
The truth is that it may be all of them; it would be correct to say it was a value that was a member of any of the mentioned types. You need to describe your intent more precisely and explicitly rather than hiding it inside some sort of type magic. That greater precision may be achieved by using an option like -s or by different main command names, but it is still necessary either way. You cannot correctly and safely do what you seek to do.
In a little more depth…
All Tcl values are valid as strings.
Lists have a defined syntax and are properly subtypes of strings. (They're implemented differently internally, but you are supposed to ignore such details.)
Dictionaries have a syntax that is equivalent to lists with even numbers of elements where the elements at the even indices are all unique from each other.
Lambda terms are lists with two or three elements (the third element is the name of the context namespace, and defaults to the global namespace if it is absent). The first element of the list needs to be a valid list as well.
A two-element list matches the requirements for all the above. In Tcl's actual type logic, it is simultaneously all of the above. A particular instantiation of the value might have a particular implementation representation under the covers, but that is a transient thing that does not reflect the true type of the value.
Tcl's type system is different to that of many other languages.

getting nested keys from tcl dict

If I have a nested dict in Tcl like so
dict set mydict1 A key1 value1
dict set mydict1 B key1 value1
dict set mydict1 B key2 value2
dict set mydict1 C key3 value3
I'd like to identify the list of all of the second level keys in this dictionary. IN this case, I would like to know that the second level keys are key1,key2,key3
Is there a way to get this list of values from this dictionary directly?
There's no built-in command for doing this, nor even a way for the code to know on your behalf that that's what the structure is (which is a consequence of the type system in Tcl). However, if you know that there's definitely always two levels though, it's not too hard to code it yourself.
proc two-level-enumerate {dict} {
set keypairs {}
dict for {key1 subdict} $dict {
foreach key2 [dict keys $subdict] {
lappend keypairs [list $key1 $key2]
### Depending on what you're doing, you might prefer:
# lappend keypairs $key1 $key2
}
}
return $keypairs
}
The tricky bit for the generic Tcl layer is knowing that there's two levels, as it can't safely use the internal types on values (the types of literals are quite tricky, and on the flip side, determining the intended structure vs what you've happened to put in beneath it is also awkward). Being explicit — the code above — works much better.
Thanks to Donal I was able to further refine the solution to the following (which requires tcl >= 8.6 for the lmap):
lsort -unique [concat {*}[lmap k1 [dict keys $mydict1] {dict keys [dict get $mydict1 $k1]}]]

"can't read:variable is array" error in ns2

In ns2, I declared a simple array using
array set ktree {}
then I tried to use it as a GOD variable as
create-god $ktree
but this gives the error
can't read "ktree": variable is array
while executing
"create-god $ktree {}"
Any help is greatly appreciated.
In Tcl, $varName means “read from the variable called varName” and is not a general reference to the variable (unlike some other languages, notably Perl and PHP, which do rather different things). Reading from a whole array, instead of an element of that array, is always an error in Tcl.
To pass an array to a command, you pass the name of that array in. It's then up to that command to access it as it sees fit. For procedures and methods written in Tcl, it'll typically involve upvar to link the array into a local view. (Things written directly in C or C++ have far fewer restrictions as they don't automatically push a Tcl stack frame.)
Note however that the command must be expecting the name of an array when you pass that name in. (Good programmers will document this fact, of course.) Whether create-god does, I really have no idea; it's not a general Tcl command but rather something that's more specific. (Part of ns2? Or maybe your own code.)
Example of passing in an array
An example of passing in an array by name is the parray command that should be part of every Tcl distribution. It's a procedure that prints an array out. Here's the source code without a few boiler-plate comments:
proc parray {a {pattern *}} {
upvar 1 $a array
if {![array exists array]} {
error "\"$a\" isn't an array"
}
set maxl 0
set names [lsort [array names array $pattern]]
foreach name $names {
if {[string length $name] > $maxl} {
set maxl [string length $name]
}
}
set maxl [expr {$maxl + [string length $a] + 2}]
foreach name $names {
set nameString [format %s(%s) $a $name]
puts stdout [format "%-*s = %s" $maxl $nameString $array($name)]
}
}
The key thing here is that we first see upvar 1 to bind the named variable in the caller to a local variable, and a test with array exists to see if the user really passed in an array (so as to give a good error message rather than a rubbishy one). Everything else then is just the implementation of how to actually pretty-print an associative array (finding out the max key length and doing some formatted output); it's just plain Tcl code.