Tcl binary scan is not setting bytes - tcl

I am new to tcl and I need to read a binary file and compare first and last byte of data but I discovered that binary scan is not working for me (even the examples)
Here is the snippet
set fp [open $temp_file_name]
defer { close $fp }
fconfigure $fp -translation binary
# ...
seek $fp 0
set m [binary scan [read $fp 1] h* v]
puts $v
# can't read "v": no such variable
# m is 0
With the example
> puts [binary scan abcdefg s3s first second]
> puts $first
## can't read "first": no such variable
> puts $second
## can't read "second": no such variable
What am I doing wrong?

The result of binary scan is the number of format groups that have been satisfied. If we consider the case:
binary scan abcdefg s3s first second
The result of this is 1, because there are enough bytes to only satisfy the first format group, s3. That consumes six bytes (three little-endian 16-bit quantities, using up abcdef). There's insufficient data to satisfy the following s; there'd need to be another byte for that.
% info patchlevel
8.6.10
% puts [binary scan abcdefg s3s first second]
1
% puts $first
25185 25699 26213
% puts $second
can't read "second": no such variable
With the other code:
set m [binary scan [read $fp 1] h* v]
This will read a byte from the stream, convert into little-endian hex digits (I don't like the h conversion very much; H seems more sensible to me, especially for single bytes) and stores all those hex characters in the variable v. It should return 1 if anything is read from the stream at all.
I've no idea at all why binary scan is not working for you. It does not appear to be functioning in accordance with documented behaviour at all. To double-check, you haven't accidentally replaced the ::binary or ::tcl::binary::scan (which is the default implementation of binary scan) commands, have you?
Since things are behaving massively out of spec, it's hard to suggest how to fix that.

Related

How to Convert Hex String to Unicode character byte string in Tcl

Suppose I have a hex string like 905082.
Now I want convert it it corresponding unicode character in Tcl. I have used following code:
set charstring ""
set hexstring 905082
set len [string length $hexstring]
for {set i 0} {$i < [expr $len / 2]} {incr i} {
set j [expr 2 * $i]
set char [string range $hexstring $j [expr $j + 1]]
set charstring $charstring[format %c [format %i 0x$char]]
}
puts $charstring
but it is not working... Probably it contains such hex value which represents non-printable unicode character. So how can I do this so that I can convert the hex string to unicode byte string so that I can pass it to C function using SWIG?
binary format formatString ?arg arg ...? (link) is your friend.
The binary format command generates a binary string whose layout is specified by the formatString >and whose contents come from the additional arguments. The resulting binary value is returned.
In your case you would be feeding hex characters to the binary format function, so your formatString would be h* or H*, depending on whether your MSB is the rightmost character (H*) or your leftmost character (h*)
binary format H* 905082 will return three bytes containing the raw data 0x905082 even if the string representation may not always be printable.
Store the command output in a variable and you're good to go.
PS: an alternative would be to retrieve the integer value of the string you want to parse with scan (link).
scan $hexstring %x intvalue and the integer value would be stored in $intvalue
U+905082 isn't a Unicode character. The Unicode spec explicitly states that the limit is U+10FFFF. It's also not a UTF-8 encoding of a single character (since the byte 0x50 always encodes a single character in UTF-8 by itself, a P). So whatever you're after, it's not a single character.
So what might we have open to us? Well, we can convert to a sequence of bytes:
set bytes [binary format H* "905082"]
Well, that's it! We're now in Unicode! The bytes are now converted the same sequence of characters as you'd get with "\u0090\u0050\u0082". But maybe you wanted them in a specific encoding? Well, that's where you might use encoding convertto to change to a different sequence of bytes;
set bytesTurnedToUtf8 [encoding convertto utf-8 $bytes]
If we knew that they were in some other encoding, we could use encoding convertfrom to do the reverse operation. Mind you, most of the time you don't need to spend a lot of time thinking about encodings in Tcl; the Tcl runtime manages string encodings for you and you hardly ever have to think about things other than as sequences of characters.

TCL -- How to store and print table of values?

I know that you can "hack" nesting associative arrays in tcl, and I also know that with dictionaries (which I have no experience with) you can nest them pretty easily. I'm trying to find a way to store the values of a function that has two variables, and then at the end I just want to print out a table of the two variables (column and row headers) with the function values in the cells. I can make this work, but it is neither succinct nor efficient.
Here's what should be printed. The rows are values of b and columns are values of a (1,2,3,4,5 for simplicity):
b
1 2 3 4 5
1 y(1,1) y(1,2) y(1,3) y(1,4) y(1,5)
2 y(2,1) y(2,2) y(2,3) y(2,4) y(2,5)
a 3 y(3,1) y(3,2) y(3,3) y(3,4) y(3,5)
4 y(4,1) y(4,2) y(4,3) y(4,4) y(4,5)
5 y(5,1) y(5,2) y(5,3) y(5,4) y(5,5)
To store this, I imagine I would simply do two nested for loops over a and b and somehow store the results in nested dictionaries. Like have one dictionary with 5 entries, 1 for each value of b, and each entry in this is another dictionary for each value of b.
To print it, the only way I can think of is to just explicitly print out each table line and call each dictionary entry. I'm not too versed in output formatting with tcl, but I can probably manage there.
Can anyone think of a more elegant way to do this?
Here are a couple of examples on how you might use the struct::matrix package.
Example 1 - Simple Create/Display
package require struct::matrix
package require Tclx
# Create a 3x4 matrix
set m [::struct::matrix]
$m add rows 3
$m add columns 4
# Populate data
$m set rect 0 0 {
{1 2 3 4}
{5 6 7 8}
{9 10 11 12}
}
# Display it
puts "Print matrix, cell by cell:"
loop y 0 [$m rows] {
loop x 0 [$m columns] {
puts -nonewline [format "%4d" [$m get cell $x $y]]
}
puts ""
}
Output
Print matrix, cell by cell:
1 2 3 4
5 6 7 8
9 10 11 12
Discussion
In the first part of the script, I created a matrix, add 3 rows and 4 columns--a straight forward process.
Next, I called set rect to populate the matrix with data. Depend on your need, you might want to look into set cell, set column, or set row. For more information, please consult the reference for struct::matrix.
When it comes to displaying the matrix, instead of using the Tcl's for command, I prefer the loop command from the Tclx package, which is simpler to read and use.
Example 2 - Read from a CSV file
package require csv
package require struct::matrix
package require Tclx
# Read a matrix from a CSV file
set m [::struct::matrix]
set fileHandle [open data.csv]
::csv::read2matrix $fileHandle $m "," auto
close $fileHandle
# Displays the matrix
loop y 0 [$m rows] {
loop x 0 [$m columns] {
puts -nonewline [format "%4d" [$m get cell $x $y]]
}
puts ""
}
The data file, data.csv:
1,2,3,4
5,6,7,8
9,10,11,12
Output
1 2 3 4
5 6 7 8
9 10 11 12
Discussion
The csv package provides a simple way to read from a CSV file to a matrix.
The heart of the operation is in the ::csv::read2matrix command, but before that, I have to create an empty matrix and open the file for reading.
The code to display the matrix is the same as previous example.
Conclusion
While the struct::matrix package seems complicated at first; I only need to learn a couple of commands to get started.
Elegance is in the eye of the beholder :)
With basic core Tcl, I think you understand your options reasonably well. Either arrays or nested dictionaries have clunky edges when it comes to tabular oriented data.
If you are willing to explore extensions (and Tcl is all about the extension) then you might consider the matrix package from the standard Tcl library. It deals with rows and columns as key concepts. If you need to do transformations on tabular data then I would suggest TclRAL, a relational algebra library that defines a Relation data type and will handle all types of tabular data and provide a large number of operations on it. Alternatively, you could try something like SQLite which will also handle tabular data, provide for manipulating it and has robust persistent storage. The Tcl wiki will direct you to details of all of these extensions.
However, if these seem too heavyweight for your taste or if you don't want to suffer the learning curve, rolling up your sleeves and banging out an array or nested dictionary solution, while certainly being rather ad hoc, is probably not that difficult. Elegant? Well, that's for you to judge.
Nested lists work reasonably well for tabular data from 8.4 onwards (with multi-index lindex and lset) provided you've got compact numeric indices. 8.5's lrepeat is good for constructing an initial matrix too.
set mat [lrepeat 5 [lrepeat 5 0.0]]
lset mat 2 3 1.3
proc printMatrix {mat} {
set height [llength $mat]
set width [llength [lindex $mat 0]]
for {set j 0} {$j < $width} {incr j} {
puts -nonewline \t$j
}
puts ""
for {set i 0} {$i < $height} {incr i} {
puts -nonewline $i
for {set j 0} {$j < $width} {incr j} {
puts -nonewline \t[lindex $mat $i $j]
}
puts ""
}
}
printMatrix $mat
You should definitely consider using the struct::matrix and report packages from tcllib.
package require csv
package require struct::matrix
array set OPTS [subst {
csv_input_filename {input.csv}
}]
::struct::matrix indata
set chan [open $OPTS(csv_input_filename)]
csv::read2matrix $chan indata , auto
close $chan
# prints matrix as list format
puts [join [indata get rect 0 0 end end] \n];
# prints matrix as csv format
csv::joinmatrix indata
# cleanup
indata destroy
This is a one-liner way to print out a matrix as list or csv format respectively.

How to convert 64 bit binary string in to a decimal number in tcl?

I'm trying to convert a 64 bit binary string value () in to an integer but I'm apparently hitting the tcl limit and getting:
integer value too large to represent
Binary String:
000000000000101000000000000010100000000000000100
The proc I'm using to convert from binary to integer is:
proc binary2decimal {bin} {
binary scan [binary format B64 [format %064d $bin]] I dec
return $dec
}
I tried using 64 for formatting and also tried LI to no avail. I read about "wide" but did not understand if it could be applied here.
P.S: I'm using tcl 8.4
First off, the error is coming from format, which isn't very happy with 64-bit numbers by default (for backward compatibility reasons) but we're really working with a string of characters there, so %064s (s, not d) is a suitable fix.
You really ought to switch to using Tcl 8.5 where this is much easier:
binary scan [binary format B64 [format "%064s" $bin]] W dec
Or even just:
set dec [expr "0b$bin"]
In 8.4, you've got to do more work as the 64-bit support was a lot more primitive.
binary scan [binary format B64 [format %064s $bin]] II d1 d2
set dec [expr {wide($d1)<<32 | wide($d2)}]
That's slightly non-obvious, but the I format is and always has been documented to work with 32-bit values only. So we stitch the two (big-endian) half-values back together after scanning.

How to handle long integers in TCL with binary format command?

% binary scan [ binary format i* 146366987889541120] B* g
integer value too large to represent
Can anyone help me in computing a long integer value using binary format commnand .
But we are getting error and there is no way of representing 'l' in the syntax (like in the format command format %lx 146366987889541120).
% format %lx 146366987889541120
208000000000000
%
%
% format %x 146366987889541120
0
%
Can anyone sugggest me a way to solve this ?
Edit: Just use
format %llb 146366987889541120
If you want a binary string representing this number.
If you don't deal with wide integer (bigger than 64-bit, added with Tcl 8.5), you can use
binary scan [ binary format w 146366987889541120] B* g
But if you have longer integers, the best workarround that I found is:
# convert a binary string to a large integer
binary scan $bytes H* temp
set num [format %lli 0x$temp]
# convert a number to a binary string.
set bytes [binary format H* [format %llx $num]]
This has some bugs (leading zero), so you want to add a 0 for binary format.
set hex [format %llx $num]
if {[string length $hex]%2} {set hex 0$hex}
set bytes [binary format H* $hex]
But I doubt that this approach yields the best performance.

Convert Tektronix's RIBinary data in TCL

I am pulling data from a Tektronix oscilloscope in Tektronix' RIBinary format using a TCL script, and then within the script I need to convert that to a decimal value.
I have done very little with binary conversions in the first place, but to add to my frustration the documentation on this binary format is also very vague in my opinion. Anyway, here's my current code:
proc ::Scope::CaptureWaveform {VisaAlias Channel} {
# Apply scope settings
::VISA::Write $VisaAlias "*WAI"
::VISA::Write $VisaAlias "DATa:STARt 1"
::VISA::Write $VisaAlias "DATa:STOP 4000"
::VISA::Write $VisaAlias "DATa:ENCdg RIBinary"
::VISA::Write $VisaAlias "DATa:SOUrce $Channel"
# Download waveform
set RIBinaryWaveform [::VISA::Query $VisaAlias "CURVe?"]
# Parse out leading label from scope output
set RIBinaryWaveform [string range $RIBinaryWaveform 11 end]
# Convert binary data to a binary string usable by TCL
binary scan $RIBinaryWaveform "I*" TCLBinaryWaveform
set TCLBinaryWaveform
# Convert binary data to list
}
Now, this code pulls the following data from the machine:
-1064723993 -486674282 50109321 -6337556 70678 8459972 143470359 1046714383 1082560884 1042711231 1074910212 1057300801 1061457453 1079313832 1066305613 1059935120 1068139252 1066053580 1065228329 1062213553
And this is what the machine pulls when I just take regular ASCII data (i.e. what the above data should look like after the conversion):
-1064723968 -486674272 50109320 -6337556 70678 8459972 143470352 1046714368 1082560896 1042711232 1074910208 1057300800 1061457472 1079313792 1066305600 1059935104 1068139264 1066053568 1065228352 1062213568
Finally, here is a reference to the RIBinary specification from Tektronix since I don't think it is a standard data type:
http://www.tek.com/support/faqs/how-binary-data-represented-tektronix-oscilloscopes
I've been looking for a while now on the Tektronix website for more information on converting the data and the above URL is all I've been able to find, but I'll comment or edit this post if I find any more information that might be useful.
Updates
Answers don't necessarily have to be in TCL. If anyone can help me logically work through this on a high level I can hash out the TCL details (this I think would be more helpful to others as well)
The reason I need to transfer the data in binary and then convert it afterwards is for the purpose of optimization. Due to this I can't have the device perform the conversion before the transfer as it will slow down the process.
I updated my code some and now my results are maddeningly close to the actual results. I assume it may have something to do with the commas that are in the data originally.
Below are now examples of the raw data sent from the device without any of my parsing.
On suggestion from #kostix, I made a second script with code he gave me that I modified to fit my data set. It can be seen below, however the result are exactly the same as my above code.
ASCIi:
:CURVE -1064723968,-486674272,50109320,-6337556,70678,8459972,143470352,1046714368,1082560896,1042711232,1074910208,1057300800,1061457472,1079313792,1066305600,1059935104,1068139264,1066053568,1065228352,1062213568
RIBinary:
:CURVE #280ÀçâýðüÿKì
Note on RIBinary - ":CURVE #280" is all part of the header that I need to parse out, but the #280 part of it can vary depending on the data I'm collecting. Here's some more info from Tektronix on what the #280 means:
block is the waveform data in binary format. The waveform is formatted
as: # where is the number of y bytes. For
example, if = 500, then = 3. is the number of bytes to
transfer including checksum.
So, for my current data set x = 2 and yyy = 80. I am just really unfamiliar with converting binary data, so I'm not sure what to do programmatically to deal with the block format.
On suggestion from #kostix I made a second script with code he gave me that I modified to fit my data set:
set RIBinaryWaveform [::VISA::Query ${VisaAlias} "CURVe?"]
binary scan $RIBinaryWaveform a8a curv nbytes
encoding convertfrom ascii ${curv}
scan $nbytes %u n
set n
set headerlen [expr {$n + 9}]
binary scan $RIBinaryWaveform #9a$n nbytes
scan $nbytes %u n
set n
set numints [expr {$n / 4}]
binary scan $RIBinaryWaveform #${headerlen}I${numints} data
set data
The output of this code is the same as the code I provided above.
According to the documentation you link to, RIBinary is signed big-endian. Thus, you convert the binary data to integers with binary scan $data "I*" someVar (I* means “as many big-endian 4-byte integers as you can”). You use the same conversion with RPBinary (if you've got that) but you then need to chop each value to the positive 32-bit integer range by doing & 0xFFFFFFFF (assuming at least Tcl 8.5). For FPBinary, use R* (requires 8.5). SRIBinary, SRPBinary and SFPBinary are the little-endian versions, for which you use lower-case format characters.
Getting conversions correct can take some experimentation.
I have no experience with this stuff but like googleing. Here are my findings.
This document, in the section titled "Formatted I/O Operations" tells that the viQueryf() standard C API function combines viPrintf() (writing to a device) with viScanf() (reading from a device), and examples include calls like viQueryf (io, ":CURV?\n", "%#b", &totalPoints, rdBuffer); (see the section «IEEE-488.2 Binary Data—"%b"»), where the third argument to the function specifies the desired format.
The VISA::Query procedure from your Tcl library pretty much resembles that viQueryf() in my eyes, so I'd expect it to accept the third (optional) argument which specifies the format you want the data to be in.
If there's nothing like it, let's look at your ASCII data. Your FAQ entry and the document I found both specify that the opaque data might come in the form of a series of integers of different size and endianness. The "RIBinary" format states it should be big-endian signed integers.
The binary scan Tcl command is able to scan 16-bit and 32-bit big-endian integers from a byte stream — use the S* and I* formats, correspondingly.
Your ASCII data clearly looks like 32-bit integers, so I'd try scanning using I*.
Also see this doc — it appears to have much in common with the PDF guide I linked above, but might be handy anyway.
TL;DR
Try studying your API to find a way to explicitly tell the device the data format you want. This might produce a more robust solution in the case the device might be somehow reconfigured externally to change its default data format effectively pulling the rug under the feet of your code which relies on certain (guessed) default.
Try interpreting the data as outlined above and see if the interpretation looks sensible.
P.S.
This might mean nothing at all, but I failed to find any example which has "e" between the "CURV" and the "?" in the calls to viQueryf().
Update (2013-01-17, in light of the new discoveries about the data format): to binary scan the data of varying types, you might employ two techniques:
binary scan accepts as many specifiers in a row, you like; they're are processed from left to right as binary scan reads the supplied data.
You can do multiple runs of binary scanning over a chunk of your binary data either by cutting pieces of this chunk (string manipulation Tcl commands understand they're operating on a byte array and behave accordingly) or use the #offset term in the binary scan format string to make it start scanning from the specified offset.
Another technique worth employing here is that you'd better first train yourself on a toy example. This is best done in an interactive Tcl shell — tkcon is a best bet but plain tclsh is also OK, especially if called via rlwrap (POSIX systems only).
For instance, you could create a fake data for yourself like this:
% set b [encoding convertto ascii ":CURVE #224"]
:CURVE #224
% append b [binary format S* [list 0 1 2 -3 4 -5 6 7 -8 9 10 -11]]
:CURVE #224............
Here we first created a byte array containing the header and then created another byte array containing twelve 16-bit integers packed MSB first, and then appended it to the first array essentially creating a data block our device is supposed to return (well, there's less integers than the device returns). encoding convertto takes the name of a character encoding and a string and produces a binary array of that string converted to the specified encoding. binary format is told to consume a list of arbitrary size (* in the format list) and interpret it as a list of 16-bit integers to be packed in the big-endian format — the S format character.
Now we can scan it back like this:
% binary scan $b a8a curv nbytes
2
% encoding convertfrom ascii $curv
:CURVE #
% scan $nbytes %u n
1
% set n
2
% set headerlen [expr {$n + 9}]
11
% binary scan $b #9a$n nbytes
1
% scan $nbytes %u n
1
% set n
24
% set numints [expr {$n / 2}]
12
% binary scan $b #${headerlen}S${numints} data
1
% set data
0 1 2 -3 4 -5 6 7 -8 9 10 -11
Here we proceeded like this:
Interpret the header:
Read the first eight bytes of the data as ASCII characters (a8) — this should read our :CURVE # prefix. We convert the header prefix from the packed ASCII form to the Tcl's internal string encoding using encoding convertfrom.
Read the next byte (a) which is then interpreted as the length, in bytes, of the next field, using the scan command.
We then calculate the length of the header read so far to use it later. This values is saved to the "headerlen" variable. The length of the header amounts to the 9 fixed bytes plus variable-number of bytes (2 in our case) specifying the length of the following data.
Read the next field which will be interpreted as the "number of data bytes" value.
To do this, we offset the scanner by 9 (the length of ":CURVE #2") and read so many ASCII bytes as obtained on the previous step, so we use #9a$n for the format: $n is just obtaining the value of a variable named "n", and it will be 2 in our case. Then we scan the obtained value and finally get the number of the following raw data.
Since we will read 16-bit integers, not bytes, we divide this number by 2 and store the result to the "numints" variable.
Read the data. To do this, we have to offset the scanner by the length of the header. We use #${headerlen}S${numints} for the format string. Tcl expands those ${varname} before passing the string to the binary scan so the actual string in our case will be #11S12 which means "offset by 11 bytes then scan 12 16-bit big-endian integers".
binary scan puts a list of integers to the variable which name is passed, so no additional decoding of those integers is needed.
Note that in the real program you should probably do certain sanity checks:
* After the first step check that the static part of the header is really ":CURVE #".
* Check the return value of binary scan and scan after each invocation and check it equals to the number of variables passed to the command (which means the command was able to parse the data).
One more insight. The manual you cited says:
is the number of bytes to transfer including checksum.
so it's quite possible that not all of those data bytes represent measures, but some of them represent the checksum. I don't know what format (and hence length) and algorithm and position of this checksum is. But if the data does indeed include a checksum, you can't interpret it all using S*. Instead, you will probably take another approach:
Extract the measurement data using string range and save it to a variable.
binary scan the checksum field.
Calculate the checksum on the data obtained on the first step, verify it.
Use binary scan on the extracted data to get back your measurements.
Checksumming procedures are available in tcllib.
# Download waveform
set RIBinaryWaveform [::VISA::Query ${VisaAlias} "CURVe?"]
# Extract block format data
set ResultCount [expr [string range ${RIBinaryWaveform} 2 [expr [string index${RIBinaryWaveform} 1] + 1]] / 4]
# Parse out leading label from Tektronics block format
set RIBinaryWaveform [string range ${RIBinaryWaveform} [expr [string index ${RIBinaryWaveform} 1] + 2] end]
# Convert binary data to integer values
binary scan ${RIBinaryWaveform} "I${ResultCount}" Waveform
set Waveform
Okay, the code above does the magic trick. This is very similar to all the things discussed on this page, but I figured I needed to clear up the confusion about the numbers from the binary conversion being different from the numbers received in ASCII.
After troubleshooting with a Tektronix application specialist we discovered that the data I had been receiving after the binary conversion (the numbers that were off by a few digits) were actually the true values captured by the scope.
The reason the ASCII values are wrong is a result of the binary-to-ASCII conversion done by the instrument and then the incorrect values are then passed by the scope to TCL.
So, we had it right a few days ago. The instrument was just throwing me for a loop.