What does binary scan "xyz" H* do? - tcl

I see the following code:
binary scan "xyz" H* var
It puzzles me: binary scan is supposed to scan a binary stream and construct string type variables, but here it is "xyz" ...?
I did the following experiment inside tclsh:
% puts $var
78797a <== what is this?
% binary scan $var #1H y <== I mean to get "y"
1
% puts $y <== but I get "3"?
3
I am lost.
Could you explain what is going on?

Does it help to know that the hexadecimal value of the character 'x' is 0x78? Or that binary scan \x78\x79\x7a H* var2 is identical to your example? The examples in the 'binary scan' manual page under the 'H' conversion code explain it pretty well, I think.

In your code:
binary scan "xyz" H* var
The binary string is xyz, which is three bytes that are the ASCII values for x, y and z. We then ask for the var variable to be given a sequence hex digits of the scanned bytes in big endian order (very much the right thing for dealing with strings, BTW!) with twice as many hex digits as there are bytes in the binary string (because *). Let's double check with what the documentation says:
The data is turned into a string of count hexadecimal digits in high-to-low order represented as a sequence of characters in the set “0123456789abcdef”. The data bytes are scanned in first to last order with the hex digits being taken in high-to-low order within each byte. Any extra bits in the last byte are ignored. If count is *, then all of the remaining hex digits in string will be scanned. If count is omitted, then one hex digit will be scanned. For example,
binary scan \x07\xC6\x05\x1f\x34 H3H* var1 var2
will return 2 with 07c stored in var1 and 051f34 stored in var2.
Now, there are three bytes in xyz so there are six digits in 78797a. The first two hex digits, 78 are the the hex for the ASCII version of x (check for yourself), and similarly for 79 and 7a.
When you then do:
binary scan $var #1H y
you move the internal cursor into the string to the byte for the ASCII for 8 (because zero-based indexing), \x38, and because there's no count given to the H, it gets the first hex digit of 38 (i.e., 3) and puts that in the y variable.
To actually retrieve the y, you can just use string index or string range on the original binary string (as all Tcl's string commands work just fine on binary data). Or you use string range to get the hex digits out of var and binary format to convert back:
binary format H* [string range $var 2 3]
It's probably not a good idea to binary scan the results of binary scan. It's totally legal to do so, but the results are going to be unlikely to illuminate.

Related

How to handle long integers in TCL with binary format command?

% binary scan [ binary format i* 146366987889541120] B* g
integer value too large to represent
Can anyone help me in computing a long integer value using binary format commnand .
But we are getting error and there is no way of representing 'l' in the syntax (like in the format command format %lx 146366987889541120).
% format %lx 146366987889541120
208000000000000
%
%
% format %x 146366987889541120
0
%
Can anyone sugggest me a way to solve this ?
Edit: Just use
format %llb 146366987889541120
If you want a binary string representing this number.
If you don't deal with wide integer (bigger than 64-bit, added with Tcl 8.5), you can use
binary scan [ binary format w 146366987889541120] B* g
But if you have longer integers, the best workarround that I found is:
# convert a binary string to a large integer
binary scan $bytes H* temp
set num [format %lli 0x$temp]
# convert a number to a binary string.
set bytes [binary format H* [format %llx $num]]
This has some bugs (leading zero), so you want to add a 0 for binary format.
set hex [format %llx $num]
if {[string length $hex]%2} {set hex 0$hex}
set bytes [binary format H* $hex]
But I doubt that this approach yields the best performance.

Convert Tektronix's RIBinary data in TCL

I am pulling data from a Tektronix oscilloscope in Tektronix' RIBinary format using a TCL script, and then within the script I need to convert that to a decimal value.
I have done very little with binary conversions in the first place, but to add to my frustration the documentation on this binary format is also very vague in my opinion. Anyway, here's my current code:
proc ::Scope::CaptureWaveform {VisaAlias Channel} {
# Apply scope settings
::VISA::Write $VisaAlias "*WAI"
::VISA::Write $VisaAlias "DATa:STARt 1"
::VISA::Write $VisaAlias "DATa:STOP 4000"
::VISA::Write $VisaAlias "DATa:ENCdg RIBinary"
::VISA::Write $VisaAlias "DATa:SOUrce $Channel"
# Download waveform
set RIBinaryWaveform [::VISA::Query $VisaAlias "CURVe?"]
# Parse out leading label from scope output
set RIBinaryWaveform [string range $RIBinaryWaveform 11 end]
# Convert binary data to a binary string usable by TCL
binary scan $RIBinaryWaveform "I*" TCLBinaryWaveform
set TCLBinaryWaveform
# Convert binary data to list
}
Now, this code pulls the following data from the machine:
-1064723993 -486674282 50109321 -6337556 70678 8459972 143470359 1046714383 1082560884 1042711231 1074910212 1057300801 1061457453 1079313832 1066305613 1059935120 1068139252 1066053580 1065228329 1062213553
And this is what the machine pulls when I just take regular ASCII data (i.e. what the above data should look like after the conversion):
-1064723968 -486674272 50109320 -6337556 70678 8459972 143470352 1046714368 1082560896 1042711232 1074910208 1057300800 1061457472 1079313792 1066305600 1059935104 1068139264 1066053568 1065228352 1062213568
Finally, here is a reference to the RIBinary specification from Tektronix since I don't think it is a standard data type:
http://www.tek.com/support/faqs/how-binary-data-represented-tektronix-oscilloscopes
I've been looking for a while now on the Tektronix website for more information on converting the data and the above URL is all I've been able to find, but I'll comment or edit this post if I find any more information that might be useful.
Updates
Answers don't necessarily have to be in TCL. If anyone can help me logically work through this on a high level I can hash out the TCL details (this I think would be more helpful to others as well)
The reason I need to transfer the data in binary and then convert it afterwards is for the purpose of optimization. Due to this I can't have the device perform the conversion before the transfer as it will slow down the process.
I updated my code some and now my results are maddeningly close to the actual results. I assume it may have something to do with the commas that are in the data originally.
Below are now examples of the raw data sent from the device without any of my parsing.
On suggestion from #kostix, I made a second script with code he gave me that I modified to fit my data set. It can be seen below, however the result are exactly the same as my above code.
ASCIi:
:CURVE -1064723968,-486674272,50109320,-6337556,70678,8459972,143470352,1046714368,1082560896,1042711232,1074910208,1057300800,1061457472,1079313792,1066305600,1059935104,1068139264,1066053568,1065228352,1062213568
RIBinary:
:CURVE #280ÀçâýðüÿKì
Note on RIBinary - ":CURVE #280" is all part of the header that I need to parse out, but the #280 part of it can vary depending on the data I'm collecting. Here's some more info from Tektronix on what the #280 means:
block is the waveform data in binary format. The waveform is formatted
as: # where is the number of y bytes. For
example, if = 500, then = 3. is the number of bytes to
transfer including checksum.
So, for my current data set x = 2 and yyy = 80. I am just really unfamiliar with converting binary data, so I'm not sure what to do programmatically to deal with the block format.
On suggestion from #kostix I made a second script with code he gave me that I modified to fit my data set:
set RIBinaryWaveform [::VISA::Query ${VisaAlias} "CURVe?"]
binary scan $RIBinaryWaveform a8a curv nbytes
encoding convertfrom ascii ${curv}
scan $nbytes %u n
set n
set headerlen [expr {$n + 9}]
binary scan $RIBinaryWaveform #9a$n nbytes
scan $nbytes %u n
set n
set numints [expr {$n / 4}]
binary scan $RIBinaryWaveform #${headerlen}I${numints} data
set data
The output of this code is the same as the code I provided above.
According to the documentation you link to, RIBinary is signed big-endian. Thus, you convert the binary data to integers with binary scan $data "I*" someVar (I* means “as many big-endian 4-byte integers as you can”). You use the same conversion with RPBinary (if you've got that) but you then need to chop each value to the positive 32-bit integer range by doing & 0xFFFFFFFF (assuming at least Tcl 8.5). For FPBinary, use R* (requires 8.5). SRIBinary, SRPBinary and SFPBinary are the little-endian versions, for which you use lower-case format characters.
Getting conversions correct can take some experimentation.
I have no experience with this stuff but like googleing. Here are my findings.
This document, in the section titled "Formatted I/O Operations" tells that the viQueryf() standard C API function combines viPrintf() (writing to a device) with viScanf() (reading from a device), and examples include calls like viQueryf (io, ":CURV?\n", "%#b", &totalPoints, rdBuffer); (see the section «IEEE-488.2 Binary Data—"%b"»), where the third argument to the function specifies the desired format.
The VISA::Query procedure from your Tcl library pretty much resembles that viQueryf() in my eyes, so I'd expect it to accept the third (optional) argument which specifies the format you want the data to be in.
If there's nothing like it, let's look at your ASCII data. Your FAQ entry and the document I found both specify that the opaque data might come in the form of a series of integers of different size and endianness. The "RIBinary" format states it should be big-endian signed integers.
The binary scan Tcl command is able to scan 16-bit and 32-bit big-endian integers from a byte stream — use the S* and I* formats, correspondingly.
Your ASCII data clearly looks like 32-bit integers, so I'd try scanning using I*.
Also see this doc — it appears to have much in common with the PDF guide I linked above, but might be handy anyway.
TL;DR
Try studying your API to find a way to explicitly tell the device the data format you want. This might produce a more robust solution in the case the device might be somehow reconfigured externally to change its default data format effectively pulling the rug under the feet of your code which relies on certain (guessed) default.
Try interpreting the data as outlined above and see if the interpretation looks sensible.
P.S.
This might mean nothing at all, but I failed to find any example which has "e" between the "CURV" and the "?" in the calls to viQueryf().
Update (2013-01-17, in light of the new discoveries about the data format): to binary scan the data of varying types, you might employ two techniques:
binary scan accepts as many specifiers in a row, you like; they're are processed from left to right as binary scan reads the supplied data.
You can do multiple runs of binary scanning over a chunk of your binary data either by cutting pieces of this chunk (string manipulation Tcl commands understand they're operating on a byte array and behave accordingly) or use the #offset term in the binary scan format string to make it start scanning from the specified offset.
Another technique worth employing here is that you'd better first train yourself on a toy example. This is best done in an interactive Tcl shell — tkcon is a best bet but plain tclsh is also OK, especially if called via rlwrap (POSIX systems only).
For instance, you could create a fake data for yourself like this:
% set b [encoding convertto ascii ":CURVE #224"]
:CURVE #224
% append b [binary format S* [list 0 1 2 -3 4 -5 6 7 -8 9 10 -11]]
:CURVE #224............
Here we first created a byte array containing the header and then created another byte array containing twelve 16-bit integers packed MSB first, and then appended it to the first array essentially creating a data block our device is supposed to return (well, there's less integers than the device returns). encoding convertto takes the name of a character encoding and a string and produces a binary array of that string converted to the specified encoding. binary format is told to consume a list of arbitrary size (* in the format list) and interpret it as a list of 16-bit integers to be packed in the big-endian format — the S format character.
Now we can scan it back like this:
% binary scan $b a8a curv nbytes
2
% encoding convertfrom ascii $curv
:CURVE #
% scan $nbytes %u n
1
% set n
2
% set headerlen [expr {$n + 9}]
11
% binary scan $b #9a$n nbytes
1
% scan $nbytes %u n
1
% set n
24
% set numints [expr {$n / 2}]
12
% binary scan $b #${headerlen}S${numints} data
1
% set data
0 1 2 -3 4 -5 6 7 -8 9 10 -11
Here we proceeded like this:
Interpret the header:
Read the first eight bytes of the data as ASCII characters (a8) — this should read our :CURVE # prefix. We convert the header prefix from the packed ASCII form to the Tcl's internal string encoding using encoding convertfrom.
Read the next byte (a) which is then interpreted as the length, in bytes, of the next field, using the scan command.
We then calculate the length of the header read so far to use it later. This values is saved to the "headerlen" variable. The length of the header amounts to the 9 fixed bytes plus variable-number of bytes (2 in our case) specifying the length of the following data.
Read the next field which will be interpreted as the "number of data bytes" value.
To do this, we offset the scanner by 9 (the length of ":CURVE #2") and read so many ASCII bytes as obtained on the previous step, so we use #9a$n for the format: $n is just obtaining the value of a variable named "n", and it will be 2 in our case. Then we scan the obtained value and finally get the number of the following raw data.
Since we will read 16-bit integers, not bytes, we divide this number by 2 and store the result to the "numints" variable.
Read the data. To do this, we have to offset the scanner by the length of the header. We use #${headerlen}S${numints} for the format string. Tcl expands those ${varname} before passing the string to the binary scan so the actual string in our case will be #11S12 which means "offset by 11 bytes then scan 12 16-bit big-endian integers".
binary scan puts a list of integers to the variable which name is passed, so no additional decoding of those integers is needed.
Note that in the real program you should probably do certain sanity checks:
* After the first step check that the static part of the header is really ":CURVE #".
* Check the return value of binary scan and scan after each invocation and check it equals to the number of variables passed to the command (which means the command was able to parse the data).
One more insight. The manual you cited says:
is the number of bytes to transfer including checksum.
so it's quite possible that not all of those data bytes represent measures, but some of them represent the checksum. I don't know what format (and hence length) and algorithm and position of this checksum is. But if the data does indeed include a checksum, you can't interpret it all using S*. Instead, you will probably take another approach:
Extract the measurement data using string range and save it to a variable.
binary scan the checksum field.
Calculate the checksum on the data obtained on the first step, verify it.
Use binary scan on the extracted data to get back your measurements.
Checksumming procedures are available in tcllib.
# Download waveform
set RIBinaryWaveform [::VISA::Query ${VisaAlias} "CURVe?"]
# Extract block format data
set ResultCount [expr [string range ${RIBinaryWaveform} 2 [expr [string index${RIBinaryWaveform} 1] + 1]] / 4]
# Parse out leading label from Tektronics block format
set RIBinaryWaveform [string range ${RIBinaryWaveform} [expr [string index ${RIBinaryWaveform} 1] + 2] end]
# Convert binary data to integer values
binary scan ${RIBinaryWaveform} "I${ResultCount}" Waveform
set Waveform
Okay, the code above does the magic trick. This is very similar to all the things discussed on this page, but I figured I needed to clear up the confusion about the numbers from the binary conversion being different from the numbers received in ASCII.
After troubleshooting with a Tektronix application specialist we discovered that the data I had been receiving after the binary conversion (the numbers that were off by a few digits) were actually the true values captured by the scope.
The reason the ASCII values are wrong is a result of the binary-to-ASCII conversion done by the instrument and then the incorrect values are then passed by the scope to TCL.
So, we had it right a few days ago. The instrument was just throwing me for a loop.

Mysql: xor a string with a key

I want to Bitwise-XOR a string (actually its binary representation) with a KEY.
The result of the operation should be represented as HEX.
What I have:
'a' - the UTF-8 String to be changed.
'ACF123456' - the key in HEX.
Result seen as BIGINT:
select CONV(HEX('a'), 16, 10) ^ CONV('ACF123456', 16, 10);
Result seen as HEX:
select CONV( CONV(HEX('a'), 16, 10) ^ CONV('ACF123456', 16, 10), 10, 16);
Questions:
Is the conversion above done correctly?
What happens if the string is too long (i.e instead of 'a' we have 'a veeeeeery long string')? It seems that the conv() function has a limitation (is it the 64-bit precision from the documentation)? And besides the XOR operator ^ has also a limitation, related to the nr. of bits of the returned result. Any solutions that work for any string (a stored procedure is allowed)?
Thanks.
Your conversions look fine to me.
And as you point out, both CONV() and ^ have indeed a 64-bits precision.
2^64 = 16^16, therefore strings of more than 16 hexadecimal digits should convert to integers larger than 2^64. However, such strings will be brutally (silently) truncated from the left when attempting to convert them to integers.
The point of my solution here is to slice such strings. Obviously, the result may not be displayed as an integer, but only as a string representation.
Let #input be your "string to be changed" and #key, your "key".
Assign HEX(#input) to #hex_input. No problem here since HEX() works with strings.
Slice #hex_input into 16 hexadecimal digit long strings, starting from the right
Likewise, slice #key into 16 digit long strings.
Compute the X-OR of each 64-bit slice of #hex_input with each 64-bit slice of #key, starting from the right. Use CONV(#slice, 16, 10). If either #hex_input or #key has less slices than the other string, then X-OR the remaining slices of the other string with 0.
Convert each 64-bit number resulting from the X-OR in point 4. back into an hexadecimal string with UNHEX().
Reassemble the resulting slices. This is your result.
A three-columns TEMPORARY table could be used as an array to store slices of #hex_input, #mask and the resulting slices.
Put this all together into a stored procedure, and voilà!
You sound like you have some skills in MySQL, you should be able to translate the above into real code. But I'll be happy to help if you need further guidance.

Inserting hex value in MySQL

I have created an SQL database using Java. I have a table created which has two columns, the first being a big integer which increments, the second I have tried defining it as a char, varchar and binary.
But I'm still not getting the desired functionality. Say I try and store a hex number 0a into the char column and I get an error. I appended 0x to the beginning and it seems to store, but when I print out the contents it is blank. Or in some cases I get characters such as '/' or '?'. I also tried using SQL explorer and it gives me the same result viewing the table,
My problem is I need to store an eight character hex string such as eb8d4ee6.
Could someone please advise me of how this can be done?
See http://dev.mysql.com/doc/refman/5.5/en/hexadecimal-literals.html
MySQL supports hexadecimal values,
written using X'val', x'val', or 0xval
format, where val contains hexadecimal
digits (0..9, A..F). Lettercase of the
digits does not matter. For values
written using X'val' or x'val' format,
val must contain an even number of
digits. For values written using 0xval
syntax, values that contain an odd
number of digits are treated as having
an extra leading 0. For example, 0x0a
and 0xaaa are interpreted as 0x0a and
0x0aaa.
In numeric contexts, hexadecimal
values act like integers (64-bit
precision). In string contexts, they
act like binary strings, where each
pair of hex digits is converted to a
character:
You probably should store the Hex number in an integer column. You can then convert back to hex when selecting using the HEX() function.
E.g.,
INSERT INTO MyTable (`MyIntegerColumn`) VALUES (0xeb8d4ee6);
You can use a Json column:
And use JSON.stringify(hex) to insert and you can always get the result via select and compare too

Converting binary to hexadecimal?

Just wondering on how I would go about converting binary to hexadecimal??
Would I first have to convert the binary to decimal and then to hexadecimal??
For example, 101101001.101110101010011
How would I go about converting a complex binary such as the above to hexadecimal?
Thanks in advance
Each 4 bits of a binary number represents a hexadecimal digit. So the best way to convert from binary to hexadecimal is to pad the binary number with leading zeroes so that the number of bits is divisible by four.
Then you process four bits at a time and convert them to a single hexadecimal digit:
0000 -> 0
0001 -> 1
0010 -> 2
....
1110 -> E
1111 -> F
No, you don't convert to decimal and then to hexadecimal, you convert to a numeric value, and then to hexadecimal.
(Decimal is also a textual representation of a number, just like binary and hexadecimal. Although decimal representation is used by default, a number doesn't have a textual representation in itself.)
As a hexadecimal digit corresponds to four binary digits you don't have to convert the entire string to a number, you can do it four binary digits at a time.
First fill up the binary number so that it has full groups of four digits:
000101101001.1011101010100110
Then you can convert each group to a number, and then to hexadecimal:
0001 0110 1001.1011 1010 1010 0110
169.BAA6
Alternatively, you can split the number into the two parts before and after the period and convert those from binary. The part before the period can be converted stright off, but the part after has to be padded to be correct.
Example in C#:
string binary = "101101001.101110101010011";
string[] parts = binary.Split('.');
while (parts[1].Length % 4 != 0) {
parts[1] += '0';
}
string result =
Convert.ToInt32(parts[0], 2).ToString("X") +
"." +
Convert.ToInt32(parts[1], 2).ToString("X");
You could simply have a small hash table, or other mapping converting each quadruplet of binary digits (as a string, assuming that's your input) into the corresponding hex digit (0 to 9, A to F) for the output string. You'll have to bunch the input bits up by 4, left-padding before the '.' and right-padding after it, with 0 in both cases, as needed.
So...:
locate the '.'
left of the '.', bunch by 4, left-padding the last bunch, going leftwards: in your example, 1001 leftmost, then 0110, finally 0001 (left-padding), that's it;
ditto to the right -- in your example 1011, then 1010, then 1010, finally 0110 (right-padding)
each bunch of 4 binary digits, via a hash or other form of hashing, turns into the hex digit to put in that place in the output string.
Want some pseudo-code for it, e.g., Python?
The simplest approach, especially if you already can convert from binary digits to internal numeric representation and from internal numeric representation to hexadecimal digits, is to go binary->internal->hex. I say internal and not decimal, because even though it may print as decimal, it is actually being stored internally in binary format. That said, it is possible to go straight from one to the other. This does not apply to your specific example, but in many cases when converting from binary to hex, you can go four digits at a time, and simply lookup the corresponding hex values in a table. There are all sorts of ways to convert.
BIN to HEX
Binary and hex are natively compatible. Just group 4 binary digits(bits) and substitute the corresponding HEX-digit.
More reference here:
http://en.wikipedia.org/wiki/Hexadecimal#Binary_conversion