Where is command if compiled? - tcl

I am checking out the source of tcl 8.6.3 and like to see how the byte code works. But I cannot find where is command if compiled.
I looked at tclCompCmds.c, it has a bunch of commands, such as break, continue, etc., but I do not see command if.
Could you point me where is the compile routine of command if?

It is in generic/tclCompCmdsGR.c (function TclCompileIfCmd()).
BTW, while looking for the requested function I noticed that the division of command implementations by different source files doesn't strictly follow the scheme declared in the beginning of the two of them.
/*
* tclCompCmds.c --
*
* This file contains compilation procedures that compile various Tcl
* commands into a sequence of instructions ("bytecodes").
*
*/
/*
* tclCompCmdsGR.c --
*
* This file contains compilation procedures that compile various Tcl
* commands (beginning with the letters 'g' through 'r') into a sequence
* of instructions ("bytecodes").
*
*/
/*
* tclCompCmdsSZ.c --
*
* This file contains compilation procedures that compile various Tcl
* commands (beginning with the letters 's' through 'z', except for
* [upvar] and [variable]) into a sequence of instructions ("bytecodes").
* Also includes the operator command compilers.
*
*/
tclCompCmds.c doesn't state that it contains only part of the command implementations, but the other two files suggest that it should contain commands whose names start with 'a' through 'f'. However, as an exception, it also includes the implementation of the lmap command.
tclCompCmdsGR.c states that it contains implementations of command starting with 'g' through 'r'. However upvar and variable also belong to tclCompCmdsGR.c, but that is explicitly stated in the last file tclCompCmdsSZ.c, which, apart from that exception, hosts commands starting with 's' through 'z', as well as the ::tcl::mathop::* commands.

Related

Apache drill cannot parse CSV files with windows EOL correctly?

Ok, let's save someone 8 hours of clueless debugging.
TL;DR: Apache drill cannot correctly parse CSV files generated on windows machines. That's because their EOL is set to \r\n by default unlike to unix system, where it is set to \n. And this leads to horribly undebuggable errors because the leading \r probably stays clued to the last field's value. And what's funny, you won't notice this because it's invisible.
Let's have two files, one created in linux and the second in windows: hello.linux.csv and hello.win.csv. The content is the same (at least it looks like it is ...)
field_a,field_b
Hello,0.5
Let's have a query.
SELECT * from (...)/hello.linux.csv;
---
field_a, field_b
Hello, "0.5"
SELECT * from (...)/hello.win.csv;
---
field_a, field_b
Hello, "0.5"
Fine! Let's do something with the data. Cast "0.5" to number should be fine (and necessary).
SELECT
field_a, CAST (field_b as DECIMAL(10, 2)) as test
from (...)/hello.linux.csv;
---
field_a, test
Hello, 0.5
-- ... aaand, here we go!
SELECT
field_a, CAST (field_b as DECIMAL(10, 2)) as test
from (...)/hello.win.csv;
[30038]Query execution error. Details:[
SYSTEM ERROR: NumberFormatException
Fragment 0:0
Please, refer to logs for more information. -- In the logs, there is only useless java stacktrace, of course.
[Error Id: 3551c939-3f5b-42c1-9b58-d600da5f12a0 on drill-develop-7bdb45c597-52rnz:31010]
]
...
(And now, imagine how much time would take to reveal this on a complex production setup where the queries, data and other factors are somehow more complicated.)
The question: Is there a way how to force apache drill (v 1.15) to process CSV files created with windows EOLs?
You can update csv format line delimiter to \r\n but this would apply to all csv files in the scope of your text plugin. To change delimiter per table use table function.
https://drill.apache.org/docs/plugin-configuration-basics/

Converting Tcl to C++

I am trying to convert some tcl script into a C++ program. I don't have much experience with tcl and am hoping someone could explain what some of the following things are actually doing in the tcl script:
1) set rtn [true_test_sfm $run_dir]
2) cd [glob $run_dir]
3) set pwd [pwd]
Is the first one just checking if true_test_sfm directory exists in run_dir?
Also, I am programming on a windows machine. Would the system function be the equivalent to exec statements in tcl? And if so how would I print the result of the system function call to stdout?
In Tcl, square brackets indicate "evaluate the code between the square brackets". The result of that evaluation is substituted for the entire square-bracketed expression. So, the first line invokes the function true_test_sfm with a single argument $run_dir; the result of that function call is then assigned to the variable rtn. Unfortunately, true_test_sfm is not a built-in Tcl function, which means it's user-defined, which means there's no way we can tell you what the effect of that function call will be based on the information you've provided here.
glob is a built-in Tcl function which takes a file pattern as an argument and then lists files that match that pattern. For example, if a directory contains files "foo", "bar" and "baz", glob b* would return a list of two files, "bar" and "baz". Therefore the second line is looking for any files that match the pattern given by $run_dir, then using the cd command (another Tcl built-in) to change to the directory found by glob. Probably $run_dir is not actually a file pattern, but an explicit file name (ie, no globbing characters like * or ? in the string), otherwise this code may break unexpectedly. On Windows, some combination of FindFirstFile/FindNextFile in C++ could be used as a substitute for glob in Tcl, and SetCurrentDirectory could substitute for cd.
pwd is another built-in Tcl function which returns the process current working directory as an absolute path. So the last line is querying the current working directory and saving the result in a variable named pwd. Here you could use GetCurrentDirectory as a substitute for pwd.

Break on namespace function in gdb (llvm)

I'm trying to step through llvm's opt program (for an assignment) and the instructor suggested setting a breakpoint at runOnFunction. I see this in one of the files:
bool InstCombiner::runOnFunction(Function &F) { /* (Code removed for SO) */ }
but gdb does not seem to find the runOnFunction breakpoint. It occurred to me that the problem might be namespaces? I tried this but gdb never breaks, it just creates the fooOpt.s file:
(gdb) b runOnFunction
Function "runOnFunction" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (runOnFunction) pending.
(gdb) r -S -instcombine -debug -o ~/Desktop/fooOpt.s ~/Desktop/foo.s
I'm on a Mac so I don't have objdump but otool produces 5.6 million lines, wading through that for a starting point does not seem reasonable as runOnFunction appears more than once there.
Gdb has several builtin commands to find name of such functions. First is info functions, which can be used with optional regexp argument to grep all available functions, https://sourceware.org/gdb/current/onlinedocs/gdb/Symbols.html
info functions regexp
Print the names and data types of all defined functions whose names contain a match for regular expression regexp. Thus, ‘info fun step’ finds all functions whose names include step; ‘info fun ^step’ finds those whose names start with step. If a function name contains characters that conflict with the regular expression language (e.g. ‘operator*()’), they may be quoted with a backslash.
So, you can try info functions runOnFunction to get the name. Sometimes it can be useful to add quotes around name when doing break command.
The other way is to use rbreak command instead of break (b). rbreak will do regexp search in functions names and may define several breakpoints: https://sourceware.org/gdb/current/onlinedocs/gdb/Set-Breaks.html#Set-Breaks
rbreak regex
Set breakpoints on all functions matching the regular expression regex. This command sets an unconditional breakpoint on all matches, printing a list of all breakpoints it set. ...
The syntax of the regular expression is the standard one used with tools like grep. Note that this is different from the syntax used by shells, so for instance foo* matches all functions that include an fo followed by zero or more os. There is an implicit .* leading and trailing the regular expression you supply, so to match only functions that begin with foo, use ^foo.
(or even rbreak file:regex to limit search to single source file)
PS: if you want, you can turn on or off C++ function name demangling with set print demangle on or off (https://sourceware.org/gdb/current/onlinedocs/gdb/Debugging-C-Plus-Plus.html#Debugging-C-Plus-Plus). With demangling turned off it will be easier to copy function name to break command.

mumps syntax declaration

Q ZR $ZTLP I Q=-1 S Q,A=F G T
I Q< S A=F G R
How to identify Label, Keyword and Variable in MUMPS?
What is Q in above code? i.e. Label, Variable or Key word?
What are the rules to define variable keyword and subroutine?
Otherwise it is difficult to identify could you suggest why because I can't understand my existing code which is what?
Q means QUIT in first instance but then I Q=-1 is IF Q EQUALS -1 - Q is a variable here too - not very good practice
S Q,A=F again SET Q and A = F
I Q< S A=F G R if Q is less than null (???) SET A=F then GOTO line R.
The secret is: whitespaces.
General MUMPS program line syntax is:
...
Lebel and arguments are optional: when a line has no label, it begins with tab, when a command has no arguments (it happens in rare cases, e.g.: Quit), the command is followed by two spaces.
When a line begins with command (no label and no tab), it's not part of a program, but it is an immediatelly executed command.
You may feel that it's confusing, but remember, MUMPS was designed when machines were slow; it's easy to parse commands if they are strictly delimited. That's why commands can be abbreviated as single-letter, and also that's why MUMPS have no operation precedence (newer MUMPS systems are configurable to use operator precedence instead of traditional left-to-right processing order).

sed: modify function arguments

I'm trying to write a sed command that will allow me to modify a function's arguments. The number of arguments can be variable.
If this is my function:
int myFunction(int arg1, int arg2, Dog arg3) {
// function implementation
}
I would like to be able to perform addition operations on int arg1, int arg2, ...
Here's what I have that does not work:
sed -e '/^[a-zA-Z0-9_]\+\s\+[a-zA-Z0-9_]\+(/ , /)[\n\s]*{/ {
# arguments should be listed here
}'
Any help is appreciated. Go easy on me, it's my first attempt at sed / shell scripting.
Thanks.
Ultimately, sed is not the correct tool for this job - because of your comment about 'the number of arguments can be variable'. If you are dealing with a fixed number of arguments of a fixed type, you can kludge your way around it, but any more general processing requires a more general processor (than sed).
I suggest trying a different task as your introduction to shell scripting and sed.
If you must do it, then maybe:
sed '/^[A-Za-z_][A-Za-z0-9_]* *[A-Za-z_][A-Za-z0-9]* *([A-Za-z_][A-Za-z0-9_]* *\([A-Za-z_][A-Za-z0-9_]*\) *, *[A-Za-z_][A-Za-z0-9_]* *\([A-Za-z_][A-Za-z0-9_]*\)[ ,)].*{/{p;a\
return \1 + \2;
}' $file
That horror of a match contains the sequence [A-Za-z_][A-Za-z0-9_]* 6 times; it matches an identifier each time. The segment from '[ ,)].*{ matches a third or subsequent arguments. The spaces in the pattern should, perhaps, be '[<blank><tab>]' character classes, but they're a pain to enter on StackOverflow. The regex then matches a function definition, and captures (in the '\(<identifier>\)' parts the names of the two variables (arg1 and arg2 in your example). The actions when this is recognized are:
p - print the line that was recognized.
a - append the following line(s) to the output; in this case, one line containing a return statement that is the sum of the two remembered argument names. The backslash indicates that there is another line of output to append. The braces group the operations together.
Some versions of sed support more powerful regular expressions than others; I'm not sure though that even GNU sed supports PCRE (Perl-Compatible Regular Expressions), and it would take something like PCRE to significantly reduce the the regex.
Note that this script leaves the comment line '// function implementation' untouched. It's your call what you do with that.
Finally, remember that if you write more than one function to add two integers together, you are wasting code. Therefore, this is not a plausible transformation. Each function should do something different, somehow. Granted, if the types are different each time, then maybe it has its uses, but even so, it would be easier to write a generator than to parse the skeleton and fill in the bits. And that might be a good scripting exercise.