I am learning assembly language and had a question on syntax of procedure.
(I am using VS 2012).
For main procedure, if the line 'main PROC' was inserted before '.data', the error would occur.
.data
.code
main Proc
;some code
main ENDP
local1 Proc
.data
.code
ret
.local1 ENDP
END main
But for other local procedure under the main, it works fine with '.data' after the declaration of the procedure.
Could someone explain to me why?
p.s. Also is assembly lanaugage unpopular? I have learned little java and c++ and compare to them, there are much less discussion and searches on google.
Imagine assembler as a machine which reads input source and writes (emits) to two output channels: one for executable code and another one for data.
Segment-switch directives .data and .code tell assembler where it should write the emitted information.
Using directive .data you actually command your assembler: stop emitting to whichever output segment was current at the moment, switch to the segment .data and continue writing to this segment at the next free space (origin), where you left it the last time it was active.
Alternation between .code and .data segment in source text is good for readability of the program, it allows to keep code procedures close to the global data it operates with. On the other hand, when the compiled program is loaded in memory, all procedures have to be linked together in one code segment and all data are kept together in data segment. Operating system usually does not allow to execute any instruction from segments marked as "data", and to write any data to segment marked as "code". That is why it is programmer's responsibility to switch to .code before any executable statement is emitted.
Related
I want to know when a function body end in assemby, for example in c you have this brakets {} that tell you when the function body start and when it ends but how do i know this in assembly?
Is there a parser that can extract me all the functions from assembly and start line and endline of their body?
There's no foolproof way, and there might not even be a well-defined correct answer in hand-written asm.
Usually (e.g. in compiler-generated code) you know a function ends when you see the next global symbol, like objdump does to decide when to print a new "banner". But without all function-start symbols being visible, there's no unambigious way. That's why some object file formats have room for size metadata associated with a symbol. Like .size foo, . - foo in GAS syntax.
It's not as easy as looking for a ret; some functions end with a jmp tail-call to another function. And some call a noreturn function like abort or __stack_chk_fail (not tailcall because they want to push a return address for a backtrace.) Or just fall off into whatever's next because that path had undefined behaviour in the source so the compiler assumed it wasn't reachable and stopped generating instructions for it, e.g. a C++ non-void function where execution can/does fall off the end without a return.
In general, assembly can blur the lines of what a function is.
Asm has features you can use to implement the high-level concept of a function, but you're not restricted to that.
e.g. multiple asm functions could all return by jumping to a common block of code that pops some registers before a ret. Is that shared tail a separate function that's called with a tail-called with a special calling convention?
Compilers don't usually do that, but humans could.
As for function entry points, usually some other code somewhere in the program will contain a call to it. But not necessarily; it might only be reachable via a table of function pointers, and you don't know that a block of .rodata holds function pointers until you find some code loading from it and calling or jumping.
But that doesn't work if the lowest-address instruction of the function isn't its entry point. See Does a function with instructions before the entry-point label cause problems for anything (linking)? for an example
Compilers don't generate code like that, but humans can. (It's a handy trick sometimes for https://codegolf.stackexchange.com/ questions.)
Or in the general case, a function might have multiple entry points. Or you could describe that as multiple functions with overlapping implementations. Sometimes it's as simple as one tailcalling another by falling into it without needing a jmp, i.e. it starts a few instructions before another.
I wan't to know when a function body ends in assembly, [...]
There are mainly four ways that the execution of a stream of (userspace) instructions can "end":
An unconditional jump like jmp or a conditional one like Jcc (je,jnz,jg ...)
A ret instruction (meaning the end of a subroutine) which probably comes closest to the intent of your question (including the ExitProcess "ret" command)
The call of another "function"
An exception. Not a C style exception, but rather an exception like "Invalid instruction" or "Division by 0" which terminates the user space program
[...] for example in c you have this brakets {} that tell you when the function body start and when it ends but how do i know this in assembly?
Simple answer: you don't. On the machine level every address can (theoretically) be an entry point to a "function". So there is no unique entry point to a "function" other than defined - and you can define anything.
On a tangent, this relates to self-modifying code and viruses, but it must not. The exit/end is as described in the first part above.
Is there a parser that can extract me all the functions from assembly and
start line and endline of their body?
Disassemblers create some kind of "functions" with entry and exit points. But they are merely assumed. No way to know if that assumption is correct. This may cause problems.
The usual approach is using a disassembler and the work to recombinate the stream of instructions to different "functions" remains to the person that mandated this task (vulgo: you). Some tools exist that claim to simplify this, but I cannot judge their efficacy.
From the perspective of a high level language, there are decompilers that try to reverse the transformation from (for example) C to assembly/machine code that try to automatize that task and will work more or less or in some cases.
I made a new segment in the ELF (ARM) game library I am working on (Page Aligned, .text segment (Pure Code)). The entirety of the segment includes only NOPs (0000A0E1). However, IDA recognizes the NOPs as:
DCD 0xE1A00000
If I press C to make it into code, it shows one NOP per line I press it on. When I try to create a function, it tells me that the function has undefined instruction/data at the specified address, and won't let me do it. In addition, each line of the DCD 0xE1A00000s is referenced in the .plt segment because they're referencing an offset out of the segment. This happens even after I turn it into code.
Is there a way to declare the entire segment as code, and/or put a segment property that does this, possibly fixing the function problem?
I have a curious case of Tcl that perhaps I just don't understand.
The following code is done at the top level (not inside of any procedure):
if {![info exists g_log_file_name]} {
set g_log_file_name "default.txt"
}
What I hope it would do is to declare a global variable with some value if it wasn't declared yet (which can be done at some other script or C application). However, the if statement always false. I ran on Tcl 7.4.
What may be the problem?
Thank you so much.
% info level
0
% info exists g_log_file_name
0
% set g_log_file_name whatever
whatever
% info exists g_log_file_name
1
Hence the reason you observe is probably because the variable is really always unset at the time your if command is executed.
Possible reasons for this I can imagine are:
It's just not set: literally, no code attempt to do this;
The external code sets some other variable: name mismatch;
The external code sets a variable in some other interpreter: in a C code embedding Tcl, there can be any number of Tcl interpreters active at any moment (and those are free to create child interpreters as well);
I'm not sure abot the long forgotten version of Tcl you have at hand, but 8.x has the trace command which can be used to log accesses to a particular variable—you could try to use it to see what happens.
I have a question about global in TCL.
In one tcl file tclone.tcl, I have a global variable: global SIGNAL
in another tcl file called tcltwo.tcl, I set the variable SIGNAL as: set SIGNAL 10
In tclone.tcl, I improted the tcltwo.tcl as following" package require tcltwo.tcl
will the variable SIGNAL in tclone.tcl will be set as 10 when I execute it? and what is the usage of gloable variable?
As stated in its manual page, the global command only has meaning inside proc bodies:
This command has no effect unless executed in the context of a proc body.
So the whole question is unclear. If you meant that you have a proc in the first file setting a global variable and another proc (in the second file) reading it, then the question makes sense and the answer is yes, the code from the second file will see the change made by the code from the first file provided the "setting" procedure runs before the "getting" one. To possibly make it more clear, a global variable is global with regard to an interpreter the code operating that variable runs. Hence no matter which way do you use to fetch the code into an interpreter (package require vs source vs eval etc), all that code will see the same set of globals.
But in any case you should probably abstrain from using globals and use namespaced variables: they are also global but you greatly reduce the risk of introducing some other code later which will inadvertently mess with that global variable it should not touch. Of course, as usually this depends on how complicated your application is expected to be.
As of now I use 3 Notebook :
Functions
Where I have all the functions I created and call in the other Notebooks.
Transformation
Based on the original data, I compute transformations and add columns/List
When data is my raw data, I then call :
t1data : the result of the first transformation
t2data : the result of the second transformation
and so on,
I am yet at t20.
Display & Analysis
Using both the above I create Manipulate object that enable me to analyze the data.
Questions
Is there away to save the results of the Transformation Notebook such that t13data for example can be used in the Display & Analysis Notebooks without running all the previous computations (t1,t2,t3...t12) it is based on ?
Is there a way to use my Functions or transformed data without opening the corresponding Notebook ?
Does my separation strategy make sense at all ?
As of now I systematically open the 3 and have to run them all before being able to do anything, and it takes a while given my poor computing power and yet inefficient codes.
Saving variable states: can be done using DumpSave, Save or Put. Read back using Get or <<
You could make a package from your functions and read those back using Needs or <<
It's not something I usually do. I opt for a monolithic notebook containing everything (nicely layered with sections and subsections so that you can fold open or close) or for a package + slightly leaner analysis notebook depending on the weather and some other hidden variables.
Saving intermediate results
The native file format for Mathematica expressions is the .m file. This is human readable text format, and you can view the file in a text editor if you ever doubt what is, or is not being saved. You can load these files using Get. The shorthand form for Get is:
<< "filename.m"
Using Get will replace or refresh any existing assignments that are explicitly made in the .m file.
Saving intermediate results that are simple assignments (dat = ...) may be done with Put. The shorthand form for Put is:
dat >> "dat.m"
This saves only the assigned expression itself; to restore the definition you must use:
dat = << "dat.m"
See also PutAppend for appending data to a .m file as new results are created.
Saving results and function definitions that are complex assignments is done with Save. Examples of such assignments include:
f[x_] := subfunc[x, 2]
g[1] = "cat"
g[2] = "dog"
nCr = #!/(#2! (# - #2)!) &;
nPr = nCr[##] #2! &;
For the last example, the complexity is that nPr depends on nCr. Using Save it is sufficient to save only nPr to get a fully working definition of nPr: the definition of nCr will automatically be saved as well. The syntax is:
Save["nPr.m", nPr]
Using Save the assignments themselves are saved; to restore the definitions use:
<< "nPr.m" ;
Moving functions to a Package
In addition to Put and Save, or manual creation in a text editor, .m files may be generated automatically. This is done by creating a Notebook and setting Cell > Cell Properties > Initialization Cell on the cells that contain your function definitions. When you save the Notebook for the first time, Mathematica will ask if you want to create an Auto Save Package. Do so, and Mathematica will generate a .m file in parallel to the .nb file, containing the contents of all Initialization Cells in the Notebook. Further, it will update this .m file every time you save the Notebook, so you never need to manually update it.
Sine all Initialization Cells will be saved to the parallel .m file, I recommend using the Notebook only for the generation of this Package, and not also for the rest of your computations.
When managing functions, one must consider context. Not all functions should be global at all times. A series of related functions should often be kept in its own context which can then be easily exposed to or removed from $ContextPath. Further, a series of functions often rely on subfunctions that do not need to be called outside of the primary functions, therefore these subfunctions should not be global. All of this relates to Package creation. Incidentally, it also relates to the formatting of code, because knowing that not all subfunctions must be exposed as global gives one the freedom to move many subfunctions to the "top level" of the code, that is, outside of Module or other scoping constructs, without conflicting with global symbols.
Package creation is a complex topic. You should familiarize yourself with Begin, BeginPackage, End and EndPackage to better understand it, but here is a simple framework to get you started. You can follow it as a template for the time being.
This is an old definition I used before DeleteDuplicates existed:
BeginPackage["UU`"]
UnsortedUnion::usage = "UnsortedUnion works like Union, but doesn't \
return a sorted list. \nThis function is considerably slower than \
Union though."
Begin["`Private`"]
UnsortedUnion =
Module[{f}, f[y_] := (f[y] = Sequence[]; y); f /# Join###] &
End[]
EndPackage[]
Everything above goes in Initialization Cells. You can insert Text cells, Sections, or even other input cells without harming the generated Package: only the contents of the Initialization Cells will be exported.
BeginPackage defines the Context that your functions will belong to, and disables all non-System` definitions, preventing collisions. (There are ways to call other functions from your package, but that is better for another question).
By convention, a ::usage message is defined for each function that it to be accessible outside the package itself. This is not superfluous! While there are other methods, without this, you will not expose your function in the visible Context.
Next, you Begin a context that is for the package alone, conventionally "`Private`". After this point any symbols you define (that are not used outside of this Begin/End block) will not be exposed globally after the Package is loaded, and will therefore not collide with Global` symbols.
After your function definition(s), you close the block with End[]. You may use as many Begin/End blocks as you like, and I typically use a separate one for each function, though it is not required.
Finally, close with EndPackage[] to restore the environment to what it was before using BeginPackage.
After you save the Notebook and generate the .m package (let's say "mypackage.m"), you can load it with Get:
<< "mypackage.m"
Now, there will be a function UnsortedUnion in the Context UU` and it will be accessible globally.
You should also look into the functionality of Needs, but that is a little more advanced in my opinion, so I shall stop here.