Why are there different sections for initialized and uninitialized globals? - language-agnostic

I'm reading this course on the structure of an executable and it says there are three data sections in an executable:
code section — where the instructions are stored
data section
.data — stores initialized global data
.bss — stores uninitialized global data
.rodata — stores read-only data, such as literals
My question is, why is the distinction made between initialized and uninitialized global data?
(We use C in class, but I guess this is a language agnostic subject.)

Image size. The program image has to contain the initialization data for .data, but it does not have to contain .bss.

Related

how to make Ghidra use a function's complete/original stackframe for decompiled code

I have a case where some function allocates/uses a 404 bytes temporary structure on the stack for its internal calculations (the function is self-contained and shuffles data around within that data structure). Conceptually the respective structure seems to consist of some 32-bit counters followed by an int[15] and a byte[80] array, and then an area that might or might not actually be used. Some of the generated data in the tables seems to represent offsets that are again used by the function to navigate within the temporary structure.
Unfortunately Ghidra's decompiler makes a total mess while trying to make sense of the function: In particular it creates separate "local_.." int-vars (and then uses a pointer to that var) for what should correctly be a pointer into the function's original data-structure (e.g. pointing into one of the arrays).
undefined4 local_17f;
...
dest= &local_17f;
for (i = 0xf; i != 0; i = i + -1) {
*dest = 0;
dest = dest + 1;
}
Ghidra does not seem to understand that an array based data access is actually being used at that point. Ghirda's decompiler then also generates a local auStack316[316] variable which unfortunately seems to cover only a part of the respective local data structure used by the original ASM code (at least Ghidra actually did notice that a temporary memory buffer is used). As a result the decompiled code basically uses two overlapping (and broken) shadow data structures that should correctly just be the same block of memory.
Is there some way to make Ghidra's decompiler use the complete 404 bytes block allocated by the function as an auStack404 thus bypassing Ghidra's flawed interpretation logic and actually preserve the original functionality of the ASM code?
I think I found something.. In the "Listing" view the used local-variable layout is shown as a comment under the function's header. It seems that by right clicking on a respective local-var line in that comment, "set data type" can be applied to a respective local variable. Ah, and then there is what I've been looking for under "Function/"Edit stack frame" :-)

Unusual function declaration in Verilog

To try to understand, I looked for some code on the internet and found the following declaration of what I suppose to be functions, and that I don't understand at all.
sext #(.inwidth(1), .outwidth(32)) scc_sext_i0(
.i0(paw_0_i0_outport0[32]),
.o0(scc_sext_i0_o0));
combine2_wn #(.inwidth0(32), .inwidth1(32)) scc_combine2_wn_i0(
.i0(paw_0_i0_outport0[31 : 0]),
.i1(scc_sext_i0_o0),
.o0(scc_combine2_wn_i0_o0));
combine2_wn #(.inwidth0(32), .inwidth1(32)) scc_combine2_wn_i1(
.i0(scc_combine2_wn_i2_o0[31 : 0]),
.i1(scc_combine2_wn_i2_o0[63 : 32]),
.o0(scc_combine2_wn_i1_o0));
My questions are the following:
Are these really functions mapping?
If yes, they are not defined in any other lower level .v file (and no library is included either in the top-level file). So what is their use?
What does # symbol mean?
What does .inwidth(32) mean? input of 32 bits? (impossible to find on the internet...)
If yes, the combine2_wn blocks should have only 2 inputs, why is there an output mapped each time?
More generally, are these any kind of concatenation functions?
These are most likely module instantiations, not function calls.
You should have a module named sext and another named combine2_wn declared in files somewhere in your Verilog search path.
#() means you are assigning values to parameters inside the named modules.
There is a parameter named inwidth in the sext module. You are assigning it a value of 1.
There are plenty of references on the web. Look at the verilog wiki site.

the usage of global keyword in TCL

I have a question about global in TCL.
In one tcl file tclone.tcl, I have a global variable: global SIGNAL
in another tcl file called tcltwo.tcl, I set the variable SIGNAL as: set SIGNAL 10
In tclone.tcl, I improted the tcltwo.tcl as following" package require tcltwo.tcl
will the variable SIGNAL in tclone.tcl will be set as 10 when I execute it? and what is the usage of gloable variable?
As stated in its manual page, the global command only has meaning inside proc bodies:
This command has no effect unless executed in the context of a proc body.
So the whole question is unclear. If you meant that you have a proc in the first file setting a global variable and another proc (in the second file) reading it, then the question makes sense and the answer is yes, the code from the second file will see the change made by the code from the first file provided the "setting" procedure runs before the "getting" one. To possibly make it more clear, a global variable is global with regard to an interpreter the code operating that variable runs. Hence no matter which way do you use to fetch the code into an interpreter (package require vs source vs eval etc), all that code will see the same set of globals.
But in any case you should probably abstrain from using globals and use namespaced variables: they are also global but you greatly reduce the risk of introducing some other code later which will inadvertently mess with that global variable it should not touch. Of course, as usually this depends on how complicated your application is expected to be.

Organizing Notebooks & Saving Results in Mathematica

As of now I use 3 Notebook :
Functions
Where I have all the functions I created and call in the other Notebooks.
Transformation
Based on the original data, I compute transformations and add columns/List
When data is my raw data, I then call :
t1data : the result of the first transformation
t2data : the result of the second transformation
and so on,
I am yet at t20.
Display & Analysis
Using both the above I create Manipulate object that enable me to analyze the data.
Questions
Is there away to save the results of the Transformation Notebook such that t13data for example can be used in the Display & Analysis Notebooks without running all the previous computations (t1,t2,t3...t12) it is based on ?
Is there a way to use my Functions or transformed data without opening the corresponding Notebook ?
Does my separation strategy make sense at all ?
As of now I systematically open the 3 and have to run them all before being able to do anything, and it takes a while given my poor computing power and yet inefficient codes.
Saving variable states: can be done using DumpSave, Save or Put. Read back using Get or <<
You could make a package from your functions and read those back using Needs or <<
It's not something I usually do. I opt for a monolithic notebook containing everything (nicely layered with sections and subsections so that you can fold open or close) or for a package + slightly leaner analysis notebook depending on the weather and some other hidden variables.
Saving intermediate results
The native file format for Mathematica expressions is the .m file. This is human readable text format, and you can view the file in a text editor if you ever doubt what is, or is not being saved. You can load these files using Get. The shorthand form for Get is:
<< "filename.m"
Using Get will replace or refresh any existing assignments that are explicitly made in the .m file.
Saving intermediate results that are simple assignments (dat = ...) may be done with Put. The shorthand form for Put is:
dat >> "dat.m"
This saves only the assigned expression itself; to restore the definition you must use:
dat = << "dat.m"
See also PutAppend for appending data to a .m file as new results are created.
Saving results and function definitions that are complex assignments is done with Save. Examples of such assignments include:
f[x_] := subfunc[x, 2]
g[1] = "cat"
g[2] = "dog"
nCr = #!/(#2! (# - #2)!) &;
nPr = nCr[##] #2! &;
For the last example, the complexity is that nPr depends on nCr. Using Save it is sufficient to save only nPr to get a fully working definition of nPr: the definition of nCr will automatically be saved as well. The syntax is:
Save["nPr.m", nPr]
Using Save the assignments themselves are saved; to restore the definitions use:
<< "nPr.m" ;
Moving functions to a Package
In addition to Put and Save, or manual creation in a text editor, .m files may be generated automatically. This is done by creating a Notebook and setting Cell > Cell Properties > Initialization Cell on the cells that contain your function definitions. When you save the Notebook for the first time, Mathematica will ask if you want to create an Auto Save Package. Do so, and Mathematica will generate a .m file in parallel to the .nb file, containing the contents of all Initialization Cells in the Notebook. Further, it will update this .m file every time you save the Notebook, so you never need to manually update it.
Sine all Initialization Cells will be saved to the parallel .m file, I recommend using the Notebook only for the generation of this Package, and not also for the rest of your computations.
When managing functions, one must consider context. Not all functions should be global at all times. A series of related functions should often be kept in its own context which can then be easily exposed to or removed from $ContextPath. Further, a series of functions often rely on subfunctions that do not need to be called outside of the primary functions, therefore these subfunctions should not be global. All of this relates to Package creation. Incidentally, it also relates to the formatting of code, because knowing that not all subfunctions must be exposed as global gives one the freedom to move many subfunctions to the "top level" of the code, that is, outside of Module or other scoping constructs, without conflicting with global symbols.
Package creation is a complex topic. You should familiarize yourself with Begin, BeginPackage, End and EndPackage to better understand it, but here is a simple framework to get you started. You can follow it as a template for the time being.
This is an old definition I used before DeleteDuplicates existed:
BeginPackage["UU`"]
UnsortedUnion::usage = "UnsortedUnion works like Union, but doesn't \
return a sorted list. \nThis function is considerably slower than \
Union though."
Begin["`Private`"]
UnsortedUnion =
Module[{f}, f[y_] := (f[y] = Sequence[]; y); f /# Join###] &
End[]
EndPackage[]
Everything above goes in Initialization Cells. You can insert Text cells, Sections, or even other input cells without harming the generated Package: only the contents of the Initialization Cells will be exported.
BeginPackage defines the Context that your functions will belong to, and disables all non-System` definitions, preventing collisions. (There are ways to call other functions from your package, but that is better for another question).
By convention, a ::usage message is defined for each function that it to be accessible outside the package itself. This is not superfluous! While there are other methods, without this, you will not expose your function in the visible Context.
Next, you Begin a context that is for the package alone, conventionally "`Private`". After this point any symbols you define (that are not used outside of this Begin/End block) will not be exposed globally after the Package is loaded, and will therefore not collide with Global` symbols.
After your function definition(s), you close the block with End[]. You may use as many Begin/End blocks as you like, and I typically use a separate one for each function, though it is not required.
Finally, close with EndPackage[] to restore the environment to what it was before using BeginPackage.
After you save the Notebook and generate the .m package (let's say "mypackage.m"), you can load it with Get:
<< "mypackage.m"
Now, there will be a function UnsortedUnion in the Context UU` and it will be accessible globally.
You should also look into the functionality of Needs, but that is a little more advanced in my opinion, so I shall stop here.

How to resolve naming conflicts when multiply instantiating a program in VxWorks

I need to run multiple instances of a C program in VxWorks (VxWorks has a global namespace). The problem is that the C program defines global variables (which are intended for use by a specific instance of that program) which conflict in the global namespace. I would like to make minimal changes to the program in order to make this work. All ideas welcomed!
Regards
By the way ... This isn't a good time to mention that global variables are not best practice!
The easiest thing to do would be to use task Variables (see taskVarLib documentation).
When using task variables, the variable is specific to the task now in context. On a context switch, the current variable is stored and the variable for the new task is loaded.
The caveat is that a task variable can only be a 32-bit number.
Each global variable must also be added independently (via its own call to taskVarAdd?) and it also adds time to the context switch.
Also, you would NOT be able to share the global variable with other tasks.
You can't use task variables with ISRs.
Another Possibility:
If you are using Vxworks 6.x, you can make a Real Time Process application.
This follows a process model (similar to Unix/Windows) where each instance of your program has it's own global memory space, independent of any other instance.
I had to solve this when integrating two third-party libraries from the same vendor. Both libraries used some of the same symbol names, but they were not compatible with each other. Because these were coming from a vendor, we couldn't afford to search & replace. And task variables were not applicable either since (a) the two libs might be called from the same task and (b) some of the dupe symbols were functions.
Assume we have app1 and app2, linked, respectively, to lib1 and lib2. Both libs define the same symbols so must be hidden from each other.
Fortunately (if you're using GNU tools) objcopy allows you to change the type of a variable after linking.
Here's a sketch of the solution, you'll have to modify it for your needs.
First, perform a partial link for app1 to bind it to lib1. Here, I'm assuming that you've already partially linked *.o in app1 into app1_tmp1.o.
$(LD_PARTIAL) $(LDFLAGS) -Wl,-i -o app1_tmp2.o app1_tmp1.o $(APP1_LIBS)
Then, hide all of the symbols from lib1 in the tmp2 object you just created to generate the "real" object for app1.
objcopymips `nmmips $(APP1_LIBS) | grep ' [DRT] ' | sed -e's/^[0-9A-Fa-f]* [DRT] /-L /'` app1_tmp2.o app1.o
Repeat this for app2. Now you have app1.o and app2.o ready to link into your final application without any conflicts.
The drawback of this solution is that you don't have access to any of these symbols from the host shell. To get around this, you can temporarily turn off the symbol hiding for one or the other of the libraries for debugging.
Another possible solution would be to put your application's global variables in a static structure. For example:
From:
int global1;
int global2;
int someApp()
{
global2 = global1 + 3;
...
}
TO:
typedef struct appGlobStruct {
int global1;
int global2;
} appGlob;
int someApp()
{
appGlob.global2 = appGlob.global1 + 3;
}
This simply turns into a search & replace in your application code. No change to the structure of the code.