What does Backpatching mean? - language-agnostic

What does backpatching mean ? Please illustrate with a simple example.

Back patching usually refers to the process of resolving forward branches that have been planted in the code, e.g. at 'if' statements, when the value of the target becomes known, e.g. when the closing brace or matching 'else' is encountered.

In intermediate code generation stage of a compiler we often need to execute "jump" instructions to places in the code that don't exist yet. To deal with this type of cases a target label is inserted for that instruction.
A marker nonterminal in the production rule causes the semantic action to pick up.

Some statements like conditional statements, while, etc. will be represented as a bunch of "if" and "goto" syntax while generating the intermediate code.
The problem is that, These "goto" instructions, do not have a valid reference at the beginning(when the compiler starts reading the source code line by line - A.K.A 1st pass). But, after reading the whole source code for the first time, the labels and references these "goto"s are pointing to, are determined.
The problem is that can we make the compiler able to fill the X in the "goto X" statements in one single pass or not?
The answer is yes.
If we don't use backpatching, this can be achieved by a 2 pass analysis on the source code. But, backpatching lets us to create and hold a separate list which is exclusively designed for "goto" statements. Since it is done in only one pass, the first pass will not fill the X in the "goto X" statements because the comipler doesn't know where the X is at first glance. But, it does stores the X in that exclusive list and after going through the whole code and finding that X, the X is replaced by that address or reference.

Backpaching is the process of leaving blank entries for the goto instruction where the target address is unkonown in the forward transfer in the first pass and filling these unknown in the second pass.

Backpatching:
The syntax directed definition can be implemented in two or more passes (we have both synthesized attributes and inherited attributes).
Build the tree first.
Walk the tree in the depth-first order.
The main difficulty with code generation in one pass is that we may not know the target of a branch when we generate code for flow of control statements
Backpatching is the technique to get around this problem.
Generate branch instructions with empty targets
When the target is known, fill in the label of the branch instructions (backpatching).

backpatching is a process in which the operand field of an instruction containing a forward reference is left blank initially. the address of the forward reference symbol is put into this field when its definition is encountered in the program.

Back patching is the activity of filling up the unspecified information of labels
by using the appropriate semantic expression in during the code generation process.
It is done by:
boolean expression.
flow of control statement.

Related

How do i know when a function body ends in assembly

I want to know when a function body end in assemby, for example in c you have this brakets {} that tell you when the function body start and when it ends but how do i know this in assembly?
Is there a parser that can extract me all the functions from assembly and start line and endline of their body?
There's no foolproof way, and there might not even be a well-defined correct answer in hand-written asm.
Usually (e.g. in compiler-generated code) you know a function ends when you see the next global symbol, like objdump does to decide when to print a new "banner". But without all function-start symbols being visible, there's no unambigious way. That's why some object file formats have room for size metadata associated with a symbol. Like .size foo, . - foo in GAS syntax.
It's not as easy as looking for a ret; some functions end with a jmp tail-call to another function. And some call a noreturn function like abort or __stack_chk_fail (not tailcall because they want to push a return address for a backtrace.) Or just fall off into whatever's next because that path had undefined behaviour in the source so the compiler assumed it wasn't reachable and stopped generating instructions for it, e.g. a C++ non-void function where execution can/does fall off the end without a return.
In general, assembly can blur the lines of what a function is.
Asm has features you can use to implement the high-level concept of a function, but you're not restricted to that.
e.g. multiple asm functions could all return by jumping to a common block of code that pops some registers before a ret. Is that shared tail a separate function that's called with a tail-called with a special calling convention?
Compilers don't usually do that, but humans could.
As for function entry points, usually some other code somewhere in the program will contain a call to it. But not necessarily; it might only be reachable via a table of function pointers, and you don't know that a block of .rodata holds function pointers until you find some code loading from it and calling or jumping.
But that doesn't work if the lowest-address instruction of the function isn't its entry point. See Does a function with instructions before the entry-point label cause problems for anything (linking)? for an example
Compilers don't generate code like that, but humans can. (It's a handy trick sometimes for https://codegolf.stackexchange.com/ questions.)
Or in the general case, a function might have multiple entry points. Or you could describe that as multiple functions with overlapping implementations. Sometimes it's as simple as one tailcalling another by falling into it without needing a jmp, i.e. it starts a few instructions before another.
I wan't to know when a function body ends in assembly, [...]
There are mainly four ways that the execution of a stream of (userspace) instructions can "end":
An unconditional jump like jmp or a conditional one like Jcc (je,jnz,jg ...)
A ret instruction (meaning the end of a subroutine) which probably comes closest to the intent of your question (including the ExitProcess "ret" command)
The call of another "function"
An exception. Not a C style exception, but rather an exception like "Invalid instruction" or "Division by 0" which terminates the user space program
[...] for example in c you have this brakets {} that tell you when the function body start and when it ends but how do i know this in assembly?
Simple answer: you don't. On the machine level every address can (theoretically) be an entry point to a "function". So there is no unique entry point to a "function" other than defined - and you can define anything.
On a tangent, this relates to self-modifying code and viruses, but it must not. The exit/end is as described in the first part above.
Is there a parser that can extract me all the functions from assembly and
start line and endline of their body?
Disassemblers create some kind of "functions" with entry and exit points. But they are merely assumed. No way to know if that assumption is correct. This may cause problems.
The usual approach is using a disassembler and the work to recombinate the stream of instructions to different "functions" remains to the person that mandated this task (vulgo: you). Some tools exist that claim to simplify this, but I cannot judge their efficacy.
From the perspective of a high level language, there are decompilers that try to reverse the transformation from (for example) C to assembly/machine code that try to automatize that task and will work more or less or in some cases.

Minor bug in pararrayfun? Is there a workaround (threads vs processes)?

I am using Octave version 4.2.2, but I think the question applies to previous versions as well.
I want to know if the following behavior is well-known and caused by my ignorance or if it is a bug that should be addressed, and also if there is a workaround. Note that I am focusing on parallelization in situations where vectorization is not possible, and where copy by values are not an option.
Basically, my problem is that functions such as pararrayfun or parcellfun seem to violate the principle that the properties of handles are passed by reference.
As an example, suppose we define
classdef data_class < handle
properties
data
end
end
and that we want to take each element from the .data property of one object input_arr from this class, apply some stochastic function fancy_func to them, and copy the result to the corresponding index of the .data property of another object arr of this class. The .data property is just a matrix of size 12*1, and we want to have 4 processes that each process 3 elements.
elem_per_process=3;
num_processes=4;
start_indexes={1,4,7,10};
function []=fill_arr(start_idx,num_elems,in_arr_handle,arr_to_fill_handle)
for i=1:num_elems
arr_to_fill_handle.data(start_idx+i-1,:)=fancy_func(in_arr_handle.data(start_idx+i-1,:));
end
end
filler=#(start_idx)fill_arr(start_idx,elem_per_process,input_arr,arr);
cellfun(filler,start_indexes);%works fine
parcellfun(num_processes,filler,start_indexes);% PROBLEM! Nothing is copied
So, this is actually worse that what I thought:
parcellfun copies object handle properties by value
It breaks code that works with cellfun !
A quick look at
<octave_dir>/packages/parallel-3.1.1/parcellfun.m
seems to indicate that the parallelization relies on processes instead of threads (also note pararrayfun uses parcellfun at its core), so this is expected. What bothers me is that it does so without a single warning, while going against the fundamental properties that many Octave users expect to hold.
So, to summarize:
Is this as a (minor) bug?
Is there any way to parallelize using threads instead of processes (for situations where vectorization is not possible, and where copy by value is unacceptable)?

Shows warning during compiling

enter image description here
it show warning
I was suppose to arrange the numbers, In order irrespective of the values, but to move 0 come at last.
To learn what you are doing wrong you need to read a book in C. Basic one.
I can point some errors and good practices.
comparison is done by ==. so you should use if(i==0)
After the second for loop you would want to change the value of i to 0. i=0.
the two for loops should be run upto the point when i<n and j<n.
That if(a[i]==0) comparison is not needed.
You don't need the while loop here.
You can print all of them after the for-for looping.
Global variables are used but you should have a good reason to use that.
Index variables are better if declared and defined locally.
what you are trying to do is known as Bubble sort.
Even after understanding all this and following this you get error try to run it through debugger, try small value of n.
Then if you can't, then ask here.

Questions about the Boundary Value Check

I'm doing my JUnit homework and need some explanations here.
Here's the quotation from my homework description:
One of the issues with boundary conditions is that the system needs to behave well even if the boundary is approached multiple times. This should be obvious, but it doesn't always happen in practice.
Remember that we can characterize an object as state and behavior. Typically, the state is not directly accessible, but instead, is accessed indirectly by means of the behavior. That is, the behavior reflects the state of the object.
Now, if we think about boundaries in math, it might not be too surprising to imagine the the value at some boundary will be different if we approach that boundary in different ways. So, if the value can be likened to the state, the state at the boundary may vary depending on how we got there. This would mean that the behavior could be different.
To make objects that behave consistently, we would have to insure that the internal state at those boundaries is consistent. So, test cases should check this assumption. To receive challenge points for this homework assignment enhance your test cases so that potential problems around the boundaries may be discovered.
Clearly mark the Challenge test cases with the string "### challenge ###" in the comments. Include in those comments what boundary is being tested, and how you're guessing that the state of the object may be different depending on how the boundary is being approached.
I don't understand this especially the highlighted part. What does he mean by "object behave consistently" and the "potential probelms"?
Also, how is this different than general boundary check that will just throws the exception and i expected in the JUnit?
Thank you!
Without knowing the details of the homework, an answer could only be somewhat generic, but I'll try.
Boundary checking is not just exception checking, its about seeing which paths in your code are execution on what condition. If you have control statements, loops, if-else, switch, etc you have to verify, on what conditions (of your internal state) those statements are processed in what way.
To me, boundary testing is that you change certain values of an instance field in a way that would cause the behavior to run through different branches of your code.
for example, you have this behavior:
if(someInstanceValue > 5) {
return "great";
} else {
return "poor";
}
Now you could test with data for someInstanceValue that define the boundary
4 : "poor"
5 : "great"
If you have multiple fields in your class, all of them define the state but only some of them may affect a certain path in your code. As the test is a specification of your class under test, written in code, you should specify which fields are relevant to a function, and which are not (by leaving them out).
So you should set up your instance-under-test accordingly (calling all setters) or if you require more complex objects, you could use frameworks like Mockito to specify the state (in a when().thenReturn() syntax).
If you want to verify if you covered all your boundaries, you could run a mutation test against your suite using a mutation testing tool like PIT. It will flip the switches in your code (i.e. replacing a < with a >=) to check whether your test will fail. Often, it's a good source of inspiration for improving the way you test.
Neverthelss, some parts of the homework assignment sound a bit confusing to me. You may approach a boundary from two sides, ok, but there is no such thing as a state that represents THE boundary, you're either on one or the other side of the boundary. If the way, how you approached one side of a boundary matters, and the object behaves differently depending on that "history" of how you reached that state, the history becomes part of the state. In other words: different history = different state.
Keep in mind: every instance field is part of the state. Every possible combination of values of your instance fields defines a single state. Every transition from one combination to another is a state transition triggered by calling a behavior. No think of your test describing this statemachine, be listing the triple of {currentState,input} -> nextState (with input being method invocation). Wich is basically the Given-When-Then structure good tests should have.

Amending a condition as the stack unwinds in Common Lisp

I'm working on some lisp code that munges a CVS repository in order to get it into shape for a git conversion. As such, when something goes wrong it might not be a bug in my code but might instead be an inaccuracy in the input data which needs fixing.
Deep down in the bowels of my code, I'll have (say) some function that merges two date ranges by taking an intersection. If that intersection turns out to be empty, I'll raise an error. This is all good, but my "merge dates" function is missing lots of the information that the user (me!) needs in order to figure out what went wrong. For example, which CVS master file ("foo,v") was I working on? What branch was I thinking about? Etc. etc.
My partial solution to this problem is to handle and re-raise the error. For example, this sort of code:
(handler-case
(do-something)
(unclear-graft-point (c)
(setf (slot-value c 'master) master)
(setf (slot-value c 'branch) branch)
(error c)))
which sets some extra slots with useful info before re-raising the error.
The report function for the condition then checks whether those slots have been set and, if so, will use them to give a more helpful error message.
Unfortunately, the backtrace that I get stops at the top-level re-raise. That makes sense: it's the code I wrote. But it's not really what I want...
Is there a way in Common Lisp to "annotate" a condition as it hurtles past, without re-raising it and, hence, partially unwinding the stack without reporting that in the back trace?
I gave a reasonable amount of background info about what I'm doing in the hope that, if this isn't possible, someone will say "Ah, what you should do in this situation is...": am I just going about this the wrong way?
Yes, there is a Common Lisp technology applicapble to your case - it is called HANDLER-BIND. You can find a detailed explanation and use cases for it alongside other condition-handling machinery in Peter Seibel's Practical Common Lisp Chapter 19. Conditions and Restarts.