Correct C Syntax for Function moved to RAM - function

I have a function that resides in flash memory but must run from ram.
Since it is not used often, I do not want the linker to relocate it.
Every thing is relative addressing so I can move it my self.
I'm using gcc for ARM and can't get the syntax correct in assigning the ram location to the ram based function pointer - please help
This is the function in flash:
byte Flash_Function(byte)
{
... code
}
These are global RAM variables
// pointer to ram based function .. called from flash based routines
byte (*Ram_Based_Routine)(byte) = 0; // issue is assigning a value to this
// XXX holds enought space to have routine copied into it
byte Ram_PlaceHolder_For_Function[XXX];
There is a function that does moves the function "Flash_Function" into the array "Ram_PlaceHolder_For_Function" and initializes Ram_Based_Routine pointer
I can't get syntax correct on this assignment line
Ram_Based_Routine = (*Flash_Function(byte))(Ram_PlaceHolder_For_Function);
if I were assigning the flash function -- this would be fine
Ram_Based_Routine = &Flash_Function;
So -- how to cast Ram_PlaceHolder_For_Function into Flash_Function?
Thanks for any comments

Related

How to define, allocate, and also initialise the values of an array of user defined length

I am quite new to MIPS32 and am working on an assignment that requires me to first ask the user for the length of the array they would like to define, and then ask them what the respective values are. I have written a rough C code which does the same, which is as follows
int main()
{
int N;
scanf("%d\n", &N); // will be the first line, will tell us the number of inputs we will get
int i=0, A[N]; // (1)
// Loop to enter the values to be sorted into an array we have made A[]. the values are entered as a1, a2.... and so on.
while(i != N)
{
scanf("%d\n", &A[i]);
i++;
}
}
I am mainly having trouble with how I write the code above, mainly line (1) in MIPS32. I know that defining the size of the array in the data section itself is not an option, but I am unsure about how to dynamically define an array of size N and then also store values into the array. Any help or advice on what I can do would be really helpful.
Arrays can be stored in global, stack or heap memory.
Global memory
Global memory is essentially fixed-sized at program build time — you put a label in your .data and reserve some constant amount of space, using .space or other data directive.
One approach here is to have a maximum (say 100), so reserve space for that many, and program a limit test to make sure the code doesn't try to use more than the pre-defined maximum.
As an exception, the last global data item can be used to to store an array of relatively unknown size.  This happens to work in QtSpim and MARS, because a fair amount of space behind the global data is there for use to use.  This approach is not very professional, since the code can't really know at what size this will no longer work, but is valid approach for sample toy programs and throw away assignments.  Put a label at the end of your global data and reserve no space or just one word of space.
Integer element arrays have alignment requirements, so when putting global data after string data often requires use of alignment (as a separate directive or by reserving a word, e.g. .word, which will inject alignment automatically).
Heap memory
Heap memory can be allocated using MARS/QtSpim syscall #9.  If the allocation fails, the size was too large, though if it succeeds you have all the space that was asked for.  The syscall #9 returns a pointer to the newly allocated memory in $v0, and you will generally want to store that value somewhere (register or global) for later use.
Stack memory
The stack grows in the downward direction: stack memory can be allocated by decrementing the stack pointer.  The stack pointer — after a decrement — refers to the newly allocated memory.  You can decrement the stack pointer by a fixed amount or by a variable amount.  In your case, you would use a variable amount.  It is generally required that the stack pointer maintain alignment, so in computing the amount to decrement, we would round up.  If you need multiple entities, you can decrement the stack pointer multiple times, or, sum the sizes together and decrement once, which would be the more common approach.
Before (or as) a function returns to its caller, the stack pointer must be returned to the value it had upon function entry.  This releases any allocated stack memory and returns to the caller the same stack environment that it had when it made the function call.  It should stand to reason that it would be a logic error to return released memory to a caller, so this approach cannot be used within a function that needs to return an array to its caller.
Any function that uses syscall #10 to terminate the program does not have to honor this requirement, since the program terminates immediately upon that syscall.  This approach is often used to exit the main — MARS requires it, since it doesn't "call" the main, whereas QtSpim, by default, inserts a small startup stub that does "call" main.

How to detect function pointers from assignment statements in LLVM IR?

I want to detect all occurences of function pointers, calls as well as assignments. I am able to detect indirect calls but how to do the assignment ones? Is there a char Iterator in the Instruction.h header or something which iterates character by character for each instruction in the .ll file?
Any help or links will be appreciated.
Start with unoptimized LLVM IR, then scan the use-list for all functions.
In the comments we established that this is for a school project, not a production system. The difference when building a production system is that you should rely only on the properties that are guaranteed, while for research or prototyping or schoolwork, you can build something that happens to work by relying on properties that are not guaranteed. That's what we're going to do here.
It so happens (but is not guaranteed!) that as clang converts C to LLVM IR, it will[1] emit a store for every function pointer used. As long as we don't run the LLVM optimizer, they will still be there.
The "forwards" direction would be to look into instructions and see whether any of them do the action you want (assign or call a function pointer). I've got a better idea: let's do it backwards. Every llvm::Value has a list of all places where that value is used, called the use-list. For example %X = add i32 %Y, %Z, %X would appear in the use-list of %Y because %X is using %Y.
Starting with your Module, iterate over every function (ie. for (auto &F : M->functions()) {), then scan the use-list for the function (ie. for (const auto *U : F.users())) and look at those Values (ie. U.get()). If the user value is an Instruction, you can query which Function this Instruction belongs to -- the Instruction has a parent BasicBlock and the BasicBlock has a parent Function. If it's a ConstantExpr, you'll need to recurse through that ConstantExpr's use-list until you find all the instructions using those constants. If the instruction is a direct call, you want to skip it (unless it's a direct call with an argument that is also a function pointer, like somefun(1, &somefun, 2);).
This will miss any code that uses C typed function pointers but never point to any functions, for instance C code like void (*fun_ptr)(int) = NULL;. If this is important to you then I suggest writing a Clang AST tool with the RecursiveASTVisitor instead of using LLVM.
[1] clang is free not to, for instance given if (0) { /* ... */ } clang could skip emitting LLVM IR for the disabled code because it's faster to do that. If your function pointer use was in there, LLVM would never get the opportunity to see it.

When functions are assigned to variables, how are they stored?

Normally, if you create a variable, it's usually trivial how to store it in memory, just get the size of it (or of all of it's components, for example in structs) and allocate that many bytes in memory to store it. However a function is a bit different from other data types, it's not just some primitive with a set size. My question is, how exactly are functions stored in memory?
Some example code in JavaScript:
let factorial = function(x) {
if(x == 0) return 1;
return x*factorial(x-1);
}
Once defined, I can use this function like any other variable, putting it in objects, arrays, passing it into other functions, etc.
So how does it keep track of the function? I understand that this is eventually compiled to machine code (or not in the case of JavaScript but I just used it since it was a convenient example), but how would memory look after such a function is defined? Does it store a pointer to the code and a marker that it's a function, or does it store the literal machine code/bytecode for the function in memory, or something else?

How are functions modified at run-time then propagated to multiple threads?

With Clojure (and other Lisp dialects) you can modify running code. So, when a function is modified during runtime is that change made available to multiple threads?
I'm trying to figure out how it works technically in a concurrent setting: if several threads are using a function foo, what happens when I redefine (say using defn) the function foo?
There has to be some synchronization going on: when and how does such synchronization happen and what does it cost?
Say on a JVM, is the function referenced using a volatile reference? If so, does it mean every single time there's a "function lookup" then one has to pay the volatile cost?
In Clojure functions are instances of the IFn class and they are almost always stored in vars. vars are Clojures mechanism for thread local values.
when you define a function that sets the "root binding" of the var to reference the function
threads other threads get whatever the the current value of the root binding for the var but can't change the value. this prevents any two threads from having to fight over the value of the var because only the root thread can set the value.
threads can choose to use a new value of the var if they need to, but calling binding which gives then their own thread local value that they are free to change at will because no other thread can read it.
A good understanding of vars is well worth a little study, they are a very useful concurrency device once you get used to them.
ps: the root thread is usually the REPL
pss: you are of course free to store your functions in something other than vars, if for instance you needed to atomically update a group of functions, though this is rare.

CUDA memory allocation - is it efficient

This is my code. I have lot of threads so that those threads calling this function many times.
Inside this function I am creating an array. It is an efficient implementation?? If it is not please suggest me the efficient implementation.
__device__ float calculate minimum(float *arr)
{
float vals[9]; //for each call to this function I am creating this arr
// Is it efficient?? Or how can I implement this efficiently?
// Do I need to deallocate the memory after using this array?
for(int i=0;i<9;i++)
vals[i] = //call some function and assign the values
float min = findMin(vals);
return min;
}
There is no "array creation" in that code. There is a statically declared array. Further, the standard CUDA compilation model will inline expand __device__functions, meaning that the vals will be compiled to be in local memory, or if possible even in registers.
All of this happens at compile time, not run time.
Perhaps I am missing something, but from the code you have posted, you don't need the temporary array at all. Your code will be (a little) faster if you do something like this:
#include "float.h" // for FLT_MAX
__device__ float calculate minimum(float *arr)
{
float minVal = FLT_MAX:
for(int i=0;i<9;i++)
thisVal = //call some function and assign the values
minVal = min(thisVal,minVal);
return minVal;
}
Where an array is actually required, there is nothing wrong with declaring it in this way (as many others have said).
Regarding the "float vals[9]", this will be efficient in CUDA. For arrays that have small size, the compiler will almost surely allocate all the elements into registers directly. So "vals[0]" will be a register, "vals[1]" will be a register, etc.
If the compiler starts to run out of registers, or the array size is larger than around 16, then local memory is used. You don't have to worry about allocating/deallocating local memory, the compiler/driver do all that for you.
Devices of compute capability 2.0 and greater do have a call stack to allow things like recursion. For example you can set the stack size to 6KB per thread with:
cudaStatus = cudaThreadSetLimit(cudaLimitStackSize, 1024*6);
Normally you won't need to touch the stack yourself. Even if you put big static arrays in your device functions, the compiler and driver will see what's there and make space for you.