Hopper disassembler ASM - reverse-engineering

I am using the Hopper Disassembler to reverse engineer an iOS library. In principle, in the beginning, everything is clear and logical. But I can't find information about asm. What does asm mean? Is this a function call? If this is a function, what does it do? Thanks!
Disassembled code screenshot

It's inline assembly, equivalent to the asm keyword in C. Wikipedia might serve as an introduction.
You almost certainly get this because Hopper fails to properly decompile the instruction. In your case it's arm64 assembly (formally the A64 instruction set) as outlined in the ARMv8 Reference Manual.
Also, xN or wN in assembly should correspond to rN in the decompiled code, so you should be able to make some sense of the output you get.

Related

Compiling hybrid CUDA/MPI and CUDA/UPC

How can I compile MPI/CUDA and UPC/CUDA hybrid code? Do I have to separately compile them or can I use language constructs interchangeably and compile as a single source file? Could someone with previous experience in this area help? Thanks in advance
MPI/CUDA - As JackOLantern has pointed out, can write MPI and CUDA code in separate files, compile them and link them.
For UPC, if it is Berkeley UPC, same procedure can be done but have to do a small change at the initial configuration. When defining the compiler parameters, have to provide NVCC as both C and C++ compilers.

Slatec + CUDA Fortran

I have code written in old-style Fortran 95 for combustion modelling. One of the features of this problem is that one have to solve stiff ODE system for taking into account chemical reactions influence. For this purpouse I use Fortran SLATEC library, which is also quite old. The solving procedure is straight forward, one just need to call subroutine ddriv3 in every cell of computational domain, so that looks something like that:
do i = 1,Number_of_cells ! Number of cells is about 2000
call ddriv3(...) ! All calls are independent on cell number i
end do
ddriv3 is quite complex and utilizes many other library functions.
Is there any way to get an advantage with CUDA Fortran, without searching some another library for this purpose? If I just run this as "parallel loop" is that will be efficient, or may be there is another way?
I'm sorry for such kind of question that immidiately arises the most obvious answer: "Why wouldn't you try and know it by yourself?", but i'm in a really straitened time conditions. I have no any experience in CUDA and I just want to choose the most right and easiest way to start.
Thanks in advance !
You won't be able to use or parallelize the ddriv3 call without some effort. Your usage of the phrase "parallel loop" suggests to me you may be thinking of using OpenACC directives with Fortran, as opposed to CUDA Fortran, but the general answer isn't any different in either case.
The ddriv3 call, being part of a Fortran library (which is presumably compiled for x86 usage) cannot be directly used in either CUDA Fortran (i.e. using CUDA GPU kernels within Fortran) or in OpenACC Fortran, for essentially the same reason: The library code is x86 code and cannot be used on the GPU.
Since presumably you may have access to the source implementation of ddriv3, you might be able to extract the source code, and work on creating a CUDA version of it (or a version that OpenACC won't choke on), but if it uses many other library routines, it may mean that you have to create CUDA (or direct Fortran source, for OpenACC) versions of each of those library calls as well. If you have no experience with CUDA, this might not be what you want to do (I don't know.) If you go down this path, it would certainly imply learning more about CUDA, or at least converting the library calls to direct Fortran source (for an OpenACC version).
For the above reasons, it might make sense to investigate whether a GPU library replacement (or something similar) might exist for the ddriv3 call (but you specifically excluded that option in your question.) There are certainly GPU libraries that can assist in solving ODE's.

Using STL containers in GNU Assembler

Is it possible to "link" the STL to an assembly program, e.g. similar to linking the glibc to use functions like strlen, etc.? Specifically, I want to write an assembly function which takes as an argument a std::vector and will be part of a lib. If this is possible, is there any documentation on this?
Any use of C++ templates will require the compiler to generate instantiations of those templates. So you don't really "link" something like the STL into a program; the compiler generates object code based upon your use of templates in the library.
However, if you can write some C++ code that forces the templates to be instantiated for whatever types and other arguments you need to use, then write some C-linkage functions to wrap the uses of those template instantiations, then you should be able to call those from your assembly code.
I strongly believe you're doing it wrong. Using assembler is not going to speed up your handling of the data. If you must use existing assembly code, simply pass raw buffers
std::vector is by definition (in the standard) compatible with raw buffers (arrays); the standard mandates contiguous allocation. Only reallocation can invalidate the memory region that contains the element data. In short, if the C++ code can know the (max) capacity required and reserve()/resize() appropriately, you can pass &vector[0] as the buffer address and be perfectly happy.
If the assembly code needs to decide how (much) to reallocate, let it use malloc. Once done, you should be able to use that array as STL container:
std::accumulate(buf, buf+n, 0, &dosomething);
Alternatively, you can use the fact that std::tr1::array<T, n> or boost::array<T, n> are POD, and use placement new right on the buffer allocated in the library (see here: placement new + array +alignment or How to make tr1::array allocate aligned memory?)
Side note
I have the suspicion that you are using assembly for the wrong reasons. Optimizing compilers will leverage the full potential of modern processors (including SIMD such as SSE1-4);
E.g. for gcc have a look at
__attibute__ (e.g. for pointer restrictions
such as alignment and aliasing guarantees: this will enable the more powerful vectorization options for the compiler);
-ftree_vectorize and -ftree_vectorizer_verbose=2, -march=native
Note also that since the compiler can't be sure what registers an external (or even inline) assembly procedure clobbers, it must assume all registers are clobbered leading to potential performance degradation. See http://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html for ways to use inline assembly with proper hints to gcc.
probably completely off-topic: -fopenmp and gnu::parallel
Bonus: the following references on (premature) optimization in assembly and c++ might come in handy:
Optimizing software in C++: An optimization guide for Windows, Linux and Mac platforms
Optimizing subroutines in assembly language: An optimization guide for x86 platforms
The microarchitecture of Intel, AMD and VIA CPUs: An optimization guide for assembly programmers and compiler makers
And some other relevant resources

Pascal to Mips Code Conversion

I'm currently working on a Mips Code Generator for my Pascal Parser (Written in C using Lex / Yacc) . Does anybode know of a Tool out there I can use as a reference in order assure correct Code Generation?
Here is a mips simulator. I used it in school to check and run my mips projcets. One thing I remember is that this simulator has a few commands(to make it easier on us students) that real mips compilers don't. I am pretty sure it is all documented tho.
You can build the GNU Pascal Compiler for a MIPS target, as a cross-compiler.

How exactly do executables work?

I know that executables contain instructions, but what exactly are these instructions? If I want to call the MessageBox API function for example, what does the instruction look like?
Thanks.
Executables are binary files that are understood by the operating system. The executable will contain sections which have data in them. Windows uses the PE format. The PE Format has a section which has machine instructions. These instructions are just numbers which are ordered in a sequence and is understood by the CPU.
A function call to MessageBox(), would be a sequence of instructions which will
1) have the address of the function which is in a DLL. This address is put in by the compiler
2) instructions to "push" the parameters onto a stack
3) The actual function call
4) some sort of cleanup (depends on the calling convention).
Its important to remember that EXE files are just specially formatted files. I dont have a disassembly for you, but you can try compiling your code, then open your EXE in visual studio to see the disassembly.
That is a bloated question if I ever saw one.
BUT, I will try my best to give an overview.
In a binary executable there are these things called "byte codes", byte codes are just the hex represtation of an instruction. Commonly you can "look up" byte codes and convert them to Assembly instructions. For example:
The instruction:
mov ax, 2h
Has the byte code representation:
B8 02 00
The byte codes get loaded into RAM and executed by the processer as that is its "language". No one sane that I know programs in byte code, it would just be wayyyy to complicated. Assembly is...fun enough as it is. Whenever you compile a program in a higher level language it has to take your code and turn it into Assembly instructions, you just imagine how mangled your code would look after it compiles it. Don't get me wrong, compilers are great, but disassemble a C++ program with IDA Pro Freeware and you will see what I am talking about.
That is executables in a nutshell, there are certainly books written on this subject.
I am not a Windows API expert, but someone else can show you what the instruction would look like for calling the Windows API "MessageBox". It should only be a few lines of Assembly.
Whatever code is written (be it in C or some other language) is compiled by a compiler to a special sort of language called assembly (well, machine code, but they're very close). Assembly is a very low-level language, which the CPU executes natively. Normally, you don't program in assembly because it is so low-level (for example, you don't want to deal with pulling bits back and forth from memory).
I can't say about the MessageBox function specifically, but I'd guess that it's a LOT of instructions. Think about it: it has to draw the box, and style it however your computer styles it, and hook up an even handler so that something happens when the user clicks the button, tells Windows (or whatever operating system) to add it to the taskbar (or dock, etc), and so many other things.
It depends on the language that you are working in. But for many it is as simple as...
msgbox("Your message goes here")
or
alert("Your message goes here")