What's the difference between solc's --opcodes and --asm? - ethereum

I'm leanring low-level Solidity inline assembly, but confused by different output formats.
The ouptut option of Solidity compiler says:
--asm EVM assembly of the contracts.
--opcodes Opcodes of the contracts.
I tried both options to compile the contract below:
// SPDX-License-Identifier: GPL-3.0
pragma solidity >=0.4.16 <0.9.0;
contract MyContract {
string bar = "Hello World";
function foo() public view returns(string memory) {
return bar;
}
}
The command solc -o output --asm contract.sol generated MyContract.evm:
/* "contract.sol":70:208 contract MyContract {... */
mstore(0x40, 0x80)
/* "contract.sol":96:122 string bar = "Hello World" */
mload(0x40)
dup1
0x40
add
0x40
mstore
dup1
0x0b
dup2
mstore
0x20
add
0x7657399374655890603765000000000000000000000000000000000000000000
dup2
mstore
pop
0x00
swap1
dup1
mload
swap1
0x20
add
swap1
tag_1
swap3
swap2
swap1
tag_2
jump // in
tag_1:
pop
/* "contract.sol":70:208 contract MyContract {... */
callvalue
dup1
... (and a lot more code)
The command solc -o output --opcodes contract.sol generated MyContract.opcode:
PUSH1 0x80 PUSH1 0x40 MSTORE PUSH1 0x40 MLOAD DUP1 PUSH1 0x40 ADD PUSH1 0x40 MSTORE DUP1 PUSH1 0xB DUP2 MSTORE PUSH1 0x20 ADD PUSH32 0x7657399374655890603765000000000000000000000000000000000000000000 DUP2 MSTORE POP PUSH1 0x0 SWAP1 DUP1 MLOAD SWAP1 PUSH1 0x20 ADD SWAP1 PUSH2 0x4F SWAP3 SWAP2 SWAP1 PUSH2 0x62 JUMP JUMPDEST POP CALLVALUE DUP1 ISZERO PUSH2 0x5C JUMPI PUSH1 0x0 DUP1 REVERT JUMPDEST POP PUSH2 0x166 JUMP JUMPDEST DUP3 DUP1 SLOAD PUSH2 0x6E SWAP1 PUSH2 0x105 JUMP JUMPDEST SWAP1 PUSH1 0x0 MSTORE PUSH1 0x20 PUSH1 0x0 KECCAK256 SWAP1 PUSH1 0x1F ADD PUSH1 0x20 SWAP1 DIV DUP2 ADD SWAP3 DUP3 PUSH2 0x90 JUMPI PUSH1 0x0 DUP6 SSTORE PUSH2 0xD7 JUMP JUMPDEST DUP3 PUSH1 0x1F LT PUSH2 0xA9 JUMPI DUP1 MLOAD PUSH1 ... ...
They look pretty much similar, although not 100% matching each other...
Questions
Does --opcodes just give a more compact form of assembly code compared to --asm output?
PUSH1 0x80 PUSH1 0x40 MSTORE in --opcodes format vs. mstore(0x40, 0x80) in --asm format. Are they doing the same thing? (I guess so but not 100% sure...)
Is there a way to print --opcodes in a pretty format rather than in a single line?
Are there any good resouces to learn Solidity inline assembly? I googled around but found a bunch of one page blogs, which are good to explain the basics but unfortunately none of them gives a complete and in-depth tutorial

1. Does --opcodes just give a more compact form of assembly code compared to --asm output?
That's somewhat correct. You could still translate the assembly to opcodes and get the same result.
Assembly represents a set of "low level-ish" instructions. Opcodes are the "real" binary instructions passed to the EVM. See for example this table that translates the opcodes to binary.
2. PUSH1 0x80 PUSH1 0x40 MSTORE in --opcodes format vs. mstore(0x40, 0x80) in --asm format. Are they doing the same thing? (I guess so but not 100% sure...)
Yes, they are doing the same thing - see answer to 1. When you run this snippet in Solidity
assembly {
mstore(0x40, 0x80)
}
you get the same opcodes.
3. Is there a way to print --opcodes in a pretty format rather than in a single line?
You can use tr on any other text-formatting unix cli command.
echo "PUSH1 0x80 PUSH1" | tr ' ' '\n'
PUSH1
0x80
PUSH1
See this forum post for more ways.
4. Are there any good resouces to learn Solidity inline assembly?
Apart from the documentation, I don't know any, so I'll let someone else give a better answer. But Solidity is still a pretty new technology, so my guess is that most people who specialize in Solidity assembly, learned by trial&error.

Related

Passing arguments into an Assebly function [duplicate]

This question already has answers here:
When do we create base pointer in a function - before or after local variables?
(2 answers)
What is exactly the base pointer and stack pointer? To what do they point?
(6 answers)
Closed last month.
I'm trying to pass some arguments into a function but it doesn't get them correctly. I want to multiply some matrices and I want to pass: address of matrix 1, address of matrix 2, address of the matrix i want the result to be in, and the size of the matrix.
matrix_mult:
pushl %ebp
pushl %ebx
pushl %edi
pushl %esi
movl %esp, %ebp
movl 8(%ebp), %edi
movl 12(%ebp), %esi
movl 16(%ebp), %ebx
movl 20(%ebp), %ecx
*the rest of the algorithm*
After those lines, the values in edi, esi, ebx, ecx are all wrong.
The function calling looks like this - trying to multiply matrix by itself and storing the result into matrix2. The matrix and msizze are stored correctly.
main:
*reading msize and matrix*
pushl msize
pushl $matrix2
pushl $matrix
pushl $matrix
call matrix_mult
addl $16, %esp
In the debugger it shows me that:
edi = 2
esi = big negative value
ebx = big positive value
ecx = big positive value

Variable initialization in as8088

I'm currently writing a function that should basically just write characters from a string into variables.
When performing test prints my variables seem fine. But when I attempt to print the first variable assigned(inchar) outside of the function it returns a empty string, but the second variable (outchar) seems to return fine. Am I somehow overwriting the first variable?
This is my code:
_EXIT = 1
_READ = 3
_WRITE = 4
_STDOUT = 1
_STDIN = 1
_GETCHAR = 117
MAXBUFF = 100
.SECT .TEXT
start:
0: PUSH endpro2-prompt2
PUSH prompt2
PUSH _STDOUT
PUSH _WRITE
SYS
ADD SP,8
PUSH 4
PUSH buff
CALL getline
ADD SP,4
!!!!!!!!!
PUSH buff
CALL gettrans
ADD SP,4
ADD AX,1 !gives AX an intial value to start loop
1: CMP AX,0
JE 2f
PUSH endpro-prompt1
PUSH prompt1
PUSH _STDOUT
PUSH _WRITE
SYS
ADD SP,8
PUSH MAXBUFF
PUSH buff
CALL getline
ADD SP,2
!PUSH buff
!CALL translate
!ADD SP,4
JMP 1b
2: PUSH 0 ! exit with normal exit status
PUSH _EXIT
SYS
getline:
PUSH BX
PUSH CX
PUSH BP
MOV BP,SP
MOV BX,8(BP)
MOV CX,8(BP)
ADD CX,10(BP)
SUB CX,1
1: CMP CX,BX
JE 2f
PUSH _GETCHAR
SYS
ADD SP,2
CMPB AL,-1
JE 2f
MOVB (BX),AL
INC BX
CMPB AL,'\n'
JNE 1b
2: MOVB (BX),0
MOV AX, BX
SUB AX,8(BP)
POP BP
POP CX
POP BX
RET
gettrans:
PUSH BX
PUSH BP
MOV BP,SP
MOV BX,6(BP) !Store argument in BX
MOVB (inchar),BL ! move first char to inchar
1: INC BX
CMPB (BX),' '
JE 1b
MOVB (outchar),BL !Move char seperated by Space to outchar
MOV AX,1 !On success
POP BP
POP BX
RET
.SECT .BSS
buff:
.SPACE MAXBUFF
.SECT .DATA
prompt1:
.ASCII "Enter a line of text: "
endpro:
prompt2:
.ASCII "Enter 2 characters for translation: "
endpro2:
outchar:
.BYTE 0
inchar:
.BYTE 0
charct:
.BYTE 0
wordct:
.BYTE 0
linect:
.BYTE 0
inword:
.BYTE 0
This is the code used to test print
PUSH 1 ! print that byte
PUSH inchar
PUSH _STDOUT
PUSH _WRITE
SYS
ADD SP,8
CALL printnl !function that prints new line
PUSH 1 ! print that byte
PUSH outchar
PUSH _STDOUT
PUSH _WRITE
SYS
CALL printnl
ADD SP,8
There seem to be a number of as88 8088 simulator environments. But I noticed on many of the repositories of code this bug mentioned:
1. The assembler requires sections to be defined in the following order:
TEXT
DATA
BSS
After the first occurrences, remaining section directives may appear in any order.
I'd recommend in your code to move the BSS section after DATA in the event your as88 environment has a similar problem.
In your original code you had lines like this:
MOV (outchar),BX
[snip]
MOV (inchar),BX
You defined outchar and inchar as bytes. The 2 lines above move 2 bytes (16-bits) from the BX register to both one byte variables. This will cause the CPU to write the extra byte into the next variable in memory. You'd want to explicitly move a single byte. Something like this might have been more appropriate:
MOVB (outchar),BL
[snip]
MOVB (inchar),BL
As you will see this code still has a bug as I mention later in this answer. To clarify - the MOVB instruction will move a single byte from BL and place it into the variable.
When you do a SYS call for Write you need to pass the address of the buffer to print, not the data in the buffer. You had 2 lines like this:
PUSH (inchar)
[snip]
PUSH (outchar)
The parentheses say to take the value in the variables and push them on the stack. SYS WRITE requires the address of the characters to display. The code to push their addresses should look like:
PUSH inchar
[snip]
PUSH outchar
gettrans function has a serious flaw in handling the copy of a byte from one buffer to another. You have code that does this:
MOV BX,6(BP) !Store argument in BX
MOVB (inchar),BL ! move first char to inchar
1: INC BX
CMPB (BX),' '
JE 1b
MOVB (outchar),BL !Move char seperated by Space to outchar
MOV BX,6(BP) properly places that buffer address passed as an argument and puts it into BX. There appears to be a problem with the lines that look like:
MOVB (inchar),BL ! move first char to inchar
This isn't doing what the comment suggests it should. The line above moves the lower byte (BL) of the buffer address in BX to the variable inchar . You want to move the byte at the memory location pointed to by BX and put it into inchar. Unfortunately on the x86 you can't move the data from one memory operand to another directly. To get around this you will have to move the data from the buffer pointed to by BX into a temporary register (I'll choose CL) and then move that to the variable. The code could look like this:
MOVB CL, (BX)
MOVB (inchar),CL ! move first char to inchar
You then have to do the same for outchar so the fix in both places could look similar to this:
MOV BX,8(BP) !Store argument in BX
MOVB CL, (BX)
MOVB (inchar),CL ! move first char to inchar
1: INC BX
CMPB (BX),' '
JE 1b
MOVB CL, (BX)
MOVB (outchar),CL ! move second char to outchar
The instruction MOV (inchar),BX stores register BX to the memory location labelled inchar.
However, inchar has been defined as a .BYTE, but BX is a 16-bit register, (2 bytes,) so you are writing not only inchar but also outchar.
The only reason why it appears to work in the beginning is because the 8088 is a low-endian architecture, so the low-order byte of BX is being stored first, while the high-order byte follows.
So, try MOV (inchar),BL

LLVM use of carry and zero flags

I'm starting to read LLVM docs and IR documentation.
In common architectures, an asm cmp instruction "result" value is -at least- 3 bits long, let's say the first bit is the SIGN flag, the second bit is the CARRY flag and the third bit is the ZERO flag.
Question 1)
Why the IR icmp instruction result value is only i1? (you can choose only one flag)
Why doesn't IR define, let's call it a icmp2 instruction returning an i3 having SIGN,CARRY and ZERO flags?
This i3 value can be acted upon with a switch instruction, or maybe a specific br2 instruction, like:
%result = cmp2 i32 %a, i32 %b
br2 i3 %result onzero label %EQUAL, onsign label %A_LT_B
#here %a GT %b
Question 2)
Does this make sense? Could this br2 instruction help create new optimizations? i.e. remove all jmps? it is necessary or the performance gains are negligible?
The reason I'm asking this -besides not being an expert in LLVM- is because in my first tests I was expecting some kind of optimization to be made by LLVM in order to avoid making the comparison twice and also avoid all branches by using asm conditional-move instructions.
My Tests:
I've compiled with clang-LLVM this:
#include <stdlib.h>
#include <inttypes.h>
typedef int32_t i32;
i32 compare (i32 a, i32 b){
// return (a - b) & 1;
if (a>b) return 1;
if (a<b) return -1;
return 0;
}
int main(int argc, char** args){
i32 n,i;
i32 a,b,avg;
srand(0); //fixed seed
for (i=0;i<500;i++){
for (n=0;n<1e6;n++){
a=rand();
b=rand();
avg+=compare(a,b);
}
}
return avg;
}
Output asm is:
...
mov r15d, -1
...
.LBB1_2: # Parent Loop BB1_1 Depth=1
# => This Inner Loop Header: Depth=2
call rand
mov r12d, eax
call rand
mov ecx, 1
cmp r12d, eax
jg .LBB1_4
# BB#3: # in Loop: Header=BB1_2 Depth=2
mov ecx, 0
cmovl ecx, r15d
.LBB1_4: # %compare.exit
# in Loop: Header=BB1_2 Depth=2
add ebx, ecx
...
I expected (all jmps removed in the inner loop):
mov r15d, -1
mov r13d, 1 # HAND CODED
call rand
mov r12d, eax
call rand
xor ecx,ecx # HAND CODED
cmp r12d, eax
cmovl ecx, r15d # HAND CODED
cmovg ecx, r13d # HAND CODED
add ebx, ecx
Performance difference (1s) seems to be negligible (on a VM under VirtualBox):
LLVM generated asm: 12.53s
hancoded asm: 11.53s
diff: 1s, in 500 millions iterations
Question 3)
Are my performance measures correct? Here's the makefile and the full hancoded.compare.s
makefile:
CC=clang -mllvm --x86-asm-syntax=intel
all:
$(CC) -S -O3 compare.c
$(CC) compare.s -o compare.test
$(CC) handcoded.compare.s -o handcoded.compare.test
echo `time ./compare.test`
echo `time ./handcoded.compare.test`
echo `time ./compare.test`
echo `time ./handcoded.compare.test`
hand coded (fixed) asm:
.text
.file "handcoded.compare.c"
.globl compare
.align 16, 0x90
.type compare,#function
compare: # #compare
.cfi_startproc
# BB#0:
mov eax, 1
cmp edi, esi
jg .LBB0_2
# BB#1:
xor ecx, ecx
cmp edi, esi
mov eax, -1
cmovge eax, ecx
.LBB0_2:
ret
.Ltmp0:
.size compare, .Ltmp0-compare
.cfi_endproc
.globl main
.align 16, 0x90
.type main,#function
main: # #main
.cfi_startproc
# BB#0:
push rbp
.Ltmp1:
.cfi_def_cfa_offset 16
push r15
.Ltmp2:
.cfi_def_cfa_offset 24
push r14
.Ltmp3:
.cfi_def_cfa_offset 32
push r12
.Ltmp4:
.cfi_def_cfa_offset 40
push rbx
.Ltmp5:
.cfi_def_cfa_offset 48
.Ltmp6:
.cfi_offset rbx, -48
.Ltmp7:
.cfi_offset r12, -40
.Ltmp8:
.cfi_offset r14, -32
.Ltmp9:
.cfi_offset r15, -24
.Ltmp10:
.cfi_offset rbp, -16
xor r14d, r14d
xor edi, edi
call srand
mov r15d, -1
mov r13d, 1 # HAND CODED
# implicit-def: EBX
.align 16, 0x90
.LBB1_1: # %.preheader
# =>This Loop Header: Depth=1
# Child Loop BB1_2 Depth 2
mov ebp, 1000000
.align 16, 0x90
.LBB1_2: # Parent Loop BB1_1 Depth=1
# => This Inner Loop Header: Depth=2
call rand
mov r12d, eax
call rand
xor ecx,ecx #hand coded
cmp r12d, eax
cmovl ecx, r15d #hand coded
cmovg ecx, r13d #hand coded
add ebx, ecx
.LBB1_3:
dec ebp
jne .LBB1_2
# BB#5: # in Loop: Header=BB1_1 Depth=1
inc r14d
cmp r14d, 500
jne .LBB1_1
# BB#6:
mov eax, ebx
pop rbx
pop r12
pop r14
pop r15
pop rbp
ret
.Ltmp11:
.size main, .Ltmp11-main
.cfi_endproc
.ident "Debian clang version 3.5.0-1~exp1 (trunk) (based on LLVM 3.5.0)"
.section ".note.GNU-stack","",#progbits
Question 1: LLVM IR is machine independent. Some machines might not even have a carry flag, or even a zero flag or sign flag. The return value is i1 which suffices to indicate TRUE or FALSE. You can set the comparison condition like 'eq' and then check the result to see if the two operands are equal or not, etc.
Question 2: LLVM IR does not care about optimization initially. The main goal is to generate a Static Single Assignment (SSA) based representation of instructions. Optimization happens in later passes of which some are machine independent and some are machine dependent. Your br2 idea will assume that the machine will support those 3 flags which might be a wrong assumption,
Question 3: I am not sure what you are trying to do here. Can you explain more?

x86 simple function not working

Hi guys I am trying to build the following function
function int Main(){
return 5;
}
this is my assembly code:
.globl Main
Main:
pushl %ebp
movl %esp, %ebp
subl $0, %esp
pushl $5
movl %ebp, %esp
popl %ebp
ret
However this always returns 1 it never returns 5 why?
How about just:
Main:
push byte 5
pop eax
ret
Summarizing what everyone said: your primary error is that the return value should go into EAX and it does not. Prolog and epilog code are not necessary for simple functions like this, but they won't hurt either (as long as they don't unbalance the stack). So the assembly should go:
(prolog)
movl $5, %eax,
(epilog)
ret
Where prolog and epilog are whatever your compiler generates by default.

assembly function flow [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
assembly function flow
assembly function flow
hello
I am reading a "programming from the ground up"
if you don't know what this book is, you still can help me.
in this book(chapter 4) there are 2 things that I don't understand.
Q. I don't understand
what "movl %ebx, -4(%ebp) #store current result" for.
and what does "current result" means
in marked section in the code below
little upperside, there is
"movl 8(%ebp), %ebx" which means save 8(%ebp) to %ebx
but the reason why I don't understand is
if the programmer want 8(%ebp) to save to -4(%ebp),
why should 8(%ebp) be passed through %ebx?
is "movl 8(%ebp), -4(%ebp)" akward?
or is there any typo in "movl 8(%ebp), %ebx #put first argument in %eax"?
(I think %ebx should be %eax or vice versa)
#PURPOSE: Program to illustrate how functions work
# This program will compute the value of
# 2^3 + 5^2
#
#Everything in the main program is stored in registers,
#so the data section doesn’t have anything.
.section .data
.section .text
.globl _start
_start:
pushl $3 #push second argument
pushl $2 #push first argument
call power #call the function
addl $8, %esp #move the stack pointer back
pushl %eax #save the first answer before
#calling the next function
pushl $2 #push second argument
pushl $5 #push first argument
call power #call the function
addl $8, %esp #move the stack pointer back
popl %ebx #The second answer is already
#in %eax. We saved the
#first answer onto the stack,
#so now we can just pop it
#out into %ebx
addl %eax, %ebx #add them together
#the result is in %ebx
movl $1, %eax #exit (%ebx is returned)
int $0x80
#PURPOSE: This function is used to compute
# the value of a number raised to
# a power.
#
#INPUT: First argument - the base number
# Second argument - the power to
# raise it to
#
#OUTPUT: Will give the result as a return value
#
#NOTES: The power must be 1 or greater
#
#VARIABLES:
# %ebx - holds the base number
# %ecx - holds the power
#
# -4(%ebp) - holds the current result
#
# %eax is used for temporary storage
#
.type power, #function
power:
pushl %ebp #save old base pointer
movl %esp, %ebp #make stack pointer the base pointer
subl $4, %esp #get room for our local storage
##########################################
movl 8(%ebp), %ebx #put first argument in %eax
movl 12(%ebp), %ecx #put second argument in %ecx
movl %ebx, -4(%ebp) #store current result
##########################################
power_loop_start:
cmpl $1, %ecx #if the power is 1, we are done
je end_power
movl -4(%ebp), %eax #move the current result into %eax
imull %ebx, %eax #multiply the current result by
#the base number
movl %eax, -4(%ebp) #store the current result
decl %ecx #decrease the power
jmp power_loop_start #run for the next power
end_power:
movl -4(%ebp), %eax #return value goes in %eax
movl %ebp, %esp #restore the stack pointer
popl %ebp #restore the base pointer
ret
is "movl 8(%ebp), -4(%ebp)" akward
Yes it is. In fact, the processor cannot do it. There are not enough memory fetch/write components to the processor to do both a read and a write in the same command. Due to this, in order to move a value from one memory spot to another, it must first be copied into a register.
In the x86 world, I know it would be possible to push the value from the source, the pop it to the destination. This would effectively copy memory to memory to memory, but this is a special case and is most likely slower than the one you listed.