I have this piece of asm code that uses __ctype_b array, i'm trying to understand what it is, anybody know what is it doing?
ps: i know what __ctype_b_loc() is, this is different.
mov eax, [rbp+counter]
cdqe
add rax, [rbp+server_received_buffer]
movzx eax, byte ptr [rax]
movsx rax, al
add rax, rax
mov rdx, rax
mov rax, cs:__ctype_b
lea rax, [rdx+rax]
movzx eax, word ptr [rax]
movzx eax, ax
and eax, 20h
test eax, eax
I found the answer.
__ctype_b is an array that be used in "__isctype_l" macro:
# define __isctype_l(c, type, locale) \
((locale)->__ctype_b[(int) (c)] & (unsigned short int) type)
and it is used here:
# define __isalnum_l(c,l) __isctype_l((c), _ISalnum, (l))
# define __isalpha_l(c,l) __isctype_l((c), _ISalpha, (l))
# define __iscntrl_l(c,l) __isctype_l((c), _IScntrl, (l))
# define __isdigit_l(c,l) __isctype_l((c), _ISdigit, (l))
# define __islower_l(c,l) __isctype_l((c), _ISlower, (l))
# define __isgraph_l(c,l) __isctype_l((c), _ISgraph, (l))
# define __isprint_l(c,l) __isctype_l((c), _ISprint, (l))
# define __ispunct_l(c,l) __isctype_l((c), _ISpunct, (l))
# define __isspace_l(c,l) __isctype_l((c), _ISspace, (l))
# define __isupper_l(c,l) __isctype_l((c), _ISupper, (l))
# define __isxdigit_l(c,l) __isctype_l((c), _ISxdigit, (l))
# define __isblank_l(c,l) __isctype_l((c), _ISblank, (l))
then we will reach to this enum
enum
{
_ISupper = _ISbit (0), /* UPPERCASE. */
_ISlower = _ISbit (1), /* lowercase. */
_ISalpha = _ISbit (2), /* Alphabetic. */
_ISdigit = _ISbit (3), /* Numeric. */
_ISxdigit = _ISbit (4), /* Hexadecimal numeric. */
_ISspace = _ISbit (5), /* Whitespace. */
_ISprint = _ISbit (6), /* Printing. */
_ISgraph = _ISbit (7), /* Graphical. */
_ISblank = _ISbit (8), /* Blank (usually SPC and TAB). */
_IScntrl = _ISbit (9), /* Control character. */
_ISpunct = _ISbit (10), /* Punctuation. */
_ISalnum = _ISbit (11) /* Alphanumeric. */
};
now we need to understand what 0x20 is stand for from _ISbit definition
# if __BYTE_ORDER == __BIG_ENDIAN
# define _ISbit(bit) (1 << (bit))
# else /* __BYTE_ORDER == __LITTLE_ENDIAN */
# define _ISbit(bit) ((bit) < 8 ? ((1 << (bit)) << 8) : ((1 << (bit)) >> 8))
# endif
0x20 : 0010 0000
from the asm code i know it is big endian. so the shift number should be 5 and it is Whitespace.
Conclusion: that asm code finds Whitespace from the server_received_buffer
Related
From RISC-V OpenSBI's source code and documents, in OpenSBI firmware a1 preserves FDT address from the prior booting stage, which I guess is QEMU if the following command is used to boot OpenSBI firmware:
qemu-system-riscv64 -M virt -m 256M -nographic -bios build/platform/generic/firmware/fw_payload.bin
and OpenSBI firmware is built with
make PLATFORM=generic CROSS_COMPILE=riscv64-linux-gnu-
fw_base.S in OpenSBI's source code will use the value of a1 to invoke fw_platform_init, which assumes a1 contains the FDT address.
My question is when and how a1 is set before fw_base.S?
a1 is set by function riscv_setup_rom_reset_vec(...) which in qemu/hw/riscv/boot.c inserts some instructions to set FDT address:
/* reset vector */
uint32_t reset_vec[10] = {
0x00000297, /* 1: auipc t0, %pcrel_hi(fw_dyn) */
0x02828613, /* addi a2, t0, %pcrel_lo(1b) */
0xf1402573, /* csrr a0, mhartid */
0,
0,
0x00028067, /* jr t0 */
start_addr, /* start: .dword */
start_addr_hi32,
fdt_load_addr, /* fdt_laddr: .dword */
0x00000000,
/* fw_dyn: */
};
if (riscv_is_32bit(harts)) {
reset_vec[3] = 0x0202a583; /* lw a1, 32(t0) */
reset_vec[4] = 0x0182a283; /* lw t0, 24(t0) */
} else {
reset_vec[3] = 0x0202b583; /* ld a1, 32(t0) */
reset_vec[4] = 0x0182b283; /* ld t0, 24(t0) */
}
When qemu runs, it will first jump to 0x1000 to execute these instructions and then jump to _start.
I'm writing some functions in Delphi using Assembly. So I want to put it in a .pas file called Strings.pas. To use in uses of a new Delphi software. What do I need to write, to make it a valid library?
My function is like this:
function Strlen(texto : string) : integer;
begin
asm
mov esi, texto
xor ecx,ecx
cld
#here:
inc ecx
lodsb
cmp al,0
jne #here
dec ecx
mov Result,ecx
end;
end;
That counts the numbers of chars in the string. How can I make it in a lib Strings.pas to call with uses Strings; in my form?
A .pas file is a unit, not a library. A .pas file needs to have unit, interface, and implementation statements, eg:
Strings.pas:
unit Strings;
interface
function Strlen(texto : string) : integer;
implementation
function Strlen(texto : string) : integer;
asm
// your assembly code...
// See Note below...
end;
end.
Then you can add the .pas file to your other projects and use the Strings unit as needed. It will be compiled directly into each executable. You don't need to make a separate library out of it. But if you want to, you can. Create a separate Library (DLL) or Package (BPL) project, add your .pas file to it, and compile it into an executable file that you can then reference in your other projects.
In the case of a DLL library, you will not be able to use the Strings unit directly. You will have to export your function(s) from the library (and string is not a safe data type to pass over a DLL boundary between modules), eg:
Mylib.dpr:
library Mylib;
uses
Strings;
exports
Strings.Strlen;
begin
end.
And then you can have your other projects declare the function(s) using external clause(s) that reference the DLL file, eg:
function Strlen(texto : PChar) : integer; external 'Mylib.dll';
In this case, you can make a wrapper .pas file that declares the functions to import, add that unit to your other projects and use it as needed, eg:
StringsLib.pas:
unit StringsLib;
interface
function Strlen(texto : PChar) : integer;
implementation
function Strlen; external 'Mylib.dll';
end.
In the case of a Package, you can use the Strings units directly. Simply add a reference to the package's .bpi in your other project's Requires list in the Project Manager, and then use the unit as needed. In this case, string is safe to pass around.
Note: in the assembly code you showed, for the function to not cause an access violation, you need to save and restore the ESI register. See the section on Register saving conventions in the Delphi documentation.
The correct asm version may be:
unit MyStrings; // do not overlap Strings.pas unit
interface
function StringLen(const texto : string) : integer;
implementation
function StringLen(const texto : string) : integer;
asm
test eax,eax
jz #done
mov eax,dword ptr [eax-4]
#done:
end;
end.
Note that:
I used MyStrings as unit name, since it is a very bad idea to overlap the official RTL unit names, like Strings.pas;
I wrote (const texto: string) instead of (texto: string), to avoid a reference count change at calling;
Delphi string type already has its length stored as integer just before the character memory buffer;
In Delphi asm calling conventions, the input parameters are set in eax edx ecx registers, and the integer result of a function is the eax register - see this reference article - for Win32 only;
I tested for texto to be nil (eax=0), which stands for a void '' string;
This would work only under Win32 - asm code under Win64 would be diverse;
Built-in length() function would be faster than an asm sub-function, since it is inlined in new versions of Delphi;
Be aware of potential name collisions: there is already a well known StrLen() function, which expects a PChar as input parameter - so I renamed your function as StringLen().
Since you want to learn asm, here are some reference implementation of this function.
A fast PChar oriented version may be :
function StrLen(S: PAnsiChar): integer;
asm
test eax,eax
mov edx,eax
jz #0
xor eax,eax
#s: cmp byte ptr [eax+edx+0],0; je #0
cmp byte ptr [eax+edx+1],0; je #1
cmp byte ptr [eax+edx+2],0; je #2
cmp byte ptr [eax+edx+3],0; je #3
add eax,4
jmp #s
#1: inc eax
#0: ret
#2: add eax,2; ret
#3: add eax,3
end;
A more optimized version:
function StrLen(S: PAnsiChar): integer;
// pure x86 function (if SSE2 not available) - faster than SysUtils' version
asm
test eax,eax
jz ##z
cmp byte ptr [eax+0],0; je ##0
cmp byte ptr [eax+1],0; je ##1
cmp byte ptr [eax+2],0; je ##2
cmp byte ptr [eax+3],0; je ##3
push eax
and eax,-4 { DWORD Align Reads }
##Loop:
add eax,4
mov edx,[eax] { 4 Chars per Loop }
lea ecx,[edx-$01010101]
not edx
and edx,ecx
and edx,$80808080 { Set Byte to $80 at each #0 Position }
jz ##Loop { Loop until any #0 Found }
##SetResult:
pop ecx
bsf edx,edx { Find First #0 Position }
shr edx,3 { Byte Offset of First #0 }
add eax,edx { Address of First #0 }
sub eax,ecx { Returns Length }
##z: ret
##0: xor eax,eax; ret
##1: mov eax,1; ret
##2: mov eax,2; ret
##3: mov eax,3
end;
An SSE2 optimized version:
function StrLen(S: PAnsiChar): integer;
asm // from GPL strlen32.asm by Agner Fog - www.agner.org/optimize
or eax,eax
mov ecx,eax // copy pointer
jz #null // returns 0 if S=nil
push eax // save start address
pxor xmm0,xmm0 // set to zero
and ecx,0FH // lower 4 bits indicate misalignment
and eax,-10H // align pointer by 16
movdqa xmm1,[eax] // read from nearest preceding boundary
pcmpeqb xmm1,xmm0 // compare 16 bytes with zero
pmovmskb edx,xmm1 // get one bit for each byte result
shr edx,cl // shift out false bits
shl edx,cl // shift back again
bsf edx,edx // find first 1-bit
jnz #A200 // found
// Main loop, search 16 bytes at a time
#A100: add eax,10H // increment pointer by 16
movdqa xmm1,[eax] // read 16 bytes aligned
pcmpeqb xmm1,xmm0 // compare 16 bytes with zero
pmovmskb edx,xmm1 // get one bit for each byte result
bsf edx,edx // find first 1-bit
// (moving the bsf out of the loop and using test here would be faster
// for long strings on old processors, but we are assuming that most
// strings are short, and newer processors have higher priority)
jz #A100 // loop if not found
#A200: // Zero-byte found. Compute string length
pop ecx // restore start address
sub eax,ecx // subtract start address
add eax,edx // add byte index
#null:
end;
Or even a SSE4.2 optimized version:
function StrLen(S: PAnsiChar): integer;
asm // warning: may read up to 15 bytes beyond the string itself
or eax,eax
mov edx,eax // copy pointer
jz #null // returns 0 if S=nil
xor eax,eax
pxor xmm0,xmm0
{$ifdef HASAESNI}
pcmpistri xmm0,dqword [edx],EQUAL_EACH // comparison result in ecx
{$else}
db $66,$0F,$3A,$63,$02,EQUAL_EACH
{$endif}
jnz #loop
mov eax,ecx
#null: ret
#loop: add eax,16
{$ifdef HASAESNI}
pcmpistri xmm0,dqword [edx+eax],EQUAL_EACH // comparison result in ecx
{$else}
db $66,$0F,$3A,$63,$04,$10,EQUAL_EACH
{$endif}
jnz #loop
#ok: add eax,ecx
end;
You will find all those functions, including Win64 versions, in our very optimized SynCommons.pas unit, which is shared by almost all our Open Source projects.
The my two solutions to get the length of two types of string,
as for says Peter Cordes are not both useful.
Only the "PAnsiCharLen()" could be an alternative solution,
but not as fast as it is StrLen() (optimized) of Amaud Bouchez,
that it is about 3 times faster than mine.
10/14/2017 (mm/dd/yyy): Added one new function (Clean_Str).
However, for now, I propose three small corrections to both
of them (two suggested by Peter Cordes: 1) use MovZX instead of Mov && And;
2) Use SetZ/SetE instead LAHF/ShL, use XOr EAX,EAX instead XOr AL,AL);
in the future I could define the functions in assembly (now they are defined in Pascal):
unit MyStr;
{ Some strings' function }
interface
Function PAnsiCharLen(S:PAnsiChar):Integer;
{ Get the length of the PAnsiChar^ string. }
Function ShortStrLen(S:ShortString):Integer;
{ Get the length of the ShortString^ string. }
Procedure Clean_Str(Str:ShortString;Max_Len:Integer);
{ This function can be used to clear the unused space of a short string
without modifying his useful content (for example, if you save a
short-string field in a file, at parity of content the file may be
different, because the unused space is not initialized).
Clears a String Str_Ptr ^: String [], which has
Max_Len = SizeOf (String []) - 1 characters, placing # 0
all characters beyond the position of Str_Ptr ^ [Str_Ptr ^ [0]] }
implementation
Function PAnsiCharLen(S:PAnsiChar):Integer;
{ EAX EDX ECX are 1°, 2° AND 3° PARAMETERs.
Can freely modify the EAX, ECX, AND EDX REGISTERs. }
Asm
ClD {Clear string direction flag}
Push EDI {Save EDI's reg. into the STACK}
Mov EDI,S {Load S into EDI's reg.}
XOr EAX,EAX {Set AL's reg. with null terminator}
Mov ECX,-1 {Set ECX's reg. with maximum length of the string}
RepNE ScaSB {Search null and decrease ECX's reg.}
SetE AL {AL is set with FZero}
Add EAX,ECX {EAX= maximum_length_of_the_string - real_length_of_the_string}
Not EAX {EAX= real_length_of_the_string}
Pop EDI {Restore EDI's reg. from the STACK}
End;
Function ShortStrLen(S:ShortString):Integer; Assembler;
{ EAX EDX ECX are 1°, 2° AND 3° PARAMETERs.
Can freely modify the EAX, ECX, AND EDX REGISTERs. }
Asm
MovZX EAX,Byte Ptr [EAX] {Load the length of S^ into EAX's reg. (function's result)}
End;
Procedure Clean_Str(Str:ShortString;Max_Len:Integer); Assembler;
(* EAX EDX ECX are 1°, 2° AND 3° PARAMETERs.
Can freely modify the EAX, ECX, AND EDX REGISTERs. *)
Asm
ClD {Clear string direction flag}
Push EDI {Save EDI's reg. into the STACK}
Mov EDI,Str {Load input string pointer into EDI's reg.}
Mov ECX,Max_Len {Load allocated string length into ECX's reg.}
MovZX EDX,Byte Ptr [EDI] {Load real string length into EDX's reg.}
StC {Process the address of unused space of Str; ...}
AdC EDI,EDX {... skip first byte and useful Str space}
Cmp EDX,ECX {If EDX>ECX ...}
CMovGE EDX,ECX {... set EDX with ECX}
Sub ECX,EDX {ECX contains the size of unused space of Str}
XOr EAX,EAX {Clear accumulator}
Rep StoSB {Fill with 0 the unused space of Str}
Pop EDI {Restore EDI's reg. from the STACK}
End;
end.
Old (incomplete) answer:
"Some new string's functions, not presents in Delphi library, could be these:"
Type Whole=Set Of Char;
Procedure AsmKeepField (PStrIn,PStrOut:Pointer;FieldPos:Byte;
All:Boolean);
{ Given "field" as a sequence of characters that does not contain spaces
or tabs (# 32, # 9), it takes FieldPos (1..N) field
to PStrIn ^ (STRING) and copies it to PStrOut ^ (STRING).
If All = TRUE, it also takes all subsequent fields }
Function AsmUpCComp (PStr1,PStr2:Pointer):Boolean;
{ Compare a string PStr1 ^ (STRING) with a string PStr2 ^ (STRING),
considering the PStr1 alphabetic characters ^ always SHIFT }
Function UpCaseStrComp (Str1,Str2:String;Mode:Boolean):ShortInt;
{ Returns: -1 if Str1 < Str2.
0 is Str1 = Str2.
1 is Str1 > Str2.
MODE = FALSE means "case sensitive comparison" (the letters are
consider them as they are).
MODE = TRUE means that the comparison is done by considering
both strings as if they were all uppercase }
Function KeepLS (Str:String;CntX:Byte):String;
{ RETURN THE PART OF STR THAT INCLUDES THE FIRST CHARACTER
OF STR AND ALL THE FOLLOW UP TO THE POSITION CntX (0 to N-1) INCLUDED }
Function KeepRS (Str:String;CntX,CsMode:Byte):String;
{ RETURN THE PART OF STR STARTING TO POSITION CntX + 1 (0 to N-1)
UP TO END OF STR.
IF CsMode = 0 (INSERT MODE), IF CsMode = 1 (OVERWRITE-MODE):
IN THIS CASE, THE CHARACTER TO CntX + 1 POSITION IS NOT INCLUDED }
Function GetSubStr (Str:String;
Pos,Qnt:Byte;CH:Char):String;
{ RETURN Qnt STR CHARACTERS FROM POSITION Pos (1 to N) OF STR;
IF EFFECTIVE LENGTH IS LESS THAN Qnt, WILL ADDED CHARACTER = CH }
Function Keep_Right_Path_Str_W(PathName:String;FieldWidth:Byte;
FormatWithSpaces:Boolean):String;
{ RESIZE A STRING OF A FILE PATH, FROM PathName;
THE NEW STRING WILL HAVE A MAXIMUM LENGTH OF FieldWidth CHARACTERS.
REPLACE EXCEDENT CHARACTERS WITH 3 POINTS,
INSERTED AFTER DRIVE AND ROOT.
REPLACE SOME DIRECTORY WITH 3 POINTS,
ONLY WHEN IT IS NECESSARY, POSSIBLE FROM SECOND.
FORMAT RETURN WITH SPACE ONLY IF FormatWithSpaces = TRUE }
Function KeepBarStr (Percentage,Qnt:Byte;
Ch1,Ch2,Ch3:Char):String;
{ THIS IS A FUNCTION WICH MAKES A STRING WICH CONTAINS A REPRESENTATION OF STATE
OF ADVANCEMENT OF A PROCESS; IT RETURNS A CHARACTERS' SEQUENCE, CONSTITUTED BY "<Ch1>"
(LENGTH = Percentage / 100 * Qnt), WITH AN APPROXIMATION OF THE LAST CHARACTER TO
"<Ch2>" (IF "Percentage / 100 * Qnt" HAS HIS FRACTIONAL'S PART GREATER THAN 0.5),
FOLLOWED BY AN OTHER CHARACTERS' SEQUENCE, CONSTITUTED BY "<Ch3>" (LENGTH = (100 -
Percentage) / 100 * Qnt). }
Function Str2ChWhole (Str:String;Var StrIndex:Byte;
Var ChSet:Whole;
Mode:Boolean):Boolean;
{ CONVERT A PART OF Str, POINTED BY StrIndex, IN A ChSet CHARACTER SET;
IF Mode = TRUE, "StrIn" SHOULD CONTAIN ASCII CODES
OF CORRESPONDING CHARACTERS EXPRESSED IN DECIMAL SIZE;
OTHERWISE IT SHOULD CONTAIN CORRESPONDING CHARACTER SYMBOLS }
Function ChWhole2Str (ChSet:Whole;Mode:Boolean):String;
{ CONVERT A SET OF CHARACTERS IN A CORRESPONDING STRING;
IF Mode = TRUE ELEMENTS OF ChSet WILL BE CONVERTED IN ASCII CODES
EXPRESSED IN DECIMAL SIZE; OTHERWISE THE CORRESPONDING SYMBOLS
WILL BE RETURNED }
Function ConverteFSize (FSize:LongInt;
Var SizeStr:TSizeStr):Integer;
{ MAKES THE CONVERSION OF THE DIMENSION OF A FILE IN A TEXT,
LARGE TO MAXIMUM 5 CHARACTERS, AND RETURN THE COLOR OF THIS STRING }
Function UpCasePos (SubStr,Str:String):Byte;
{ Like the Pos () system function, but not "case sensitive" }
I'm starting to read LLVM docs and IR documentation.
In common architectures, an asm cmp instruction "result" value is -at least- 3 bits long, let's say the first bit is the SIGN flag, the second bit is the CARRY flag and the third bit is the ZERO flag.
Question 1)
Why the IR icmp instruction result value is only i1? (you can choose only one flag)
Why doesn't IR define, let's call it a icmp2 instruction returning an i3 having SIGN,CARRY and ZERO flags?
This i3 value can be acted upon with a switch instruction, or maybe a specific br2 instruction, like:
%result = cmp2 i32 %a, i32 %b
br2 i3 %result onzero label %EQUAL, onsign label %A_LT_B
#here %a GT %b
Question 2)
Does this make sense? Could this br2 instruction help create new optimizations? i.e. remove all jmps? it is necessary or the performance gains are negligible?
The reason I'm asking this -besides not being an expert in LLVM- is because in my first tests I was expecting some kind of optimization to be made by LLVM in order to avoid making the comparison twice and also avoid all branches by using asm conditional-move instructions.
My Tests:
I've compiled with clang-LLVM this:
#include <stdlib.h>
#include <inttypes.h>
typedef int32_t i32;
i32 compare (i32 a, i32 b){
// return (a - b) & 1;
if (a>b) return 1;
if (a<b) return -1;
return 0;
}
int main(int argc, char** args){
i32 n,i;
i32 a,b,avg;
srand(0); //fixed seed
for (i=0;i<500;i++){
for (n=0;n<1e6;n++){
a=rand();
b=rand();
avg+=compare(a,b);
}
}
return avg;
}
Output asm is:
...
mov r15d, -1
...
.LBB1_2: # Parent Loop BB1_1 Depth=1
# => This Inner Loop Header: Depth=2
call rand
mov r12d, eax
call rand
mov ecx, 1
cmp r12d, eax
jg .LBB1_4
# BB#3: # in Loop: Header=BB1_2 Depth=2
mov ecx, 0
cmovl ecx, r15d
.LBB1_4: # %compare.exit
# in Loop: Header=BB1_2 Depth=2
add ebx, ecx
...
I expected (all jmps removed in the inner loop):
mov r15d, -1
mov r13d, 1 # HAND CODED
call rand
mov r12d, eax
call rand
xor ecx,ecx # HAND CODED
cmp r12d, eax
cmovl ecx, r15d # HAND CODED
cmovg ecx, r13d # HAND CODED
add ebx, ecx
Performance difference (1s) seems to be negligible (on a VM under VirtualBox):
LLVM generated asm: 12.53s
hancoded asm: 11.53s
diff: 1s, in 500 millions iterations
Question 3)
Are my performance measures correct? Here's the makefile and the full hancoded.compare.s
makefile:
CC=clang -mllvm --x86-asm-syntax=intel
all:
$(CC) -S -O3 compare.c
$(CC) compare.s -o compare.test
$(CC) handcoded.compare.s -o handcoded.compare.test
echo `time ./compare.test`
echo `time ./handcoded.compare.test`
echo `time ./compare.test`
echo `time ./handcoded.compare.test`
hand coded (fixed) asm:
.text
.file "handcoded.compare.c"
.globl compare
.align 16, 0x90
.type compare,#function
compare: # #compare
.cfi_startproc
# BB#0:
mov eax, 1
cmp edi, esi
jg .LBB0_2
# BB#1:
xor ecx, ecx
cmp edi, esi
mov eax, -1
cmovge eax, ecx
.LBB0_2:
ret
.Ltmp0:
.size compare, .Ltmp0-compare
.cfi_endproc
.globl main
.align 16, 0x90
.type main,#function
main: # #main
.cfi_startproc
# BB#0:
push rbp
.Ltmp1:
.cfi_def_cfa_offset 16
push r15
.Ltmp2:
.cfi_def_cfa_offset 24
push r14
.Ltmp3:
.cfi_def_cfa_offset 32
push r12
.Ltmp4:
.cfi_def_cfa_offset 40
push rbx
.Ltmp5:
.cfi_def_cfa_offset 48
.Ltmp6:
.cfi_offset rbx, -48
.Ltmp7:
.cfi_offset r12, -40
.Ltmp8:
.cfi_offset r14, -32
.Ltmp9:
.cfi_offset r15, -24
.Ltmp10:
.cfi_offset rbp, -16
xor r14d, r14d
xor edi, edi
call srand
mov r15d, -1
mov r13d, 1 # HAND CODED
# implicit-def: EBX
.align 16, 0x90
.LBB1_1: # %.preheader
# =>This Loop Header: Depth=1
# Child Loop BB1_2 Depth 2
mov ebp, 1000000
.align 16, 0x90
.LBB1_2: # Parent Loop BB1_1 Depth=1
# => This Inner Loop Header: Depth=2
call rand
mov r12d, eax
call rand
xor ecx,ecx #hand coded
cmp r12d, eax
cmovl ecx, r15d #hand coded
cmovg ecx, r13d #hand coded
add ebx, ecx
.LBB1_3:
dec ebp
jne .LBB1_2
# BB#5: # in Loop: Header=BB1_1 Depth=1
inc r14d
cmp r14d, 500
jne .LBB1_1
# BB#6:
mov eax, ebx
pop rbx
pop r12
pop r14
pop r15
pop rbp
ret
.Ltmp11:
.size main, .Ltmp11-main
.cfi_endproc
.ident "Debian clang version 3.5.0-1~exp1 (trunk) (based on LLVM 3.5.0)"
.section ".note.GNU-stack","",#progbits
Question 1: LLVM IR is machine independent. Some machines might not even have a carry flag, or even a zero flag or sign flag. The return value is i1 which suffices to indicate TRUE or FALSE. You can set the comparison condition like 'eq' and then check the result to see if the two operands are equal or not, etc.
Question 2: LLVM IR does not care about optimization initially. The main goal is to generate a Static Single Assignment (SSA) based representation of instructions. Optimization happens in later passes of which some are machine independent and some are machine dependent. Your br2 idea will assume that the machine will support those 3 flags which might be a wrong assumption,
Question 3: I am not sure what you are trying to do here. Can you explain more?
I'm trying to write a function that receives a number (which I pushed earlier), and prints it. How can I do it?
What I have so far:
org 100h
push 10
call print_num
print_num:
push bp
mov bp, sp
mov ax, [bp+2*2]
mov bx, cs
mov es, bx
mov dx, string
mov di, dx
stosw
mov ah, 09h
int 21h
pop bp
ret
string:
What you're placing at the address of string is a numerical value, not the string representation of that value.
The value 12 and the string "12" are two separate things. Seen as a 16-bit hexadecimal value, 12 would be 0x000C while "12" would be 0x3231 (0x32 == '2', 0x31 == '1').
You need to convert the numerical value into its string representation and then print the resulting string.Rather than just pasting a finished solution I'll show a simple way of how this could be done in C, which should be enough for you to base an 8086 implementation on:
char string[8], *stringptr;
short num = 123;
string[7] = '$'; // DOS string terminator
// The string will be filled up backwards
stringptr = string + 6;
while (stringptr >= string) {
*stringptr = '0' + (num % 10); // '3' on the first iteration, '2' on the second, etc
num /= 10; // 123 => 12 => 1 => 0
if (num == 0) break;
stringptr--;
}
I wanted to print the first 20 numbers using loop.
Printing the first nine numbers is absolutely fine as the hexadecimal and decimal codes are the same, but from the 10th number I had to convert each number into its appropriate code and then convert it and store it to string and eventually display it
That is,
If (NUMBER > 9)
ADD 6D
;10d = 0ah --(+6)--> 16d = 10h
IF NUMBER IS > 19
ADD 12D
;20d = 14h --(+12)--> 32d = 20h
Then rotating and shifting each number to get the desired output number, that is,
DAA # let al = 74h = 0111.0100
XOR AH,AH # ah = 0 (Just in case it wasn't)
# ax = 0000.0000.0111.0100
ROR AX,4 # ax = 0100.0000.0000.0111 = 4007h
SHR AH,4 # ax = 0000.0100.0000.0111 = 0407h
ADD AX,3030h # ax = 0011.0100.0011.0111 = 3437h = ASCII "74" (Reversed due to little endian)
And then storing the result in to the string and displaying it, that is,
MOV BX,OFFSET Result ;Let Result is an empty string
MOV byte ptr[BX],5 ;Size of the string
MOV byte ptr[BX+4],'$' ;String terminator
MOV byte ptr[BX+3],AH ;storing number
MOV byte ptr[BX+2],AL
MOV DX,BX
ADD DX,02 ;Displaying the result
MOV AH,09H ;Interrupt 21 service to display string
INT 21H
And here is the complete code with proper commenting,
MOV CX,20 ;Number of iterations
MOV DX,0 ;First value of the sequence
L1:
PUSH DX
ADD DX,30H ; 30H is equal to 0 in hexadecimal , 31H = 1 and so on
MOV AH,02H ; INTERRUPT Service to print the DX content
INT 21H
POP DX
ADD DX,1
CMP DX,09 ; if number is > 9 i.e 0A then go to L2
JA L2
LOOP L1
L2:
PUSH DX
MOV AX,DX
CMP AX,14H ;If number is equal to 14H(20) then Jump to L3
JE L3
ADD AX,6D ;If less than 20 then add 6D
XOR AH,AH ;Clear the content of AH
ROR AX,4 ;Rotating and Shifting for to properly store
SHR AH,4
ADC AX,3030h
MOV BX,OFFSET Result
MOV byte ptr[BX],5
MOV byte ptr[BX+4],'$'
MOV byte ptr[BX+3],AH
MOV byte ptr[BX+2],AL
MOV DX,BX
ADD DX,02
MOV AH,09H
INT 21H
POP DX
ADD DX,1
LOOP L2
;If the number is equal to 20 come here, ->
; Every step is repeated here just to change 6D to 12D
L3:
ADD AX,12D
XOR AH,AH
ROR AX,1
ROR AX,1
ROR AX,1
ROR AX,1
SHR AH,1
SHR AH,1
SHR AH,1
SHR AH,1
ADC AX,3030h
MOV BX,OFFSET Result
MOV byte ptr[BX],5
MOV byte ptr[BX+4],'$'
MOV byte ptr[BX+3],AH
MOV byte ptr[BX+2],AL
MOV DX,BX
ADD DX,02
MOV AH,09H
INT 21H
Is there any proper way to do it, creating a function and using if/else (jumps) to get the desired output rather than repeating the code again and again?
PSEUDO CODE:
VAR = 6
IF Number is > 9
ADD AX,VAR
Else IF Number is > 19
ADD AX,(VAR*2)
ELSE IF NUMBER is > 29
ADD AX,(VAR*3)
So you just want to print 0 ... 20 as ASCII characters? It looks like you understand that the numerals are identified as 0x30 ... 0x39 for '0' to '9', so you could use integer division to generate the character for the tens digit:
I usually work with C but conversion to assembler shouldn't be too complicated since these are all fundamental operations and there are no function calls.
int i_value = 29;
int i_tens = i_value/10; //Integer division! 29/10 = 2, save for later use
char c_tens = '0' + i_tens;
char c_ones = '0' + i_value-(10*i_tens); // Subtract N*10 from value
The output will be c_tens = 0x32, c_ones = 0x39. You should be able to wrap this inside of a loop pretty easily using a pair of registers.
Pseudocode
regA <- num_iterations //For example, 20
regB <- 0 //Initialize counter register
LOOP:
//Do conversion for the current iteration.
//Manipulate bytes for output as necessary.
regB <- regB +1
branch not equal regA, regB LOOP
The following code counts from 0 up to 99 (ax contains the ASCII number):
count proc
mov cx, 100 ; loop runs the times specified in the cx register
xor bx, bx ; set counter to zero
print:
mov ax, bx
aam ; Converts binary to unpacked BCD
xor ax, 3030h ; Converts upacked BCD to ASCII
; Print here (ax now contains the numer in ASCII representation)
inc bx ; Increase counter
loop print
ret
count endp