How to make an accent insensitive palindrome checker in MIPS? - mips

I am writing a palindrome checker in MIPS, and I was trying to make it accent insensitive so that something like "ahà" would be considered a palindrome too. However, it doesn't look so simple as the case insensitive scenario where there is a fixed value between a lowercase and an uppercase letter.
I asked my teacher about it and she said that I could check the entire string and replace any "è" with "e", then check it again to replace any "é" with "e" and so on, but she told me there is a better solution and asked me to think about it.
The only thing I have noticed so far is that the accents are in the extended ASCII code, so > 127, but I can't seem to understand what to do. Can someone help me? Even just a hint would be appreciated, thank you in advance.

You're going to have to hardcode this one with a lookup table like Alain Merigot suggested. How you do this depends on your string encoding scheme (ASCII vs. UTF-8, etc.)
For ASCII, I whipped this up and it should work:
.data
ascii_strip_accent_table:
# index: U+nnnn offset, minus 128
.space 0x40 ;table doesn't really start until U+00C0
.ascii "AAAAA"
.byte 0xC6
.ascii "C"
.ascii "EEEE"
.ascii "IIII"
.ascii "D"
.ascii "N"
.ascii "OOOOO" ;these are capital Os, not zeroes
.byte 0xD7
.ascii "O" ;this is a capital O, not a zero
.ascii "UUUU"
.ascii "Y"
.byte 0xDE,0xDF
.ascii "aaaaa"
.byte 0xE6
.ascii "c"
.ascii "eeee"
.ascii "iiii"
.ascii "d"
.ascii "n"
.ascii "ooooo"
.byte 0xF7
.ascii "o"
.ascii "uuuu"
.ascii "y"
.byte 0xFE
.ascii "y"
MyString:
.asciiz "Pokémon"
.text
la $a0,ascii_strip_accent_table
la $a1,MyString
li $t2,128
loop:
lbu $t0,($a1) # read from string
beqz $t0,done
bltu $t0,$t2,continue # if char < 128, skip
subu $t0,$t0,$t2 # subtract 128 to get array index
move $a2,$a0 # backup table base
addu $a2,$a2,$t0 # add array index to table base
lbu $t0,($a2) # load from table
sb $t0,($a1) # store in string
continue:
addiu $a0,$a0,1
j loop
done:
li $v0,10
syscall
EDIT: Now if you're like me and you can't stand unnecessary padding, you can actually remove that .space 40 at the beginning if you la $a0,ascii_strip_accent_table-64 instead. Whether you're willing to take that risk, is up to you.

Related

How to load an integer value to a double register in MIPS?

I have this code :
void test(int x)
{
cout<<x;
double y=x+4.0;
cout<<y;
}
void main ()
{
test(7); // call the test in main
}
In MIPS :
after I put the value of parameter x in 0($fp) in stack and jump to test :
lw $a0,0($fp) // load value of x and print it
li $v0,1
syscall
lw $t1,0($fp)
sw $t1,0($sp) // put the value of x in stack and point it by $sp
li.d $f0,4.0
s.d $f0,4($sp) // put the value 4.0 in stack and point it by $sp
l.d $f0,0($sp)
l.d $f2,4($sp)
add.d $f4,$f0,$f2
s.d $f4,8($sp) // put the result of add
l.d $f12,8($sp) // print the value of y
li $v0,3
syscall
My problem is the result of y in QTSPIM is 4 .... the problem because I load an integer value in a double register ... How I can solve this problem ???
You need to load the integer value into an fp register and then convert it to floating point format. You have an 8-byte load (which will load the 4 bytes of your value, plus the following 4 bytes, whatever they happen to be), so you probably want to change that, and then do a cvt:
l.w $f0,0($fp)
cvt.d.w $f0,$f0

Variable initialization in as8088

I'm currently writing a function that should basically just write characters from a string into variables.
When performing test prints my variables seem fine. But when I attempt to print the first variable assigned(inchar) outside of the function it returns a empty string, but the second variable (outchar) seems to return fine. Am I somehow overwriting the first variable?
This is my code:
_EXIT = 1
_READ = 3
_WRITE = 4
_STDOUT = 1
_STDIN = 1
_GETCHAR = 117
MAXBUFF = 100
.SECT .TEXT
start:
0: PUSH endpro2-prompt2
PUSH prompt2
PUSH _STDOUT
PUSH _WRITE
SYS
ADD SP,8
PUSH 4
PUSH buff
CALL getline
ADD SP,4
!!!!!!!!!
PUSH buff
CALL gettrans
ADD SP,4
ADD AX,1 !gives AX an intial value to start loop
1: CMP AX,0
JE 2f
PUSH endpro-prompt1
PUSH prompt1
PUSH _STDOUT
PUSH _WRITE
SYS
ADD SP,8
PUSH MAXBUFF
PUSH buff
CALL getline
ADD SP,2
!PUSH buff
!CALL translate
!ADD SP,4
JMP 1b
2: PUSH 0 ! exit with normal exit status
PUSH _EXIT
SYS
getline:
PUSH BX
PUSH CX
PUSH BP
MOV BP,SP
MOV BX,8(BP)
MOV CX,8(BP)
ADD CX,10(BP)
SUB CX,1
1: CMP CX,BX
JE 2f
PUSH _GETCHAR
SYS
ADD SP,2
CMPB AL,-1
JE 2f
MOVB (BX),AL
INC BX
CMPB AL,'\n'
JNE 1b
2: MOVB (BX),0
MOV AX, BX
SUB AX,8(BP)
POP BP
POP CX
POP BX
RET
gettrans:
PUSH BX
PUSH BP
MOV BP,SP
MOV BX,6(BP) !Store argument in BX
MOVB (inchar),BL ! move first char to inchar
1: INC BX
CMPB (BX),' '
JE 1b
MOVB (outchar),BL !Move char seperated by Space to outchar
MOV AX,1 !On success
POP BP
POP BX
RET
.SECT .BSS
buff:
.SPACE MAXBUFF
.SECT .DATA
prompt1:
.ASCII "Enter a line of text: "
endpro:
prompt2:
.ASCII "Enter 2 characters for translation: "
endpro2:
outchar:
.BYTE 0
inchar:
.BYTE 0
charct:
.BYTE 0
wordct:
.BYTE 0
linect:
.BYTE 0
inword:
.BYTE 0
This is the code used to test print
PUSH 1 ! print that byte
PUSH inchar
PUSH _STDOUT
PUSH _WRITE
SYS
ADD SP,8
CALL printnl !function that prints new line
PUSH 1 ! print that byte
PUSH outchar
PUSH _STDOUT
PUSH _WRITE
SYS
CALL printnl
ADD SP,8
There seem to be a number of as88 8088 simulator environments. But I noticed on many of the repositories of code this bug mentioned:
1. The assembler requires sections to be defined in the following order:
TEXT
DATA
BSS
After the first occurrences, remaining section directives may appear in any order.
I'd recommend in your code to move the BSS section after DATA in the event your as88 environment has a similar problem.
In your original code you had lines like this:
MOV (outchar),BX
[snip]
MOV (inchar),BX
You defined outchar and inchar as bytes. The 2 lines above move 2 bytes (16-bits) from the BX register to both one byte variables. This will cause the CPU to write the extra byte into the next variable in memory. You'd want to explicitly move a single byte. Something like this might have been more appropriate:
MOVB (outchar),BL
[snip]
MOVB (inchar),BL
As you will see this code still has a bug as I mention later in this answer. To clarify - the MOVB instruction will move a single byte from BL and place it into the variable.
When you do a SYS call for Write you need to pass the address of the buffer to print, not the data in the buffer. You had 2 lines like this:
PUSH (inchar)
[snip]
PUSH (outchar)
The parentheses say to take the value in the variables and push them on the stack. SYS WRITE requires the address of the characters to display. The code to push their addresses should look like:
PUSH inchar
[snip]
PUSH outchar
gettrans function has a serious flaw in handling the copy of a byte from one buffer to another. You have code that does this:
MOV BX,6(BP) !Store argument in BX
MOVB (inchar),BL ! move first char to inchar
1: INC BX
CMPB (BX),' '
JE 1b
MOVB (outchar),BL !Move char seperated by Space to outchar
MOV BX,6(BP) properly places that buffer address passed as an argument and puts it into BX. There appears to be a problem with the lines that look like:
MOVB (inchar),BL ! move first char to inchar
This isn't doing what the comment suggests it should. The line above moves the lower byte (BL) of the buffer address in BX to the variable inchar . You want to move the byte at the memory location pointed to by BX and put it into inchar. Unfortunately on the x86 you can't move the data from one memory operand to another directly. To get around this you will have to move the data from the buffer pointed to by BX into a temporary register (I'll choose CL) and then move that to the variable. The code could look like this:
MOVB CL, (BX)
MOVB (inchar),CL ! move first char to inchar
You then have to do the same for outchar so the fix in both places could look similar to this:
MOV BX,8(BP) !Store argument in BX
MOVB CL, (BX)
MOVB (inchar),CL ! move first char to inchar
1: INC BX
CMPB (BX),' '
JE 1b
MOVB CL, (BX)
MOVB (outchar),CL ! move second char to outchar
The instruction MOV (inchar),BX stores register BX to the memory location labelled inchar.
However, inchar has been defined as a .BYTE, but BX is a 16-bit register, (2 bytes,) so you are writing not only inchar but also outchar.
The only reason why it appears to work in the beginning is because the 8088 is a low-endian architecture, so the low-order byte of BX is being stored first, while the high-order byte follows.
So, try MOV (inchar),BL

MIPS assembler memory alignment issue (detailed code included)

I am trying to store a integer that i am reading from the user into a array, however when i try to store it into my array my data becomes unaligned.
This first block of code is where i initialize all the data. (Under the 1. is where i am trying to store the integer )
'#Constants
P_INT = 1 #Syscall to print integer(value)
P_STRING = 4 #Syscall to print a string(addr)
P_CHAR = 11 #Syscall to print a char(char)
R_INT = 5 #Syscall to read a integer(none)
EXIT = 10 #Exit program(none)
'#Data
.data
newline:
.asciiz "\n"
'#Space for the bored.
1.
board_rep:
.space 578
'#The current node to be read in
cur_node:
.word 0
'#Size of the bored
size:
.space 4
'#Plus sign
plus:
.asciiz "+"
'#dash
dash:
.asciiz "-"
Right here is where it becomes unaligned (the sw right after 2.) . The strange thing is that i am doing the exact same thing later (in the third code block) except that i am storing it in the size array.
'#Grabs user input for the bored and stores it
get_board_rep:
li $v0,R_INT '#Read next node and store it
syscall
2.
sw $v0,0($s1)
addi $s1,$s1,4 ' #Increment next node addr
lw $a0,0($s1)
j prnt_node
At the store word (under the 3. ) it stores the read in integer fine.
la $s0, size ' #Store the variable addr's
la $s1, board_rep
li $v0,R_INT ' #Get user input(size of bored)
syscall
3.
sw $v0,0($s0) ' #Store the size of bored
jal get_board_rep
I thought maybe the array was too large but i changed it to 4 (the same size that the other array that worked). But it still was unaligned.
Thanks in advance . This is a project and i know some people don't like helping with stuff like this. But i have done my homework and i cannot find a answer anywhere.
That doesn't look aligned to me, and if i'm wrong try explicitly aligning it anyway.

Whats the difference between .asciiz vs .ascii

I read that .asciiz null terminates the string (appending \n?) ... but when looking at the User Data Segment of QtSPIM,
User data segment [10000000]..[10040000]
[10000000]..[1000ffff] 00000000
[10010000] 6c6c6548 6f57206f 00646c72 6c6c6548 H e l l o W o r l d . H e l l
[10010010] 6f57206f 00646c72 00000000 00000000 o W o r l d . . . . . . . . .
[10010020]..[1003ffff] 00000000
I don't see a difference?
.data
str1: .asciiz "Hello World" # string str1 = "Hello World"
str2: .ascii "Hello World" # string str2 = "Hello World"
.text
.globl main
main:
li $v0, 4 # print_string
# print(str1)
la $a0, str1 # load address of str1 into $a0
syscall
# print(str2)
la $a0, str2 # load address of str2 into $a0
syscall
j $ra
Outputs "Hello WorldHello World"
UPDATE
What are the implications or when do I use each? asciiz sounds like the "proper" method?
As written by #osgx, ASCIIZ means that the string is terminated by the \0 (ASCII code 0) NUL character. They are even called C strings. To quote from there:
In computing, a C string is a character sequence terminated with a
null character ('\0', called NUL in ASCII). It is usually stored as
one-dimensional character array.[dubious – discuss] The name refers to
the C programming language which uses this string representation.
Alternative names are ASCIIZ (note that C strings do not imply the use
of ASCII) and null-terminated string.

In MIPS, how do I divide register contents by two?

Let's say I have $t0, and I'd like to divide its integer contents by two, and store it in $t1.
My gut says: srl $t1, $t0, 2
... but wouldn't that be a problem if... say... the right-most bit was 1? Or does it all come out in the wash because the right-most bit (if positive) makes $t0 an odd number, which becomes even when divided?
Teach me, O wise ones...
Use instruction sra: Shift right arithmetic !!
sra $t1, $t0, 1
Divides the content of $t0 by the first power of 2.
Description: Shifts a register value
right by the shift amount (shamt) and
places the value in the destination
register. The sign bit is shifted in.
Operation: $d = $t >> h;
advance_pc (4);
Syntax: sra $d, $t, h
Encoding:
0000 00-- ---t tttt dddd dhhh hh00
0011
Why is this important? Check this simple program that divides an integer number (program's input) by 2.
#include <stdio.h>
/*
* div divides by 2 using sra
* udiv divides by 2 using srl
*/
int div(int n);//implemented in mips assembly.
int udiv(int n);
int main(int argc,char** argv){
if (argc==1) return 0;
int a = atoi(argv[1]);
printf("div:%d udiv:%d\n",div(a),udiv(a));
return 1;
}
//file div.S
#include <mips/regdef.h>
//int div(int n)
.globl div
.text
.align 2
.ent div
div:
sra v0,a0,1
jr ra //Returns value in v0 register.
.end div
//int udiv(int n)
.globl udiv
.text
.align 2
.ent udiv
udiv:
srl v0,a0,1
jr ra //Returns value in v0 register.
.end udiv
Compile
root#:/tmp#gcc -c div.S
root#:/tmp#gcc -c main.c
root#:/tmp#gcc div.0 main.o -o test
Test drives:
root#:~# ./test 2
div:1 udiv:1
root#:~# ./test 4
div:2 udiv:2
root#:~# ./test 8
div:4 udiv:4
root#:~# ./test 16
div:8 udiv:8
root#:~# ./test -2
div:-1 udiv:2147483647
root#:~# ./test -4
div:-2 udiv:2147483646
root#:~# ./test -8
div:-4 udiv:2147483644
root#:~# ./test -16
div:-8 udiv:2147483640
root#:~#
See what happens? The srl instruction is shifting the sign bit
-2 = 0xfffffffe
if we shift one bit to the right, we get 0x7fffffff
0x7ffffffff = 2147483647
Of course this is not a problem when the number is a positive integer, because the sign bit is 0.
To do unsigned integer division, thats right. This only works for unsigned integers and if you don't care about the fractional part.
You will want to use a shift amount of 1, not 2:
srl $t1, $t0, 1
If you use 2, you will end up dividing by 4. In general, shifting right by x divides by 2x.
If you are concerned about "rounding" and you want to round up, you can just increment by 1 before doing the logical (unsigned) shift.
And other have stated it previously but you only shift by 1 to divide by 2. A right shift by N bits divides by 2^N.
To use rounding (rounding up at 0.5 or greater) with shift values of N other than 1, just add 1<<(N-1) prior to the shift.