Impossible to encode 32-bit binary opcode in machine instruction - binary

I have been trying to format binary opcodes for Motorola 68000, but I keep finding that it's not possible to encode both the destination memory address, instruction designation and addressing mode/size, and data value to be copied to the address bus for memory-mapped I/O.
For the Sega Genesis' Video Display Processor I am attempting to write to the control port which is memory mapped at C00004 in the Genesis' memory map.
C0004 is 1100 0000 0000 0000 0000 0100 in binary, or three bytes. The value I'm writing is 87, which the VDP recognizes as 8787 in VDP register #7. The issue I'm having is figuring out how to encode 32-bits worth of data, e.g. the instruction prefix designation move.b, value 87, which is #$87, and the destination memory address C00004 for MMIO re-routing to the correct VDP port on the way to the VDP.
Altogether it looks like this:
move.b #$87, $00C00004,
which loosely translates into not four, but four bytes and a nibble(36-bits to be exact!)
0001 1000 0111 1100 0000 0000 0000 0000 0100
Since Motorola 68000 will only parse 32-bits when working down to microcode, how is it possible to encode the required information if there's not enough space(and within the same instruction)?
Perhaps I'm understanding this incorrectly?
I know this is beyond the level most programmers would anticipate, but I'm hoping someone around can break this down for me and explain how this encoding scheme would work.

Your instruction, move.b #$87, $$00C00004 ought to encode to
0001111001111100 0000000010000111 00000000110000000000000000000100
(or similar; I'm not sure about the order of the operands).
The first 16-bit word can be broken down thusly:
The first four bits say that this is a move.b instruction.
The next six bits say that the destination addressing mode is an absolute 32-bit address.
The last six bits say that the source operand is immediate.
After that follows instruction extension words with the operands. The first is 16 bits for the immediate data, and the last 32 bits are for the address. (Might be the other way around.)
For more information, see http://www.freescale.com/files/archives/doc/ref_manual/M68000PRM.pdf

Related

load byte instruction in MIPS

I am learning about Computer architecture through the MIPS instructions. I have a question which is:
Memory at 0x10000000 contains 0x80
Register $5 contains 0x10000000
What is put in register $8 after lb $8,0($5) is executed?
I was thinking when the load byte is called, it will take the 8 bits of 0x80[10000000] from the 0x10000000 address and load it into the first 8 bits of the $8 register and fill the remaining bits with zeros making the answer to be 00000080. But the correct answer listed is FFFFFF80. I am not sure if I understand it. Can anybody help explain it?
The instruction you mention here is lb which loads a one byte into a register by sign-extending the byte to the word size. This means if the most significant bit is set to 1 it will fill the remaining 24 bits with 1 as well. This is done to preserve the twos-complement value of the byte in a 32 bit representation.
If your byte would be 0100 1010 the sign-extend would fill it with 0 as
0000 000... 0100 1010.
If your byte would be 1011 0101 the sign-extend would fill it with 1 as
1111 111... 1011 0101.
To avoid this and always pad the byte with 0 you can use the alternative lbu instruction which does not perform a sign-extend but pads the byte with 0 instead.
This preserves the unsigned value of the byte since twos-complement is not involved for those.

MIPS PC and label tracking

Assuming that the current PC is 0x00400010 (after increment) and the target label has the value of 0x00400040. What is the binary value of the constant in the instruction?
beq $s0, $s0, target
I'm not really sure how to approach this question. I would appreciate a hint, or explanation of how to find a solution to this.
I am not sure if I have understood your question. I am assuming that you are asking for the offset which will be coded into the instruction.
Since the target is at 0x00400040 and the current PC is at 0x00400010, the offset probably will be 0x00000030 (because 0x00400040 - 0x00400010 = 0x00000030). This can be easily converted into the binary format you asked for:
0000 0000 0000 0000 0000 0000 0011 0000
But please note that I don't know MIPS. In some processor architectures, the offset coded into the instruction is
(target PC) - ((current PC) + (size of current instruction))
Since I don't know MIPS, I don't know what the byte size of the beq instruction is. Thus, I can't compute the offset for this case. If you tell me the size of the beq instruction, I'll make an edit to that answer and add that.
Furthermore, in most processor architectures, relative offsets will be restricted for most instructions. Once again, I don't know MIPS, but chances are that the offset is limited to 16, 12 or even 8 bits. In that case, to get the actual binary offset representation, remove zeroes from the left from the binary number I gave above until only the bits which are used to store the offset are left.
EDIT (taking into account Busy Beaver's comment)
On MIPS, it seems that instructions are aligned to 32 bits / 4 bytes. This allows to store the actual offset needed divided by 4 (the CPU then reads the offset and multiplies it by 4 to compute the actual target). The advantage is that you can store bigger offsets with the bits given. In other words, you save 2 offset bits that way.
In your example, the PC should jump by 0x00000030 bytes to get to the target. The offset stored in the instruction then would be 0x00000030 / 4 which is the same as 0x00000030 >> 2 which is 0x0000000C0. You asked for the binary representation:
0000 0000 0000 0000 0000 0000 0000 1100
When decoding / executing the instruction, the CPU automatically multiplies that offset by four and that way gets back the real offset desired.

Reading / Computing Hex received over RS232

I am using Docklight Scripting to put together a VBScript that communicates with a device via RS232. All the commands are sent in Hex.
When I want to read from the device, I send a 32-bit address, a 16-bit read length, and an 8-bit checksum.
When I want to write to the device, I send a 16-bit data length, the data, followed by an 8-bit checksum.
In Hex, the data that is sent to the device is the following:
AA0001110200060013F81800104D
AA 00 01 11 02 0006 0013F818 0010 4D
(spaced for ease of reading)
AA000111020006 is the protocol header, where:
AA is the Protocol Byte
00 is the Source ID
01 is the Dest ID
11 is the Message Type
02 is the Command Byte
0006 is the Length Byte(s)
The remainder of the string is broken down as follows:
0013F818 is the 32-bit address
0010 is the 16 bit read length
4D is the 8-bit checksum
If the string is not correct, or the checksum is invalid the device replies back with an error string. However, I am not getting an error. The device replies back with the following hex string:
AA0100120200100001000000000100000000000001000029
AA 01 00 12 02 0010 00010000000001000000000000010000 29
(spaced for ease of reading)
Again, the first part of the string (AA00011102) is a part of the protocol header, where:
AA is the Protocol Byte
01 is the Source ID
00 is the Dest ID
12 is the Message Type
02 is the Command Byte
The difference between what is sent to the device, and what the device replies back with is that the length bytes is not a "static" part of the protocol header, and will change based of the request. The remainder of the string is broken down as follows:
0010 is the Length Byte(s)
00010000000001000000000000010000 is the data
29 is the 8-bit Check Sum
The goal is to read a timer that is stored in the NVM. The timer is stored in the upper halves of 60 4-byte NVM words.
The instructions specify that I need to read the first two bytes of each word, and then sum the results.
Verbatim, the instructions say:
Read the NVM elapsed timer. The timer is stored in the upper halves of 60 4-byte words.
Read the first two bytes of each word of the timer. Read the 16 bit values of these locations:
13F800H, 13F804H, 13808H, and continue to 13F8ECH.
Sum the results. Multiply the sum by 409.6 seconds, then divide by 3600 to get the results in hours.
My knowledge of bits, and bytes, and all other things is a bit cloudy. The first thing I need to confirm is that I am understanding the read protocol correctly.
I am assuming that when I specify 0010 as the 16 bit read length, that translates to the 16-bit values that the instructions want me to read.
The second thing I need to understand a little better is that when it tells me to read the first two bytes of each word, what exactly constitutes the first two bytes of each word?
I think what confuses me a little more is that the instructions say the timer is stored in the upper half of the 4 byte word (which to me seems like the first half).
I've sat with another colleague of mine for a day trying to figure out how to make this all work, and we haven't had any consistent results with our trials.
I have looked on the internet to find something that would explain this better in the context being used.
Another worry is that the technical data I am using to accomplish this project isn't 100% accurate in their instructions, and they have conflicting information or skipping information throughout their publication (which is probably close to 1000 pages long).
What I would really appreciate is someone who has a much better understanding of hex / binary to review the instructions I've posted, and provide some feedback on my interpretation of the instructions provided, and provide any information.

How to get reminder of XOR division in calc.exe?

I would like to understand how I can calculate manually CRC encoding.
I have message to be sent like 1110 1101 1011 0111 and code generator 11001. In order to encode the message I add five zeros to the message (1110 1101 1011 0111 00000) and divide it by generator 11001.
I should receive 1011000000100100 with reminder 0000100 - in such a case I can replace five zeros with right part of reminder (00100). This is what I can see in the example found somewhere.
But I cannot calculate it with Windows calculator (calc.exe). I launch programmer's mode in calc.exe, type 1110 1101 1011 0111 00000 XOR 110001 and receive 111011011011011010001 instead of 1011000000100100. (Ordinary division gives 1001101100111110 which is not correct value as well).
How can I perform XOR division (or rather obtain reminder from this division) on two binary numbers?
Best regards!
XOR by itself isn't a division. The calculator tool is just doing a single bit-wise exclusive-or of the two binary numbers (with the shorter number assumed to be padded with leading zeros as necessary) and returning the result.
111011011011011100000 XOR
000000000000000110001 =
111011011011011010001
What you are looking for is an iterative shift-and-XOR process, which calc.exe doesn't do. If you want to do it manually, you're going to need a pencil and paper.

Little Endian - Memory content/address

Consider a system that has a byte-addressable memory organized in 32-bit words according to the big-endian scheme. A program reads ASCII characters entered at a keyboard and stores them in successive byte locations starting at location 1000.
Show the contents of the two memory words at locations 1000 and 1004 after the name johnson has been entered. Write this in the little-endian scheme.
What I got was:
[NULL, n], [o, s], [n,h], [o,j]
00, 6E 6F, 73 6E, 68 6F, 6A
I just want to know if this is correct and if not, what I did wrong.
Thank you all!
There is no such thing as endianes for storing a single byte (such as an ASCII character). Endianes only comes into play when a value is represented as multiple bytes. So for example, storing a sequence of bytes is the same in little- and big-endian, only the representation of the bytes are different. For example, take the number 3 735 928 559 (or 0xdeadbeef in hex notation) and store that as a 32-bit word (e.g., an int) at memory location 1000 will give:
ADR: 1000 1001 1002 1004
BE: de ad be ef
LE: ef be ad de
So, if you were to actually represent your ASCII character as a 32-bit word you would get:
[0, 0, 0, 6a], [0, 0, 0, 6f], ... or,
[6a, 0, 0, 0], [6f, 0, 0, 0], ...
for BE and LE respectively.
I find this question quite confusing.
byte is normally defined as the smallest addressable unit so saying that a machine has byte-addressable memory just tells nothing: every machine has byte-addressable memory because that's the definition of what a byte is, what can change is how many bits is a byte.
If the question is talking about a 32-bit byte machine (I know they exists, but I personally used only machines with 8-bit and 16-bit bytes) then it's not clear what role is playing endian-ness given that no multibyte processing is needed for storing ASCII.
What is often done in large-byte machines is however storing multiple characters per byte to save space (not necessarily a 16-bit byte machine is "big": the one I know is a DSP with a very limited amount of memory) but this seems unrelated to the question and there so "standard" way to do so anyway.
If instead the question assumes that a byte is always 8 bit by definition and talks about storing ASCII chars then once again endian-ness plays no role; chars are just store in memory one after another in consecutive locations. For example if the string "johnson" has been stored (assuming a C string convention) the content of memory would be:
0x6A 0x6F 0x68 0x6E 0x73 0x6F 0x6E 0x00
Reading this memory content as two 32-bit words would be affected by endian-ness of course, but saying that the machine uses big-endian and asking to display the result in little-endian scheme is nonsense.
In a big endian scheme (e.g. 68k) the two 32-bit words would be 0x6A6F686E and 0x736F6E00, in a little-endian scheme (e.g. x86) they would be 0x6E686F6A and 0x006E6F73.