Do MIPS pseudoinstructions get converted to machine code on a one to one scale? - mips

As far as I know, regular MIPS instructions are converted to machine code (binary) on a one to one scale. However, I'm curious as to whether pseudoinstructions like la and li are the same way. Would it be fair to say that pseudoinstructions work the same way and are also converted on a one to one scale?

Related

Can offset be a register? [duplicate]

I'm curious as to why we are not allowed to use registers as offsets in MIPS. I know that you can't use registers as offsets like this: lw $t3, $t1($t4); I'm just curious as to why that is the case.
Is it a hardware restriction? Or simply just part of the ISA?
PS: if you're looking for what to do instead, see Load Word in MIPS, using register instead of immediate offset from another register or look at compiler output for a C function like int foo(int *arr, int idx){ return arr[idx]; } - https://godbolt.org/z/PhxG57ox1
I'm curious as to why we are not allowed to use registers as offsets in MIPS.
I'm not sure if you mean "why does MIPS assembly not permit you to write it this form" or "why does the underlying ISA not offer this form".
If it's the former, then the answer is that the base ISA doesn't have any machine instructions that offers that functionality, and apparently the designers didn't decide to offer any pseudo-instruction that would implement that behind the scenes.2
If you're asking why the ISA doesn't offer it in the first place, it's just a design choice. By offering fewer or simpler addressing modes, you get the following advantages:
Less room is needed to encode a more limited set of possibilities, so you save encoding space for more opcodes, shorter instructions, etc.
The hardware can be simpler, or faster. For example, allowing two registers in address calculation may result in:
The need for an additional read port in the register file1.
Additional connections between the register file and the AGU to get both registers values there.
The need to do a full width (32 or 64 bit) addition rather than a simpler address-side + 16 bit-addition for the offset.
The need to have a three-input ALU if you want to still want to support immediate offsets with the 2-register addresses (and they are less useful if you don't).
Additional complexity in instruction decoding and address-generation since you may need to support two quite different paths for address generation.
Of course, all of those trade-offs may very well pay off in some contexts that could make good use of 2-reg addressing with smaller or faster code, but the original design which was heavily inspired by the RISC philosophy didn't include it. As Peter points out in the comments, new addressing modes have been subsequently added for some cases, although apparently not a general 2-reg addressing mode for load or store.
Is it a hardware restriction? Or simply just part of the ISA?
There's a bit of a false dichotomy there. Certainly it's not a hardware restriction in the sense that hardware could certainly support this, even when MIPS was designed. It sort of seems to imply that some existing hardware had that restriction and so the MIPS ISA somehow inherited it. I would suspect it was much the other way around: the ISA was defined this way, based on analysis of how likely hardware would be implemented, and then it became a hardware simplification since MIPS hardware doesn't need to support anything outside of what's in the MIPS ISA.
1 E.g., to support store instructions which would need to read from 3 registers.
2 It's certainly worth asking whether such a pseudo-instruction is a good idea or not: it would probably expand to an add of the two registers to a temporary register and then a lw with the result. There is always a danger that this hides "too much" work. Since this partly glosses over the difference between a true load that maps 1:1 to a hardware load, and the version that is doing extra arithmetic behind the covers, it is easy to imagine it might lead to sup-optimal decisions.
Take the classic example of linearly accessing two arrays of equal element size in a loop. With 2-reg addressing, it is natural to write this loop as two 2-reg accesses (each with a different base register and a common offset register). The only "overhead" for the offset maintenance is the single offset increment. This hides the fact that internally there are two hidden adds required to support the addressing mode: it would have simply been better to increment each base directly and not use the offset. Furthermore, once the overhead is clear, you can see that unrolling the loop and using immediate offsets can further reduce the overhead.

Is it a number or letter (Binary Conversion)?

I just read a lot about how processors work and how everything is just about 0 and 1 but I have a small question.
Suppose the processor got the following input "01100001" how could he know that it's 'a' letter and not the number 97? I don't understand this point and didn't find an answer for it as long as I searched.
Suppose the processor got the following input "01100001" how could he
know that it's 'a' letter and not the number 97?
Well, generally speaking, the processor doesn't need to know that information, and it is impossible to know how it is going to interpret that input without knowing the architecture and the associated assembly instruction.
I don't understand this point and didn't find an answer for it as long
as I searched.
I think the thing you are missing in your understanding is that the processor is at the lowest layer of abstraction, which is the hardware level. The processor interacts with memory which is where your example number would reside. What is done with that memory is up to the software. It is also up to the software to determine how to interpret that number when that memory location is read. If you are wondering how a number like that would be printed by a processor, the answer is that it wouldn't. There would be some sort of peripheral that would be responsible for doing that that the processor is interfacing with.
I encourage you to read more about CPU's

Entropy of AS3 source code

I'm trying to decide if I should generate AS3 code for a certain complicated configurable set of business-logic rules, or should I express them as data, and write a state machine in AS3 to interpret it.
My goal is to get smallest compiled swf size. Speed is not an issue. Implementation complexity is not an issue as well. (Both within rational limits, of course.)
I can not disclose sufficient details, and I understand that I probably should do experiment instead of asking, but my question is:
What is average compression ratio for AS3 source when compiled to swf? How much bytes of swf per kilobyte of source code?
(I perfectly understand that the answer would be a very rough figure at best.)
Some facts about the compiled SWF:
class names are preserved
member names are preserved
local variable names are not preserved
comments and whitespace are stripped
So, the ratio would depend on length of identifiers, comments and even tabs-or-spaces preferences. If you run your result SWF through obfuscator which replaces class names with something like A0, A1, etc. you should save some bytes.
Your state machine idea seems to be the most promising - code is written once, and rules can be written in compact way. If you can pack small numbers into one int, it will be better (there are no integers in AS3 less than 4-byte ints and uints).
Remember that SWF files are already zlib compressed so that for the most part might erase any relative gains by going either data way or code way.
If the initial load speed is what you are going after you can always factor out chunks of your code and load it on demand from another SWF file.

What kind of learning algorithm would you use to build a model of how long it takes a human to solve a given Sudoku situation?

I don't have much experience in machine learning, pattern recognition, data mining, etc. and in their underlying theory and systems.
I would like to develop an artificial model of the time it takes a human to make a move in a given Sudoku puzzle.
So what I'm looking for as an output from the machine learning process is a model that can give predictions on how long does it take for a target human to make a move in a given Sudoku situation.
Same input doesn't always map to same outcome. It takes different times for the human to make a move with the same situation, but my hypothesis is that there's a tendency in the resulting probability distribution. (My educated guess is that it is ~normal.)
I have ideas about the factors that influence the distribution (like #empty slots) but would preferably leave it to the system to figure these patterns out. Please notice, that I'm not interested in the patterns, just the model.
I can generate sample and test data easily by running sudoku puzzles and measuring the times it takes to make the moves.
What kind of learning algorithm would you suggest to use for this?
I was thinking NNs, but I'm not sure if they can have the desired property of giving weighted random outcomes for the same input.
If I understand this correctly you have an input vector of length 81, which contains 1 if the square is filled in and 0 otherwise. You want to learn a function which returns a probability distribution which models the response time of a human to that board position.
My first response would be that this is a regression problem and you should try straightforward linear regression. This will not provide you with a distribution of response times, but a single 'best-guess' response time.
I'm not clear on why you want to model a distribution of response times. However, if you really want to do want to output a distribution then it sounds like you want to look at Bayesian methods. I'm not really an expert on Bayesian inference, so I can't help you much further here.
However, I don't really think your approach is going to work because I agree with your intuition about features such as the number of empty slots being important. There are also other obvious features, such as the number of empty slots per row/column that are likely to be important. Explicitly putting these features in your representation will probably be much more successful than expecting that the learning algorithm will infer something similar on its own.
The monte carlo method seems like it would work well here but would require a stack of solutions the size of the moon to really do it. And it wouldn't give you the time per person, just the time on average.
My understanding of it, tenuous as it is, is that you have a database with a board position and the time it took a human to make the next move. At the very least you have a starting point for most moves. Even if it's not in the database you could start to calculate how long it would take to make a move based on some algorithm. Though I know you had specified you wanted machine learning to do this it might be worth segmenting the problem into something a little smaller then building on it.
If you have some guesstimate as to what influences the function (# of empty cell, etc), try to train a classifier on a vector of features, and not on the 81 cells vector (0/1 or 0..9, doesn't really matter for my argument).
I think that your claim:
we wouldn't have to necessary know the underlying patterns, the "trained patterns" in a learning system automatically encodes these sometimes quite delicate and subtle patterns inside them -- that's one of their great power
is wrong. you do have to give the network the right domain. for example, when trying to detect object in an image, working in the pixel domain is pointless. you'll only get results if you first run some feature detection to detect edges, corners, etc.
Theoretically, with enough non-linearity (in NN - enough layers in the network) it can detect such things, but in practice, I have never seen that work, without giving the classifier the right features to work with.
I was thinking NNs, but I'm not sure if they can have the desired property of giving weighted random outcomes for the same input.
You're just trying to learn a function from 2^81 or 10^81 (or a much smaller feature space as I suggest) to R (response time between 0 and Inf) or some discretization of that. So NN and other classifiers can do that.

How do you write code that is both 32 bit and 64 bit compatible?

What considerations do I need to make if I want my code to run correctly on both 32bit and 64bit platforms ?
EDIT: What kind of areas do I need to take care in, e.g. printing strings/characters or using structures ?
Options:
Code it in some language with a Virtual Machine (such as Java)
Code it in .NET and don't target any specific architecture. The .NET JIT compiler will compile it for you to the right architecture before running it.
One solution would be to target a virtual environment that runs on both platforms (I'm thinking Java, or .Net here).
Or pick an interpreted language.
Do you have other requirements, such as calling existing code or libraries?
The same things you should have been doing all along to ensure you write portable code :)
mozilla guidelines and the C faq are good starting points
I assume you are still talking about compiling them separately for each individual platform? As running them on both is completely doable by just creating a 32bit binary.
The biggest one is making sure you don't put pointers into 32-bit storage locations.
But there's no proper 'language-agnostic' answer to this question, really. You couldn't even get a particularly firm answer if you restricted yourself to something like standard 'C' or 'C++' - the size of data storage, pointers, etc, is all terribly implementation dependant.
It honestly depends on the language, because managed languages like C# and Java or Scripting languages like JavaScript, Python, or PHP are locked in to their current methodology and to get started and to do anything beyond the advanced stuff there is not much to worry about.
But my guess is that you are asking about languages like C++, C, and other lower level languages.
The biggest thing you have to worry about is the size of things, because in the 32-bit world you are limited to the power of 2^32 however in the 64-bit world things get bigger 2^64.
With 64-bit you have a larger space for memory and storage in RAM, and you can compute larger numbers. However if you know you are compiling for both 32 and 64, you need to make sure to limit your expectations of the system to the 32-bit world and limitations of buffers and numbers.
In C (and maybe C++) always remember to use the sizeof operator when calculating buffer sizes for malloc. This way you will write more portable code anyway, and this will automatically take 64bit datatypes into account.
In most cases the only thing you have to do is just compile your code for both platforms. (And that's assuming that you're using a compiled language; if it's not, then you probably don't need to worry about anything.)
The only thing I can think of that might cause problems is assuming the size of data types, which is something you probably shouldn't be doing anyway. And of course anything written in assembly is going to cause problems.
Keep in mind that many compilers choose the size of integer based on the underlying architecture, given that the "int" should be the fastest number manipulator in the system (according to some theories).
This is why so many programmers use typedefs for their most portable programs - if you want your code to work on everything from 8 bit processors up to 64 bit processors you need to recognize that, in C anyway, int is not rigidly defined.
Pointers are another area to be careful - don't use a long, or long long, or any specific type if you are fiddling with the numeric value of the pointer - use the proper construct, which, unfortunately, varies from compiler to compiler (which is why you have a separate typedef.h file for each compiler you use).
-Adam Davis