My simple turing machine - language-agnostic

I'm trying to understand and implement the simplest turing machine and would like feedback if I'm making sense.
We have a infinite tape (lets say an array called T with pointer at 0 at the start) and instruction table:
( S , R , W , D , N )
S->STEP (Start at step 1)
R->READ (0 or 1)
W->WRITE (0 or 1)
D->DIRECTION (0=LEFT 1=RIGHT)
N->NEXTSTEP (Non existing step is HALT)
My understanding is that a 3-state 2-symbol is the simplest machine. 3-state i don't understand. 2-symbol because we use 0 and 1 for READ/WRITE.
For example:
(1,0,1,1,2)
(1,1,0,1,2)
Starting at step 1, if Read is 0 then { Write 1, Move Right) else {Write 0, Move Right) and then go to step 2 - which does not exist which halts program.
What does 3-state mean? Does this machine pass as turing machine? Can we simplify more?

I think the confusion might come from your use of "steps" instead of "states". You can think of a machine's state as the value it has in its memory (although as a previous poster noted, some people also take a machine's state to include the contents of the tape -- however, I don't think that definition is relevant to your question). It's possible that this change in terminology might be at the heart of your confusion. Let me explain what I think it is you're thinking. :)
You gave lists of five numbers -- for example, (1,0,1,1,2). As you correctly state, this should be interpreted (reading from left to right) as "If the machine is in state 1 AND the current square contains a 0, print a 1, move right, and change to state 2." However, your use of the word "step" seems to suggest that that "step 2" must be followed by "step 3", etc., when in reality a Turing machine can go back and forth between states (and of course, there can only be finitely many possible states).
So to answer your questions:
Turing machines keep track of "states" not "steps";
What you've described is a legitimate Turing machine;
A simpler (albeit otherwise uninteresting) Turing machine would be one that starts in the HALT state.
Edits: Grammar, Formatting, and removed a needless description of Turing machines.
Response to comment:
Correct me if I'm misinterpreting your comment, but I did not mean to suggest a Turing machine could be in more than one state at a time, only that the number of possible states can be any finite number. For example, for a 3-state machine, you might label the possible states A, B, and C. (In the example you provided, you labeled the two possible states as '1' and '2') At any given time, exactly one of those values (states) would be in the machine's memory. We would say, "the machine is in state A" or "the machine is in state B", etc. (Your machine starts in state '1' and terminates after it enters state '2').
Also, it's no longer clear to me what you mean by a "simpler/est" machine. The smallest known Universal Turing machine (i.e., a Turing machine that can simulate another Turing machine, given an appropriate tape) requires 2 states and 5 symbols (see the relevant Wikipedia article).
On the other hand, if you're looking for something simpler than a Turing machine with the same computation power, Post-Turing machines might be of interest.

I believe that the concept of state is basically the same as in Finite State Machines. If I recall, you need a separate termination state, to which the turing machine can transition after it has finished running the program. As for why 3 states I'd guess that the other two states are for intialisation and execution respectively.
Unfortunately none of that is guaranteed to be correct, but I thought I'd post my thoughts anyway since the question was unanswered for 5 hours. I suspect if you were to re-ask this question on cstheory.stackexchange.com you might get a better/more definative answer.

"State" in the context of Turing machines should be clarified as to which is being described: (i) the current instruction, or (ii) the list of symbols on the tape together with the current instruction, or (iii) the list of symbols on the tape together with the current instruction placed to the left of the scanned symbol or to the right of the scanned symbol. Reference

Related

Nand to Tetris how to compile "pop this 2" into asm

I know how to pop a value from the stack to put it in D
#SP
M=M-1
A=M
D=M
and I know how to select the memory location "this 2"
#2
D=A
THIS
A=A+D
The problem is that I am using D in both steps so obviously just using
M=D will not have the desired outcome. I would neeed a second register to hold some value for later on I guess or am I missing something here?
In these situations, you will have to use memory locations as temporary registers. Note that just as #SP is predefined for you, so are some other temporary memory locations like R0, THIS, THAT, etc.
So usually it is best to write your programs as as series of isolated code nuggets that do things like "POP into THIS", "ADD THAT to THIS", "MOVE THAT into R15", etc. Include a comment that explains what the nugget does. It will make debugging a lot easier.
One way to think of it is that the actual HACK instructions are actually microcode, and the larger nuggets are the real machine instructions.
Later on, should you so desire, you can see if you can merge pairs of these instructions (for example, if the first one ends by storing a value in location X, and the next one immediately loads it again, you can usually omit the load, and sometimes the store as well). However, such cleverness can bite you if you are not careful, so it is best to get something working that is easier to understand, and then try optimizing it.
Have fun!

Is HTML Turing Complete?

After reading this question Is CSS Turing complete? -- which received a few thoughtful, succinct answers -- it made me wonder: Is HTML Turing Complete?
Although the short answer is a definitive Yes or No, please also provide a short description or counter-example to prove whether HTML is or is not Turing Complete (obviously it cannot be both). Information on other versions of HTML may be interesting, but the correct answer should answer this for HTML5.
By itself (without CSS or JS), HTML (5 or otherwise) cannot possibly be Turing-complete because it is not a machine. Asking whether it is or not is essentially equivalent to asking whether an apple or an orange is Turing complete, or to take a more relevant example, a book.
HTML is not something that "runs". It is a representation. It is a format. It is an information encoding. Not being a machine, it cannot compute anything on its own, at the level of Turing completeness or any other level.
It seems clear to me that states and transitions can be represented in HTML with pages and hyperlinks, respectively. With this, one can implement deterministic finite automata where clicking links transitions between states. For example, I implemented a few simple DFA which are accessible here.
DFA are much simpler that the Turing Machine though. To implement something closer to a TM, an additional mechanism involving reading and writing to memory would be necessary, besides the basic states/transitions functionality. However, HTML does not seem to have this kind of feature. So I would say HTML is not Turing-complete, but is able to simulate DFA.
Edit1: I was reminded of the video On The Turing Completeness of PowerPoint when writing this answer.
Edit2: complementing this answer with the DFA definition and clarification.
Edit3: it might be worth mentioning that any machine in the real world is a finite-state machine due to reality's constraint of finite memory. So in a way, DFA can actually do anything that any real machine can do, as far as I know. See: https://en.wikipedia.org/wiki/Turing_machine#Comparison_with_real_machines
Definition
From https://en.wikipedia.org/wiki/Deterministic_finite_automaton#Formal_definition
In the theory of computation, a branch of theoretical computer
science, a deterministic finite automaton (DFA)—also known as
deterministic finite acceptor (DFA), deterministic finite-state
machine (DFSM), or deterministic finite-state automaton (DFSA)—is a
finite-state machine that accepts or rejects a given string of
symbols, by running through a state sequence uniquely determined by
the string.
A deterministic finite automaton M is a 5-tuple, (Q, Σ, δ, q0, F),
consisting of
a finite set of states Q
a finite set of input symbols called the alphabet Σ
a transition function δ : Q × Σ → Q
an initial or start state q0
a set of accept states F
The following example is of a DFA M, with a binary alphabet, which
requires that the input contains an even number of 0s.
M = (Q, Σ, δ, q0, F) where
Q = {S1, S2}
Σ = {0, 1}
q0 = S1
F = {S1} and
δ is defined by the following state transition table:
0
0
s1
s2
s1
s2
s1
s2
State diagram for M:
The state S1 represents that there has been an even number of 0s in
the input so far, while S2 signifies an odd number. A 1 in the input
does not change the state of the automaton. When the input ends, the
state will show whether the input contained an even number of 0s or
not. If the input did contain an even number of 0s, M will finish in
state S1, an accepting state, so the input string will be accepted.
HTML implementation
The DFA M exemplified above plus a few of the most basic DFA were implemented in Markdown and converted/hosted as HTML pages by Github, accessible here.
Following the definition of M, its HTML implementation is detailed as follows.
The set of states Q contains the pages s1.html and s2.html, and also the acceptance page acc.html and the rejection page rej.html. These two additional states are a "user-friendly" way to communicate the acceptance of a word and don't affect the semantics of the DFA.
The set of symbols Σ is defined as the symbols 0 and 1. The empty string symbol ε was also included to denote the end of the input, leading to either acc.html or rej.html state.
The initial state q0 is s1.html.
The set of accept states is {acc.html}.
The set of transitions is defined by hyperlinks such that page s1.html contains a link with text "0" leading to s2.html, a link with text "1" leading to s1.html, and a link with text "ε" leading to acc.html. Each page is analogous according to the following transition table. Obs: acc.html and rej.html don't contain links.
0
1
ε
s1.html
s2.html
s1.html
acc.html
s2.html
s1.html
s2.html
rej.html
Questions
In what ways are those HTML pages "machines"? Don't these machines include the browser and the person who clicks the links? In what way does a link perform computation?
DFA is an abstract machine, i.e. a mathematical object. By the definition shown above, it is a tuple that defines transition rules between states according to a set of symbols. A real-world implementation of these rules (i.e. who keeps track of the current state, looks up the transition table and updates the current state accordingly) is then outside the scope of the definition. And for that matter, a Turing machine is a similar tuple with a few more elements to it.
As described above, the HTML implementation represents the DFA M in full: every state and every transition is represented by a page and a link respectively. Browsers, clicks and CPUs are then irrelevant in the context of the DFA.
In other words, as written by #Not_Here in the comments:
Rules don't innately implement themselves, they're just rules an
implementation should follow. Consider it this way: Turing machines
aren't actual machines, Turing didn't build machines. They're purely
mathematical objects, they're tuples of sets (state, symbols) and a
transition function between states. Turing machines are purely
mathematical objects, they're sets of instructions for how to
implement a computation, and so is this example in HTML.
The Wikipedia article on abstract machines:
An abstract machine, also called an abstract computer, is a
theoretical computer used for defining a model of computation.
Abstraction of computing processes is used in both the computer
science and computer engineering disciplines and usually assumes a
discrete time paradigm.
In the theory of computation, abstract machines are often used in
thought experiments regarding computability or to analyze the
complexity of algorithms (see computational complexity theory). A
typical abstract machine consists of a definition in terms of input,
output, and the set of allowable operations used to turn the former
into the latter. The best-known example is the Turing machine.
Some have claimed to implement Rule 110, a cellular automaton, using pure HTML and CSS (no JavaScript). You can see a video here, or browse the source of one implementation.
Why is this relevant? It has been proven that Rule 110 is itself Turing complete, meaning that it can simulate any Turing machine. If we then implement Rule 110 using pure HTML, it follows that HTML can simulate any Turing machine via its simulation of that particular cellular automaton.
The critiques of this HTML "proof" focus on the fact that human input is required to drive the operation of the HTML machine. As seen in the video above, the human's input is constrained to a repeating pattern of Tab + Space (because the HTML machine consists of a series of checkboxes). Much as a Turing machine would require a clock signal and motive force to move its read/write head if it were to be implemented as a physical machine, the HTML machine needs energy input from the human -- but no information input, and crucially, no decision making.
In summary: HTML is probably Turing-complete, as proven by construction.

PIC Assembly: Calling functions with variables

So say I have a variable, which holds a song number. -> song_no
Depending upon the value of this variable, I wish to call a function.
Say I have many different functions:
Fcn1
....
Fcn2
....
Fcn3
So for example,
If song_no = 1, call Fcn1
If song_no = 2, call Fcn2
and so forth...
How would I do this?
you should have compare function in the instruction set (the post suggests you are looking for assembly solution), the result for that is usually set a True bit or set a value in a register. But you need to check the instruction set for that.
the code should look something like:
load(song_no, $R1)
cmpeq($1,R1) //result is in R3
jmpe Fcn1 //jump if equal
cmpeq ($2,R1)
jmpe Fcn2
....
Hope this helps
I'm not well acquainted with the pic, but these sort of things are usually implemented as a jump table. In short, put pointers to the target routines in an array and call/jump to the entry indexed by your song_no. You just need to calculate the address into the array somehow, so it is very efficient. No compares necessary.
To elaborate on Jens' reply the traditional way of doing on 12/14-bit PICs is the same way you would look up constant data from ROM, except instead of returning an number with RETLW you jump forward to the desired routine with GOTO. The actual jump into the jump table is performed by adding the offset to the program counter.
Something along these lines:
movlw high(table)
movwf PCLATH
movf song_no,w
addlw table
btfsc STATUS,C
incf PCLATH
addwf PCL
table:
goto fcn1
goto fcn2
goto fcn3
.
.
.
Unfortunately there are some subtleties here.
The PIC16 only has an eight-bit accumulator while the address space to jump into is 11-bits. Therefore both a directly writable low-byte (PCL) as well as a latched high-byte PCLATH register is available. The value in the latch is applied as MSB once the jump is taken.
The jump table may cross a page, hence the manual carry into PCLATH. Omit the BTFSC/INCF if you know the table will always stay within a 256-instruction page.
The ADDWF instruction will already have been read and be pointing at table when PCL is to be added to. Therefore a 0 offset jumps to the first table entry.
Unlike the PIC18 each GOTO instruction fits in a single 14-bit instruction word and PCL addresses instructions not bytes, so the offset should not be multiplied by two.
All things considered you're probably better off searching for general PIC16 tutorials. Any of these will clearly explain how data/jump tables work, not to mention begin with the basics of how to handle the chip. Frankly it is a particularly convoluted architecture and I would advice staying with the "free" hi-tech C compiler unless you particularly enjoy logic puzzles or desperately need the performance.

Upper limit of bugs in a given program

Is there an upper limit to the number of bugs contained in a given program? If the number of instructions are known, could one say the program cannot contain more than 'n' bugs? For example, how many bugs could the following function contain?
double calcInterest(double amount) {
return -O.07 / amount;
}
A parser would count four terms in the function, and I could count these errors:
wrong number syntax
wrong interest rate (business requirements error)
wrong calculation (should be multiply)
Potential divide by zero
Clearly the number of bugs is not infinite given a finite number of instructions. Alternatively, one could say the function accepts 2^64 inputs, and of those, how many produce the correct output. However, is there any way to formally prove an upper limit?
If bug is "a requirement not met by the program", then there is no limit on the number of bugs (per line or otherwise), since there is no limit on the number of requirements.
print "hello world"
Might contain a million bugs. It doesn't create a pink elephant. I leave it to the reader to come up with 999999 other requirements not satisfied by this program.
Number of instructions have nothing to do with whether the program does what the user wants it to do. I mean, look at how poorly GCC does balancing my check book. Buggy as all get out, down right useless!
This would all depend on how you define a 'bug'.
If you define a program as a function from some input to some output, and a specification as a definition of that function, and a bug as any difference in output from the specification on a given input, then yes, you can conceivably have countably infinite bugs - however this is a somewhat useless definition of a bug.
The upper limit is the number of states your program can be in. Since this number is finite on real machines you could number the states from 1 to n. For each state you could label if this state is a bug or not. So yes, but even a small program having 16 bytes of memory has 2^128 states and the problem of analyzing all the different states is intractable.
There is a theoretical upper limit for bugs, but for all but the most trivial programs it is very nearly impossible to calculate, although engines such as Pex do give it the old college try.
Law of programming:
"If You will find all compile-time bugs, then n logical ones are still hidden, waiting to surprise You at run-time."
Depends on how you count bugs, which leads me to say "nope, no limit." I don't know about you, but I can easily write several bugs in the same line of code. For instance, how many bugs are in this Java code? :-P
public int addTwoNumbers(int x, String y)
{{
z == x + y;
return y;
}
As little as one if the bug is significant enough.

Creating a logic gate simulator

I need to make an application for creating logic circuits and seeing the results. This is primarily for use in A-Level (UK, 16-18 year olds generally) computing courses.
Ive never made any applications like this, so am not sure on the best design for storing the circuit and evaluating the results (at a resomable speed, say 100Hz on a 1.6Ghz single core computer).
Rather than have the circuit built from the basic gates (and, or, nand, etc) I want to allow these gates to be used to make "chips" which can then be used within other circuits (eg you might want to make a 8bit register chip, or a 16bit adder).
The problem is that the number of gates increases massively with such circuits, such that if the simulation worked on each individual gate it would have 1000's of gates to simulate, so I need to simplify these components that can be placed in a circuit so they can be simulated quickly.
I thought about generating a truth table for each component, then simulation could use a lookup table to find the outputs for a given input. The problem occurred to me though that the size of such tables increase massively with inputs. If a chip had 32 inputs, then the truth table needs 2^32 rows. This uses a massive amount of memory in many cases more than there is to use so isn't practical for non-trivial components, it also wont work with chips that can store their state (eg registers) since they cant be represented as a simply table of inputs and outputs.
I know I could just hardcode things like register chips, however since this is for educational purposes I want it so that people can make their own components as well as view and edit the implementations for standard ones. I considered allowing such components to be created and edited using code (eg dlls or a scripting language), so that an adder for example could be represented as "output = inputA + inputB" however that assumes that the students have done enough programming in the given language to be able to understand and write such plugins to mimic the results of their circuit which is likly to not be the case...
Is there some other way to take a boolean logic circuit and simplify it automatically so that the simulation can determine the outputs of a component quickly?
As for storing the components I was thinking of storing some kind of tree structure, such that each component is evaluated once all components that link to its inputs are evaluated.
eg consider: A.B + C
The simulator would first evaluate the AND gate, and then evaluate the OR gate using the output of the AND gate and C.
However it just occurred to me that in cases where the outputs link back round to the inputs, will cause a deadlock because there inputs will never all be evaluated...How can I overcome this, since the program can only evaluate one gate at a time?
Have you looked at Richard Bowles's simulator?
You're not the first person to want to build their own circuit simulator ;-).
My suggestion is to settle on a minimal set of primitives. When I began mine (which I plan to resume one of these days...) I had two primitives:
Source: zero inputs, one output that's always 1.
Transistor: two inputs A and B, one output that's A and not B.
Obviously I'm misusing the terminology a bit, not to mention neglecting the niceties of electronics. On the second point I recommend abstracting to wires that carry 1s and 0s like I did. I had a lot of fun drawing diagrams of gates and adders from these. When you can assemble them into circuits and draw a box round the set (with inputs and outputs) you can start building bigger things like multipliers.
If you want anything with loops you need to incorporate some kind of delay -- so each component needs to store the state of its outputs. On every cycle you update all the new states from the current states of the upstream components.
Edit Regarding your concerns on scalability, how about defaulting to the first principles method of simulating each component in terms of its state and upstream neighbours, but provide ways of optimising subcircuits:
If you have a subcircuit S with inputs A[m] with m < 8 (say, giving a maximum of 256 rows) and outputs B[n] and no loops, generate the truth table for S and use that. This could be done automatically for identified subcircuits (and reused if the subcircuit appears more than once) or by choice.
If you have a subcircuit with loops, you may still be able to generate a truth table. There are fixed-point finding methods which can help here.
If your subcircuit has delays (and they are significant to the enclosing circuit) the truth table can incorporate state columns. E.g. if the subcircuit has input A, inner state B, and output C, where C <- A and B, B <- A, the truth table could be:
A B | B C
0 0 | 0 0
0 1 | 0 0
1 0 | 1 0
1 1 | 1 1
If you have a subcircuit that the user asserts implements a particular known pattern such as "adder", provide an option for using a hard-coded implementation for updating that subcircuit instead of by simulating its inner parts.
When I made a circuit emulator (sadly, also incomplete and also unreleased), here's how I handled loops:
Each circuit element stores its boolean value
When an element "E0" changes its value, it notifies (via the observer pattern) all who depend on it
Each observing element evaluates its new value and does likewise
When the E0 change occurs, a level-1 list is kept of all elements affected. If an element already appears on this list, it gets remembered in a new level-2 list but doesn't continue to notify its observers. When the sequence which E0 began has stopped notifying new elements, the next queue level is handled. Ie: the sequence is followed and completed for the first element added to level-2, then the next added to level-2, etc. until all of level-x is complete, then you move to level-(x+1)
This is in no way complete. If you ever have multiple oscillators doing infinite loops, then no matter what order you take them in, one could prevent the other from ever getting its turn. My next goal was to alleviate this by limiting steps with clock-based sync'ing instead of cascading combinatorials, but I never got this far in my project.
You might want to take a look at the From Nand To Tetris in 12 steps course software. There is a video talking about it on youtube.
The course page is at: http://www1.idc.ac.il/tecs/
If you can disallow loops (outputs linking back to inputs), then you can significantly simplify the problem. In that case, for every input there will be exactly one definite output. Cycles however can make the output undecideable (or rather, constantly changing).
Evaluating a circuit without loops should be easy - just use the BFS algorithm with "junctions" (connections between logic gates) as the items in the list. Start off with all the inputs to all the gates in an "undefined" state. As soon as a gate has all inputs "defined" (either 1 or 0), calculate its output and add its output junctions to the BFS list. This way you only have to evaluate each gate and each junction once.
If there are loops, the same algorithm can be used, but the circuit can be built in such a way that it never comes to a "rest" and some junctions are always changing between 1 and 0.
OOps, actually, this algorithm can't be used in this case because the looped gates (and gates depending on them) would forever stay as "undefined".
You could introduce them to the concept of Karnaugh maps, which would help them simplify truth values for themselves.
You could hard code all the common ones. Then allow them to build their own out of the hard coded ones (which would include low level gates), which would be evaluated by evaluating each sub-component. Finally, if one of their "chips" has less than X inputs/outputs, you could "optimize" it into a lookup table. Maybe detect how common it is and only do this for the most used Y chips? This way you have a good speed/space tradeoff.
You could always JIT compile the circuits...
As I haven't really thought about it, I'm not really sure what approach I'd take.. but it would possibly be a hybrid method and I'd definitely hard code popular "chips" in too.
When I was playing around making a "digital circuit" simulation environment, I had each defined circuit (a basic gate, a mux, a demux and a couple of other primitives) associated with a transfer function (that is, a function that computes all outputs, based on the present inputs), an "agenda" structure (basically a linked list of "when to activate a specific transfer function), virtual wires and a global clock.
I arbitrarily set the wires to hard-modify the inputs whenever the output changed and the act of changing an input on any circuit to schedule a transfer function to be called after the gate delay. With this at hand, I could accommodate both clocked and unclocked circuit elements (a clocked element is set to have its transfer function run at "next clock transition, plus gate delay", any unclocked element just depends on the gate delay).
Never really got around to build a GUI for it, so I've never released the code.