Splay tree insertion - language-agnostic

Going through some excercises to hone my binary tree skills, I decided to implement a splay tree, as outlined in Wikipedia: Splay tree.
One thing I'm not getting is the part about insertion.
It says:
First, we search x in the splay tree. If x does not already exist, then we will not find it, but its parent node y. Second, we perform a splay operation on y which will move y to the root of the splay tree. Third, we insert the new node x as root in an appropriate way. In this way either y is left or right child of the new root x.
My question is this: The above text seems overly terse compared to the other examples in the article, why is that? It seems there are some gotchas left out here. For instance, after splaying the y node up to the root, I can't just blindly replace root with x, and tack y onto x as either left or right child.
Let's assume the value does not already exist in the tree.
I have this tree:
10
/ \
5 15
/ \ \
1 6 20
and I want to insert 8. With the description above, I will up finding the 6-node, and in a normal binary tree, 8 would be added as a right child of the 6-node, however here I first have to splay the 6-node up to root:
6
/ \
5 10
/ \
1 15
\
20
then either of these two are patently wrong:
8 8
\ /
6 6
/ \ / \
5 10 5 10
/ \ / \
1 15 1 15
\ \
20 20
6 is not greater than 8 10 is not less than 8
it seems to me that the only way to do the splaying first, and then correctly adding the new value as root would mean I have to check the following criteria (for adding the splayed node as the left child of the new root):
the node I splayed to the root is less than the new root (6 < 8)
the rightmost child of the node I splayed to the root is also less than the new root (20 8)
However, if I were to split up the node I splayed, by taking the right child and appending it as the right child of the new node, I would get this:
8
/ \
6 10
/ \
5 15
/ \
1 20
But, is this simple alteration always going to give me a correct tree? I'm having a hard time coming up with an example, but could this lead to the following:
The new value I want to add is higher than the temporary root (the node I splayed to the root), but also higher than the leftmost child of the right-child of the temporary root?
Ie. a tree that would basically look like this after splaying, but before I replace the root?
10
/ \
5 15
/ \
11 20
and I want to add 13, which would make the new tree like this:
13
/ \
10 15
/ / \
5 11 20 <-- 11, on the wrong side of 13
or can this never happen?
My second question is this: Wouldn't it be much easier to just rewrite the operation as follows:
First, we search x in the splay tree. If x does not already exist, then we will not find it, but its parent node y. Then we add the new node as either a left or right child of the parent node. Thirdly, we perform a splay operation on the node we added which will move the new value to the root of the splay tree.
emphasis mine to show what I changed.

I don't see how the problem you describe could happen. If you want to insert 13 into this tree you first have to find where it would be:
10
/ \
5 15
/ \
11 20
From 10 you go right, from 15 you go left, from 11 you go right... and then you have no more elements. If 13 had been in the tree, we would have found it as a right child of 11. So according to the rule we perform a splay operation on 11 which will move 11 to the root of the splay tree:
11
/ \
10 15
/ \
5 20
Then we add 13 as the new root, with 11 as the left child:
13
/ \
11 15
/ \
10 20
/
5
Now there is no problem.
First, we search x in the splay tree. If x does not already exist, then we will not find it, but its parent node y. Then we add the new node as either a left or right child of the parent node. Thirdly, we perform a splay operation on the node we added which will move the new value to the root of the splay tree.
This sounds to me like it would work too, but if I were you, I'd just try to implement the version as it described in Wikipedia since lots of people have tested that and it is already well documented.

"Splay Tree" immediately made me remember an article in CUJ I read a while ago, you might find some insight there: Implementing Splay Tree in C++.
Third, we insert the new node x as root in an appropriate way. In this way either y is left or right child of the new root x.
Yes, but this new root x has to have 2 children, that's why this sentence might sound confusing.

the new node would be added to the tree just like a normal binary search tree. Then the new node would be splayed up to be the root or the first level from the root. Also, when we insert a new node, we need to find the location to put it, so we do a find. And all operations including find on a splay tree trigger a splay operation. May be thats why the wikipedia article describes it like that. I just insert the new node and splay it up. Either way the tree becomes better balanced than it was. works just fine here

Related

Turing machine for addition and comparison of binary numbers

Good Day everyone!
I am trying to solve this Exercise for learning purpose. Can someone guide me in solving these 3 questions?
Like I tried the 1st question for addition of 2 binary numbers separated by '+'. where I tried 2 numbers addition by representing each number with respective number of 1's or zeros e.g 5 = 1 1 1 1 1 or 0 0 0 0 0 and then add them and the result will also be in the same format as represented but how to add or represent 2 binaries and separating them by +, not getting any clue. Will be head of Turing machine move from left and reach plus sign and then move left and right of + sign? But how will the addition be performed. As far as my little knowledge is concerned TM can not simply add binaries we have to make some logic to represent its binaries like in the case of simple addition of 2 numbers. Similar is the case with comparison of 2 binaries?
Regards
The following program, inspired by the edX / MITx course Paradox and Infinity, shows how to perform binary addition with a Turing machine, where the numbers to be added are input to the Turing machine and are separated by a blank.
The Turing Machine
uses the second number as a counter
decrements the second number by one
increments the first number by one
till the second number becomes 0.
The following animation of the simulation of the Turing machine shows how 13 (binary 1101) and 5 (binary 101) are added to yield 18 (binary 10010).
I'll start with problems 2 and 3 since they are actually easier than problem 1.
We'll assume we have valid input (non-empty binary strings on both sides with no leading zeroes), so we don't need to do any input validation. To check whether the numbers are equal, we can simply bounce back and forth across the = symbol and cross off one digit at a time. If we find a mismatch at any point, we reject. If we have a digit remaining on the left and can't find one on the right, we reject. If we run out of digits on the left and still have some on the right, we reject. Otherwise, we accept.
Q T Q' T' D
q0 0 q1 X right // read the next (or first) symbol
q0 1 q2 X right // of the first binary number, or
q0 = q7 = right // recognize no next is available
q1 0 q1 0 right // skip ahead to the = symbol while
q1 1 q1 1 right // using state to remember which
q1 = q3 = right // symbol we need to look for
q2 0 q2 0 right
q2 1 q2 1 right
q2 = q4 = right
q3 X q3 X right // skip any crossed-out symbols
q3 0 q5 X left // in the second binary number
q3 1,b rej 1 left // then, make sure the next
q4 X q4 X,b right // available digit exists and
q4 0,b rej 0,b left // matches the one remembered
q4 1 q5 X left // otherwise, reject
q5 X q5 X left // find the = while ignoring
q5 = q6 = left // any crossed-out symbols
q6 0 q6 0 left // find the last crossed-out
q6 1 q6 1 left // symbol in the first binary
q6 X q0 X right // number, then move right
// and start over
q7 X q7 X right // we ran out of symbols
q7 b acc b left // in the first binary number,
q7 0,1 rej 0,1 left // make sure we already ran out
// in the second as well
This TM could first sanitize input by ensuring both binary strings are non-empty and contain no leading zeroes (crossing off any it finds).
Do to "greater than", you could easily do the following:
check to see if the length of the first binary number (after removing leading zeroes) is greater than, equal to, or less than the length of the second binary number (after removing leading zeroes). If the first one is longer than the second, accept. If the first one is shorter than the second, reject. Otherwise, continue to step 2.
check for equality as in the other problem, but accept if at any point you have a 1 in the first number and find a 0 in the second. This works because we know there are no leading zeroes, the numbers have the same number of digits, and we are checking digits in descending order of significance. Reject if you find the other mismatch or if you determine the numbers are equal.
To add numbers, the problem says to increment and decrement, but I feel like just adding with carry is going to be not significantly harder. An outline of the procedure is this:
Begin with carry = 0.
Go to least significant digit of first number. Go to state (dig=X, carry=0)
Go to least significant digit of second number. Go to state (sum=(X+Y+carry)%2, carry=(X+Y+carry)/2)
Go after the second number and write down the sum digit.
Go back and continue the process until one of the numbers runs out of digits.
Then, continue with whatever number still has digits, adding just those digits and the carry.
Finally, erase the original input and copy the sum backwards to the beginning of the tape.
An example of the distinct steps the tape might go through:
#1011+101#
#101X+101#
#101X+10X#
#101X+10X=#
#101X+10X=0#
#10XX+10X=0#
#10XX+1XX=0#
#10XX+1XX=00#
#1XXX+1XX=00#
#1XXX+XXX=00#
#1XXX+XXX=000#
#XXXX+XXX=000#
#XXXX+XXX=0000#
#XXXX+XXX=00001#
#XXXX+XXX=0000#
#1XXX+XXX=0000#
#1XXX+XXX=000#
#10XX+XXX=000#
#10XX+XXX=00#
#100X+XXX=00#
#100X+XXX=0#
#1000+XXX=0#
#1000+XXX=#
#10000XXX=#
#10000XXX#
#10000XX#
#10000X#
#10000#
There are two ways to solve the addition problem. Assume your input tape is in the form ^a+b$, where ^ and $ are symbols telling you you've reached the front and back of the input.
You can increment b and decrement a by 1 each step until a is 0, at which point b will be your answer. This is assuming you're comfortable writing a TM that can increment and decrement.
You can implement a full adding TM, using carries as you would if you were adding binary numbers on paper.
For either option, you need code to find the least significant bit of both a and b. The problem specifies that the most significant bit is first, so you'll want to start at + for a and $ for b.
For example, let's say we want to increment 1011$. The algorithm we'll use is find the least significant unmarked digit. If it's a 0, replace it with a 1. If it's a 1, move left.
Start by finding $, moving the read head there. Move the read head to the left.
You see a 1. Move the read head to the left.
You see a 1. Move the read head to the left.
You see a 0. write 1.
Return the read head to $. The binary number is now 1111$.
To compare two numbers, you need to keep track of which values you've already looked at. This is done by extending the alphabet with "marked" characters. 0 could be marked as X, 1 as Y, for example. X means "there's a 0 here, but I've seen it already.
So, for equality, we can start at ^ for a and = for b. (Assuming the input looks like ^a=b$.) The algorithm is to find the start of a and b, comparing the first unmarked bit of each. The first time you get to a different value, halt and reject. If you get to = and $, halt and reject.
Let's look at input ^11=10$:
Read head starts at ^.
Move the head right until we find an unmarked bit.
Read a 1. Write Y. Tape reads ^Y1=10$. We're in a state that represents having read a 1.
Move the head right until we find =.
Move the head right until we find an unmarked bit.
Read a 1. This matches the bit we read before. Write a Y.
Move the head left until we find ^.
Go to step 2.
This time, we'll read a 1 in a and read the 0 in b. We'll halt and reject.
Hope this helps to get you started.

Importing Nodes with Coordinates to Gephi from CSV

This question seems pretty stupid but I actually fail to find a simple solution to this. I have a csv file that is structured like this:
0 21 34.00 34.00
1 23 35.00 25.00
2 25 45.00 65.00
The first column is the node's id, the second is an unimportant attribute. The 3rd and 4th attribute are supposed to be the x and y position of the nodes.
I can import the file into the Data Laboratory without problems, but I fail to explain to Gephi to use the x y attributes as the corresponding properties. All I want to achieve is that Gephi sets the x Property to the value of the x Attribute (and y respectively). Also see picture.
Thanks for your help!
In the Layout window, you can select "Geo Layout" and define which columns are used as Latitude and Longitude.
The projection might come in weird if you do not actually have GeoData, but for me, this is fine.
In Gephi 0.8 there was a plugin called Recast column. This plugin is unfortunately not ported to Gephi 0.9 yet, but it allowed you to set Standard (hidden) Columns in the Node Table, from visible values in the nodes table. Thus if you have two columns of type Float or Decimal that represent your coordinates, you could set the coordinate values of your nodes.

Weka Decision Tree

I am trying to use weka to analyze some data. I've got a dataset with 3 variables and 1000+ instances.
The dataset references movie remakes and
how similar they are (0.0-1.0)
the difference in years between the movie and the remake
and lastly if they were made by the same studio (yes or no)
I am trying to make a decision tree to analyze the data. Using the J48 (because that's all I have ever used) I only get one leaf. Im assuming I'm doing something wrong. Any help is appreciated.
Here is a snippet from the data set:
Similarity YearDifference STUDIO TYPE
0.5 36 No
0.5 9 No
0.85 18 No
0.4 10 No
0.5 15 No
0.7 6 No
0.8 11 No
0.8 0 Yes
...
If interested the data can be downloaded as a csv here http://s000.tinyupload.com/?file_id=77863432352576044943
Your data set is not balanced cause there are almost 5 times more "No" then "Yes" for class attribute. That's why J48 is tree which is actually just one leaf that classifies everything as "NO". You can do one of these things:
sample your data set so you have equal number of No and Yes
Try using better classification algorithm e.g. Random Forest (it's located few spaces below J48 in Weka explorer GUI)

Is there any (opposite of newline) char?

Was wondering if we could print from right to left, bottom to top... I got this thought when trying to write a program to print the following square (for an input 'n', here n=4 )
1 2 3 4
12 13 14 5
11 16 15 6
10 9 8 7
This could be solved many ways, by storing into a 2D array and printing the array... (Any language: Perl, C, C++, Java).
The long answer is that you can do whatever the terminal supports. There are many kinds of terminals (or “character output devices”), many of them support cursor motions. (You can see the Termcap Library project to create a picture what different terminal types do.) There is a terminal command for moving up a line, so esentially yes, you should be able to do that. After poking in the termcap database, I came up with the following:
$ printf "\n"; printf '\e[A'; echo Foo
Foo
In other words, the \e[A string has a non-zero chance to get you one line up. On some terminals :)
Baiscly this is possible. But not on an traditional line-based terminal. When accessing the screen pixel based, it's quite easy to solve this problem. At least there is no real counterpart to \n defined in ASCII.
Or maybe this could be archived by changing the input method of the terminal to some culture which reads left to right and bottom to up.

How do I detect circular logic or recursion in a multi-levels references and dependencies

I have a graph of multi-level dependecies like this, and I need to detect any circular reference in this graph.
A = B
B = C
C = [D, B]
D = [C, A]
Somebody have a problem like this?
Any solution???
Thanks and sorry by english.
========= updated ==========
I had another situation.
1
2 = 1
3 = 2
4 = [2, 3]
5 = 4
In this case, my recursive code iterate two times in "4" reference, but this references don't generate a infinite loop. My problem is to know when function iterate more than one time a reference and is not infinite loop and when is a infinite loop, to inform user.
1 = 4
2 = 1
3 = 2
4 = [2, 3]
5 = 4
This case is a bit diferent from 2th example. This generate a infinite loop. how can I know when cases generate a infinite loop or not?
Topological sorting. The description on Wikipedia is clear and works for all your examples.
Basically you start with a node that has no dependencies, put it in a list of sorted nodes, and then remove that dependency from every node. For you second example that means you start with 1. Once you remove all dependencies on 1 you're left with 2. You end up sorting them 1,2,3,4,5 and seeing that there's no cycle.
For your third example, every node has a dependency, so there's nowhere to start. Such a graph must contain at least one cycle.
Keep a list of uniquely identified nodes. Try to loop through the entire tree but keep checking nodes in the list till you get a node being referred as a child which is already there in the unique list - take it from there (handle the loop or simply ignore it depending on your requirement)
One way to detect circular dependency is to keep a record of the length of the dependency chains that your ordering algorithm detects. If a chain becomes longer than the total number of nodes (due to repetition over a loop) then there is a circular dependency. This should work both for an iterative and for a recursive algorithm.