If I were to take the expression:
(A + B + C + D + E)
And use de morgan's law to transform it to:
(!A!B!C!D!E)!
Would I have to invert each bit before put it into a NAND gate? Is there a simpler way?
EDIT: there is no short cut. You have to do (!A!B!C!D!E)!
I think you don't need to get invert for each inputs,i think invert needs like this expression :)
Eg: (A' + B' + C' + D' + E')
i'm not remember properly,but hope this will be helped to you
Related
I‘m studying Reinforcement Learning and I’m facing a problem understanding the difference between SARSA, Q-Learning, expected SARSA, Double Q Learning and temporal difference. Can you please explain the difference and tell me when to use each? And what is the effect on e-greedy and greedy move?
SARSA :
I’m in state St, an action is chosen with the help of the policy so it moves me to another state St+1 Depending on the Policy in state St+1 an action is made so my Reward in St is gonna be updated due to the expected Reward in the look ahead state St+1.
Q(S, A) ← Q(S, A) + α[ R + γQ(S , A ) − Q(S, A)]
Q-Learning:
I’m in state St, an action was chosen with the help of the policy so it moves me to state St+1, this time it’s not gonna depend on the policy instead it’s gonna observe the maximum of the expected Reward (greedy Reward) in state St+1 and through it the reward of state St is going to be updated.
Q(S, A) ← Q(S, A) + α [R + γ max Q(S , a) − Q(S, A)]
Expected SARSA:
it’s gonna be same as Q-learning instead of updating my Reward with the help of the greedy move in St+1 I take the expected reward of all actions :
Q(St , At) ← Q(St , At) + α[Rt+1 + γE[Q(St+1, At+1)|St+1] − Q(St , At)]
Temporal difference :
The current Reward is gonna be updated using the observed reward Rt+1 and the estimate value V(St+1) At timepoint t + 1:
V (St) ← V (St) + α[Rt+1 + γV (St+1) − V (St)]
is it true what I got or am I missing something? And What about Double Q Learning?
With 0.5 probabilility:
Q1(S, A) ← Q1(S, A) + α R + γQ2 S , argmaxa Q1(S , a) − Q1(S, A)
else:
Q2(S, A) ← Q2(S, A) + α R + γQ1 S , argmaxa Q2(S , a) − Q2(S, A)
Can someone explain it please!!
I have been working on a machine learning course and currently on Classification. I implemented the classification algorithm and obtained the parameters as well as the cost. The assignment already has a function for plotting the decision boundary and it worked but I was trying to read their code and cannot understand these lines.
plot_x = [min(X(:,2))-2, max(X(:,2))+2];
% Calculate the decision boundary line
plot_y = (-1./theta(3)).*(theta(2).*plot_x + theta(1));
Anyone explain?
I'm also taking the same course as you. I guess what the code does is to generate two points on the decision line.
As you know you have the function:
theta0 + theta1 * x1 + theta2 * x2 = 0
Which it can be rewritten as:
c + mx + ky = 0
where x and y are the axis corresponding to x1 and x2, c is theta(0) or the y-intercept, m is the slope or theta(1), and k is theta(2).
This equation (c + mx + ky = 0) corresponds to the decision boundary, so the code is finding two values for x (or x1) which cover the whole dataset (-2 and +2 in plot_x min and max functions) and then uses the equation to find the corresponding y (or x2) values. Finally, a decision boundary can be plotted -- plot(plot_x, plot_y).
In other words, what it does is to use the the equation to generate two points to plot the line on graph, the reason of doing this is that Octave cannot plot the line given an equation to it.
Hope this can help you, sorry for any mistake in grammar or unclear explanation ^.^
Rearranging equations helped me, so adding those here:
plot_y = -1/theta2 (theta1*plot_x + theta0)
note that index in Octave starts at 1, not at 0, so theta(3) = theta2, theta(2) = theta1 and theta(1) = theta0.
This plot_y equation is equivalent to:
c + mx + ky = 0 <=>
-ky = mx + c <=>
y = -1/k (mx + c)
X=2, y=1
X=3, y=3
X=4, y= 6
X=5, y= 10
X=6, y= 15
X=7, y= 21
X=8, y=28
I know that f(x) = f(x-1) + (x-1)
But...is that the correct mathematical function? What would Big O notation be?
The correct (or at least, significantly more efficient than recursive) equation would be
f(x) = x * (x - 1) / 2
Looks like homework. You should mark it with the homework tag.
Did you mean f(x) = f(x-1) + (x-1) ?
To solve for the function:
http://en.wikipedia.org/wiki/Recurrence_relation#Solving
To get the complexity:
http://en.wikipedia.org/wiki/Master_theorem
Yes the function is right, the difference between y values is incrementally increasing by 1
Edited: Thanks for the comment by trutheality
For complexity of the function you can see y like this
y= 1 + (1+2) + (1+2+3) + ....(1+2+3+..n)
As highest possible degree term 1+2+3...n is O(n^2)
y=O(n^2)
The way to correctly state the problem is:
f(x) = f(x - 1) + (x - 1)
f(1) = 0
You want to solve f(x) in terms of x.
There are many ways to solve these kinds of recursive formulas. I like to use Wolfram Alpha, it has an easy interface.
Wolfram Alpha query "f(x)=f(x-1)+(x-1)"
That gives you the precise answer, in big-O notation you would say the function f is in O(x^2).
I'm implementing the system detailed in this paper.
On page 3, section 4 it shows the form that tensors take within the system:
R [ cos(2t), sin(2t); sin(2t), -cos(2t) ]
In my system, I only store R and t, since everything can be calculated from them.
However, I've got to the point where I need to sum two of these tensors (page 4, section 5.2). How can I find values for R and t after summing two tensors of this form?
I guess that's what you are looking for:
x = R_1*cos(2*t_1) + R_2*cos(2*t_2)
y = R_1*sin(2*t_1) + R_2*sin(2*t_2)
R_result = sqrt(x*x+y*y)
t_result = atan2(y,x)/2
Each term reduces to
R_1 trg(2 t_1) + R_2 trg(2 t_2) = R_1 trg_1 + R_2 trg_2
where trg represents either sin or cos and the indexed version takes the obvious meaning. So this is a just an ordinary problem in trigonometric identities repeated a couple of times.
Let
Q = (R_1 + R_2)/2
S = (R_1 - R_2)/2
then
R_1 trg(2 t_1) + R_2 trg(2 t_2) = (Q+S)(trg_1 + trg_2) + (Q-S)(trg_1 - trg_2)
which involves identities you can look up.
Sorry, adding two tensors is nothing more than algebra. The two matricies have to be the same size, and you add them term by term.
You can't just add the radii and angles and plug them back into the tensor. Do the addition properly and it'll work. Here's the first term:
R1*cost(2t1) + R2*cos(2t2) = ?
Here's the answer from Wolfram Alpha. As you can see, it doesn't simplify into a nice, neat expression with an R and a T for you.
In case you haven't thought of it, put the tensor sum into Wolfram Alpha and see what it gives you. They're better at algebra than anyone at this site. Why not get an independent check of your work?
And by adjacent, I only mean one unit left, right, up, or down. Diagonals don't count. You know the x,y grid coordinates of both positions.
Ultimately this is for AS3, but answers in pseudo code would be sufficient.
abs(a.x - b.x) + abs(a.y - b.y) == 1
(a.x - b.x) ^ 2 + (a.y - b.y) ^ 2 = 1