What are the states and rewards in the reward matrix? - reinforcement-learning

This code :
R = ql.matrix([ [0,0,0,0,1,0],
[0,0,0,1,0,1],
[0,0,100,1,0,0],
[0,1,1,0,1,0],
[1,0,0,1,0,0],
[0,1,0,0,0,0] ])
is from :
https://github.com/PacktPublishing/Artificial-Intelligence-By-Example/blob/47bed1a88db2c9577c492f950069f58353375cfe/Chapter01/MDP.py
R is defined as the "Reward matrix for each state" . What are the states and rewards in this matrix ?
# Reward for state 0
print('R[0,]:' , R[0,])
# Reward for state 0
print('R[1,]:' , R[1,])
prints :
R[0,]: [[0 0 0 0 1 0]]
R[1,]: [[0 0 0 1 0 1]]
Is [0 0 0 0 1 0] state0 & [0 0 0 1 0 1] state1 ?

According to the book that uses that example, R represents the reward of the transitions from one current state s to another next state s'.
Specifically, R is associated with the following graph:
Each line in the matrix R represents a letter from A to F, and each column represents a letter from A to F. The 1 values represent the nodes of the graphs. I.e., R[0,]: [[0 0 0 0 1 0]] means that you can go from state s=A to next state s'=E and receive a reward of 1. Similarly, R[1,]: [[0 0 0 1 0 1]] means that you receive a reward of 1 if you go from B to F or D. The goal seems to be achieving and remaining in C, which obtains the largest reward.

Related

How to find the nodes of a triangle from an adjacency matrix in Octave

I know how to find the number of triangles in an adjacency matrix.
tri = trace(A^3) / 6
But i require to find the nodes so that i can finally find the value of the edges from adjacency matrix since it's a sign graph. Is there already existing function which does that?
Taking the power of the adjacency matrix loses information about the intermediate nodes. Instead of a 2-dimensional matrix, we need 3 dimensions.
Given a graph:
and its adjacency matrix:
A =
0 0 0 0 1 1 0 1 0 0
0 0 0 1 0 1 0 0 0 0
0 0 0 1 0 0 0 1 0 1
0 1 1 0 1 0 1 0 0 0
1 0 0 1 0 0 1 0 0 0
1 1 0 0 0 0 0 1 1 0
0 0 0 1 1 0 0 0 1 0
1 0 1 0 0 1 0 0 0 0
0 0 0 0 0 1 1 0 0 0
0 0 1 0 0 0 0 0 0 0
Compute the 3d matrix T such that T(i,j,k) == 1 iff there is a path in the graph i=>j=>k=>i.
T = and(A, permute(A, [3 1 2]))
This is the equivalent of squaring the adjacency matrix, but keeping the path information. and is used here instead of multiplication in case A is a weighted adjacency matrix. If you sum along the 2nd dimension, you'll get A^2:
>> isequal(squeeze(sum(T,2)), A^2)
ans = 1
Now that we've got the paths of length 2, we just need to filter so we keep only the paths that return to their starting points.
T = and(T, permute(A.', [1 3 2])); % Transpose A in case graph is directed
Now, if T(i,j,k) == 1, then there is a triangle starting at node i, through nodes j and k and returning to node i. If you want to find all such paths:
[M,N,P] = ind2sub(size(T), find(T));
P = [M,N,P];
P will be a list of all triangular paths:
P =
8 6 1
6 8 1
7 5 4
5 7 4
7 4 5
4 7 5
8 1 6
1 8 6
5 4 7
4 5 7
6 1 8
1 6 8
In this case we get 12 paths. All paths in an undirected graph have 6 duplicates: one starting at each triangle point, times 2 directions. This gives the same results as trace:
>> trace(A^3)
ans = 12
If you want to remove the duplicates, the simplest way for triangles is to simply sort the vertex ordering and then take the unique rows of the list. This works for triangles only because all permutations of the nodes in the cycle are present. For longer cycles, this will not work.
P = unique(sort(P, 2), 'rows');
P =
1 6 8
4 5 7
Here is a solution using matrix multiplication:
C = (A * A.') & A;
[x, y] = find(tril(C));
n = numel(x);
D = sparse([x; y], [1:n 1:n].', 1, size(A,1), n);
[X, ~, V] = find(C * D);
tri = [x y X(V == 2)]
tri = unique(sort(tri, 2), 'rows');
First we need to know what are triangle nodes. Two nodes are triangle nodes if they have a common neighbor and both of them are neighbor of each other.
We take the definition to compute an adjacency matrix C that only contains triangle nodes and all other node are removed.
The expression A * A.' selects nodes that have common neighbors and the & A operator says that those nodes that have common neighbors should by neighbor of each other.
Now we can use [x, y] = find(tril(C)); to extract the first and the second points of each triangle as x and y respectively.
For the third node we need to find a node that has x and y as its neighbors. As before we can use the multiplication of boolean matrix trick to speed up the computation.
Finally the result tri has duplicates that should be remove using unique and sort.

Octave element wise comparisons [duplicate]

let us consider following code for impulse function
function y=impulse_function(n);
y=0;
if n==0
y=1;
end
end
this code
>> n=-2:2;
>> i=1:length(n);
>> f(i)=impulse_function(n(i));
>>
returns result
f
f =
0 0 0 0 0
while this code
>> n=-2:2;
>> for i=1:length(n);
f(i)=impulse_function(n(i));
end
>> f
f =
0 0 1 0 0
in both case i is 1 2 3 4 5,what is different?
Your function is not defined to handle vector input.
Modify your impluse function as follows:
function y=impulse_function(n)
[a b]=size(n);
y=zeros(a,b);
y(n==0)=1;
end
In your definition of impulse_function, whole array is compared to zero and return value is only a single number instead of a vector.
In the first case you are comparing an array to the value 0. This will give the result [0 0 1 0 0], which is not a simple true or false. So the statement y = 0; will not get executed and f will be [0 0 0 0 0] as shown.
In the second you are iterating through the array value by value and passing it to the function. Since the array contains the value 0, then you will get 1 back from the function in the print out of f (or [0 0 1 0 0], which is an impulse).
You'll need to modify your function to take array inputs.
Perhaps this example will clarify the issue further:
cond = 0;
if cond == 0
disp(cond) % This will print 0 since 0 == 0
end
cond = 1;
if cond == 0
disp(cond) % This won't print since since 1 ~= 0 (not equal)
end
cond = [-2 -1 0 1 2];
if cond == 0
disp(cond) % This won't print since since [-2 -1 0 1 2] ~= 0 (not equal)
end
You could define your impulse function simply as this one -
impulse_function = #(n) (1:numel(n)).*n==0
Sample run -
>> n = -6:4
n =
-6 -5 -4 -3 -2 -1 0 1 2 3 4
>> out = impulse_function(n)
out =
0 0 0 0 0 0 1 0 0 0 0
Plot code -
plot(n,out,'o') %// data points
hold on
line([0 0],[1 0]) %// impulse point
Plot result -
You can write an even simpler function:
function y=impulse_function(n);
y = n==0;
Note that this will return y as a type logical array but that should not affect later numerical computations.

How to apply countvectorizer to bigrams in a pandas dataframe

I'm trying to apply the countvectorizer to a dataframe containing bigrams to convert it into a frequency matrix showing the number of times each bigram appears in each row but I keep getting error messages.
This is what I tried using
cereal['bigrams'].head()
0 [(best, thing), (thing, I), (I, have),....
1 [(eat, it), (it, every), (every, morning),...
2 [(every, morning), (morning, my), (my, brother),...
3 [(I, have), (five, cartons), (cartons, lying),...
.........
bow = CountVectorizer(max_features=5000, ngram_range=(2,2))
train_bow = bow.fit_transform(cereal['bigrams'])
train_bow
Expected results
(best,thing) (thing, I) (I, have) (eat,it) (every,morning)....
0 1 1 1 0 0
1 0 0 0 1 1
2 0 0 0 0 1
3 0 0 1 0 0
....
I see you are trying to convert a pd.Series into a count representation of each term.
Thats a bit different from what CountVectorizer does;
From the function description:
Convert a collection of text documents to a matrix of token counts
The official example of case use is:
>>> from sklearn.feature_extraction.text import CountVectorizer
>>> corpus = [
... 'This is the first document.',
... 'This document is the second document.',
... 'And this is the third one.',
... 'Is this the first document?',
... ]
>>> vectorizer = CountVectorizer()
>>> X = vectorizer.fit_transform(corpus)
>>> print(vectorizer.get_feature_names())
['and', 'document', 'first', 'is', 'one', 'second', 'the', 'third', 'this']
>>> print(X.toarray())
[[0 1 1 1 0 0 1 0 1]
[0 2 0 1 0 1 1 0 1]
[1 0 0 1 1 0 1 1 1]
[0 1 1 1 0 0 1 0 1]]
So, as one can see, it takes as input a list where each term is a "document".
Thats problaby the cause of the errors you are getting, you see, you are passing a pd.Series where each term is a list of tuples.
For you to use CountVectorizer you would have to transform your input into the proper format.
If you have the original corpus/text you can easily implement CountVectorizer on top of it (with the ngram parameter) to get the desired result.
Else, best solution wld be to treat it as it is, a series with a list of items, which must be counted/pivoted.
Sample workaround:
(it wld be a lot easier if you just use the text corpus instead)
Hope it helps!

How to convert a 3 input AND gate into a NOR gate?

I know that I can say convert a 2-input AND gate into a NOR gate by simply inverting the two inputs because of DeMorgan's Theorem.
But how would you do the equivalent on a 3-input AND gate?
Say...
____
A___| \
B___| )___
C___|____ /
I'm trying to understand this because my homework asks me to take a circuit and convert it using NOR synthesis to only use nor gates, and I know how to do it with 2 input gates, but the gate with 3 inputs is throwing me for a spin.
DeMorgan's theorem for 2-input AND would produce:
AB
(AB)''
(A' + B')'
So, yes, the inputs are inverted and fed into a NOR gate.
DeMorgan's theorem for 3-input AND would similarly produce:
ABC
(ABC)''
(A' + B' + C')'
Which is, again, inputs inverted and fed into a (3-input) NOR gate:
___
A--O\ \
B--O ) )O---
C--O/___ /
#SailorChibi has truth tables that show equivalence.
If i haven't made any mistakes it is pretty much the same, invert all 3 of the inputs and you get a NOR
Table:
AND with inverted in is exact the same as
1 1 1 = 1
1 1 1 = 0
1 0 1 = 0
0 1 0 = 0
0 1 1 = 0
0 1 0 = 0
0 0 1 = 0
0 0 0 = 0
NOR with original input
0 0 0 = 1
0 0 1 = 0
0 1 0 = 0
1 0 1 = 0
1 0 0 = 0
1 0 1 = 0
1 1 0 = 0
1 1 1 = 0

Not able to understand output of CSR Representation in CUSP

I am trying to use the CUSP library. I am reading .txt files which are basically sparse COO representation. I am using CUSP to convert into CSR format.
When I print the matrix with cusp::print() it prints the correct outcome for COO representation. However when I convert the matrix into CSR, I have written my own function for printing but the outcome is not what I want.
Here is the snippet
main()
{
//.
//bla bla
//..
//create a 2d coo matrix
cusp::coo_matrix<int, int, cusp::host_memory> D(nRows_data, nCols_data, nnz_data);
// Load data from file into sparse matrices
//fill 2D coo matrix
fill2DCooMatrixFromFile( fNameData, D );
std::cout<<"\n----------------------------\n";
cusp::print( D );
cusp::csr_matrix<int, int, cusp::host_memory> csrD = D;
std::cout<<"\n----------------------------\n";
printCSRMatrix( csrD );
}
//print csr matrix
void printCSRMatrix( cusp::csr_matrix<int, int, cusp::host_memory> csr )
{
std::cout<<"csr matrix <"<<csr.num_rows<<", "<<csr.num_cols<<"> with <csr.num_entries<<" enteries\n";
std::cout<<"V :: ";
for( int i=0 ; i<csr.values.size() ; i++ )
std::cout<<csr.values[i]<<" ";
std::cout<<"\n";
std::cout<<"CI :: ";
for( in
t i=0 ; i<csr.column_indices.size() ; i++ )
std::cout<<csr.column_indices[i]<<" ";
std::cout<<"\n";
std::cout<<"RO :: ";
for( int i=0 ; i<csr.row_offsets.size() ; i++ )
std::cout<<csr.row_offsets[i]<<" ";
std::cout<<"\n";
}
Assume that fill2DCooMatrixFromFile fills in the following matrix
1 0 1 0 0
0 0 0 1 0
0 0 0 0 0
0 1 0 0 0
0 0 0 1 0
Following is the output I get with the code
sparse matrix <5, 5> with 5 entries
0 0 1
0 2 1
1 3 1
3 1 1
4 3 1
----------------------------
csr matrix <5, 5> with 5 enteries
V :: 1 1 1 1 1
CI :: 0 2 3 1 3
RO :: 0 2 3 3 4 5
I am not able to understand the RowOffset that is the output.
The RowOffset specifies cumulative how many entries there are. It will always start with 0 and end with the number of nonzero contained in the sparse matrix.
RO :: 0 2 3 3 4 5
Hence, you should read the line as: before the first row of your sparse matrix, there are zero entries RO[0]. In the first row there are two entries RO[1], these are indexed by CI[0]-CI[1] and filled with the values of V[0]-V[1]. In the second row of your matrix, there are one more entry hence RO[2] == 3, and it is located at column CI[2] with value V[2].
As you can see the RO does not change value between the third and fourth number that indicates an empty row in the matrix.
Hope that clarifies how the CSR matrix format works. Otherwise feel free to ask more.