How to make the sum of output to 1 - deep-learning

My (PyTorch) sum of model’s output isn’t 1. And this is the structure of model.
LSTM(4433, 64)
LSTM(64, 64)
Linear(64, 4433)
Sigmoid()
And this is the predicted output of the model.
Input
[1, 0, 0, …, 0, 0]
Output
[.7842, .5, .5, …, .5, .5]
Do you know any function that can make its sum 1?

Sigmoid activation function maps every input to a value between [0, 1], without taking into account other elements in the input vector. However, Softmax does a similar transformation but the output vector sums 1.
TL;DR: use softmax instead of sigmoid.

Related

How can I use scipy interp1d with N-D array for x without for loop

How can I use scipy.interpolate.interp1d when my x array is an N-D array, instead of a 1-D array, without using a loop?
The function f from interp1d then needs to be used with numpy.percentile with one of the arrays as an input.
I think there should be a way to do it with a list comprehension or lambda function, but I am still learning these tools.
(Note that this is different than my recent question here because I mixed up the x and y arrays in the posted question, and this problem was not reproducible.)
Problem statement/example:
# a is y in interp1d docs
a = np.array([97,4809,4762,282,3879,17454,103,2376,40581,])
# b is x in interp1d docs
b = np.array([
[0.14,0.11,0.29,0.11,0.09,0.68,0.09,0.18,0.5,],
[0.32,0.25,0.67,0.25,0.21,1.56,1.60, 0.41,1.15,],]
)
Just trying this, below, fails with ValueError: x and y arrays must be equal in length along interpolation axis. The expected return is array(97, 2376). Using median here, but will need to consider 10th, 90th, etc. percentiles.
f = interpolate.interp1d(b, a, axis=0)
f(np.percentile(b, 50, axis=0))
However this, below, works and prints array(97.)
f = interpolate.interp1d(b[0,:], a, axis=0)
f(np.percentile(b[0,:], 50, axis=0))
A loop works, but I am wondering if there is a solution using list comprehensions, lambda functions, or some other technique.
l = []
for _i in range(b.shape[0]):
_f = interpolate.interp1d(b[_i,:], a, axis=0)
l.append(_f(np.percentile(b[_i,:], 50, axis=0)))
print(out)
# returns
# [array(97.), array(2376.)]
Efforts:
I understand I can loop through the b array with a list comprehension.
[b[i,:] for i in range(b.shape[0])]
# returns
# [array([0.14, 0.11, 0.29, 0.11, 0.09, 0.68, 0.09, 0.18, 0.5 ]),
# array([0.32, 0.25, 0.67, 0.25, 0.21, 1.56, 1.6 , 0.41, 1.15])]
And I also understand that I can use a list comprehension to create the scipy function f for each dimension in b:
[interpolate.interp1d(b[i, :], a, axis=0) for i in range(b.shape[0])]
# returns
# [<scipy.interpolate.interpolate.interp1d at 0x1b72e404360>,
# <scipy.interpolate.interpolate.interp1d at 0x1b72e404900>]
But I don't know how to combine these two list comprehensions to apply the np.percentile function.
Using Python 3.8.3, NumPy 1.18.5, SciPy 1.3.2
If you have large data arrays, you want to stay away from for loops, map, np.vectorize and comprehensions. They will all be slow. Instead, it's always better to use vectorized numpy or scipy operations whenever possible.
In this particular case, you can implement the vectorization pretty trivially yourself. interp1d defaults to a linear interpolation, which is very simple to code by hand. For a general interpolator, the first step would be to sort x and y, which is why scipy can't support multiple x for a given y. If the x rows all have different sort order, what do you do with the y?
Luckily, there are a couple of things you can do to make this much faster than having to build a full interpolator or argsort y multiple times. For example, start by argsorting x:
idx = b.argsort(axis=1)
idx is now an array such that b[np.arange(2)[:, None], idx] gives the sorted version of b along axis 1, and also, a[idx] is the corresponding y-values. Since you are taking the median (50th precentile), and the rows have an odd number of elements, the value of x is just the middle of each row, and y is given by
a[idx[:, len(a) // 2]]
If you had an even number of elements, you would have to average the elements surrounding the middle:
i = len(a) // 2 - 1
a[idx[:, i:i + 2]].mean(axis=1)
You can reduce algorithmic complexity by using np.argpartition instead of a full-blown np.argsort to get the middle element(s).
interp1d and other interpolators from scipy.interpolate only support 1D x arrays. So you'll need to loop over the dimensions of x manually.

plotting multiple graphs on the same figure in Octave

I'm trying to plot multiple graphs on a single figure in Octave. Here is my code: these graphs represents the decrease of the cost function on each iteration of gradient decent:
% Init Theta and Run Gradient Descent
theta = zeros(3, 1);
[theta, J_history] = gradientDescentMulti(X, y, zeros(3, 1), alpha, num_iters);
[theta1,J1]=gradientDescentMulti(X, y, zeros(3, 1), 0.05, num_iters);
[theta3,J3]=gradientDescentMulti(X, y, zeros(3, 1), 0.03, num_iters);
% Plot the convergence graph
figure;
plot(1:numel(J_history), J_history, 'g', 'LineWidth', 2);
hold on;
plot(1:50, J2, 'r');
plot(1:50, J3, 'b');
xlabel('Number of iterations');
ylabel('Cost J');
However, when I run the codes, I got only one graph on the figure without even the labels, The best I was able to d was to put two graph on the same figure:
Is there something wrong with my codes?

Converting from fractional to decimal representation in Octave

I'm getting the following warning:
warning: Using rat() heuristics for double-precision input (is this what you wanted?)
and my resultant calculation is using the rational approach when I would like the decimal form. How can I force the computation to convert the rational to a decimal representation?
Here is the code:
pkg load symbolic
syms a b c d real
C = [1, 0, 0, 0; 0, 1, 0, 0; 0, 0, 0, 1; 0, 0, 1, 0]
H = (1/sqrt(2))*[1, 1; 1, -1]
I = [1, 0; 0, 1]
X = [a, b, c, d]
s = kron(H, I)
s*C*X'
The rational representation can be converted to a float one using vpa. vpa(x,n) evaluates x to at least n significant digits. If you want to use current value of digits, you can omit n.
vpa(s*C*X.',4)
% Above line evaluates the result to at least 4 significant digits
Also note that ' is not transpose. It is complex conjugate transpose. Use transpose (i.e. .') when you are meant to take transpose. That's why I made the replacement in the above code.
Regarding the warning message, it can be turned off by:
warning ('off', 'OctSymPy:sym:rationalapprox');
You can turn it on again by replacing off with on in the above code.

Index of Embedding layer with zero padding in Keras

I am building an RNN model in Keras for sentences with word embeddings from gensim. I am initializing the embedding layer with GloVe vectors. Since this is a sequential model and sentences have variable lengths, vectors are zero-padded. e.g.
[0, 0, 0, 6, 2, 4]
Let's say the GloVe vectors have dimensions [NUM_VOCAB, EMBEDDING_SIZE]. The zero index is masked (ignored) so to get the proper indexing of words, do we add an extra column to the GloVe matrix so the dimensions are: [NUM_VOCAB+1, EMBEDDING_SIZE]?
Seems like there is an unnecessary vector that the model will estimate unless there is a more elegant way.
glove = Word2Vec.load_word2vec_format(filename)
embedding_matrix = np.vstack([np.zeros(EMBEDDING_SIZE), glove.syn0])
model = Sequential()
# -- this uses Glove as inits
model.add(Embedding(NUM_VOCAB, EMBEDDING_SIZE, input_length=maxlen, mask_zero=True,
weights=[embedding_matrix]))
# -- sequence layer
model.add(LSTM(32, return_sequences=False, init='orthogonal'))
model.add(Activation('tanh'))
...
Thanks

prolog all binary numbers

i need a predicate that will produce all the binary number of N digits .
For instance the predicate binary(2,L)
will return L = [[0, 0], [0, 1], [1, 0], [1, 1]].
please do not use findall ....
Once you have a list representing all the numbers with N bits, generating all the numbers of N+1 bits is just a matter of unfolding every N-number [a,b,c,...] into two N+1-numbers: [0,a,b,c,...] and [1,a,b,c,...].
Update:
unfold([], []).
unfold([H|T], [[0|H], [1|H]|L]) :-
unfold(T, L).
bn(N, L) :-
( N = 0
-> L = [[]]
; N1 is N - 1,
bn(N1, L1),
unfold(L1, L)
).
If you need to avoid findall/3, then you need an aggregator to collect the binary numbers:
binary(N, L) :-
collect_binaries(N, [], L).
You then generate one binary at a time and check whether it's already present in the aggregated list:
collect_binaries(N, R, L) :-
length(B, N),
make_binary(B), % make binary of length N
\+ memberchk(B, R),
!,
collect_binaries(N, [B|R], L).
If generating another binary fails, you are done:
collect_binaries(_, L, L).
Generating binaries is simple (I'm using the format you gave in your question: a list of 0/1 values). You iterate over all positions in the list and use either 1 or 0:
make_binary([]).
make_binary([H|T]) :-
member(H, [1,0]),
make_binary(T).
Result:
?- binary(2, L).
L = [[0, 0], [0, 1], [1, 0], [1, 1]]
Yes (0.00s cpu)