Octave dataframe's problem with isna() function - octave

I use Octave dataframe to read csv file but some columns can compute numbers of NA but some don't. How can I fix this?
enter image description here
I suspect it is about the columns with char type that does not work. How can I fix this? I want to use Octave for data science practice too. Thank you in advance.

Related

how to convert/match a handwritten list of names? (HWR)

I would like to see if I can scan a sign-in sheet for a class. The good news is I know 90% of the names that might be written.
My idea was to use tessaract to parse an image of names, and then use the Levenshtein algorithm to compare each line with a list of names in my database and if I get reasonably close matches, then that name is right.
Does this approach sound like a good one? If not, other ideas?
I tried using tesseract on a sample sheet (see below)
I used:
tesseract simple.png -psm 4 outtxt
Tesseract Open Source OCR Engine v3.05.01 with Leptonica
Warning. Invalid resolution 0 dpi. Using 70 instead.
Error in boxClipToRectangle: box outside rectangle
Error in pixScanForForeground: invalid box
I am assuming it didn't like line 2 because I went below the line.
The results I got were:
1.. AM: (harm;
l. ’E (J 22 a 00k
2‘ wau \\) [HQ
4. KIM TAYLOE
5. LN] Davis
6‘ Mzflé! Ha K
Obviously not the greatest, my guess is the distance matches for 4 & 5 would work, but the rest are not even close.
I have control of my sign-in sheet, but not the handwriting of folks coming in, so if any changes to that I can do to help, please let me know.
Since your goal is to get names only - I would suggest you to reduce tessedit_char_whitelist to english alphabetical ones("ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789.") so that you will not get characters that you don't expect as output like \\) [ .
Your initial approach to calculate L distance is fine if you success to extract text from handwritten image (which is a hard task for tesseract).
I would also suggest to run some preprocessing on your image. For example you can remove horizontal lines and extract text ROIs around them. In the best case you will be able to extract separated characters, but even if you don't do that - you will get better results & will be able to distinguish result names "line by line".
You should also try other recommended output quality improvement stages which you can find in Tesseract OCR wiki (link)

How to convert .caffemodel into .mat format

I wanted to use my fine-tuned caffe model in matconvnet for further processing. But the script provided by Andrea vedaldi at
https://github.com/vlfeat/matconvnet/blob/master/utils/import-caffe.py
leaves the following errors which I am not able to fix:
google.protobuf.text_format.ParseError: 8:1 : Message type "caffe.NetParameter" has no field named "layer".
When I include my caffe.proto file in the argument to import-caffe.py it leaves the following error
google.protobuf.text_format.ParseError: 8:1 : Message type "caffe.NetParameter" has no field named "layer".
I would be grateful if any researcher might have already solved this issue.
Thanks in advance
Tharun
Basically the error message says that the field named layer doesn't exist in the prototxt file. It's probably a typo. Check the prototxt file at line 8 column 1. My guess is that the name of the field should be layers.
Good luck!

Cannot plot solution of an ODE using a large domain

I'm trying to solve a system of ODEs. I'm using the ode45d command like so:
[t,x] = ode45d(#f, (t = linspace(0,100,1000)'),
[sh0; ih0; rh0; sm0; lm0; im0; se0; ie0], [7], ones(8,1));
When I type the system of ODEs and this command in octave I get the right graph. The problem is that when I let the x-axis larger, the graph changes to something very strange.
When I let the domain of the ODEs bigger than 100, the only thing I get is a vertical line on the graph.
Is there someone who knows the function ode45d and its limitations, and who can tell me why this is happening?
Thanks in advance, and sorry for my English.

New to Python - proficient with Matlab: getting error "IndexError: list index out of range"

As the title says, I'm proficient with Matlab and already have this function written there and it works great. I wanted to learn a new language and I've been pointed to Python so I figured I would write a simple function to get used to the syntax of Python and have something to validate what I've done. I wrote the function "Xfcn" (which is non-dimensional mass flow in rocket problems) and it gives me the correct number if I only use one value. Now, I'd like to plot the X-function versus Mach and validate with my Matlab version. I need to loop through some Mach vector then plot it. Plotting comes later. I'm getting the error mentioned above and I think it's a simple indexing problem, although I can't seem to figure out what it is. I've looked here and on Python's documentation center so hopefully we can resolve this quickly. I've also checked the "type" of "i", printed the range(len(Ms)) and get 0-49, by 1's, as I expect with the particular values of Ms 0-1 by equally spaced increments, also as I expect, so I cannot figure out where my error is. My code is below.
from Xfcn import Xfcn
import pylab as pyl
import numpy as np
Ms = np.linspace(0,1,endpoint=True)
X = []
for i in range(len(Ms)):
X[i][0] = Xfcn(Ms[i])
print X
print 'Done.'
Thanks for the help!
BL
You created x as a single dimensional list and are trying to access it as if it was multi dimensional

How to connect SentiWordNet to RapidMiner?

SentiWordNet is a text file. In RapidMiner 'OpenWordNet Dictionary' can only be used to access only exe files. How can I extract the sentiment scores from SentiWordNet for further processing?
Thanks in Advance.
of course you can.. with a little bit of code you can take the sentiwordnet score from the text file.
but the problem is each same word might have several different meaning.
in handling this you can simply take the average score or doing wordsense disambiguation