I am having a hard time working with Tesseract, is there a way to improve the accuracy? How do I train it for myself, if needed?
the only thing I am doing is reading the following characters, XYZ:-0123456789
that's it! The pictures always look that way.
thanks!
The output of Tesseract 4.00alpha with your image is
$ tesseract ICKcj.png - -l eng
*: 4606 Y; 4809 Z; 698
Warning. Invalid resolution 0 dpi. Using 70 instead.
Resample the picture to 50% and setting the dpi to 300:
The output with this image is slightly better and the warning is vanishing:
$ tesseract ICKcj-50.png - -l eng
X: 4606 Y: 4809 Z: 698
The only thing missing are the minus signs, which are printed quite irregular (a better resolution in the picture could help). It is also possible to restrict the output pattern in tesseract. Alternatively, you can try to guess the minus afterwards depending on the spaces between the X, Y, Z and the numbers.
Related
When my colleague runs:
>> fir1(8,.5)
on his 2018b installation, he gets
ans =
-0.0000 -0.0227 0.0000 0.2740 0.4974 0.2740 0.0000 -0.0227 -0.0000
but when I run it on my 2017b installation, I get:
ans =
0.0093 0.0364 0.1157 0.2113 0.2547 0.2113 0.1157 0.0364 0.0093
We believe his result is correct, because when he runs
>> clf; plot(abs(freqz(fir1(100, 0.5 ),1, 800) ) ); grid on;
the plot crosses 0.5 at half the Nyquist (400 on the x axis) as expected.
But when I run it, the plot crosses 0.5 at around 130 on the x axis.
I did this to check the path:
>> which fir1
C:\Program Files\MATLAB\R2017b\toolbox\signal\signal\fir1.m
I am running 2017b; he is running 2018b. I did a WinMerge comparison between the two fir1.m files. They were different, but not in any way that might explain this, at least not that I could see.
Before running these tests I renamed three functions that had been shadowing Matlab functions and did
>> clear all
to be certain that those functions were no longer shadowing. They had been: gamma, isvector and xor.
I would be surprised if 2017b had a broken fir1. What could be causing this?
I have some images containing only digits, and a semicolon.
Example:
You can see more here: https://imgur.com/a/54dsl6h
They seem pretty clean and straightforward to me, but Tesseract considers them as empty "pages" (Empty page!!).
I tried both with oem 1 and oem 0 with a character list:
tesseract processed/35.0.png stdout -c tessedit_char_whitelist=0123456789: --oem 0
tesseract processed/35.0.png stdout
What can I do to get Tesseract to recognize the characters better?
Tesseract still gives me pretty bad results overall, but making the text bolder with a simple dilatation algorithm helped a bit.
In the end, since the font is really square, I used a trick, where I defined a bunch of segments for each digits, and depending on which segments intersect, or dont intersect with the digit, I can determine with 99% accuracy which digit it is.
I have a project where I have to recognize the frequency from an audio file. For this I use a single tone of 10 kHz to see if I can get it working.
Since I am pretty new to Octave, I tried this example with my own audio file.
I tried to understand what happens by doing some research to all functions.
My question here is; if I let specgram plot the figure when I do not specify it's output:
specgram(y,fftn,Fs,hanning(window),step);
it gives a line at 10kHz which is what I want.
But if I specify the output for the specgram function
[S,f,t]= specgram(y,fftn,Fs,hanning(window),step);
and let it plot, it plots the line at 18 kHz.
I figured it have to be in the inputs for the figure and I tried modifying these a bit, but every time I do that Octave gives an error.
I need the frequency as an given output, since I have to do some calculations with it, I figured I need to specify the frequency output.
This is the part of the code that specify the plot for the spectrogram:
step= fix(5*Fs/1000); % stepsize of the window
window= fix(90*Fs/1000); % window size
fftn =2^nextpow2(window); % Size of the FFT block
[S,f,t]= specgram(y,fftn,Fs,hanning(window),step);
S= abs(S(2:fftn*12000/Fs,:)); % Normalize the phase
S= S/max(S(:)); % Normalize the Energy
S= max(S, 10^(-40/10)); % Throw out values below -40 dB and above -3dB
S= min(S, 10^(-3/10));
figure
imagesc(t,f,(log(S)));
Can anyone help me here how to gain the frequency data from the audio file so I can use it in some calculations?
I have searched for answers already in the Octave manual for help and I tried it with various matlab sites. Also checked already many posts here such as:
How does Octave spectrogram 'specgram' from signal work?
Methodology of FFT for Matlab spectrogram / short time Fourier transform functions
P.S. Sorry for my bad English, it's not my native language
I found the answer myself, it turns out it is in this line of code:
S= abs(S(2:fftn*12000/Fs,:));
if I delete this line, the lines are placed on the right frequency in the figure. To me it looks like this line just takes a small space of the fft and replaces it with other frequencies but I'm not shure about that.
I made this graph in wolfram alpha by accident:
Can you write code to produce a larger version of this pattern?
Can you make similar looking patterns?
Readable code in any language is good, but something that can be run in a browser would be best (i.e. JavaScript / Canvas). If you write code in other languages, please include a screenshot.
Notes:
The input formula for the above image is: arg(sin(x+iy)) = sin^(-1)((sqrt(2) cos(x) sinh(y))/sqrt(cosh(2 y)-cos(2 x))) (link)
You don't have to use to use the above formula. Anything which produces a similar result would be cool. But "reverse engineering" Wolfram Alpha would be best
The two sides of the equation are equal (I think), So WA should have probably only returned 'true' instead of the graph
The pattern is probably the result of rounding errors.
I don't know if the pattern was generated by iterating over every pixel or if it's vector based (points and lines). My guess is with vector.
I don't know what causes this type of pattern ('Rounding errors' is the best guess.)
IEEE floating point standard does not say how sin or cos, etc should work, so trig functions vary between platforms and architectures.
No brownian motion plots please
Finally, here's another example which might help in your mission: (link)
As you asked for similar looking patterns in any language, here is the Mathematica code (really easy since Wolfram Alpha is based on Mathematica)
Edit
It is indeed a roundoff effect:
If we set:
and make a plot
Plot3D[f[x, y], {x, 7, 9}, {y, -8, -9},WorkingPrecision -> MachinePrecision]
The result is:
But if we extend the precision of the plot to 30 digits:
Plot3D[f[x, y], {x, 7, 9}, {y, -8, -9},WorkingPrecision -> 30]
We get
and the roughness is gone (which caused your scribbly pattern)
BTW, your f[x,y] is a very nice function:
So if I managed to copy your formulas without errors (which should be considered a miracle), both sides of your equation are equal only in certain periodic ranges in x, probably of the form [2 n Pi, (2 n + 1) Pi]
95 bytes currently in python
I,V,X,L,C,D,M,R,r=1,5,10,50,100,500,1000,vars(),lambda x:reduce(lambda T,x:T+R[x]-T%R[x]*2,x,0)
Here is the few test results, it should work for 1 to 3999 (assume input is valid char only)
>>> r("I")
1
>>> r("MCXI")
1111
>>> r("MMCCXXII")
2222
>>> r("MMMCCCXXXIII")
3333
>>> r("MMMDCCCLXXXVIII")
3888
>>> r("MMMCMXCIX")
3999
And this is not duplicate with this, this is reversed one.
So, is it possible to make that shorter in Python, or Other languages like ruby could be done shorter than that?
Shortest solutions from codegolf.com
There was a "Roman to decimal" competition over at Code Golf some time ago. (Well, actually it's still running because they never end.) A Perl golfer by the name of eyepopslikeamosquito decided to win all four languages (Perl, PHP, Python, and Ruby), and so he did. He wrote a fascinating four-part series "The golf course looks great, my swing feels good, I like my chances" (part II, part III, part IV) describing his approaches over at Perl Monks.
Here are his solutions:
Ruby, 53 strokes
n=1;$.+=n/2-n%n=10**(494254%C/9)%4999while C=getc;p$.
Perl, 58 strokes
$\+=$z-2*$z%($z=10**(19&654115/ord)%1645)for<>=~/./g;print
He also has a 53-stroke solution, but it probably doesn't work right now: (it uses the $^T variable during a few second period in 2011!)
$\+=$z-2*$z%($z=10**(7&$^T/ord)%1999)for<>=~/./g;print
PHP, 70 strokes
<?while(A<$c=fgetc(STDIN))$t+=$n-2*$n%$n=md5(o²Ûö¬Ñ.$c)%1858+1?><?=$t;
The six weird characters in the md5(..) are chr(111).chr(178).chr(219).chr(246).chr(172).chr(209) in Perl notation.
Python, 78 strokes
t=p=0
for r in raw_input():n=10**(205558%ord(r)%7)%9995;t+=n-2*p%n;p=n
print t
Python - 94 chars
cheap shot :)
I,V,X,L,C,D=1,5,10,50,100,500
M,R,r=D+D,vars(),lambda x:reduce(lambda T,x:T+R[x]-T%R[x]*2,x,0)
Actually defining my own fromJust is smaller, a total of 98
r=foldl(\t c->t+y c-t`mod`y c*2)0 --34
y x=f$lookup x$zip"IVXLCDM"[1,5,10,50,100,500,1000] --52
f(Just x)=x --12
-- assumes correct input
Haskell gets close.
import Data.Maybe --18
r=foldl(\t c->t+y c-t`mod`y c*2)0 --34
y x=fromJust$lookup x$zip"IVXLCDM"[1,5,10,50,100,500,1000] --59
total bytes = 111
Would be 93 if i didn't need the import for fromJust
Adopting a response from Jon Skeet to a previously asked similar question:
In my custom programming language "CPL1839079", it's 3 bytes:
r=f