How to improve read tesseract accuracy?

How to improve read tesseract accuracy? - ocr

I want to get the following expected result. Can you give me any suggestions to improve the result?
Input image
Expected result
流 動 資 産
固 定 資 産
Actual result
産 産
資 資
動 定
To reproduce the result
$ git clone https://github.com/zono/ocr.git
$ cd ocr
$ git checkout 0f2541eac302dd1fe2efbbd3b36e7ba40a99d232
$ docker-compose up -d
$ docker exec -it ocr /bin/bash
# /usr/local/bin/tesseract /ocr/src/bssample7.png stdout -l jpn
産 産
資 資
動 定
Versions
$ docker -v
Docker version 19.03.5, build 633a0ea
# tesseract -v
tesseract 4.1.1-rc2-22-g08899
leptonica-1.79.0
libjpeg 8d (libjpeg-turbo 1.5.2) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11

you need to use another page-segmentation-method to get the expected result.
Try to append --psm 6 to your command to make it look like this:
$ tesseract /ocr/src/bssample7.png outputfilename -l jpn --psm 6
Here you can read about the different methods:
https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality#page-segmentation-method
Kind regards

I found the solution from Tesseract OCR Read Horizontally rather than Vertically C#
# /usr/local/bin/tesseract /ocr/src/bssample7.png stdout -l jpn --psm 6
流 動 資 産
固 定 資 産

Related

How to generate code coverage with running lcov program at the same time

I have a large project with unittest binaries running on the other machines. So, the gcda files were generated on the other machines. Then, I download them to the local machine but the different dirs. Each of the dirs has the sources code.
For example: dir gcda1/src/{*.gcda, *.gcno, *.h, *.cpp}..., dir gcda2/src/{*.gcda, *.gcno, *.h, *.cpp}....
Because the project is very large, so I have to run multiple lcov processes at the same time to generate info files to save time. And then merge these info files.
The problem is, when I merge these info files, it will take dir infos, for example:
gcda1/src/unittest1.cpp
gcda2/src/unittest1.cpp
I want this:
src/unittest1.cpp
#src/unittest1.cpp # this is expected to merge with above
The commands I use:
$ cd gcda1
$ lcov --rc lcov_branch_coverage=1 -c -d ./ -b ./ --no-external -o gcda1.info
$ cd ../gcda2
$ lcov --rc lcov_branch_coverage=1 -c -d ./ -b ./ --no-external -o gcda2.info
$ cd ..
$ lcov -a gcda1/gcda1.info -a gcda1/gcda2.info -o gcda.info
$ genhtml gcda.info -o output
The root dir contains the source code.

Description
Well, I have found a method to solve this problem finally.
The info files lcov generated are plain text file. So we can edit them directly.
Once you open these files, you will see every file line start with SF. Like below:
SF:/path/to/your/source/code.h
SF:/path/to/your/source/code.cpp
...
Problem
In my problem, these will be:
// file gcda1.info
SF:/path/to/root_dir/gcda1/src/unittest1.cpp
// file gcda2.info
SF:/path/to/root_dir/gcda2/src/unittest1.cpp
And, after lcov merge, it will be:
// file gcda.info
SF:/path/to/root_dir/gcda1/src/unittest1.cpp
SF:/path/to/root_dir/gcda2/src/unittest1.cpp
But, I expect this:
// file gcda.info
SF:/path/to/root_dir/src/unittest1.cpp
Method
My method to solve the problem is editing the info files directly.
First, edit gcda1.info and gcda2.info, change /path/to/root_dir/gcda1/src/unittest1.cpp to /path/to/root_dir/src/unittest1.cpp, and /path/to/root_dir/gcda2/src/unittest1.cpp to /path/to/root_dir/src/unittest1.cpp.
Then merge them like below and generate html report:
$ lcov -a gcda1.info -a gcda2.info -o gcda.info
$ genhtml gcda.info -o output
In a large project, we could not manually edit each info file, otherwise you will collapse.
We can use sed to help us. Like below:
$ sed "s/\(^SF.*\/\)gcda[0-9]+\/\(.*\)/\1\2/g" gcda_tmp.info > gcda.info

Cannot change delimiter of output csv for es2csv cli

Could you please suggest what i am doing wrong? i cannot change the delimiter of the output file using es2csv cli tool.
es2csv -q '*' -i test_index -o test.csv -f id name -d /t

Actually this issue has been reported here: https://github.com/taraslayshchuk/es2csv/issues/51
If you don't want to wait for the fix to be released, you can change line 212 of es2csv.py like this and it will work:
csv_writer = csv.DictWriter(output_file, fieldnames=self.csv_headers, delimiter=unicode(self.opts.delimiter))

How to verify CuDNN installation?

I have searched many places but ALL I get is HOW to install it, not how to verify that it is installed. I can verify my NVIDIA driver is installed, and that CUDA is installed, but I don't know how to verify CuDNN is installed. Help will be much appreciated, thanks!
PS.
This is for a caffe implementation. Currently everything is working without CuDNN enabled.

The installation of CuDNN is just copying some files. Hence to check if CuDNN is installed (and which version you have), you only need to check those files.
Install CuDNN
Step 1: Register an nvidia developer account and download cudnn here (about 80 MB). You might need nvcc --version to get your cuda version.
Step 2: Check where your cuda installation is. For most people, it will be /usr/local/cuda/. You can check it with which nvcc.
Step 3: Copy the files:
$ cd folder/extracted/contents
$ sudo cp include/cudnn.h /usr/local/cuda/include
$ sudo cp lib64/libcudnn* /usr/local/cuda/lib64
$ sudo chmod a+r /usr/local/cuda/lib64/libcudnn*
Check version
You might have to adjust the path. See step 2 of the installation.
$ cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2
edit: In later versions this might be the following (credits to Aris)
$ cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
Notes
When you get an error like
F tensorflow/stream_executor/cuda/cuda_dnn.cc:427] could not set cudnn filter descriptor: CUDNN_STATUS_BAD_PARAM
with TensorFlow, you might consider using CuDNN v4 instead of v5.
Ubuntu users who installed it via apt: https://askubuntu.com/a/767270/10425

My answer shows how to check the version of CuDNN installed, which is usually something that you also want to verify. You first need to find the installed cudnn file and then parse this file. To find the file, you can use:
whereis cudnn.h
CUDNN_H_PATH=$(whereis cudnn.h)
If that doesn't work, see "Redhat distributions" below.
Once you find this location you can then do the following (replacing ${CUDNN_H_PATH} with the path):
cat ${CUDNN_H_PATH} | grep CUDNN_MAJOR -A 2
The result should look something like this:
#define CUDNN_MAJOR 7
#define CUDNN_MINOR 5
#define CUDNN_PATCHLEVEL 0
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
Which means the version is 7.5.0.
Ubuntu 18.04 (via sudo apt install nvidia-cuda-toolkit)
This method of installation installs cuda in /usr/include and /usr/lib/cuda/lib64, hence the file you need to look at is in /usr/include/cudnn.h.
CUDNN_H_PATH=/usr/include/cudnn.h
cat ${CUDNN_H_PATH} | grep CUDNN_MAJOR -A 2
Debian and Ubuntu
From CuDNN v5 onwards (at least when you install via sudo dpkg -i <library_name>.deb packages), it looks like you might need to use the following:
cat /usr/include/x86_64-linux-gnu/cudnn_v*.h | grep CUDNN_MAJOR -A 2
For example:
$ cat /usr/include/x86_64-linux-gnu/cudnn_v*.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 6
#define CUDNN_MINOR 0
#define CUDNN_PATCHLEVEL 21
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#include "driver_types.h"
indicates that CuDNN version 6.0.21 is installed.
Redhat distributions
On CentOS, I found the location of CUDA with:
$ whereis cuda
cuda: /usr/local/cuda
I then used the procedure about on the cudnn.h file that I found from this location:
$ cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

To check installation of CUDA, run below command, if it’s installed properly then below command will not throw any error and will print correct version of library.
function lib_installed() { /sbin/ldconfig -N -v $(sed 's/:/ /' <<< $LD_LIBRARY_PATH) 2>/dev/null | grep $1; }
function check() { lib_installed $1 && echo "$1 is installed" || echo "ERROR: $1 is NOT installed"; }
check libcuda
check libcudart
To check installation of CuDNN, run below command, if CuDNN is installed properly then you will not get any error.
function lib_installed() { /sbin/ldconfig -N -v $(sed 's/:/ /' <<< $LD_LIBRARY_PATH) 2>/dev/null | grep $1; }
function check() { lib_installed $1 && echo "$1 is installed" || echo "ERROR: $1 is NOT installed"; }
check libcudnn
OR
you can run below command from any directory
nvcc -V
it should give output something like this
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

Installing CuDNN just involves placing the files in the CUDA directory. If you have specified the routes and the CuDNN option correctly while installing caffe it will be compiled with CuDNN.
You can check that using cmake. Create a directory caffe/build and run cmake .. from there. If the configuration is correct you will see these lines:
-- Found cuDNN (include: /usr/local/cuda-7.0/include, library: /usr/local/cuda-7.0/lib64/libcudnn.so)
-- NVIDIA CUDA:
-- Target GPU(s) : Auto
-- GPU arch(s) : sm_30
-- cuDNN : Yes
If everything is correct just run the make orders to install caffe from there.

Getting cuDNN Version [Linux]
Use following to find path for cuDNN:
cat $(whereis cudnn.h) | grep CUDNN_MAJOR -A 2
If above doesn't work try this:
cat $(whereis cuda)/include/cudnn.h | grep CUDNN_MAJOR -A 2
Getting cuDNN Version [Windows]
Use following to find path for cuDNN:
C:\>where cudnn*
C:\Program Files\cuDNN6\cuda\bin\cudnn64_6.dll
Then use this to dump version from header file,
type "%PROGRAMFILES%\cuDNN6\cuda\include\cudnn.h" | findstr "CUDNN_MAJOR CUDNN_MINOR CUDNN_PATCHLEVEL"
Getting CUDA Version
This works on Linux as well as Windows:
nvcc --version

When installing on ubuntu via .deb you can use sudo apt search cudnn | grep installed

I have cuDNN 8.0 and none of the suggestions above worked for me. The desired information was in /usr/include/cudnn_version.h, so
cat /usr/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
did the trick.

On Ubuntu 20.04LTS:
cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR
returned the expected results

torch.backends.cudnn.version()
should do the trick

How about checking with python code:
from tensorflow.python.platform import build_info as tf_build_info
print(tf_build_info.cudnn_version_number)
# 7 in v1.10.0

Run ./mnistCUDNN in /usr/src/cudnn_samples_v7/mnistCUDNN
Here is an example:
cudnnGetVersion() : 7005 , CUDNN_VERSION from cudnn.h : 7005 (7.0.5)
Host compiler version : GCC 5.4.0
There are 1 CUDA capable devices on your machine :
device 0 : sms 30 Capabilities 6.1, SmClock 1645.0 Mhz, MemSize (Mb) 24446, MemClock 4513.0 Mhz, Ecc=0, boardGroupID=0
Using device 0

For CUDnn 8.1 and above use the following command:
cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2

torch.backends.cudnn.m.is_available()

Creating custom DVD for RHEL 7 with kickstart

I am trying to create a custom CD/DVD to deploy RHEL 7 with kickstart file. Here is what I did:
Edited isolinux.cfg (in the ISOLinux folder) and grub.cfg file (in the EFI\BOOT folder).
Created ISO using mkisofs.
But it is not working. Am I using correct files/method?

Edit the ISO image and put the ks.cfg file that you have created.
Preferably, put the ks.cfg file inside ks directory. More information can be found here.

You need to use the new command. Here is an example of what will work:
Add the kickstart file to your download and exploded ISO.
Run this command in the area with the ISO and kickstart and point to another location to build the ISO:
genisoimage -r -v -V "OEL6 with KS for OVM Manager" -cache-inodes -J -l -b isolinux/isolinux.bin -c isolinux/boot.cat -no-emul-boot -boot-load-size 4 -boot-info-table -o OEL6U6_OVM_Manager.iso /var/www/html/Template/ISO/

I found the way to create custom DVD from the RHEL7 page.
Mount the downloaded image
mount -t iso9660 -o loop path/to/image.iso /mnt/iso
Create a working directory - a directory where you want to place the contents of the ISO image.
mkdir /tmp/ISO
Copy all contents of the mounted image to your new working directory. Make sure to use the -p option to preserve file and directory permissions and ownership.
cp -pRf /mnt/iso /tmp/ISO
Unmount the image.
umount /mnt/iso
Make sure your current working directory is the top-level directory of the extracted ISO image - e.g. /tmp/ISO/iso. Create the new ISO image using genisoimage:
genisoimage -U -r -v -T -J -joliet-long -V "RHEL-7.1 Server.x86_64" -Volset "RHEL-7.1 Server.x86_64" -A "RHEL-7.1 Server.x86_64" -b isolinux/isolinux.bin -c isolinux/boot.cat -no-emul-boot -boot-load-size 4 -boot-info-table -eltorito-alt-boot -e images/efiboot.img -no-emul-boot -o ../NEWISO.iso .
Hope the answer will helpful:

I am editing my answer due to the comments posted. Here is a more comprehensive solution:
(A) You need to create the ISO properly. I found helpful information in this URL.
Here is the line that I actually ended up with, for my MBR/UEFI ISO creation:
mkisofs -U -A "<Volume Header>" -V "RHEL-7.1 x86_64" -volset "RHEL-7.1 x86_64" -J -joliet-long -r -v -T -x ./lost+found -o ${OUTPUT}/${HOST}.iso -b isolinux/isolinux.bin -c isolinux/boot.cat -no-emul-boot -boot-load-size 4 -boot-info-table -eltorito-alt-boot -e images/efiboot.img -no-emul-boot -boot-load-size 18755 /dir/where/sources/for/ISO/are/located
Be careful with the -V parameter, as it has to match what the kernel has defined for inst.stage2. In the default grub.conf included in the boot disk, it is configured to be "hd:LABEL=RHEL-7.1\x20x86_64" which matches with the settings above.
(B) You need the correct setup for EFI for RHEL7. For some reason, this has changed from RHEL6, where you could just use the /EFI/BOOT/BOOTX64.conf. Now it uses the /EFI/BOOT/grub.cfg. Common wisdom from Red Hat Manuals state to add the inst.ks= parameter to the kernel line. The grub.cfg that comes in the /EFI/BOOT directory of the RHEL7 boot iso actually has the linuxefi parameter, instead of the kernel one, I would guess they would work the same. If you are including the KS file on the CD, this should get you there.
Good Luck!

Root access required for CUDA?

I am using GeForce 8400M GS on Ubuntu 10.04 and I am learning CUDA programming. I am writing and running few basic programs. I was using cudaMalloc, and it kept giving me an error until I ran the code as root. However, I had to run the code as root only once. After that, even if I run the code as normal user, I do not get an error on malloc. What's going on?

This is probably due to your GPU not being properly initialized at boot. I've come across this problem when using Ubuntu Server and other installations where an X server isn't being started automatically. Try the following to fix it:
Create a directory for a script to initialize your GPUs. I usually use /root/bin. In this directory, create a file called cudainit.sh with the following code in it (this script came from the Nvidia forums).
#!/bin/bash
/sbin/modprobe nvidia
if [ "$?" -eq 0 ]; then
# Count the number of NVIDIA controllers found.
N3D=`/usr/bin/lspci | grep -i NVIDIA | grep "3D controller" | wc -l`
NVGA=`/usr/bin/lspci | grep -i NVIDIA | grep "VGA compatible controller" | wc -l`
N=`expr $N3D + $NVGA - 1`
for i in `seq 0 $N`; do
mknod -m 666 /dev/nvidia$i c 195 $i;
done
mknod -m 666 /dev/nvidiactl c 195 255
else
exit 1
fi
Now we need to make this script run automatically at boot. Edit /etc/rc.local to look like the following.
#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.
#
# Init CUDA for all users
#
/root/bin/cudainit.sh
exit 0
Reboot your computer and try to run your CUDA program as a regular user. If I'm right about what the problem is, then it should be fixed.

To work with Ubuntu 14.04 I followed https://devtalk.nvidia.com/default/topic/699610/linux/334-21-driver-returns-999-on-cuinit-cuda-/ to add nvidia-uvm to etc/modules, and to add a line to a custom udev rule. Create /etc/udev/rules.d/70-nvidia-uvm.rules with this line:
KERNEL=="nvidia_uvm", RUN+="/bin/bash -c '/bin/mknod -m 666 /dev/nvidia-uvm c $(grep nvidia-uvm /proc/devices | cut -d \ -f 1) 0;'"
I don't understand why sudo modprobe nvidia-uvm works to create a proper /dev/nvidia-uvm (as does sudo cuda_program) but the /etc/modules listing requires the udev rule.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

How to improve read tesseract accuracy? - ocr

I found the solution from Tesseract OCR Read Horizontally rather than Vertically C# # /usr/local/bin/tesseract /ocr/src/bssample7.png stdout -l jpn --psm 6 流動資産固定資産

Related

How to generate code coverage with running lcov program at the same time

Cannot change delimiter of output csv for es2csv cli

How to verify CuDNN installation?

Creating custom DVD for RHEL 7 with kickstart

Root access required for CUDA?

Categories

Resources

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

How to improve read tesseract accuracy? - ocr

I found the solution from Tesseract OCR Read Horizontally rather than Vertically C# # /usr/local/bin/tesseract /ocr/src/bssample7.png stdout -l jpn --psm 6 流 動 資 産 固 定 資 産

Related

How to generate code coverage with running lcov program at the same time

Cannot change delimiter of output csv for es2csv cli

How to verify CuDNN installation?

Creating custom DVD for RHEL 7 with kickstart

Root access required for CUDA?

Categories

Resources

I found the solution from Tesseract OCR Read Horizontally rather than Vertically C# # /usr/local/bin/tesseract /ocr/src/bssample7.png stdout -l jpn --psm 6 流動資産固定資産