we are using google vision ocr for gathering text from receipts.
In some cases the receipt have some text written in vertical , like vat information and some other.
The question is that google vision read efficiently only the text in the main orientation (horizontal by example) and discards all the text written in the same receipt in vertical orientation instead in horizontal.
Is there a parameter to set up for tell google vision to acquire also the text in vertical orientation?
I have put online an example with an image with text in two orientations .
https://drive.google.com/file/d/0B8kZz-q27lGGSUl5V3RjXzBLNnc/view?usp=sharing
Text recognized from g-vision :
Horizontal text line
Text I've expected to be recognized:
Horizontal text line
Vertical text line
I know it is late response, maybe somebody will benefit from it in the future...
you can force the detector to recognize ONLY vertical text by doing a frame rotation before applying the detector on like this:
in setRotation() method in the CameraSource. write:
outputFrame = new Frame.Builder()
.setImageData(mPendingFrameData,
mPreviewSize.getWidth(),
mPreviewSize.getHeight(),
ImageFormat.NV21)
.setId(mPendingFrameId)
.setTimestampMillis(mPendingTimeMillis)
.setRotation(mRotation)
.build();
mRotation = 2; (for vertical text direction from bottom to top)
mRotation = 1; (for vertical text direction from top to bottom)
I think that's a limitation of the Google Vision API. I've searched on how to do that too, and eventually used this solution. But if you need just one of them vertical or horizontal like I did, you can use client side rotate (please see here on how to crop and rotate before upload).
Related
I face a problem that When I use fitz to detect pdf layout. The two paragraph will be detect as one textblock if the two block as a close line margin.
for example. I want detect the text and the isolated formula as to text blocks. but for now fitz detect them as one text block.How could i handdle this.
Shoud I detect words coordinates and sort it with normal reading order or some methods like this.
PyMuPDF also has ways to adjust the granularity of text extraction: there are more levels between and beyond block extraction and word extraction.
You can extract by line, by text span (both are a higher level than word) and by character (level below word). And all of them deliver wrapping rectangles of the respective text, plus a plethora of text font proprerties (font size, font weight, font style, font color), writing direction.
Here is an example that extracts lines of text:
details = page.get_text("dict", flags=fitz.TEXTFLAGS_TEXT) # skips images!
for block in details["blocks"]: # delivers the block level
for line in block["lines"]: # the lines in this block
bbox = fitz.Rect(line["bbox"]) # wraps this line
line_text = "".join([span["text"] for span in line["spans"]])
Please do have a look at this picture in the documentation - it shows an overview of the dictionary layout: https://pymupdf.readthedocs.io/en/latest/_images/img-textpage.png.
I cannot find the Google documentation for updateParagraphStyle in batchUpdate.
I am attempting to align text in text boxes in slides. The horizontal and vertical alignments seem to be indifferent places. Horizontal is part of the paragraph styling? I have read the entries for documents and sheets also looking for the correct syntax. I don't want anyone to write my code. I just want to find the documentation.
My guess for vertical alignment within shape:
updateShapeProperties': {
,"shapeProperties": {
"contentAlignment": 'MIDDLE' // TOP, MIDDLE, BOTTOM
My guess for horizontal alignment within shape:
updateShapeProperties
'updateParagraphStyle': {
'style': {
"alignment": // CENTER, START, END, JUSTIFIED
My best guesses above. I never found paragraph style in slides documentation.
Your guess is correct.
Vertical Alignment can be set using UpdateShapePropertiesRequest -> ShapeProperties -> contentAlignment
Horizontal Alignment can be set using UpdateParagraphStyleRequest -> ParagraphStyle -> alignment
References:
Changing paragraph formatting
Updating text style
Updating paragraph style
I am adding a picture (some latex converted into a PNG using matplotlib) to my text using the following code:
par = doc.add_paragraph()
par.add_run().text = 'foo bar baz'
par.add_run().add_picture('pic.png')
par.add_run().text = 'blah blah blah'
This works OK, except that the picture pic.png is not vertically aligned in the rest of the text in the document:
I can get the alignment manually in MS Word by adding a character style with the advanced vertical alignment property set to "lowered by 10pt":
The problem is that I have no idea how to do this programatically using python-docx. Conceptually the steps would be to compute the size of the image, create a character style that was lowered by half that size minus half the size of the font and apply the style to the run containing the picture. How do you create a raised or lowered font style in python-docx?
For reference, here is pic.png:
Your image has a fairly large (transparent) border around it. I added a single pixel border inside its extents here to make it visible:
I expect Word is aligning the bottom of the image with the baseline (as expected). One approach would be to see if there was a way you could specify zero bottom border.
You could also try subscript on that image run. I'm not sure what it would do but it's worth a try. So something like this:
run = par.add_run()
run.add_picture('x.png')
run.font.subscript = True
If you find the run that you manually set to "lowered by 10pt", you can view the XML for it like this (aircode):
run = vertically_adjusted_run() # however you get ahold of it
print(run._element.xml)
I expect you'll see something like this:
<w:r>
<w:rPr>
<w:position w:val="20"/>
...
... where the w:position element sets the adjustment from the baseline. The value is specified in half-points.
Anyway, neither this adjustment nor even that low-level element are supported by python-docx yet, so you'd need to get in there with lxml calls to do the needful if you wanted it badly enough.
I am quite new to the ZPL II language and have some trouble with writing text in reverse mode with the ^GB and ^FR commands. As far as I understood the ZPL language, when I want to print a text in reverse mode (white over black) I have to first draw a graphic box with the ^GB command and then set the field to be written to in reverse mode with the special ^FR command.
The problem I have is that I would like to fit the graphic box's width to the text's width. With the font I use, the ^A0 font, I couldn't find out the algorithm to calculate the correct width of the graphic box.
Depending on the text, if there are numbers or letters or both, the graphic box's width is not just (number of caracters)*(width of one caracter)...
Here is the code I use :
^XA
^FO64,0,^GB70,20,10^FS
^FO64,0,^FR^A0N,32,37^FD0001^FS
^XZ
When using a mix of numbers and letters the graphic box doesn't fit anymore :
^XA
^FO64,0,^GB70,20,10^FS
^FO64,0,^FR^A0N,32,37^FDAW01^FS
^XZ
I would be very grateful to anyone who could give me the correct approach to my problem .
I do not believe that there is a way to make the graphic box auto size based on the actual length of the text. I'd recommend using a mono-spaced font. That should easily allow you to calculate the width of the box based on the number of characters. Use Zebra Utilities to download a mono-spaced font to he printer.
When using QGraphicsTextItem for editing and rendering text the distance between bullet points and text (in any kind of list) is very small. Is there any way to increase this?
I tried setting a default style sheet on the QTextDocument but can't find a proper CSS property for this specific change.
Here is a sample of how it looks like. The red arrow shows the gap I am talking about:
Thanks,
Fabian