A way to get single contiguous text block for all text from Google cloud vision OCR?

A way to get single contiguous text block for all text from Google cloud vision OCR? - google-apis-explorer

The problem is I want to use Google Cloud Vision to scan receipts. Receipts are always one block of contiguous text. The default response from the documentTextDetection API is sectioned into blocks, often splitting item name and price in variable ways. This is very useful but not for this case.
Is there is a way to tell the documentTextDetection api to return one single block? If not is there an example of stitching the vertices together to get to same result?
Sample Receipt Image input:
Beer £2.99
Coffee £6.99
Chocolate £0.99
Response:
{textAnnotations: [
{
{description: '£2.99'},
{description: '£6.99'},
{description: '£0.99'}
},
{
{description: 'Beer'},
{description: 'Coffee'},
{description: 'Chocolate'}
}
]
Desired response where item name matched to price:
{textAnnotations: [
{
{description: 'Beer £2.99'},
{description: 'Coffee . £6.99'},
{description: 'Chocolate £0.99'}
}
]

You need to assemble the response by hand iterating over full_text_annotation, pages, blocks, sections, words. There is an attribute in words.symbol.property.detected_break that tells when there are breaks, in your case if the symbol is the last of the line the value would be equal to 3.

Related

Autodesk Forge PDF Viewer and Measuring

I am working on an app that needs to calculate measures like areas and length etc. Lucky today by using Autodesk Forge viewer we can do that. I had looked into this blog post [enter link description here][1]
and as well to the docs [enter link description here][2]
[1]: https://aps.autodesk.com/blog/fast-pdf-viewingmarkup-inside-forge-viewer
[2]: https://aps.autodesk.com/en/docs/viewer/v7/reference/Extensions/MeasureExtension/
I am looking for a way to insert the measure values into my database, where I can view it again when I want or reload the page (not lose it) similarly with Markup with callouts and text.
Lastly, I am wondering about how much does it costs to translate the pdfs files using Forge?
thanks

You can retrieve the array with objects related with the measurements done with the line:
NOP_VIEWER.getExtension('Autodesk.Measure').measureTool.getMeasurementList()
You can store the result in your DB, together with viewstate and additional info such as urn and viewable guide.
To restore it, you can first activate the tool
NOP_VIEWER.getExtension('Autodesk.Measure').activate()
Then set the measurement list using the values you read from the DB
NOP_VIEWER.getExtension('Autodesk.Measure').measureTool.setMeasurements(listMeasurements)
Where listMeasurements will be something like:
var listMeasurements = [
{
angle: "0.0 °",
arc: "0.0 mm",
area: "0.0 mm²",
deltaX: "1569.7 mm",
deltaY: "6463.7 mm",
deltaZ: "162.0 mm",
distance: "6653.6 mm",
from: "Vertex",
location: "X: 0.0 mm\nY: 0.0 mm\nZ: 0.0 mm",
picks: [
{intersection: {x:43.5168342590332,y:-60.37924575805664,z: 8.858267784118652}, modelId: 2, viewportIndex2d: null, snapNode: 2587},
{intersection: {x: 38.367037573210276,y: -39.17272345572108,z: 8.32677173614502}, modelId: 2, viewportIndex2d: null, snapNode: 3521}
],
precision: 1,
text: "",
to: "Vertex",
type: "Distance",
unitType: "mm"
}
]
Now, you can deactivate it with one line of code
NOP_VIEWER.getExtension('Autodesk.Measure').deactivate()
Instead of using NOP_VIEWER, refer to your viewer instance through the variable defined in your code

How do I couple segmented CF traffic data (SS) with shape data (SHP)?

The "Road Shape and Road Class Filters" resource's example seems to imply that CF data is mapped to the current shape. This is because CF and SHP tags are siblings.
<?xml version="1.0" encoding="UTF-8"?>
<TRAFFICML_REALTIME CREATED_TIMESTAMP="2017-06-02T18:10:48Z" MAP_VERSION="" UNITS="imperial" VERSION="3.2" xmlns="http://traffic.nokia.com/trafficml-flow-3.2">
<RWS EBU_COUNTRY_CODE="1" EXTENDED_COUNTRY_CODE="A0" MAP_VERSION="201702" TABLE_ID="7" TY="TMC" UNITS="imperial">
<RW DE="Binford Blvd" LI="107+01100" PBT="2017-06-02T18:10:13Z" mid="1fe417f0-f17e-47b8-b0b0-b67a71eec11d|">
<FIS>
<FI>
<TMC DE="E Kessler Boulevard East Dr" LE="1.5983" PC="8367" QD="-"/>
<SHP FC="3">39.8405,-86.11263 39.84072,-86.11237</SHP>
<SHP FC="3">39.84072,-86.11237 39.8413,-86.11168</SHP>
<SHP FC="3">39.8413,-86.11168 39.84181,-86.11106 39.84235,-86.11039 39.84307,-86.10953 39.84487,-86.10738 39.84663,-86.10527 39.84747,-86.10427 39.84793,-86.10369</SHP>
<SHP FC="3">39.84793,-86.10369 39.84886,-86.10255 39.84949,-86.10172 39.85041,-86.10046 39.85088,-86.09985 39.85137,-86.09926 39.85169,-86.09888 39.85203,-86.09854 39.85237,-86.09821 39.85272,-86.09789 39.85307,-86.09758 39.85343,-86.09729 39.8542,-86.09673 39.85502,-86.09616</SHP>
<SHP FC="3">39.85502,-86.09616 39.85534,-86.09595 39.85631,-86.09528 39.85691,-86.09487 39.85751,-86.09443</SHP>
<SHP FC="3">39.85751,-86.09443 39.85808,-86.09399 39.85836,-86.09379</SHP>
<CF CN="0.97" FF="47.85" JF="1.39455" SP="39.84" SU="39.84" TY="TR"/>
</FI>
<!-- ... -->
</FIS>
</RW>
</RWS>
</TRAFFICML_REALTIME>
This is useful, since it tells me the exact road shape and its corresponding traffic data.
This is not the case when the Flow Item is broken up into multiple segments. Here is a sample JSON that I'm working with:
...
{
"FIS":[
{
"FI":[
{
...
"SHP":[
{
"value":[
"51.24274,7.13212 51.24311,7.13263 51.2432,7.13277 "
],
"FC":3
},
{
"value":[
"51.2432,7.13277 51.24345,7.13314 51.24363,7.13346 51.24382,7.13381 51.24398,7.13408 51.24408,7.13423 51.24418,7.13436 "
],
"FC":3
},
...
]
"CF":[
{
"SSS":{
"SS":[
{
"LE":1.07,
"SP":50.0,
"SU":52.63,
"FF":49.18,
"JF":0.0
},
{
"LE":0.37,
"SP":25.67,
"SU":25.67,
"FF":26.74,
"JF":0.37504
},
...
As you can see, the CF segments are decoupled from the shape of the road, unlike the previous XML example.
Is there any way to interpret this data that couples traffic congestion with the shape of the road?

Each traffic flow model consists of a location data of road segment represented in different location references (TMC, SHP)
and a currentFlow (CF) field describing the current traffic conditions.
If there are different traffic conditions within same road segment, then sub segments (SS) are included to provide more granular traffic conditions while CF having aggregated information.
Data model for traffic flow does not provide location data for each subsegment, however it can be derived using length information available on subsegment SS. Example: Traverse to shape points until it matches subsegment length (preferred using percentile length as length may vary due to different map version or length calculated from shape points does not accurately match with map)then continuing it until last SSS

How to extract invoices data from an image in android app?

My task is to extract text from a scanned document/ JPG and then get only below mentioned 6 values so that I can auto-fill a form-data in my next screen/ activity.
I used google cloud vision api in my android app with a Blaze version(paid), And I got the result as a text block, but I want to extract only some of information out of them, how I can achieve that?
Bills or receipt can be different all the time but I want 6 things out of all the invoices text block for Ex -
Vendor
Account
Description
Due Date
Invoice Number
Amount
Is there any tool/3rd party library available so that I can use in my android development.
Note - I don't think any sample of receipt or bill image needed for this because it can be any type of bill or invoice we just need to extract 6 mentioned things from that extracted text.

In the next scenarios I will create two fictive bill formats, then write the code algorithm to parse them. I will write only the algorithm because I don't know JAVA.
On the first column we have great pictures from two bills. In the second column we have text data obtained from OCR software. It's like a simple text file, with no logic implemented. But we know certain keywords that can make it have meaning. Bellow is the algorithm that translates the meaningless file in a perfect logical JSON.
// Text obtained from BILL format 1
var TEXT_FROM_OCR = "Invoice no 12 Amount 55$
Vendor name BusinessTest 1 Account No 1213113
Due date 2019-12-07
Description Lorem ipsum dolor est"
// Text obtained from BILL format 2
var TEXT_FROM_OCR =" BusinessTest22
Invoice no 19 Amount 12$
Account 4564544 Due date 2019-12-15
Description
Lorem ipsum dolor est
Another description line
Last description line"
// This is a valid JSON object which describes the logic behind the text
var TEMPLATES = {
"bill_template_1": {
"vendor":{
"line_no_start": null, // This means is unknown and will be ignored by our text parsers
"line_no_end": null, // This means is unknown and will be ignored by our text parsers
"start_delimiter": "Vendor name", // Searched value starts immediatedly after this start_delimiters
"end_delimiter": "Account" // Searched value ends just before this end_delimter
"value_found": null // Save here the value we found
},
"account": {
"line_no_start": null, // This means is unknown and will be ignored by our text parsers
"line_no_end": null, // This means is unknown and will be ignored by our text parsers
"start_delimiter": "Account No", // Searched value starts immediatedly after this start_delimiters
"end_delimiter": null // Extract everything untill the end of current line
"value_found": null // Save here the value we found
},
"description": {
// apply same logic as above
},
"due_date" {
// apply same logic as above
},
"invoice_number" {
// apply same logic as above
},
"amount" {
// apply same logic as above
},
},
"bill_template_2": {
"vendor":{
"line_no_start": 0, // Extract data from line zero
"line_no_end": 0, // Extract data untill line zero
"start_delimiter": null, // Ignore this, because our delimiter is a complete line
"end_delimiter": null // Ignore this, because our delimiter is a complete line
"value_found": null // Save here the value we found
},
"account": {
"line_no_start": null, // This means is unknown and will be ignored by our text parsers
"line_no_end": null, // This means is unknown and will be ignored by our text parsers
"start_delimiter": "Account", // Searched value starts immediatedly after this start_delimiters
"end_delimiter": "Due date" // Searched value ends just before this end_delimter
"value_found": null // Save here the value we found
},
"description": {
"line_no_start": 6, // Extract data from line zero
"line_no_end": 99999, // Extract data untill line 99999 (a very big number which means EOF)
"start_delimiter": null, // Ignore this, because our delimiter is a complete line
"end_delimiter": null // Ignore this, because our delimiter is a complete line
"value_found": null // Save here the value we found
},
"due_date" {
// apply same logic as above
},
"invoice_number" {
// apply same logic as above
},
"amount" {
// apply same logic as above
},
}
}
// ALGORITHM
// 1. convert into an array the TEXT_FROM_OCR variable (each index, means a new line in file)
// in JavaScript we would do something like this:
TEXT_FROM_OCR = TEXT_FROM_OCR.split("\r\n");
var MAXIMUM_SCORE = 6; // we are looking to extract 6 values, out of 6
foreach TEMPLATES as TEMPLATE_TO_PARSE => PARSE_METADATA{
SCORE = 0; // for each field we find, we increment score
foreach PARSE_METADATA as SEARCHED_FIELD_NAME => DELIMITERS_METADATA{
// Search by line first
if (DELIMITERS_METADATA['line_no_start'] !== NULL && DELIMITERS_METADATA['line_no_end'] !== NULL){
// Initiate value with an empty string
DELIMITERS_METADATA['value_found'] = '';
// Concatenate the value found across these lines
for (LINE_NO = DELIMITERS_METADATA['line_no_start']; LINE_NO <= DELIMITERS_METADATA['line_no_end']; LINE_NO++){
// Add line, one by one as defined by your delimiters
DELIMITERS_METADATA['value_found'] += TEXT_FROM_OCR[ LINE_NO ];
}
// We have found a good value, continue to next field
SCORE++;
continue;
}
// Search by text delimiters
if (DELIMITERS_METADATA['start_delimiter'] !== NULL){
// Search for text inside each line of the file
foreach TEXT_FROM_OCR as LINE_CONTENT{
// If we found start_delimiter on this line, then let's parse it
if (LINE_CONTENT.indexOf(DELIMITERS_METADATA['start_delimiter']) > -1){
// START POSITION OF OUR SEARCHED VALUE IS THE OFFSET WE FOUND + THE TOTAL LENGTH OF START DELIMITER
START_POSITION = LINE_CONTENT.indexOf(DELIMITERS_METADATA['start_delimiter']) + LENGTH( DELIMITERS_METADATA['start_delimiter'] );
// by default we try to extract all data from START_POSITION untill the end of current line
END_POSITION = 999999999999; // till the end of line
// HOWEVER, IF THERE IS AN END DELIMITER DEFINED, WE WILL USE THAT
if (DELIMITERS_METADATA['end_delimiter'] !== NULL){
// IF WE FOUND THE END DELIMITER ON THIS LINE, WE WILL USE ITS OFFSET as END_POSITION
if (LINE_CONTENT.indexOf(DELIMITERS_METADATA['end_delimiter']) > -1){
END_POSITION = LINE_CONTENT.indexOf(DELIMITERS_METADATA['end_delimiter']);
}
}
// SUBSTRACT THE VALUE WE FOUND
DELIMITERS_METADATA['value_found'] = LINE_CONTENT.substr(START_POSITION, END_POSITION);
// We have found a good value earlier, increment the score
SCORE++;
// break this foreach as we found a good value, and we need to move to next field
break;
}
}
}
}
print(TEMPLATE_TO_PARSE obtained a score of SCORE out of MAXIMUM_SCORE):
}
At the end you will know which template extracted most of the data, and based on this which one to use for that bill. Feel free to ask anything in comments. If I stayed 45 minute to write this answer, I'll surely answer to your comments as well. :)

How can I search pipeline with another pipeline value on google cloud dataflow

I would like to search text which includes specified word from stream data with google cloud dataflow.
In detail, I will deal with following two stream.
stream A: element of stream is "word"
stream B: element of stream is "text". and each text consists of "word". This text may have "word" on stream A
Many "text" flow into stream B frequently. On the other hand, "word" flow into stream A occasionally.
When "word" flow into stream A, I would like to search "text" which has "word" and flow into stream B after 5 minutes ago.
Example
time stream A : stream B
00:01 - this is an apple
00:02 - this is an orange
00:03 - I have an apple
00:04 apple <= "this is an apple" and "I have an apple" are found
00:05 this <= "this is an apple" and "this is an orange" are found
Can I search text with google cloud dataflow?

If I understand your question correctly, there are multiple ways to achieve something like what you want. I will describe two variations.
The basic idea in my example code is to use an inner join and SlidingWindows of five minutes. You can implement the join using ParDo side inputs or CoGroupByKey, depending on your data sizes.
Here is how you set up your inputs and windowing:
PCollection<String> streamA = ...;
PCollection<String> streamB = ...;
PCollection<String> windowedStreamA = streamA.apply(
Window.into(
SlidingWindows.of(Duration.standardMinutes(5)).every(...)));
PCollection<String> windowedStreamB = streamB.apply(
Window.into(
SlidingWindows.of(Duration.standardMinutes(5)).every(...)));
You may want to adjust the size of windows or period to meet your specification & performance needs.
Here is a sketch of how to do the join with side inputs. This will iterate over the entire five minute window of streamB for each element of streamA, so performance will suffer if windows get large.
PCollectionView<Iterable<String>> streamBview = streamB.apply(View.asIterable());
PCollection<String> matches = windowedStreamA.apply(
ParDo.of(new DoFn<String, String>() {
#Override void processElement(ProcessContext context) {
for (String text : context.sideInput()) {
if (split(text).contains(context.element())) {
context.output(text);
}
}
}
});
Here is a sketch of how to do this with CoGroupByKey by pre-splitting the text and joining each keyword with the lines that contain that keyword. There is similar logic in the TfIdf example included with the SDK.
PCollection<KV<String, Void>> keyedStreamA = windowedStreamA.apply(
MapElements
.via(word -> KV.of(word, null))
.withOutputType(new TypeDescriptor<KV<String, Void>>() {}));
PCollection<KV<String, String>> keyedStreamB = windowedStreamB.apply(
FlatMapElements
.via(text -> split(text).forEach(word --> KV.of(word, text))
.withOutputType(new TypeDescriptor<KV<String, String>>() {}));
TupleTag<Void> tagA = new TupleTag<Void>() {};
TupleTag<String> tagB = new TupleTag<String>() {};
KeyedPCollectionTuple coGbkInput = KeyedPCollectionTuple
.of(tagA, keyedStreamA)
.and(tagB, keyedStreamB);
PCollection<String> matches = coGbkInput
.apply(CoGroupByKey.create())
.apply(FlatMapElements
.via(result -> result.getAll(tagB))
.withOutputType(new TypeDescriptor<String>()));
The best approach will depend on your data. If you are OK with getting more matches than just the last five minutes you can tune the amount of data duplication in windows by enlarging your sliding windows and having a larger period. You can also use triggers to tune when output is produced.

Higstock add all series dynamically

I've a Higstock lineal graph. Sometimes I need to show just one serie, other times I need two or three series to draw.
Obviously, this is an example of adding series dynamically. I put:
$(function() {
var chart = new Highcharts.StockChart({
// ...
series: []
// ...
})
chart.addSeries({name : "Value1",data : value1Data});
chart.addSeries({name : "Value2",data : value2Data});
chart.addSeries({name : "Value3",data : value3Data});
But not working, the chart needs to have at least one serie with data values in "series" node, not allowing an empty series node, like I put before.
I need to add all my series dynamically.
Anyone can help me? Thanks.
Example: http://jsfiddle.net/8aP69/1/
FYI: The graph only draws the navigation bar.

After many little test, a man that I loved him give me the solution.
I need to set initial series node like that:
series: [{ name: "Serie1", data : [null]}] // first serie has to be null
And after that, adding data point for first serie.
chart.series[0].setData(data1); // adding his data here

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

A way to get single contiguous text block for all text from Google cloud vision OCR? - google-apis-explorer

You need to assemble the response by hand iterating over full_text_annotation, pages, blocks, sections, words. There is an attribute in words.symbol.property.detected_break that tells when there are breaks, in your case if the symbol is the last of the line the value would be equal to 3.

Related

Autodesk Forge PDF Viewer and Measuring

How do I couple segmented CF traffic data (SS) with shape data (SHP)?

How to extract invoices data from an image in android app?

How can I search pipeline with another pipeline value on google cloud dataflow

Higstock add all series dynamically

Categories

Resources