Does Google Cloud Vision API detect formatting in OCRed text like bold, italics, font name (helvetica or times new roman), etc?

Does Google Cloud Vision API detect formatting in OCRed text like bold, italics, font name (helvetica or times new roman), etc? - ocr

The quick brown fox jumps over the lazy dog
In such a case like this, assuming there are different font families too, can cloud VIsion API detect this. Or any other OCR API detect this cleanly. Tesseract has capabilities but its so inaccurate.

Does google cloud vision API detect formatting in OCRed text like
bold, italics, font name (helvetica or times new roman), etc?
Unfortunately, no.
In my project, I use ABBYY Cloud OCR SDK for this purpose. If you want to try, you can start free trial which includes 500 free requests (pages). After you create your trial account, you will receive an email from ABBYY which will contain your Application ID and Application password. Use these 2 values to create your authentication header according to Authentication.
See the following example:
Perform processImage request. Pass your image in the request body.
Request:
POST / https://cloud.ocrsdk.com/v2/processImage?exportFormat=xml&profile=documentConversion&xml:writeFormatting=true
Authorization: <your token>
Response:
{
"taskId": "a226a0b6-6705-4d6f-9f4c-517fa9b4e28e",
"registrationTime": "2020-07-26T09:42:39Z",
"statusChangeTime": "2020-07-26T09:42:39Z",
"status": "Queued",
"filesCount": 1,
"requestStatusDelay": 10000
}
Perform getTaskStatus request in order to check if your task is completed. Use taskId from the response of the previous step.
Request:
GET / https://cloud.ocrsdk.com/v2/getTaskStatus?taskId=a226a0b6-6705-4d6f-9f4c-517fa9b4e28e
Authorization: <your token>
Response:
{
"taskId": "a226a0b6-6705-4d6f-9f4c-517fa9b4e28e",
"registrationTime": "2020-07-26T09:42:39Z",
"statusChangeTime": "2020-07-26T09:42:40Z",
"status": "Completed",
"filesCount": 1,
"requestStatusDelay": 0,
"resultUrls": [
"https://ocrsdk.blob.core.windows.net/files/a226a0b6-6705-4d6f-9f4c-517fa9b4e28e.result?sv=2012-02-12&se=2020-07-26T19%3A00%3A00Z&sr=b&si=downloadResults&sig=4k9FcRoBfhodq%2BMj%2Ffj%2BGLBfwK2BsO7sj15JQOLcArk%3D"
]
}
Download the result (see resultUrls from the response of the previous step).
I used the following picture
and received the following result

ABBYY Cloud OCR will be quite accurate, but at the end, everything depends on your fonts and scanning quality.

Related

Postman interceptor request running forever

I am trying to intercept a website - https://www.kroger.com/pl/chicken/05002. In the chrome network tab, I see the request as below, with the details of the products nicely listed as JSON
I copied the cURL as bash and imported it as raw text in Postman. It ran forever without any response. Then I used the intercept feature and still it is running forever.
When both the requests are exactly same, why is it running in Chrome and not in Postman? What am i missing? Any help is appreciated, thanks in advance.

This is probably happening because they don't want you to do what you are trying to do. Note the "filter.verified" param in the URL.
You may want to try reaching out to them for an external API token - especially if you are creating an app or extension to compare competitive prices with the intention of distributing said app or extension - regardless of if it is for financial compensation or not.
Ethically questionable workaround (which would defintely need to be improved upon - this is simply an example of how you could solve your problem...):
GET https://www.kroger.com/search?query=chicken&searchType=default_search&fulfillment=all
const html = cheerio(responseBody);
var results = [];
html.find('div[class="AutoGrid-cell min-w-0"] > div').each(function (i, e)
{
results.push({
"Item": e.children[e.children.length-3].children[0].children[0].children[0]["data"],
"Price": e.children[e.children.length-4].children[0].attribs["value"]
})
});
console.log(results);
If you are unable to obtain an API token from them, this would probably be a legal way to accomplish what you want.

What is a useful Azure IoT Hub JSON message structure for consumption in Time Series Insights

The title sounds quite comprehensive, but my baseline question is quite simple, I guess.
Context
I Azure, I have an IoT hub, which I am sending messages to. I use a modified version one of the samples from the Azure Iot SDK for python.
Sending works fine. However, instead of a string, I send a JSON structure.
When I watch the events flowing into the IoT hub, using the Cloud shell, it looks like this:
PS /home/marcel> az iot hub monitor-events --hub-name weathertestiothub
This extension 'azure-cli-iot-ext' is deprecated and scheduled for removal. Please remove and add 'azure-iot' instead.
Starting event monitor, use ctrl-c to stop...
{
"event": {
"origin": "raspberrypi-zero-wh",
"payload": "{ \"timestamp\": \"1608643863720\", \"locationDescription\": \"Attic\", \"temperature\": \"21.941\", \"relhumidity\": \"71.602\" }"
}
}
Issue
The data seems fine, except the payload looks strange here. BUT, the payload is literally what I send from the device, using the SDK sample.
Is this the correct way to do it? At the end, I have a very hard time to actually get the data into the Time Series Insights model. So I guess, my structure is to be blamed.
Question
What is a recommended JSON data structure to send to the IoT hub for later use?

You should add the following 2 lines to your message in your python SDK sample:
msg.content_encoding = "utf-8"
msg.content_type = "application/json"
This should resolve your formatting concern.
We've also updated our samples to reflect this: https://github.com/Azure/azure-iot-sdk-python/blob/master/azure-iot-device/samples/sync-samples/send_message.py

I ended up using the tip by #elhorton, but it was not the key change. Nonetheless, the formatting in the Azure Shell Monitor looks now much better:
"event": {
"origin": "raspberrypi-zero-wh",
"payload": {
"temperature": 21.543947753906245,
"humidity": 69.22964477539062,
"locationDescription": "Attic"
}
}
The key was:
include the message source time in ISO format
from datetime import datetime
timestampIso = datetime.now().isoformat()
message.custom_properties["iothub-creation-time-utc"] = timestampIso
Using the locationDescription as the Time Series ID Property See https://learn.microsoft.com/en-us/azure/time-series-insights/how-to-select-tsid (Maybe I could also have taken the iothub-connection-device-id, but I did not test that alone specifically)

I guess using "iothub-connection-device-id" will make "raspberrypi-zero-wh" as the name of the time series instance. I agree with your choice of using "locationDescription" as TSID; so Attic becomes the time series instance name, temperature and humidity will be your variables.

What is the reason why the OneNote APIs won't return all the pages in a notebook?

I am reading around here and I am seeing multiple messages about the /pages endpoint that is not working a expected
It seems that the OneNote APIs (MS Graph or Office365) are not returning all the pages that the user can see. In particular recent pages are not shown as available.
This message is for those of you who work for Microsoft and who keep an eye on this forum. Please if you have any explanation or workaround for this we would like to hear about it.
If this is work in progress we would also like to know when the APIs can be considered stable and reliable enough to consider them OK for production use
Update:
Permissions or scopes
scopes=[
"Notes.Read",
"Notes.Read.All",
"Notes.ReadWrite",
]
This is for a device authorization flow, the device is acting as a Microsoft Online account. The app is registered to Azure as personal app but the enterprise one does the same
The authorization process is described here
What type of app/authentication flow should I select to read my cloud OneNote content using a Python script and a personal Microsoft account?
After that I am using this endpoint to get the notebooks
https://graph.microsoft.com/v1.0/users/user-id/onenote/notebooks
from the returned json I pick the endpoint for the notebook I want to read and I access the endpoint the link stored in notebook['sectionsUrl']. This call returns a sections json
From this I pick the section I want and I access the link stored in section['pagesUrl']
Each call returns the expected info excepting the last one, when I get an arbitrary low number of pages in the section I want to explore. There is nothing wrong with the format of the info, it is just incomplete or not up to date
Not sure if this is related but when I try to access the pages in a section from MS Graph Explored I am seeing the same behavior (not all the pages are reported). This is a shared notebook and I am using the owner account for all the above so it should not be a permission problem
from msal import PublicClientApplication
import requests
endpoint= "https://graph.microsoft.com/v1.0/me/onenote"
authority = "https://login.microsoftonline.com/consumers"
app=PublicClientApplication(client_id=client_id, authority=authority)
flow = app.initiate_device_flow(scopes=scopes)
# there is an interactive part here that I automated using selenium, you
# are supposed to ouse a link to enter a code and then autorize the
# device; code not shown
result = app.acquire_token_by_device_flow(flow)
token= result['access_token']
headers={'Authorization': 'Bearer ' + token}
endpoint= "https://graph.microsoft.com/v1.0/users/c5af8759-4785-4abf-9434-xxxxxxxxx/onenote/notebooks"
notebooks = requests.get(endpoint,headers=headers).json()
for notebook in notebooks['value']:
print(notebook['displayName'])
print(notebook['sectionsUrl'])
print(notebook['sectionGroupsUrl'])
# I pick a certain notebook
section=[section for section in sections if section['displayName']=="Test"][0]
endpoint=notebook['sectionsUrl']
pages=requests.get(endpoint,headers=headers).json()
for page in pages['value']:
print(page['title'])
Update2
If I use this endpoint
https://graph.microsoft.com/v1.0/users/user-id/onenote/sections/section-id/pages
I would expect to get the complete list of pages for that section.
That is not working
After reading again and again the docs I my understanding is that the approach is to
call https://graph.microsoft.com/v1.0/users/user-id/onenote/pages$fiter or search etc etc
I this correct?
Also I vaguely remember there is a way to search for a section and have it expanded so that the search returs the children too.
Am I close to understanding this?
Thank you
MM

How does pass_thread_control work in Facebook Handover Protocol?

I'm trying to test the pass_thread_control function on Facebook Messenger, to have my Dialogflow bot direct ongoing conversation to a human operator. So far I'm stuck at even trying to get a "success" code in Graph API Explorer. I have reviewed Facebook's documentation ( https://developers.facebook.com/docs/messenger-platform/reference/handover-protocol/pass-thread-control/ ), carefully looked through different threads here or elsewhere. I have:
Subscribed my Facebook Page to receive messaging_handovers.
Set the Dialogflow chatbot app as the Primary Receiver.
Set the Page Inbox as Secondary Receiver.
...And I keep on getting various errors. For example, I try this request in Graph API explorer:
POST to https://graph.facebook.com/v5.0/me/pass_thread_control
with params:
{
"recipient": {
"id": "myPageID"
},
"target_app_id": "263902037430900"
}
{
"error": {
"message": "Unsupported post request. Object with ID 'me' does not exist, cannot be loaded due to missing permissions, or does not support this operation. Please read the Graph API documentation at https://developers.facebook.com/docs/graph-api",
"type": "GraphMethodException",
"code": 100,
"error_subcode": 33,
"fbtrace_id": "AipGijCLKQOwOl6L792ZEgG"
}
}
Maybe the issue is the recipient PSID? This is the only parameter I have no idea where to get. What is the page scoped app-id? How do I get it?
Or maybe I missed some permissions...?
Any help getting me unstuck much appreciated...

Okay, I actually managed to figure it out.
First and foremost - I discovered the Page I was trying to pass the thread control to wasn't linked to my Facebook Business account and the chatbot app was. I added the page in Facebook Business Manager so that it's linked to the same business account as the chatbot app. NOTE: I am not sure that this is a prereq for everyone, so take caution. It might not be required in all scenarios.
To retrieve the Page PSID, which can then be used in the 'recipient' param of the POST request to pass_thread_control, Sent a GET request using Graph API Explorer as shown here: https://developers.facebook.com/docs/facebook-login/connecting-accounts#examples
Even though the example request does not contain the appsecret_proof param, I used it and haven't tested the request without it. A very simple way of generating appsecret_proof using PHP is shown here: https://developers.facebook.com/docs/graph-api/securing-requests#generate-proof
Then when providing the PSID obtained using method shown in point 2., I got "success:true" while testing the pass_thread_control, which did pass the thread control to Secondary Receiver = Page Inbox. Yaaay! 😊
All of the above is described also in this thread, which helped me figure it out, so credit to Sunil: Is Facebook Messenger PSID PageScope constant for User

Get JSON from Google Apps Script URL via Erlang

Good Evening!
I've been looking into the possibility of using GAS(Google Apps Script) to host a small bit of javascript that lets me use the new Google finance apps api. The intention being that I'll be using the stock information for a project which involves the use of stock data. I know that there are a few ways to get stock information from Google, but the data that the finanace app returns is more in-line with other sources we are using. (One constraint on this project is that we have multiple sources).
I've written the javascript and I can call a httpc:request to the URL for the script given to me from Google. In the browser the JS returns the json object as I want it, however when the call is made from Erlang I'm getting it in a list of ascii. From checking the values it appears to be a document starting like:
Below is the javascript and the url to see the json:
https://script.google.com/macros/s/AKfycbzEvuuQl4jkrbPCz7hf9Zv4nvIOzqAkBxL1ixslLBxmSEhksQM/exec
function doGet() {
var stock = FinanceApp.getStockInfo('LON:TSCO');
return ContentService.createTextOutput(JSON.stringify(stock))
.setMimeType(ContentService.MimeType.JSON);
}
For the erlang, it's a simple request but I've not been doing erlang long, so perhaps I've messed something up here (The URL being the one mentioned above). I've got crypto / ssl / inets when I'm testing this on the command line.
{ok, {Version, Headers, Body}} = httpc:request(get, URL, []}, [], []).
I think it's also worth mentioning that when i curl it from Cygwin, I get a massive load of HTML also, I've included it below, but if you see it you'll thank me for not posting it in here! http://pastebin.com/UtJHXjRm
I've been updating the script as I go with the new versions but I'm at a bit of a loss as to why it's not returning correctly.
If anyone can give me any pointers I'd be very grateful! I get the feeling that it's not intended to be used this way, perhaps only within other Google products and such.
Cheers!

It would be necessary to review how are you deploying the Web App, specifically the Who has access to the app, to access without authentication should be configured as shown in the image:
See Deploying Your Script as a Web App from the documentation.
In my test, by running:
curl -L https://script.google.com/macros/s/************/exec
Get the following result:
{
"priceopen":358,
"change":2.199981689453125,
"high52":388.04998779296875,
"tradetime":"2013-10-11T15:35:18.000Z",
"currency":"GBX",
"timezone":"Europe/London",
"low52":307,
"quote":357.8999938964844,
"name":"Tesco PLC",
"exchange":"LON",
"marketcap":28929273763,
"symbol":"TSCO",
"volumedelay":0,
"shares":8083060703,
"pe":23.4719295501709,
"eps":0.15248000621795654,
"price":357.8999938964844,
"has_stock_data":true,
"volumeavg":14196534,
"volume":8885809,
"changepct":0.6184935569763184,
"high":359.5,
"datadelay":0,
"low":355.8999938964844,
"closeyest":355.70001220703125
}

Possibly your GET is not following the REDIRECT that happens when you use contentService. Look at the html returned there is a redirect in there.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Does Google Cloud Vision API detect formatting in OCRed text like bold, italics, font name (helvetica or times new roman), etc? - ocr

The quick brown fox jumps over the lazy dog In such a case like this, assuming there are different font families too, can cloud VIsion API detect this. Or any other OCR API detect this cleanly. Tesseract has capabilities but its so inaccurate.

ABBYY Cloud OCR will be quite accurate, but at the end, everything depends on your fonts and scanning quality.

Related

Postman interceptor request running forever

What is a useful Azure IoT Hub JSON message structure for consumption in Time Series Insights

What is the reason why the OneNote APIs won't return all the pages in a notebook?

How does pass_thread_control work in Facebook Handover Protocol?

Get JSON from Google Apps Script URL via Erlang

Categories

Resources