Analyze data volume of API calls with Invantive SQL

Analyze data volume of API calls with Invantive SQL - exact-online

The SQL engine hides away all nifty details on what API calls are being done. However, some cloud solutions have pricing per API call.
For instance:
select *
from transactionlines
retrieves all Exact Online transaction lines of the current company, but:
select *
from transactionlines
where financialyear = 2016
filters it effectively on REST API of Exact Online to just that year, reducing data volume. And:
select *
from gltransactionlines
where year_attr = 2016
retrieves all data since the where-clause is not forwarded to this XML API of Exact.
Of course I can attach fiddler or wireshark and try to analyze the data volume, but is there an easier way to analyze the data volume of API calls with Invantive SQL?

First of all, all calls handled by Invantive SQL are logged in the Invantive Cloud together with:
the time
data volume in both directions
duration
to enable consistent API use monitoring across all supported cloud platforms. The actual data is not logged and travels directly.
You can query the same numbers from within your session, for instance:
select * from exactonlinerest..projects where code like 'A%'
retrieves all projects with a code starting with 'A'. And then:
select * from sessionios#datadictionary
shows you the API calls made:
You can also put a query like to following at the end of your session before logging off:
select main_url
, sum(bytes_received) bytes_received
, sum(duration_ms) duration_ms
from ( select regexp_replace(url, '\?.*', '') main_url
, bytes_received
, duration_ms
from sessionios#datadictionary
)
group
by main_url
with a result such as:

Related

Agregate JSON events into an array in Azure Stream Analytics

I'm new to Azure Stream Analytics and query language. I have an ASA job which reads json data coming from my IoT Hub and feeds it to different functions based on one of the values. This is what I have now:
SELECT
*
INTO
storage
FROM
iothub
SELECT
*
INTO
storageQueueFunction
FROM
iothub
WHERE
recType LIKE '3'
SELECT
*
INTO
deviceTwinD2CFunctionApp
FROM
iothub
WHERE
recType LIKE '50'
SELECT
*
INTO
heartbeatD2CFunctionApp
FROM
iothub
WHERE
recType LIKE '51'
SELECT
*
INTO
ackC2D
FROM
iothub
WHERE
recType LIKE '54'
I'm pretty sure this could be done more efficiently but it's working for now.
My problem is that when a large number of events come in with recType 54, I think it is overloading my Function App "ackC2D".
My idea is to batch these types of events into a json array using something like a rolling window of 5 seconds, then send that array to the output where I can parse through the array event by event.
I haven't been able to find anything like this online, the closest I can find is aggregating data then outputting a calculation on the aggregate.
Is what I'm trying to do possible?
Thanks!

When configuring the Azure function output, you have the ability to specify 'Max batch size' and 'Max batch count' properties. If lot of input events arrive rapidly, keeping a high value for these properties will result in fewer calls to your Azure Function output (by automatically batching many outputs events in a single HTTP request).

Get list of blocked Exact Online divisions

We have a few thousand companies in Exact Online from which a certain percentage runs their own accounting and has their own license. However, there is a daily changing group of companies that are behind with their payments to Exact and therefore their companies are blocked.
For all companies we run Invantive Data Replicator to replicate all Exact Online companies into a SQL Server datawarehouse for analytical reporting and continuous monitoring.
In the SystemDivisions table, the state of such a blocked company remains at 1 (Active). It does not change to 2 (Archive) or 0 (upcoming). Nor is there any enddate set in the past.
However, when the XML or REST APIs are used through a query from Invantive SQL or directly from Python on such a blocked company there are lot of fuzzy error messages.
Currently we have to open each company which had an error during replication individually each day and check whether a block by Exact is causing the error and for what reason.
It seems that there is no way to retrieve the list of blocked companies.
Is there an alternative?

Although it is not supported and disadviced, you can access a limited number of screens in Exact Online using native requests. It is rumoured that this is not possible for all screens.
However, you are lucky. The blocking status of a company can be requested using the following queries:
insert into NativePlatformScalarRequests(url, orig_system_group)
select /*+ ods(false) */ 'https://start.exactonline.nl/docs/SysAccessBlocked.aspx?_Division_=' || code
, 'BLOCK-DIV-CHECK-' || code
from systemdivisions
create or replace table currentlyblockeddivisions#inmemorystorage
as
select blockingstatus
, divisioncode
from ( select regexp_replace(result, '.*<table class="WizardSectionHeader" style="width:100%;"><tr><th colspan="2">([^<]*)</th>.*', '$1', 1, 0, 'n') blockingstatus
, replace(orig_system_group, 'BLOCK-DIV-CHECK-', '') divisioncode
from NativePlatformScalarRequests
where orig_system_group like 'BLOCK-DIV-CHECK-%'
)
where blockingstatus not like '%: Onbekend%'
Please note that the hyperlink with '.nl' needs to be replaced when you run on a different country. The same holds for searching on the Dutch term 'Onbekend' ('Unknown' in english).
This query runs several thousand of HTTP request, each requesting the screen with the blocking status of a company. However, when the company is not blocked, the screen reports back a reason of 'Unknown'.
These companies with 'Unknown' reason are probably not blocked. The rest is.

Register custom audit events for billing

We are running a custom app on Invantive Data Access Point which adds business functionality to Exact Online. For billing purposes, we would like to somehow register actual use of the software as defined in business terms instead of memory used, CPU, SQL statements executed, etc.
We do not yet have custom tables and I would like to keep it that way, so the whole state is kept in memory and in Exact Online only. So "insert into mytable#sqlserver..." is not an option. Neither does Exact Online offer the possibility to create custom tables as with Salesforce.
How can we somehow register billable events, such as "Performed an upload of 8 bank transactions" under this condition?

For billing purposes, you can lift along on the Customer Service infrastructure, which is similar to functionality offered by AWS or Apple for this purpose in their eco system. The "table" which stores the billing events like a Call Detail Record of a PBX is managed by Customer Service infrastructure.
There are two options:
Your apps use the default audit and license event registrations like "User logged on", "First use of partition #xyz", etc. each with a specific message code like 'itgenlic125'.
Your apps define their own event types like "Performed an upload of bank transactions", with a message code 'mybillingmessagecode123' and the number '8' as quantity in the natural key.
The first option is automatically and always done. These data is also used to manage resource consumption and detect runaways.
The second option is best done using Invantive SQL with the data dictionary table "auditevents". All records inserted into auditevents are automatically asynchronously forwarded to Customer Service. To see the current register audit events since start of application:
select *
from auditevents#datadictionary
where:
occurrence_date: when it happened.
logging_level: always "Audit".
message_code: code identifying the type of event.
data_container_d: ID of the data container, used with distributed SQL transactions.
partition: partition within the data container for platforms such as Exact Online or Microsoft SQL Server which store multiple databases under one customer/instance.
session_id: ID of the session.
user_message: actual text.
last_nk: last used natural key
application_name: name of the appplication.
application_user: user as known to the application.
gui_action: action within the GUI.
And some auditing and licensing information fields.
To register a custom event:
insert into auditevents#datadictionary select * from auditevents#datadictionary
Only some fields can be provided; the rest are automatically determined:
message_code
user_message
last_natural_key
application_name
application_user
gui_action
gui_module
partition
provider_name
reference_key
reference_table_code
session_id
To receive the billing events yourself from the infrastructure, you will need to access the Customer Service APIs or have them automatically forwarded to mail, Slack, RocketChat or Mattermost channel.
A sample SQL:
insert into auditevents#datadictionary
( message_code
, user_message
, last_natural_key
, application_name
, gui_action
, gui_module
, reference_key
, reference_table_code
, partition
)
select 'xxmycode001' message_code
, 'Processed PayPal payments in Exact Online for ' || divisionlabel user_message
, 'today' last_natural_key
, 'PayPalProcessor' application_name
, 'xx-my-paypal-processor-step-2' gui_action
, 'xx-my-payal-processor' gui_module
, clr_id reference_key
, 'clr' reference_table_code
, division partition
from settings#inmemorystorage

How can this query be optimized for speed?

This query creates an export for UPS from the deliveries history:
select 'key'
, ACC.Name
, CON.FullName
, CON.Phone
, ADR.AddressLine1
, ADR.AddressLine2
, ADR.AddressLine3
, ACC.Postcode
, ADR.City
, ADR.Country
, ACC.Code
, DEL.DeliveryNumber
, CON.Email
, case
when CON.Email is not null
then 'Y'
else 'N'
end
Ship_Not_Option
, 'Y' Ship_Not
, 'ABCDEFG' Description_Goods
, '1' numberofpkgs
, 'PP' billing
, 'CP' pkgstype
, 'ST' service
, '1' weight
, null Shippernr
from ExactOnlineREST..GoodsDeliveries del
join ExactOnlineREST..Accounts acc
on ACC.ID = del.DeliveryAccount
join ExactOnlineREST..Addresses ADR
on ADR.ID = DEL.DeliveryAddress
join ExactOnlineREST..Contacts CON
on CON.ID = DEL.DeliveryContact
where DeliveryDate between $P{P_SHIPDATE_FROM} and $P{P_SHIPDATE_TO}
order
by DEL.DeliveryNumber
It takes many minutes to run. The number of deliveries and accounts grows with several hundreds each day. Addresses and contacts are mostly 1:1 with accounts. How can this query be optimized for speed in Invantive Control for Excel?

Probably this query is run at most once every day, since the deliverydate does not contain time. Therefore, the number of rows selected from ExactOnlineREST..GoodsDeliveries is several hundreds. Based upon the statistics given, the number of accounts, deliveryaddresses and contacts is also approximately several hundreds.
Normally, such a query would be optimized by a solution such as Exact Online query with joins runs more than 15 minutes, but that solution will not work here: the third value of a join_set(soe, orderid, 100) is the maximum number of rows on the left-hand side to be used with index joins. At this moment, the maximum number on the left-hand side is something like 125, based upon constraints on the URL length for OData requests to Exact Online. Please remember the actual OData query is a GET using an URL, not a POST with unlimited size for the filter.
The alternatives are:
Split volume
Data Cache
Data Replicator
Have SQL engine or Exact Online adapted :-)
Split Volume
In a separate query select the eligible GoodsDeliveries and put them in an in-memory or database table using for instance:
create or replace table gdy#inmemorystorage as select ... from ...
Then create a temporary table per 100 or similar rows such as:
create or replace table gdysubpartition1#inmemorystorage as select ... from ... where rowidx$ between 0 and 99
... etc for 100, 200, 300, 400, 500
And then run the query several times, each time with a different gdysubpartition1..gdysubpartition5 instead of the original from ExactOnlineREST..GoodsDeliveries.
Of course, you can also avoid the use of intermediate tables by using an inline view like:
from (select * from goodsdeliveries where date... limit 100)
or alike.
Data Cache
When you run the query multiple times per day (unlikely, but I don't know), you might want to cache the Accounts in a relational database and update it every day.
You can also use a 'local memorize results clipboard andlocal save results clipboard to to save the last results to a file manually and later restore them usinglocal load results clipboard from ...andlocal insert results clipboard in table . And maybe theninsert into from exactonlinerest..accounts where datecreated > trunc(sysdate)`.
Data Replicator
With Data Replicator enabled, you can have replicas created and maintained automatically within an on-premise or cloud relational database for Exact Online API entities. For low latency, you will need to enable the Exact webhooks.
Have SQL Engine or Exact adapted
You can also register a request to have the SQL engine to allow higher number in the join_set hint, which would require addressing the EOL APIs in another way. Or register a request at Exact to also allow POST requests to the API with the filter in the body.

Using Google BigQuery to run multiple queries back to back

I'm currently working a project where I'm using Google Big Query to pull data from spreadsheets. I'm VERY new to SQL, so I apologize. I'm currently using the following code
Select *
From my_data
Where T1 > 1000
And T2 > 2000
So keeping the Select and From the same, I want to be able to run multiple queries where I can just keep changing the values I'm looking for between t1 and t2. Around 50 different values. I'd like for BigQuery to run through these 50 different values back to back. Is there a way to do this? Thanks!

I'm VERY new to SQL
... and I assume to BigQuery either ..., so
Below is one of the options for new users who are not familiar yet with BigQuery API and/or different clients rather than BigQuery Web UI.
BigQuery Mate adds parameters feature to BigQuery Web UI
What you need to do is
Save you query as below using Save Query Button
Notice <var_t1> and <var_t2>
Those are the parameters identifyable by BigQuery Mate
Now you can set those parameters
Click QB Mate and then Parameters to get to below form
Now you can set parameters with whatever values you want to run with;
Click on Replace Parameters OK Button and those values will appear in Editor. For example
After OK is clicked you get
So now you can run your query
To Run another round with new parameters, you need to load again your saved query to the editor by clicking on Edit Query Button
and now repeat settings parameters and so on
You can find BigQuery Mate Chrome extension here
Disclaimer: I am the author an the only developer of this tool

You may be interested in running parameterized queries. The idea would be to have a single query string, e.g.:
SELECT *
FROM YourTable
WHERE t1 > #t1_min AND
t2 > #t2_min;
You would execute this multiple times, where each time you bind different values of the t1_min and t2_min parameters. The exact logic would depend on the API through which you are using the client libraries, and there are language-specific examples in the first link that I provided.

If you are not concerned about sql-injection and just want to iteratively swap out parameters in queries, you might want to look into the mustache templating language (available in R as 'whisker').
If you are using R, you can iterate/automate this type of query with the condusco R package. Here's a complete R script that will accomplish this kind of iterative query using both whisker and condusco:
library(bigrquery)
library(condusco)
library(whisker)
# create a simple function that will create a query
# using {{{mustache}}} placeholders for any parameters
create_results_table <- function(params){
destination_table <- '{{{dataset_id}}}.{{{table_prefix}}}_results_{{{year_low}}}_{{{year_high}}}'
query <- '
SELECT *
FROM `bigquery-public-data.samples.gsod`
WHERE year > {{{year_low}}}
AND year <= {{{year_high}}}
'
# use whisker to swap out {{{mustache}}} placeholders with parameters
query_exec(
whisker.render(query,params),
project=whisker.render('{{{project}}}', params),
destination_table = whisker.render(destination_table,params),
use_legacy_sql = FALSE
)
}
# create an invocation query to provide sets of parameters to create_results_table
invocation_query <- '
SELECT
"<YOUR PROJECT HERE>" as project,
"<YOUR DATASET_ID HERE>" as dataset_id,
"<YOUR TABLE PREFIX HERE>" as table_prefix,
num as year_low,
num+1 as year_high
FROM `bigquery-public-data.common_us.num_999999`
WHERE num BETWEEN 1992 AND 1995
'
# call condusco's run_pipeline_gbq to iteratively run create_results_table over invocation_query's results
run_pipeline_gbq(
create_results_table,
invocation_query,
project = '<YOUR PROJECT HERE>',
use_legacy_sql = FALSE
)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008