Mass upload pictures for performance testing - exact-online

To test performance with many and large binary objects, I need 1.000 or more document attachments to be available within Exact Online.
Loading them through the user interface is quite cumbersome: many clicks per document and there is no easy drag & drop. Email might be an option, but still with 1.000 documents it is a hassle.
How can I fast load a large volume of document attachments into Exact Online?

The easiest way was to use the website lorempixel.com. On every unique hit it provides a new nice picture with configurable dimensions (throughput is approximately 1000 KB of document attachments created per second).
set use-http-cache false
use 868056 /* Set division. */
insert into exactonlinerest..documents
( subject
, type
)
select 'Sample document #' || value
, 101
from range(1000)#datadictionary /* Load 1,000 documents. */
insert into exactonlinerest..DocumentAttachmentFiles
( attachment
, document
, filename
)
/* And each document gets an additional attachment per time this insert is run. Run it 10 times to load 10 document attachments per document. */
select httpget('http://lorempixel.com/400/400/?val=' || value) attachment
, dct.id
, value || '.jpg'
from range(1000, 1)#datadictionary
join exactonlinerest..documents dct
on dct.type = 101
and dct.subject like 'Sample document #%'
and dct.subject = 'Sample document #' || value

Related

Why am I getting a runtime error on my new computer but not my 2 older computers when running the same vba macro?

html.body.innerHTML = response
' Get the values from the specified element on the page.
With Sheets("Instructions")
.Range("CurrentPrice") = html.getElementsByClassName("Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)").Item(0).innerText
' .Range("CurrentPrice") = html.getElementsByClassName("Px(10px)")(0).getElementsByTagName("span")(1).innerText
.Range("LowPrice") = html.getElementsByClassName("Px(10px)")(0).getElementsByTagName("span")(7).innerText
.Range("HighPrice") = html.getElementsByClassName("Px(10px)")(0).getElementsByTagName("span")(9).innerText
.Range("AvgPrice") = html.getElementsByClassName("Px(10px)")(0).getElementsByTagName("span")(3).innerText
.Range("SI_score") = html.querySelector("[data-test='rec-rating-txt'").innerText 'Recommendation score
Sheets("Instructions").Range("Date_PriceTargets") = Now()
End With
The following Excel VBA macro code works on my laptop, and my old desktop, but part of the code no longer works on my new desktop, which encounters a "Runtime error 424 Object required" when it gets to the Low, High, Average and SI_score lines of code shown below. I've commented out the second line of code referring to the Current price because it doesn't work also, but I found the Current price value shown also at the top of the same webpage, and I can return that value without a problem (call it a workaround for now). The other values I'm trying to retrieve are on the same webpage but located in a separate frame to the side of the main body called Analysts' Price Targets.
Note that my Excel file is stored on OneDrive so that I can share the same file between all 3 computers. All 3 computers have the latest Windows 10 64-bit build and are signed into the same Office 365 subscription account, and use the same version of the latest Firefox browser. Please note that I have tested the same file on all 3 computers, not copies of the file, so Excel and this VBA macro should run the same on all devices. But something now caused or is causing the runtime error to occur only on my new computer and only with the specific lines of code I have identified.
Here's an example webpage that the macro scrapes data values from...
https://ca.finance.yahoo.com/quote/ENB.TO/analysis?p=ENB.TO
...and enters the data into the "Instructions" worksheet of my workbook. In summary, 2 of my older computers have no trouble scraping all 5 data values, plus the Now() value. But my desktop can only scrape the "CurrentPrice" value (and Now() value), which is on the same webpage as the other 4 values.
With Sheets("Instructions")
.Range("CurrentPrice") = html.getElementsByClassName("Trsdu(0.3s) Fw(b) Fz(36px) Mb(-4px) D(ib)").Item(0).innerText
' .Range("CurrentPrice") = html.getElementsByClassName("Px(10px)")(0).getElem entsByTagName("span")(1).innerText
.Range("LowPrice") = html.getElementsByClassName("Px(10px)")(0).getElem entsByTagName("span")(7).innerText
.Range("HighPrice") = html.getElementsByClassName("Px(10px)")(0).getElem entsByTagName("span")(9).innerText
.Range("AvgPrice") = html.getElementsByClassName("Px(10px)")(0).getElem entsByTagName("span")(3).innerText
.Range("SI_score") = html.querySelector("[data-test='rec-rating-txt'").innerText 'Recommendation score
Sheets("Instructions").Range("Date_PriceTargets") = Now()
End With
Any ideas as to why my new computer can no longer run through these specific lines of code?

Mass Upload Files To Specific Contacts Salesforce

I need to upload some 2000 documents to specific users in salesforce. I have a csv file that has the Salesforce-assigned ContactID, as well as a direct path to the files on my desktop. Each contact's specific file url has been included in the csv. How can I upload them all at one and, especially, to the correct contact?
You indicated in the comments / chat that you want it as "Files".
The "Files" object is bit more complex than Attachments, you'll need to do it in 2-3 steps. What you see as a File (you might see it referred to in documentation as Chatter Files or Salesforce Content) is actually several tables. There's
ContentDocument which can be kind of a file header (title, description, language, tags, linkage to many other areas in SF - because it can be standalone, it can be uploaded to certain SF Content Library, it can be linked to Accounts, Contacts, $_GOD knows what else)
ContentVersion which is well, actual payload. Only most recent version is displayed out of the box but if you really want you can go back in time
and more
The crap part is that you can't insert ContentDocument directly (there's no create() call in the list of operations) .
Theory
So you'll need:
Insert ContentVersion (v1 will automatically create for you parent ContentDocuments... it does sound bit ass-backwards but it works). After this is done you'll have a bunch of standalone documents loaded but not linked to any Contacts
Learn the Ids of their parent ContentDocuments
Insert ContentDocumentLink records that will connect Contacts and their PDFs
Practice
This is my C:\stacktest folder. It contains some SF cheat sheet PDFs.
Here's my file for part 1 of the load
Title PathOnClient VersionData
"Lightning Components CheatSheet" "C:\stacktest\SF_LightningComponents_cheatsheet_web.pdf" "C:\stacktest\SF_LightningComponents_cheatsheet_web.pdf"
"Process Automation CheatSheet" "C:\stacktest\SF_Process_Automation_cheatsheet_web.pdf" "C:\stacktest\SF_Process_Automation_cheatsheet_web.pdf"
"Admin CheatSheet" "C:\stacktest\SF_S1-Admin_cheatsheet_web.pdf" "C:\stacktest\SF_S1-Admin_cheatsheet_web.pdf"
"S1 CheatSheet" "C:\stacktest\SF_S1-Developer_cheatsheet_web.pdf" "C:\stacktest\SF_S1-Developer_cheatsheet_web.pdf"
Fire Data Loader, select Insert, select showing all Salesforce objects. Find ContentVersion. Load should be straightforward (if you're hitting memory issues set batch size to something low, even 1 record at a time if really needed).
You'll get back a "success file", it's useless. We don't need the Ids of generated content versions, we need their parents... Fire "Export" in Data Loader, pick all objects again, pick ContentDocument. Use query similar to this:
Select Id, Title, FileType, FileExtension
FROM ContentDocument
WHERE CreatedDate = TODAY AND CreatedBy.FirstName = 'Ethan'
You should see something like this:
"ID","TITLE","FILETYPE","FILEEXTENSION"
"0690g0000048G2MAAU","Lightning Components CheatSheet","PDF","pdf"
"0690g0000048G2NAAU","Process Automation CheatSheet","PDF","pdf"
"0690g0000048G2OAAU","Admin CheatSheet","PDF","pdf"
"0690g0000048G2PAAU","S1 CheatSheet","PDF","pdf"
Use Excel and magic of VLOOKUP or other things like that to link them back by title to Contacts. You wrote you already have a file with Contact Ids and titles so there's hope... Create a file like that:
ContentDocumentId LinkedEntityId ShareType Visibility
0690g0000048G2MAAU 0037000000TWREI V InternalUsers
0690g0000048G2NAAU 0030g000027rQ3z V InternalUsers
0690g0000048G2OAAU 0030g000027rQ3a V InternalUsers
0690g0000048G2PAAU 0030g000027rPz4 V InternalUsers
1st column is the file Id, then contact Id, then some black magic you can read about & change if needed in ContentDocumentLink docs.
Load it as insert to (again, show all objects) ContentDocumentLink.
Woohoo! Beer time.
Your CSV should contain following fields :
- ParentID = Id of object you want to link the attachment to (the ID of the contact)
- Name = name of the file
- ContentType = extension(.xls or .pdf or ...)
- OwnerId = if empty I believe it takes your user as owner
- body = the location on your machine of the file (for instance: C:\SFDC\Files\test.pdf
Use this csv to insert the records (via data loader) into the Attachment object.
You will then see for each contact, that records have been added to the 'Notes & Attachments' related list.

How do I download only my purchase invoice documents from Exact Online with Invantive Query Tool?

To comply to regulations, I'm trying to download the purchase invoice documents (as PDF files) from some of my divisions to save them on-disk for archiving purposes.
I use Invantive Query Tool to do this. I like to know which table to use and how to export these attachments only regarding purchase invoice documents.
You can indeed do this by using the export options in Invantive Query Tool or Invantive Data Hub.
What you need is a query that hooks up the document information of type 20 (purchase invoices) with the actual attachment files. You can find a list of types and their description in the DocumentTypes view. You can find the document attachment files in the DocumentAttachmentFiles table.
When you have retrieved that, you can export the documents from that query to disk using a local export documents statement.
The full query is here:
use 123456
select /*+ join_set(dae, document, 10000) */ attachmentfromurl
, dct.division || '/' || dae.id || '-' || filename
filepath
from exactonlinerest..documents dct
join DocumentAttachmentFiles dae
on dae.division = dct.division
and dae.document = dct.id
where dct.Type = 20
order
by dct.division
, dae.id
local export documents in attachmentfromurl to "c:\temp\docs" filename column Filepath
Make sure to set the ID of the division right in the use statement (this is the technical ID, not the 'division number', which can contain duplicates). You can find that in the top menu bar under Partitions. Or simply use use all to get the documents from all divisions (this might take a while).
Also set the file path right where it says c:\temp\docs now. Then hit F5 in the Query Tool to execute, or run the script from Data Hub.

How can this query be optimized for speed?

This query creates an export for UPS from the deliveries history:
select 'key'
, ACC.Name
, CON.FullName
, CON.Phone
, ADR.AddressLine1
, ADR.AddressLine2
, ADR.AddressLine3
, ACC.Postcode
, ADR.City
, ADR.Country
, ACC.Code
, DEL.DeliveryNumber
, CON.Email
, case
when CON.Email is not null
then 'Y'
else 'N'
end
Ship_Not_Option
, 'Y' Ship_Not
, 'ABCDEFG' Description_Goods
, '1' numberofpkgs
, 'PP' billing
, 'CP' pkgstype
, 'ST' service
, '1' weight
, null Shippernr
from ExactOnlineREST..GoodsDeliveries del
join ExactOnlineREST..Accounts acc
on ACC.ID = del.DeliveryAccount
join ExactOnlineREST..Addresses ADR
on ADR.ID = DEL.DeliveryAddress
join ExactOnlineREST..Contacts CON
on CON.ID = DEL.DeliveryContact
where DeliveryDate between $P{P_SHIPDATE_FROM} and $P{P_SHIPDATE_TO}
order
by DEL.DeliveryNumber
It takes many minutes to run. The number of deliveries and accounts grows with several hundreds each day. Addresses and contacts are mostly 1:1 with accounts. How can this query be optimized for speed in Invantive Control for Excel?
Probably this query is run at most once every day, since the deliverydate does not contain time. Therefore, the number of rows selected from ExactOnlineREST..GoodsDeliveries is several hundreds. Based upon the statistics given, the number of accounts, deliveryaddresses and contacts is also approximately several hundreds.
Normally, such a query would be optimized by a solution such as Exact Online query with joins runs more than 15 minutes, but that solution will not work here: the third value of a join_set(soe, orderid, 100) is the maximum number of rows on the left-hand side to be used with index joins. At this moment, the maximum number on the left-hand side is something like 125, based upon constraints on the URL length for OData requests to Exact Online. Please remember the actual OData query is a GET using an URL, not a POST with unlimited size for the filter.
The alternatives are:
Split volume
Data Cache
Data Replicator
Have SQL engine or Exact Online adapted :-)
Split Volume
In a separate query select the eligible GoodsDeliveries and put them in an in-memory or database table using for instance:
create or replace table gdy#inmemorystorage as select ... from ...
Then create a temporary table per 100 or similar rows such as:
create or replace table gdysubpartition1#inmemorystorage as select ... from ... where rowidx$ between 0 and 99
... etc for 100, 200, 300, 400, 500
And then run the query several times, each time with a different gdysubpartition1..gdysubpartition5 instead of the original from ExactOnlineREST..GoodsDeliveries.
Of course, you can also avoid the use of intermediate tables by using an inline view like:
from (select * from goodsdeliveries where date... limit 100)
or alike.
Data Cache
When you run the query multiple times per day (unlikely, but I don't know), you might want to cache the Accounts in a relational database and update it every day.
You can also use a 'local memorize results clipboard andlocal save results clipboard to to save the last results to a file manually and later restore them usinglocal load results clipboard from ...andlocal insert results clipboard in table . And maybe theninsert into from exactonlinerest..accounts where datecreated > trunc(sysdate)`.
Data Replicator
With Data Replicator enabled, you can have replicas created and maintained automatically within an on-premise or cloud relational database for Exact Online API entities. For low latency, you will need to enable the Exact webhooks.
Have SQL Engine or Exact adapted
You can also register a request to have the SQL engine to allow higher number in the join_set hint, which would require addressing the EOL APIs in another way. Or register a request at Exact to also allow POST requests to the API with the filter in the body.

Comparison of sets in MySQL

I have a challenge with the following database structure:
HEADER table called 'DOC' containing document details among which the document ID
DETAIL tabel called 'DOC_SET' containing data related to the document.
The header table is approximately 16000 records. The detail table contains on average 75 records per header table (1.2 million records in total).
I have one source document and its related set (source set). This source set I like to compare to the other documents' sets (which I refer to as destination documents and sets). Through my application I have a list of ID's of the source set available and as such also the length (in the example below shown as a list of 46 elements) which I can use in the query directly.
What I need per destination document is the length of the intersection (number of shared elements) of the source and destination sets and the length of the difference (length of what is in the source set and what is not in the destination set) for display. I also need a filter to retrieve only records for which a 75% intersection between source and destination, compared to the source set is reached.
Currently I have a query which does this by using sub selects containing expressions, but it is utterly slow and the results need to be available at page refresh in a web application. The point is I only need to display about 20 results at a time, but when sorting on calculated fields I need to calculate every destination record before being able to sort and paginate.
The query is something like this:
select
DOC.id,
calc_subquery._calcSetIntersection,
calc_subquery._calcSetDifference
from
DOC
inner join
(
select
DOC.id as document_id,
(
select
count(*)
from
DOC_SET
where
DOC_SET.doc_id = DOC.id and
DOC_SET.element_id in (60,114,130,187,267,394,421,424,426,603,604,814,909,1035,1142,1223,1314,1556,2349,2512,4953,5134,6318,6339,6344,6455,6528,6601,6688,6704,6705,6731,6894,6895,7033,7088,7103,7119,7129,7132,7133,7137,7154,7159,7188,7201)
) as _calcSetIntersection
,46-(
select
count(*)
from
DOC_SET
where
DOC_SET.doc_id = DOC.id and
DOC_SET.element_id in (60,114,130,187,267,394,421,424,426,603,604,814,909,1035,1142,1223,1314,1556,2349,2512,4953,5134,6318,6339,6344,6455,6528,6601,6688,6704,6705,6731,6894,6895,7033,7088,7103,7119,7129,7132,7133,7137,7154,7159,7188,7201)
) as _calcSetDifference
from
DOC
where
DOC.id = 2599
) as calc_subquery
on
DOC.id = calc_subquery.document_id
where
DOC.id = 2599 and
_calcSetIntersection / 46 > 0.75;
I'm wondering if:
this is possible while being performed in < 100msec or so on MySQL
on an average spec server running MySQL fully in memory (24Gb).
I should use a better suiting solution for this, perhaps like a NoSQL solution.
If I should use some sort of temporary table or cache containing
calculated values. This is an issue for me as the source set of id's
might change in between queries and the whole thing needs to be
calculated again.
Anyway, some thoughts or solutions are really appreciated.
Kind regards,
Eric