Recommendation for storing and querying DataFactory run log?

Recommendation for storing and querying DataFactory run log? - json

I'd like to store and query the OUTPUT and ERROR data generated during a DataFactory run. The data is returned when calling Get-AzDataFactoryV2ActivityRun.
The intention is to use it to monitore possible pipeline execution error, duration, etc in a easy and fast way.
The data ressembles JSON format. What would be nice is to visualize the summary of each execution through some html. Should I store this log into a MongoDB?
Is there an easy and better way to centralize the log info of the multiple execution of different pipelines?
ResourceGroupName : Test
DataFactoryName : DFTest
ActivityRunId : 00000000-0000-0000-0000-000000000000
ActivityName : If Condition1
PipelineRunId : 00000000-0000-0000-0000-000000000000
PipelineName : Test
Input : {}
Output : {}
LinkedServiceName :
ActivityRunStart : 03/07/2019 11:27:21
ActivityRunEnd : 03/07/2019 11:27:21
DurationInMs : 000
Status : Succeeded
Error : {errorCode, message, failureType, target}
Activity 'Output' section:
"firstRow": {
"col1": 1
}
"effectiveIntegrationRuntime": "DefaultIntegrationRuntime (West Europe)"

This is probably not the best way how you can monitor your ADF pipelines.
Have you considered to use Azure Monitor?
Find out more:
- https://learn.microsoft.com/en-us/azure/data-factory/monitor-using-azure-monitor
- https://learn.microsoft.com/en-us/azure/azure-monitor/visualizations

Related

OneNote Api - copyToNotebook hanging

The OneNote-Api recently started to hang on this call:
https://www.onenote.com/api/beta/me/notes/sections/{id}/copyToNotebook
Polling the result (as always) now returns the following
{
"#odata.context": "https://www.onenote.com/api/beta/$metadata#me/notes/operations/$entity",
"id": "copy-645387ea-eb06-4a0d-bcde-09d276e4e3d6fe0e14f6-3e53-421e-aa6c-8adcc998a4dd",
"status": "not started",
"createdDateTime": "2017-10-04T16:57:45.9599909Z",
"lastActionDateTime": "2017-10-04T16:57:45.9599909Z"
}
The lastActionDateTime never updates and the command doesn't complete despite returning the correct 202 code and subsequent 200 codes.
Any help would be appreciated (especially in a live working environment)!

You are calling the API correctly - we introduced a problem in our service a couple of days ago. This should be fixed now.
Thanks for reporting this. Feel free to report anything to us at: https://twitter.com/onenotedev?lang=en.

I performed the same operation and got back the same response. When checking my account to verify the copying worked I get back an error.
JSON object returned below. The copy file seems to be corrupted on OneNotes side.
{Responses: [[6,…]]}
Responses
:
[[6,…]]
0
:
[6,…]
0
:
6
1
:
{OperationId: 1, StatusCode: 126, RawCellStorageErrorCode: "InternalError.4", ServerPageStatsTrace: "",…}
AvailableFileAccess
:
1
CellId
:
"40c4a0be-3ff1-49c7-b169-ba9d74e0724c|1"
ContentBytes
:
0
ContextId
:
"null"
FileId
:
"WOPIsrc=https%3A%2F%2Fwopi%2Eonedrive%2Ecom%2Fwopi%2Ffiles%2F1438AEF7B187116%21122&access_token=4wfG%2D4rgaZ6xmefLfMpPpnlTOOxmBy0LNKwftRv4WQeE1YRkcD72ADoSfR%2DC2ZxSi11DuWIEGx0iSqZb8JP88aA36k2o8KKqF1hPyoFTblc3TyLt6k65eXe%2DL7QEcINnMhnvAA9aeuP5W2ttwIYE6dDJQVh9xkv5JcUndBG1d%5F3Ldp3%2DlcE3gNO7IEZtvzf7B0mUkKoerjtJr3OBKxsQzHx1PRCfh99BtCNPNvVVAq91thnpmeuVOATVGgWlMWcHTt29l9a8%2DrbHa3jknZWee6F6DBxU%5FzgW7YpWbZ4LtW1zOx33SQVm3XjRQ628TsgAV7%5Fy%2DJ4IvxCnGMVhpwvC%2DXLGnP35DAJW7LuetWKJ93B%5Fs&access_token_ttl=1510611198523"
OperationId
:
1
RawCellStorageErrorCode
:
"InternalError.4"
RevisionList
:
[]
RootCellId
:
"null"
ServerPageStatsTrace
:
""
StatusCode
:
126

Is there a way to create a logic condition in packer provisioner?

I am trying to add a condition so my volume_size: can get two different values depending on what is passed in "-var role="
I've tried something like this :
"volume_size": [{
"Fn::If" : [
".ami_id_bar",
{"foo" : "50"},
{"foo" : "20"}
]
}],
.ami_id_bar is a :
"environment_vars": [
"ami_id_bar={{user `role`}}"
],
that gets it from command line when executing packer
This is the error I get :
error(s) decoding:
'launch_block_device_mappings[0].volume_size' expected type 'int64', got unconvertible type '[]interface {}'
Is it impossible or what am I doing wrong ?
Thank you in advance!

No you can't. Packers templates are logicless. The standard way of achieving the what you want is to use -var-file where each vars file represents a role, e.g.
packer build -var-file=role-A.json template.json
In more complex cases we recommend that you preprocess your template and other files and wrap packer in some build script like make.

REST POST json body to support complex query advice

I'm fairly new to REST. All of our legacy webservices were SOAP based with enterprise (ORACLE or DB2) databases. We are now moving to REST/couchbase.
Our team is looking into implementing a complex query method. We already have implemented simple query methods using GET, for example GET returns all entries and a GET/067e6162-3b6f-4ae2-a171-2470b63dff00 would return the entry for 067e6162-3b6f-4ae2-a171-2470b63dff00.
We want to support a query method that would support receiving several query parameters such a list of Ids and date ranges. The number of Ids can number into a few thousand and because of this, we realize we cannot pass these query parameters in a GET HTTP header since there is a limit on header size.
We are starting to look into passing our query parameters into the JSON body of a POST request. For example, we could have client pass in a few thousand Ids as an array and also pass in a date range, so we'd have each query param/filter be an object. The JSON body would then be an array of objects. For example:
{
"action" : "search",
"queryParameters" : {
[
{
“operation”: “in”,
"key" : "name.of.attribute.Id",
"value" : "[{ "id: "067e6162-3b6f-4ae2-a171-2470b63dff00"}, {"id": "next id"....}],
},
{
“operation”: “greater”,
"key" : "name.of.attribute “,
"value" : "8/20/2016"
},
{
“operation”: “less”,
"key" : "name.of.attribute “,
"value" : "8/31/2016"
}
]
}
The back end code would then receive POST and read the body. It would see action is a search and then look for any entries in the list that are in the list of Ids that are in the date range of > 8/20/2016 and < 8/31/2016.
I've been trying to look online for tips/best practices on how best to structure the JSON body for complex queries but have not found much. So any tips, guidance or advice would be greatly appreciated.
thanks.

Difference Between Two Mongo Queries

what is the difference between two mongo queries.
db.test.find({"field" : "Value"})
db.test.find({field : "Value"})
mongo shell accepts both.

There is no difference in your example.
The problem happens when your field names contain characters which cannot be a part of an identifier in Javascript (because the query engine is run in a javascript repl/shell)
For example user-name because there is a hyphen in it.
Then you would have to query like db.test.find({"user-name" : "Value"})

For the mongo shell there is no actual difference, but in some other language cases it does matter.
The actual case here is presenting what is valid JSON, and with myself as a given example, I try to do this in responses on this forum and others as JSON is a data format that can easily be "parsed" into native data structures, where alternate "JavaScript" notation may not be translated so easily.
There are certain cases where the quoting is required, as in:
db.test.find({ "field-value": 1 })
or:
db.test.find({ "field.value": 1 })
As the values would otherwise be "invalid JavaScript".
But the real point here is adhering to the JSON form.

You can understand with example: suppose that you have test collection with two records
{
'_id': ObjectId("5370a826fc55bb23128b4568"),
'name': 'nanhe'
}
{
'_id': ObjectId("5370a75bfc55bb23128b4567"),
'your name': 'nanhe'
}
db.test.find({'your name':'nanhe'});
{ "_id" : ObjectId("5370a75bfc55bb23128b4567"), "your name" : "nanhe" }
db.test.find({your name:'nanhe'});
SyntaxError: Unexpected identifier

MongoDB vs MySQL Performance - Simple Query

I am doing a comparison of mongodb with respect to mysql and imported the mysql data into the mongodb collection (>500000 records).
the collection looks like this:
{
"_id" : ObjectId(""),
"idSequence" : ,
"TestNumber" : ,
"TestName" : "",
"S1" : ,
"S2" : ,
"Slottxt" : "",
"DUT" : ,
"DUTtxt" : "",
"DUTver" : "",
"Voltage" : ,
"Temperature" : ,
"Rate" : ,
"ParamX" : "",
"ParamY" : "",
"Result" : ,
"TimeStart" : new Date(""),
"TimeStop" : new Date(""),
"Operator" : "",
"ErrorNumber" : ,
"ErrorText" : "",
"Comments" : "",
"Pos" : ,
"SVNURL" : "",
"SVNRev" : ,
"Valid" :
}
When comparing the queries (which both return 15 records):
mysql -> SELECT TestNumber FROM db WHERE Valid=0 AND DUT=68 GROUP BY TestNumber
with
mongodb -> db.results.distinct("TestNumber", {Valid:0, DUT:68}).sort()
The results are equivalent, but it takes (iro) 17secs from mongodb, compared with 0.03 secs from mysql.
I appreciate that it is difficult to make a comparison between the two db architectures and i further appreciate one of the skills of mongodb admin is to organise the data structure accordingly (therefore it is not a fair test to just import the mysql structure) Ref: MySQL vs MongoDB 1000 reads
But the time to return difference is too great to be a tuning issue.
My (default) mongodb log file reads:
Wed Mar 05 04:56:36.415 [conn4089] command NTV_Results.$cmd command: { distinct: "results", key: "TestNumber", query: { Valid: 0.0, DUT: 68.0 } } ntoreturn:1 keyUpdates:0 numYields: 6 locks(micros) r:21764672 reslen:250 16525ms
I have also tried the query:
db.results.group( {
key: { "TestNumber": 1 },
cond: {"Valid": 0, "DUT": 68 },
reduce: function ( curr, result ) { },
initial: { }
} )
With similar (17 seconds) results, any clues as to what I am doing wrong?
Both services are running on the same octo-core i7 3770 desktop PC with Windows 7 and 16Gb RAM.

There can be many reasons for slow performance, much of which is too much detail to go into here. But I can offer you a "starter pack" as it were.
Creating Indexes on your Valid and DUT fields are going to improve results for these and other queries. Consider this compound form this case using the ensureIndex command
db.collection.ensureIndex({ "Valid": 1, "DUT": 1})
Also the use of aggregate is recommended for these types of operations:
db.collection.aggregate([
{$match: { "Valid": 0, "DUT": 68 }},
{$group: { _id: "$TestNumber" }}
])
Should be the equivalent of the SQL you are referring to.
There is a SQL to Aggregation Mapping Chart that may give you some assistance with the thinking. Also worth familiarizing yourself with the difference aggregation operators in order to write effective queries.
I have spent many years writing very complex SQL for advanced tasks. And I find the aggregation framework a breath of fresh air for various problem solving cases.
Worth your time to learn.
Also worth noting. Your "default" MongoDB log file is reporting those operations because they are considered to be "slow queries" and are then brought to your attention by "default". You can also see more or less information, as you require by tuning the database profiler to meet your needs.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Recommendation for storing and querying DataFactory run log? - json

This is probably not the best way how you can monitor your ADF pipelines. Have you considered to use Azure Monitor? Find out more: - https://learn.microsoft.com/en-us/azure/data-factory/monitor-using-azure-monitor - https://learn.microsoft.com/en-us/azure/azure-monitor/visualizations

Related

OneNote Api - copyToNotebook hanging

Is there a way to create a logic condition in packer provisioner?

REST POST json body to support complex query advice

Difference Between Two Mongo Queries

MongoDB vs MySQL Performance - Simple Query

Categories

Resources