Foundry writebacks - is it possible to restore an edited record to it's unedited version (BaseVersion) - palantir-foundry

Palantir-Foundry - We have a workflow that needs updates from the backing dataset of an object with a writeback to persist in the writeback, but this fails on rows that have previously been edited. Due to the "Edits-win" model the writeback it will always choose the edited version of the row, which makes sense. Short of re-architecting the entire app, I am looking into ways to take care of this by using the Foundry REST API.
Is it possible to revert an edited row in Foundry writebacks to the original unedited version? I found some API documentation in our instance for phonograph2 BaseVersion, but I have not been able to find/understand anything that would restore a row to BaseVersion. I would need to be able to do this from a functions repository using typescript, on certain events.

One way to overwrite the edits with the values from a backing datasets is to build a transform off of the backing dataset that makes a new, identical dataset. Then you can use the new dataset as backing dataset for a new object.
Transform using a simple code repo:
from transforms.api import transform_df, Input, Output
#transform_df(
Output(".../static_guests"),
source_df=Input("<backing dataset RID>"),
)
def compute(source_df):
return source_df
You can then build up the ontology of a static object that will always equal the writeback dataset.
Then create an action that will modify your edited object (in my example that is Test Guest) by reverting a value to equal a value in the static object type.
You can then use the Apply Action API to automatically apply this action to certain values on a schedule or based on a certain condition. Documentation for the API is here.

Related

Access CSV Data Set Config variables in backend listener

I'm trying to fetch my variables from CSV Data config and add them to my backend listener in a distributed testing environment like this. FYI, it works on my local machine.
Here is my test plan:
Test Plan
CSV Data Config:
CSV config
My csv looks like this:
SELECT count(*) FROM github_events;simpleQuery
SELECT count(*) FROM github_events;medium
SELECT count(*) FROM github_events;complexQuery
SELECT count(*) FROM github_events;simpleQuery
Backend Listener:
Backend Listener
I'm setting the CSV config variables in the beanshell pre-processor like this:
props.put("query", "${QUERY}");
props.put("query_type", "${QUERY_TYPE}");
and that's why I have the ${__P(query)} ${__P(query_type)} in the backend listener.
The goal is to grab the QUERY and QUERY_TYPE from the CSV data config and send it to the backend listener.
Any help would be appreciated. Let me know if I need to add more info on here. Thank you!
Solution:
How I got this to work... kind of hacky but it'll work for what I need:
I created a JSR223 Postprocessor on my JDBC Request and added the following code:
import groovy.json.*
def my_query = vars.get("QUERY")
def my_query_type = vars.get("QUERY_TYPE")
json = JsonOutput.toJson([myQuery: my_query, myQueryType: my_query_type])
prev.setSamplerData(groovy.json.JsonOutput.prettyPrint(groovy.json.JsonOutput.toJson(json)))
This won't work if you need whatever is in your response Data but in my case, it was okay to replace. BTW, this only works with my distributed test. To make it work locally, you use prev.setResponseData instead. Hope this helps someone.
I don't think you can, as per JMeter 5.4.1 all fields of the Backend Listener are being populated in "testStarted" phase
the same applies to your custom listener
it means that JMeter Variables originating from the CSV Data Set Config don't exist at the time the Backend Listener is being initialized and your reference to JMeter Properties returns the default value of 1 as there are no such variables.
If you're looking for the possibility to dynamically send metrics to Azure you will need to replicate the code from the Azure Backend Listener in JSR223 Listener using Groovy language.
The only way how this could work on your local machine is that:
You run your test plan in GUI mode 1st time - it fails, but it sets the properties
You run your test plan in GUI mode 2nd time - it passes but uses the last values of the properties
etc.

How do I retrieve the json representation of an azure data factory pipeline?

I want to track pipeline changes in source control, and I'm looking for a way to programmatically retrieve the json representation from the ADF.
The .Net routines return the objects, but sadly ToString() does not return json (wouldn't THAT be convenient?), so right now I'm looking at copying the json down by hand (shoot me now!), or possibly trying to recreate the json from the .Net objects (shoot me later!).
Please tell me I'm being dense and there is an obvious way to do this.
You can serialize the object using Newtonsoft Json.
See (https://azure.microsoft.com/en-us/documentation/articles/data-factory-create-data-factories-programmatically/) for how to connect via the ADF SDK
var aadTokenCredentials = new TokenCloudCredentials(ConfigurationManager.AppSettings["SubscriptionId"], GetAuthorizationHeader());
var resourceManagerUri = new Uri(ConfigurationManager.AppSettings["ResourceManagerEndpoint"]);
var manager = new DataFactoryManagementClient(aadTokenCredentials, resourceManagerUri);
var pipeline = manager.Pipelines.Get(resourceGroupName, dataFactoryName, pipelineName);
var pipelineAsJson = JsonConvert.SerializeObject(pipeline.Pipeline, Formatting.Indented);
I was expecting something more complex but looking at the sdk source GitHub it is not doing anything special.
Our team has a deployment tool that takes git changes and deploy them appropriately. Everything is done asynchronously and being controlled and versioned through git.
In a nutshell our deployment has the following flow:
Any completed git merge request triggers a VSO build. This is simply
building the whole solution via MsBuild.
Every successful build is applied a Git tag for tracking of Last Known Good.
Next (if build succeeded) our .net ADFPublisher starts by taking only the changed data factory files and asynchronously publishing them based on their
git operation (modified, add, delete, etc.).
For some failures cases our ADFPublisher will perform a retry.
This whole process (Build + publish) takes ~ 65 seconds and has
already saved us from having several bugs. It also allows us to move
definitions from one environment to another very easily.
Let me know if you think this is something that you will be interested in and I will setup a way to share it with you

SSIS SQL Task Map Result Set to Project Parameter

I am implementing a custom auditing framework, logging ETL events such as start, end, error, insertrows etc.
As well as logging at a package level, I'm implementing "session logging" where a sequence of package executions, i.e. a controller package that executes several packages, is a session. In order to keep track of the "session", the stored procedures always return a SessionLogID.
I was hoping I could map this result set to a project parameter as otherwise, I will have to save it to a user var and then pass it around between packages via parameters. This will mean every single package will have a Package Parameter and User Variable called SessionLogID. I don't want to do this if I don't need to.
Open to other suggestions.
Thanks,
Adam
Parameters cannot change at runtime. They are a set once kind of deal whereas variables can change at any time. You can set the variable once in the parent package and map the variable to the child package's using a parameter.

SSIS best way to load configuration to custom script

I am using SSIS 2012 and i need to figure out the best way to load multiple configuration files to be used in a custom script.
This is the way it goes:
I need to use a custom script to access a NoSQL database
In this case, the NoSQL database has no rigid schema, therefore the attribute change from document to document
I want to use configuration files to tell how the columns are supposed to be renamed and configure there other basic rules.
the above task is easily done in c#, however if possible i would like read the configuration files using a SSIS component (to read a flat file, excel file or database rules). Therefore i want to know how can i feed the custom script with the data from the stream, the scipt consumes stream (the stream contains the configuration), and after consuming the entire stream, the script component generates rows.
An example case would be be:
script reads an entire stream of numbers.
the script orders the numbers on the stream
the script discards duplicates the script
outputs the ordered sequence of numbers without duplicates.
If I understood correctly, the NoSql database and configuration files just background of the problem and what you really need is an asynchronous
script component to read everything from the pipeline, then do something and finally send the results back to the pipeline?
If so then what you need is create an script component with it's output buffer set to SynchronousInputId=None.
The example of the numbers to be deduped and sorted that you posted could then be solved with the following pseudo-code
(assume you create an output column in the output buffer of the script component called "numberout"
and output buffer property SynchronousInputId is set to None) :
...
public override void PreExecute()
{
base.PreExecute();
CREATE ARRAY TO HOLD NUMBERS
}
public override void PostExecute()
{
base.PostExecute();
SORT AND DEDUPE ARRAY
FOR EACH N IN ARRAY:
output0buffer.addrow()
output0byffer.numberout=N
}
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
INSERT NUMBER TO ARRAY
}

SSIS: How to read WebSphere MQ, transform, and write to flat file?

I have data on a WebSphere MQ queue. I've written a script task to read the data, and I can output it to a variable or a text file. But I want to use that as input to a dataflow step and transform the data. The ultimate destination is a flat file.
Is there a way to read the variable as a source into a dataflow step? I could write the MQ data to a text file, and read the text file in the dataflow, but that seems like a lot of overhead. Or I could skip the dataflow altogther, and write all the transformations in a script (but then why bother with SSIS in the first place?)
Is there a way to write a Raw File out of the script step, to pass into the dataflow component?
Any ideas appreciated!
If you've got the script that consumes the webservice, you can skip all the intermediary outputs and simply use it as a source in your dataflow.
Drag a Data Flow Task onto the canvas and the add a Script Component. Instead of selecting Transformation (last option), select Source.
Double-Click on the Script Component and choose the Input and Output Properties. Under Output 0, select Output Columns and click Add Column for however many columns the web service has. Name them appropriately and be certain to correctly define their metadata.
Once the columns are defined, click back to the Script tab, select your language and edit the script. Take all of your existing code that could write that consumes the service and we'll use it here.
In the CreateNewOutputRows method, you will need to iterate through the results of the Websphere MQ request. For each row that is returned, you would apply the following pattern.
public override void CreateNewOutputRows()
{
// TODO: Add code here or in the PreExecute to fill the iterable object, mqcollection
foreach (var row in mqcollection)
{
// Adds a new row into the downstream buffer
Output0Buffer.AddRow();
// Assign all the data to the correct locations
Output0Buffer.Column = row.Column;
Output0Buffer.Column1 = row.Column1;
// handle nulls appropriately
if (string.IsNullOrEmpty(row.Column2))
{
Output0Buffer.Column2_IsNull = true;
}
else
{
Output0Buffer.Column2 = row.Column2;
}
}
}
You must handle nulls via the _IsNull attribute or your script will blow up. It's tedious work versus a normal source but you'll be far more efficient, faster and consume fewer resources than dumping to disk or some other staging mechanism.
Since I ran into some additional "gotchas", I thought I'd post my final solution.
The script I am using does not call a webservice, but directly connects and reads the WebSphere queue. However, in order to do this, I have to add a reference to amqmdnet.dll.
You can add a reference to a Script Task (which sits on the Control Flow canvas), but not to a Script Component (which is part of the Data Flow).
So I have a Script Task, with reference and code to read the contents of the queue. Each line in the queue is just a fixed width record, and each is added to a List. At the end, the List is put into a Read/Write object variable declared at the package level.
The Script feeds into a Data Flow task. The first component of the Data Flow is a Script Component, created as Source, as billinkc describes above. This script casts the object variable back to a list. Then parses each item in the list to fields in the Output Buffer.
Various split and transform tasks take over from there.
Try using the Q program available in the MA01 MQ supportpac instead of your script.