I am using SSIS 2012 and i need to figure out the best way to load multiple configuration files to be used in a custom script.
This is the way it goes:
I need to use a custom script to access a NoSQL database
In this case, the NoSQL database has no rigid schema, therefore the attribute change from document to document
I want to use configuration files to tell how the columns are supposed to be renamed and configure there other basic rules.
the above task is easily done in c#, however if possible i would like read the configuration files using a SSIS component (to read a flat file, excel file or database rules). Therefore i want to know how can i feed the custom script with the data from the stream, the scipt consumes stream (the stream contains the configuration), and after consuming the entire stream, the script component generates rows.
An example case would be be:
script reads an entire stream of numbers.
the script orders the numbers on the stream
the script discards duplicates the script
outputs the ordered sequence of numbers without duplicates.
If I understood correctly, the NoSql database and configuration files just background of the problem and what you really need is an asynchronous
script component to read everything from the pipeline, then do something and finally send the results back to the pipeline?
If so then what you need is create an script component with it's output buffer set to SynchronousInputId=None.
The example of the numbers to be deduped and sorted that you posted could then be solved with the following pseudo-code
(assume you create an output column in the output buffer of the script component called "numberout"
and output buffer property SynchronousInputId is set to None) :
...
public override void PreExecute()
{
base.PreExecute();
CREATE ARRAY TO HOLD NUMBERS
}
public override void PostExecute()
{
base.PostExecute();
SORT AND DEDUPE ARRAY
FOR EACH N IN ARRAY:
output0buffer.addrow()
output0byffer.numberout=N
}
public override void Input0_ProcessInputRow(Input0Buffer Row)
{
INSERT NUMBER TO ARRAY
}
Related
Palantir-Foundry - We have a workflow that needs updates from the backing dataset of an object with a writeback to persist in the writeback, but this fails on rows that have previously been edited. Due to the "Edits-win" model the writeback it will always choose the edited version of the row, which makes sense. Short of re-architecting the entire app, I am looking into ways to take care of this by using the Foundry REST API.
Is it possible to revert an edited row in Foundry writebacks to the original unedited version? I found some API documentation in our instance for phonograph2 BaseVersion, but I have not been able to find/understand anything that would restore a row to BaseVersion. I would need to be able to do this from a functions repository using typescript, on certain events.
One way to overwrite the edits with the values from a backing datasets is to build a transform off of the backing dataset that makes a new, identical dataset. Then you can use the new dataset as backing dataset for a new object.
Transform using a simple code repo:
from transforms.api import transform_df, Input, Output
#transform_df(
Output(".../static_guests"),
source_df=Input("<backing dataset RID>"),
)
def compute(source_df):
return source_df
You can then build up the ontology of a static object that will always equal the writeback dataset.
Then create an action that will modify your edited object (in my example that is Test Guest) by reverting a value to equal a value in the static object type.
You can then use the Apply Action API to automatically apply this action to certain values on a schedule or based on a certain condition. Documentation for the API is here.
In ASP.NET Core, the JsonConfigurationProvider will load configuration from appsettings.json, and then will read in the environment version, appsettings.{Environment}.json, based on what IHostingEnvironment.EnvironmentName is. The environment version can override the values of the base appsettings.json.
Is there any reasonable way to preview what the resulting overridden configuration looks like?
Obviously, you could write unit tests that explicitly test that elements are overridden to your expectations, but that would be a very laborious workaround with upkeep for every time you change a setting. It's not a good solution if you just wanted to validate that you didn't misplace a bracket or misspell an element name.
Back in ASP.NET's web.config transforms, you could simply right-click on a transform in Visual Studio and choose "Preview Transform". There are also many other ways to preview an XSLT transform outside of Visual Studio. Even for web.config parameterization with Parameters.xml, you could at least execute Web Deploy and review the resulting web.config to make sure it came out right.
There does not seem to be any built-in way to preview appsettings.{Environment}.json's effects on the base file in Visual Studio. I haven't been able to find anything outside of VS to help with this either. JSON overriding doesn't appear to be all that commonplace, even though it is now an integral part of ASP.NET Core.
I've figured out you can achieve a preview with Json.NET's Merge function after loading the appsettings files into JObjects.
Here's a simple console app demonstrating this. Provide it the path to where your appsettings files are and it will emit previews of how they'll look in each environment.
static void Main(string[] args)
{
string targetPath = #"C:\path\to\my\app";
// Parse appsettings.json
var baseConfig = ParseAppSettings($#"{targetPath}\appsettings.json");
// Find all appsettings.{env}.json's
var regex = new Regex(#"appsettings\..+\.json");
var environmentConfigs = Directory.GetFiles(targetPath, "*.json")
.Where(path => regex.IsMatch(path));
foreach (var env in environmentConfigs)
{
// Parse appsettings.{env}.json
var transform = ParseAppSettings(env);
// Clone baseConfig since Merge is a void operation
var result = (JObject)baseConfig.DeepClone();
// Merge the two, making sure to overwrite arrays
result.Merge(transform, new JsonMergeSettings
{
MergeArrayHandling = MergeArrayHandling.Replace
});
// Write the preview to file
string dest = $#"{targetPath}\preview-{Path.GetFileName(env)}";
File.WriteAllText(dest, result.ToString());
}
}
private static JObject ParseAppSettings(string path)
=> JObject.Load(new JsonTextReader(new StreamReader(path)));
While this is no guarantee there won't be some other config source won't override these once deployed, this will at least let you validate that the interactions between these two files will be handled correctly.
There's not really a way to do that, but I think a bit about how this actually works would help you understand why.
With config transforms, there was literal file modification, so it's easy enough to "preview" that, showing the resulting file. The config system in ASP.NET Core is completely different.
It's basically just a dictionary. During startup, each registered configuration provider is run in the order it was registered. The provider reads its configuration source, whether that be a JSON file, system environment variables, command line arguments, etc. and builds key-value pairs, which are then added to the main configuration "dictionary". An "override", such as appsettings.{environment}.json, is really just another JSON provider registered after the appsettings.json provider, which obviously uses a different source (JSON file). Since it's registered after, when an existing key is encountered, its value is overwritten, as is typical for anything being added to a dictionary.
In other words, the "preview" would be completed configuration object (dictionary), which is composed of a number of different sources, not just these JSON files, and things like environment variables or command line arguments will override even the environment-specific JSON (since they're registered after that), so you still wouldn't technically know the the environment-specific JSON applied or not, because the value could be coming from another source that overrode that.
You can use the GetDebugView extension method on the IConfigurationRoot with something like
app.UseEndpoints(endpoints =>
{
if(env.IsDevelopment())
{
endpoints.MapGet("/config", ctx =>
{
var config = (Configuration as IConfigurationRoot).GetDebugView();
return ctx.Response.WriteAsync(config);
});
}
});
However, doing this can impose security risks, as it'll expose all your configuration like connection strings so you should enable this only in development.
You can refer to this article by Andrew Lock to understand how it works: https://andrewlock.net/debugging-configuration-values-in-aspnetcore/
I have a project in Apps script that uses several libraries. The project needed a more complex logger (logging levels, color coding) so I wrote one that outputs to google docs. All is fine and dandy if I immediately print the output to the google doc, when I import the logger in all of the libraries separately. However I noticed that when doing a lot of logging it takes much longer than without. So I am looking for a way to write all of the output in a single go at the end when the main script finishes.
This would require either:
Being able to define the logging library once (in the main file) and somehow accessing this in the attached libs. I can't seem to find a way to get the main projects closure from within the libraries though.
Some sort of singleton logger object. Not sure if this is possible from with a library, I have trouble figuring it out either way.
Extending the built-in Logger to suit my needs, not sure though...
My project looks at follows:
Main Project
Library 1
Library 2
Library 3
Library 4
This is how I use my current logger:
var logger = new BetterLogger(/* logging level */);
logger.warn('this is a warning');
Thanks!
Instead of writing to the file at each logged message (which is the source of your slow down), you could write your log messages to the Logger Library's ScriptDB instance and add a .write() method to your logger that will output the messages in one go. Your logger constructor can take a messageGroup parameter which can serve as a unique identifier for the lines you would like to write. This would also allow you to use different files for logging output.
As you build your messages into proper output to write to the file (don't write each line individually, batch operations are your friend), you might want to remove the message from the ScriptDB. However, it might also be a nice place to pull back old logs.
Your message object might look something like this:
{
message: "My message",
color: "red",
messageGroup: "groupName",
level: 25,
timeStamp: new Date().getTime(), //ScriptDB won't take date objects natively
loggingFile: "Document Key"
}
The query would look like:
var db = ScriptDb.getMyDb();
var results = db.query({messageGroup: "groupName"}).sortBy("timeStamp",db.NUMERIC);
I have data on a WebSphere MQ queue. I've written a script task to read the data, and I can output it to a variable or a text file. But I want to use that as input to a dataflow step and transform the data. The ultimate destination is a flat file.
Is there a way to read the variable as a source into a dataflow step? I could write the MQ data to a text file, and read the text file in the dataflow, but that seems like a lot of overhead. Or I could skip the dataflow altogther, and write all the transformations in a script (but then why bother with SSIS in the first place?)
Is there a way to write a Raw File out of the script step, to pass into the dataflow component?
Any ideas appreciated!
If you've got the script that consumes the webservice, you can skip all the intermediary outputs and simply use it as a source in your dataflow.
Drag a Data Flow Task onto the canvas and the add a Script Component. Instead of selecting Transformation (last option), select Source.
Double-Click on the Script Component and choose the Input and Output Properties. Under Output 0, select Output Columns and click Add Column for however many columns the web service has. Name them appropriately and be certain to correctly define their metadata.
Once the columns are defined, click back to the Script tab, select your language and edit the script. Take all of your existing code that could write that consumes the service and we'll use it here.
In the CreateNewOutputRows method, you will need to iterate through the results of the Websphere MQ request. For each row that is returned, you would apply the following pattern.
public override void CreateNewOutputRows()
{
// TODO: Add code here or in the PreExecute to fill the iterable object, mqcollection
foreach (var row in mqcollection)
{
// Adds a new row into the downstream buffer
Output0Buffer.AddRow();
// Assign all the data to the correct locations
Output0Buffer.Column = row.Column;
Output0Buffer.Column1 = row.Column1;
// handle nulls appropriately
if (string.IsNullOrEmpty(row.Column2))
{
Output0Buffer.Column2_IsNull = true;
}
else
{
Output0Buffer.Column2 = row.Column2;
}
}
}
You must handle nulls via the _IsNull attribute or your script will blow up. It's tedious work versus a normal source but you'll be far more efficient, faster and consume fewer resources than dumping to disk or some other staging mechanism.
Since I ran into some additional "gotchas", I thought I'd post my final solution.
The script I am using does not call a webservice, but directly connects and reads the WebSphere queue. However, in order to do this, I have to add a reference to amqmdnet.dll.
You can add a reference to a Script Task (which sits on the Control Flow canvas), but not to a Script Component (which is part of the Data Flow).
So I have a Script Task, with reference and code to read the contents of the queue. Each line in the queue is just a fixed width record, and each is added to a List. At the end, the List is put into a Read/Write object variable declared at the package level.
The Script feeds into a Data Flow task. The first component of the Data Flow is a Script Component, created as Source, as billinkc describes above. This script casts the object variable back to a list. Then parses each item in the list to fields in the Output Buffer.
Various split and transform tasks take over from there.
Try using the Q program available in the MA01 MQ supportpac instead of your script.
I have a CakePHP 1.2 application that makes a number of AJAX calls using the AjaxHelper object. The AjaxHelper makes a call to a controller function which then returns some data back to the page.
I would like to log the SQL queries that are executed by the AJAX controller functions. Normally, I would just turn the debug level to 2 in config/core.php, however, this breaks my AJAX functionality because it causes the output SQL queries to be appended to the output that is returned to the client side.
To get around this issue, I would like to be able to log any SQL queries performed to a log file. Any suggestions?
I found a nice way of adding this logging functionality at this link:
http://cakephp.1045679.n5.nabble.com/Log-SQL-queries-td1281970.html
Basically, in your cake/libs/model/datasources/dbo/ directory, you can make a subclass of the dbo that you're using. For example, if you're using the dbo_mysql.php database driver, then you can make a new class file called dbo_mysql_with_log.php. The file would contain some code along the lines of the following:
App::import('Core', array('Model', 'datasource', 'dbosource', 'dbomysql'));
class DboMysqlWithLog extends DboMysql {
function _execute($sql) {
$this->log($sql);
return parent::_execute($sql);
}
}
In a nutshell, this class modifies (i.e. overrides) the _execute function of the superclass to log the SQL query before doing whatever logic it normally does.
You can modify your app/config/database.php configuration file to use the new driver that you just created.
This is a fantastic way to debug things like this, https://github.com/cakephp/debug_kit