SSIS - "switchable" file output for debug? - sql-server-2008

In an SSIS data-flow task, I'm using a Multicast transform at a key part of the flow which I want to hang a File Output destination off.
This, in itself, is no problem to do. However I only want output in the file if I enable it; i.e., I'd be using it for debugging the data if the flow fails unexpectedly and it's not immediately obvious from the default log message output why this occured.
My initial thought was to create a File Output whose output file was obtained from a variable, and by default, the variable would contain 'nul' - i.e., the Windows bit-bucket - which I could override through configuration in the event of needing to dig further.
Unfortuantly this isn't working: the File Output complains saying that "The filename is a device or contains invalid characters". So it looks like I can't use the bit-bucket.
Is anyone aware of a way to make output "switchable"? This would make enabling debug a less risky proposition than editing the package and dropping a File Output in directly.
I suppose I could have a Conditional Split off the multi-cast which basically sends output if a variable is set to some given value, but this seems overly messy, I'll be poking other options, but if anyone has any suggestions/solutions, they'd be welcome.

I'd go for the conditional split, redirecting rows to the konesans trash destination adaptor if your variable wasn't set, otherwise send to your file.

Related

Fortran90 - compiled program creates a blank csv file instead of reading the existing one

In short: I am trying to load a csv file but the program always overwrites the existing file as an empty new file.
Longer: I am pretty new to Fortran, so bear with me. I am trying to read data from a csv file into a fortran program. Now I didn't write the program and it is pretty big, so I can't post the whole thing here. The program consists of a whole bunch of .f90 files and everything is compiled using a makefile. Now since I am loading the gcc module before compiling, I am assuming that it is compiled using GNU Fortran, because it is part of gcc. (idk how to find out if that is correct)
The compiler returns an executable in a different directory. When I execute the program in that directory it apparently overwrites the existing .csv file with a new blank one, so the program only reads "End of File". I don't know why it always creates a new file, how do I stop it from doing so?
As a side note, the csv file I am trying to read simply consists of a single column of floats, e.g.
"0.01, 0.13, 0.041,..." etc.
The code that I inserted into a subroutine of one of the .f90 files is the following:
real*8, dimension(nz) :: Nsq
integer :: i
open(10, file='Nsq.csv')
do i=1,20
read(10, *) Nsq(i)
enddo
close(10)
I have also tried to write a small test program, essentially running the same code as above. That one works just fine and outputs the contents of the csv file without any issues. For that one I use gfortran to compile it.
I have no experience in Fortran at all, so I am completely stumped, why this happens. I know the chances are slim that you guys can help me with this, since I can't provide the whole source code. But maybe someone has an idea why this occurs. Maybe you know an alternate way of reading csv files?
Thanks for your time.
The open-statement in Fortran OPEN(connect-spec-list), has a lot of connection specifications which define how an external file should be managed (see. Fortran 2018 Standard sec 12.5.6).
When you open a file using the simplest form of the open-statement:
OPEN(unit=unitid,file="filename")
A lot of default assumptions are made such as: ACCESS="SEQUENTIAL", ASYNCHRONOUS="NO", BLANK="NULL", .... The most important ones, however, are ACTION and STATUS which define the purpose of the file. The action specification states if you want to use the file for reading, writing or both, while the status essentially defines if we work on an existing file or not, and what we should do with it (replace it, keep it, ...)
Both these specifications have a default compiler dependent state.
In the Intel compiler suit, the default is action="readwrite", status="unknown" (see here and here)
Intel defines the status="unknown" as :Indicates the file may or may not exist. If the file does not exist, a new file is created and its status changes to 'OLD'.
The Gnu compiler suit has a different take on this. The default action is defined by a set of rules which depend on its accessibility if the file exists (+rw,+r-w,-r+w) (see here). The behaviour for the default action="unknown" is not documented but seems to be REWRITE (see Default Status of "Unknown" in Open)
It is advised to use a proper method if you know what you want to do with the file:
OPEN(newunit=unitid, file="filename", action="read", status="old")

Using Apache Nifi to collect files from 3rd party Rest APi - Flow advice

I am trying to create a flow within Apache-Nifi to collect files from a 3rd party RESTful APi and I have set my flow with the following:
InvokeHTTP - ExtractText - PutFile
I can collect the file that I am after, as I have specified this within my Remote URL however when I get all of the data from said file it is outputting multiple (100's) of the same files to my output directory.
3 things I need help with:
1: How do I get the flow to output the file in a readable .csv rather than just a file with no ext
2: How can I stop the processor once I have all of the data that I need
3: The Json file that I have been supplied with gives me the option to get files from a certain date range:
https://api.3rdParty.com/reports/v1/scheduledReports/877800/1553731200000
Or I can choose a specific file:
https://api.3rdParty.com/reports/v1/scheduledReports/download/877800/201904/CTDDaily/2019-04-02T01:50:00Z.csv
But how can I create a command in Nifi to automatically check for newer files, as this process will be running daily and we will be looking at downloading a new file each day.
If this is too broad, please help me by letting me know so I can edit this post.
Thanks.
Note: 3rdParty host name has been renamed to comply with security - therefore links will not directly work. Thanks.
1) You change the filename of the flow file to anything you want using the UpdateAttribute processor. If you want to make it have a ".csv" extension then you can add a property named "filename" with a value of "${filename}.csv" (without the quotes when you enter it).
2) By default most processors have a scheduling strategy of timer-driver 0 seconds, which means keep running as fast as possible. Go to the configuration of the processor on the scheduling tab and configure the appropriate schedule, it sounds like you probably want CRON scheduling to schedule it daily.
3) You can use NiFi expression language statements to create dynamic time ranges. I don't fully understand the syntax for the API that you have to communicate with, but you could do something like this for the URL:
https://api.3rdParty.com/reports/v1/scheduledReports/877800/${now()}
Where now() would return the current timestamp as an epoch.
You can also format it to a date string if necessary:
${now():format('yyyy-MM-dd')}
https://nifi.apache.org/docs/nifi-docs/html/expression-language-guide.html

Can I suppress the "For each file enumerator is empty" warning in SSIS?

I have an SSIS package (SQL 2016) that loads files into a database.
At the beginning of the package I have a Foreach Loop container (Foreach File Enumerator). This loop checks to see if there are any files in an error folder. The desired condition is that there are no files in the error folder.
The ETL works well. However, when there are no files in the error folder, the Foreach Loop container generates a warning:
Foreach File - Check Error Folder:Warning: The For Each File
enumerator is empty. The For Each File enumerator did not find any
files that matched the file pattern, or the specified directory was
empty.
Since this is the desired situation (that there are no files) and since my control flow handles the situation either way, is there a way to suppress this warning?
The reason for wanting to suppress the warning is because the warning count on the package is always 1. Sometimes, however, SSIS warnings are important (such as when fields get out sync). I'd prefer not to have packages that always have warnings since they could mask other, genuine, issues.
It sounds like a small thing, so I thought for sure there'd be a way, but I haven't found it. I tried setting an OnWarning event handler on the Foreach loop and setting Propagate to False. But the warning still gets counted as a warning when the package runs.
I think the best way to solve this small issue is to write a very small script task. Just pass input variable with the path to a folder into the script task, check files count and return output variable back and then use the precedence constraint with an expression
Dts.Variables["User::GoFurther"].Value = Directory.GetFiles(Dts.Variables["User::Path"].Value.ToString()).Any();

Totally new to Talend ESB

I'm completely brand new to Talend ESB (not so much Talend for data integration, but ESB totally.)
That being said, I'm trying to build a simple route that watches a specific file path and get the filename of any file dropped into it. Then it will pass that filename to the childjob (cTalendJob) and the child job will do something to the file.
I'm able to watch the directory, procure the filename itself and System.out.println the filename. but I can't seem to 'pass' it down to the child job. When it runs, the route goes into an endless loop.
Any help is GREATLY appreciated.
You must add a context parameter to your Talend job, and then pass the filename from the route to the job by assigning it to the parameter.
In my example I added a parameter named "Param" to my job. In the Context Param view of cTalendJob, click the + button and select it from the list of available parameters, and assign a value to it.
You can then do context.Param in your child job to use the filename.
I think you are making this more difficult than you need...
I don't think you need your cProcessor or cSetBody steps.
In your tRouteInput if you want the filename, then map "${header.CamelFileName}" to a field in your schema, and you will get the filename. Mapping "${in.body}" would give you the file contents, but if you don't need that you can just map the required heading. If your job would read the file as a whole, you could skip that step and just map the message body.
Also, check the default behaviour of the camel file component - it is intended to put the contents of the file into a message, moving the file to a .camel subdirectory once complete. If your job writes to the directory cFile is monitoring, it will keep running indefinitely, as it keeps finding a "new" file - you would want to write any updated files to a different directory, or a filename mask that isn't monitored by the cFile component.

The connection "C:\\<path>\\*.txt" is not found. This error is thrown by Connections collection when the specific conn element is not found

I developed a SSIS package that creates several .txt files. These files are zipped and then the .txt files need to be removed. Using a foreach file enumerator, I loop through all the .txt files for a specific folder. The folder is retrieved from a variable in configuration and looks something like: C:\Folder\
The foreach loop uses: *.txt to gather all .txt files, does not traverse subfolder and uses the full qualified name.
In the Variable Mappings the "FileName" variable gets filled with the 0 index.
Within the foreachloop I use a File system task.
This task removes the .txt files which are generated before, using the FileName variable that is filled in the loop.
On the development machine this runs like a charm. All greens, no problem at all. Now I copy the package and the configuration file to the test environment. A basic version without the file removing was running perfectly fine here. I replaced the package. Nothing big.
Now I run the SQl Server Agent Job and it starts running. I can see all the text files appearing, and disappearing after it created the zipfiles. However, when all files are removed the package results with errors. Namely the error shown above in the title.
I tried looking for the connectionmanager that might have been removed
Looked for connection managers named in the config that don't exist in the package.
No such thing found. Annoying part is that the package is fully functioning, but still results with the error.
EDIT: I noticed that if I run the package using the execute package utility with the dev. config it gives the same errors.
Hopefully someone is able to help me out.
Thanks in advance!
I managed to "fix" the issue. Remove the File System Component responsible for deleting the files. Then add it again and configure it again.
I think this happens if you accidentally change General parameters before changing the Operation parameter. It holds the metadata to irrelevant parameters and upon execution says: "Wait, you defined this parameter but I don't need it, but I'm checking for it anyway, and it's not there!"
It's a bug for sure