How do I direct autosys log files to dated folders - windows-server

In my autosys jil (Windows), I have logging set up like this:
std_out_file: <servername>\autosys_log\%AUTO_JOB_NAME%-%AUTORUN%.std.out
I want to add the current date to the path, so e.g. for today that would be:
std_out_file: <servername>\autosys_log\2022-11-04\%AUTO_JOB_NAME%-%AUTORUN%.std.out
How can I get that to work? I need some way of dynamically inserting the current date in the output path.

Related

Data factory copy based off last high water mark value (Dynamic date)

I'm currently working on a project where I need the data factory pipeline to copy based off the last run date.
The process breakdown....
Data is ingested into a storage account
The data ingested is in the directory format topic/yyyy/mm/dd i.e., multiple files being brought in a single directory hence it's files are partitioned by date which looks like this day format and month and year etc
The process currently filters based on the last high water mark date which updates each time the pipeline is run and triggers daily at 4am, once the copy is successful, a set variable increases the high-water mark value by 1 (I.e., one day), though files are not brought over on the weekends (this is the problem)
The date value (HWM) will not increase if no files are brought over and will continue to loop through the same date.
How to I get the pipeline to increase or look for the next file in that directory given that I use the HWV as the directory to the file, copy and update the HWM value only when completed dynamically. Current update logic
current lookup of HWV lookup and directory path to copy files
Instead of adding 1 to last high water mark value, we can try to update current UTC as watermark value. So that, even when pipeline is not triggered data will be copied to the correct destination folder. I have tried to repro in my environment and below is the approach.
Watermark table is taken initially with watermark value as '1970-01-01'.
This table is referred in the Lookup Activity.
Copy data activity is added and in source, query is given as
select * from tab1 where lastmodified > '#{activity('Lookup1').output.firstRow.watermark_value}'
In Sink, Blob storage is taken. In order to have folder structure as year/month/day,
#concat(formatDateTime(utcnow(),'yyyy'),'/', formatDateTime(utcnow(),'mm'),'/',formatDateTime(utcnow(),'dd'))
is given in folder path.
File is copied as in below path.
Once file is copied, Watermark value is updated with the current UTC time.
update watermark_table
set
watermark_value='#{formatDateTime(utcnow(),'yyyy-MM-dd')}'
where tab_name='tab1'
When pipeline is triggered next day, data will be copied from the watermark value and once file is copied, value of current UTC is updated as watermark value.
I think reading the post a couple of time , what I understood is
You already have a water mark logic .
On the weekend when there are NO files in the folder , the current logic does NOT increment the watermark and so you are facing issues .
If I understand the ask correctly . please use the #dayOfWeek() function . Add a If statement and let the current logic only execute when the day of the week is Monday(2)-Friday(6) .
https://learn.microsoft.com/en-us/azure/data-factory/data-flow-expressions-usage#dayofweek

How to decouple variable names in external files and the code?

Imagine I have an external file dates.csv in the following format:
Name
Date
start_of_fin_year
01.03.2022
end_of_fin_year
28.02.2023
Obviously, the file may get updated in the future, and the date may change. I create a piece of code that checks the file periodically to extract needed dates and put them into the DB/variables. Roughly speaking, I have this pseudocode:
start_of_fin_year = SELECT Date FROM table WHERE Name = 'start_of_fin_year'
The problem I face: my code will break if I or someone else changes the name in the table. How do I prevent this?
FYI this is a personal project that I developed on my own, but I will have to give access to .csv files to others so they can update info. I'm afraid they may accidentally change the names, so that's why I'm worried.

Generating Date Range in Google Sheets

I have a Google Sheet that contains extracted metadata from a large amount of files that I transferred from CDs to a server. I am currently working on creating a description for these materials to include in a finding aid. I found it easiest to work in Excel or Sheets because the PUI we use to output our finding aids utilizes a spreadsheet upload plugin.
I've been using pivot tables in Google Sheets to sort through all of the data with little issue except when I need to generate a date range for the files contained in one CD. Essentially, I'm creating a pivot table that contains rows for the URI, Filename (in this case I'm filtering for folder name only), and date_modified. The data looks something like this:
URI
FILENAME
DATE_MODIFIED
URI1
FOLDER1
2000-01-01
URI1
FOLDER2
2000-01-01
URI1
FOLDER3
2000-02-01
URI1
FOLDER4
1999-12-02
URI2
FOLDER1
2001-01-20
... and so on.
I'd like to generate a date range for each unique URI. I realize I could just go through each unique URI and manually extract that data but I have a LOT of these to go through so I don't think it is the most efficient use of my time. Especially, when you notice that the dates do not follow a chronological order. I'm thinking that pivot tables are not going to help me here so if anyone has other suggestions I'm happy to listen but brownie points if anyone has a solution that works in Sheets!
Try this on a new tab somewhere.
=QUERY(Sheet1!A:C,"select A,MIN(C),MAX(C) where A<>'' group by A")
change the range ref to suit.
Then in the next column over, depending on where you output the query,
=IF(A2="",,TEXT(B2,"yyyy-mm-dd")&"-"&TEXT(C2,"yyyy-mm-dd"))
drag down to the bottom.

How to select Multiple CSV files based on date and load into table

i receive input files daily in a folder called INPUTFILES. These files have filename along with datetime.
My Package has been scheduled to run everyday. If i receive 2 files for the day, i need to fetch these 2 files and load into the table.
For example i had files in my files
test20120508_122334.csv
test20120608_122455.csv
test20120608_014455.csv
now i need to run files test20120608_122455.csv test20120608_014455.csv for the same day.
I solved the issue. I have taken one varibale which checks for whether a file exists for that particular Day.
If the file exists for a particular day then the value for the variable is assigned to 1.
For Each Loop Container has been taken, and placed the this file exists variable inside the container.
For Loop Properties
EvalExpression ---- #fileexists==1.
if no file exists for that particular day, then the loop fails.

What should be the appropriate name of a log file

I want to log my exceptions in a file. What will be the name of the file?
Error-ddmmyyyy.log
Log-ddmmyyyy.err
Log-ddmmyyyy.txt
or anything else?
If date is important, I would use yyyymmdd format: this is easier to get a sorted list. I would add hhmmss if relevant.
.log suffix is nice for me.
I would add the name of the command issuing exceptions as a prefix of the logfile.
Something like: myCommand-20100315-114235.log
There are many ways you can name your log files, you have to consider several factors:
Is your file generated by a special server or application in a group? Then you should add the name of the server to know where it does come from.
Example:
Server1.log
In the log file there could be many levels of logging, if you want you can configure different files for different levels.
Example
Server1.info.log
Server1.err.log
You should add a date if your application runs for many days or if you want to keep track of errors for future reference, add it at the start of the name in yyyyMMdd format on windows or linux so they will be sorted by date automatically in command line commands or add it at the end if you want them more organized:
Server1.info.log.20100315 (Linux)
Server1.info.20100315.log (Win)
You can try with different combinations, it all depends on what sorting style and archiving style you want to achieve.