For troubleshooting purposes, I would like to obtain the URL to the current step of GitHub Actions logs.
The URL seems fairly easy to calculate:
url="https://github.com/$GITHUB_REPOSITORY/runs/$GITHUB_RUN_ID?check_suite_focus=true#step:$step_number:1"
What's missing is getting the number of the current step - I don't see it listed on https://docs.github.com/en/actions/learn-github-actions/contexts or https://docs.github.com/en/actions/learn-github-actions/environment-variables. Hard-coding the number is not ideal as adding/removing steps before this one will result in wrong/misleading URLs.
Is there perhaps some way to get the current step number that I've overlooked?
Alternatively, the step can have an id. However, it doesn't seem like there's a way to link to a step's log section by its id, is there?
Is there perhaps some way to get the current step number that I've overlooked?
Here is one (very) ugly way:
Give all steps an id. This causes them to be added to the steps object.
Obtain the length of the steps object with jq:
step_nr=$(echo '${{ toJson(steps) }}' | jq length)
Add 2, to get the 1-based step number. (+1 to convert 0-based numbering of length of steps so far to 1-based numbering used by the URL hash parser, and +1 for the "Set up job" step that runs automatically.)
Alternatively, the step can have an id. However, it doesn't seem like there's a way to link to a step's log section by its id, is there?
Looking at the JS code which handles the hash part of the URL, there is:
const e = window.location.hash.match(/^#step:(\d+):(\d+)$/) || [];
So, "no" apparently, at least not via the same mechanism as for indicating the step ID by number.
I am running a KNIME workflow:
It is running over every row of my data. The problem is, I planned to run 7000 iterations and at 6800 it gets stuck. Is there a way to save the csv file? There is a problem with one row, and I want to save the result at this point in time.
If there is a problem with a single input row, then easiest way to debug this in KNIME is often to run the input in a chunk loop. In your case I would set the outer chunk loop to run 1 row at a time, and remove the inner parallel chunk loop until you find the row causing the problem.
Unfortunately, this might take quite some time to run. As an alternative, try as above, but set the chunk size to say 100, and then once you know the block of rows that cause the error, use a row filter before the chunk loop to filter the input table to just that block of 100 rows, and then set the chunk size to 1 to see which row is the problem.
Place a CSV Writer node inside the loop, i.e. connected to the output of your Parallel Chunk End (keeping this also connected to the Loop End).
Configure the If file exists… setting of this CSV Writer to Append.
That should save all the data that is successfully processed by the loop.
When you say there is a problem with one row though, do you know what that problem is? Presumably you'd rather get the whole loop working.
You could also consider using Try and Catch nodes from the Workflow Control > Error Handling section to skip a chunk that causes an error.
I have the following scenario:
I have a remote server that every week gets loaded with 2 files, these files have the following name format:
"FINAL_NAME06Apr16.txt" and
"FINAL_NAME_F106Apr16.txt"
The part in bold is fixed everytime, but the date changes, now, I need to pick, copy to another directory and rename these files. but I'm not sure about how to pick the name of the files to variables to operate with them as I need to put different name to each file.
How can I proceed? I' pretty sure it has to be done with naming a variable with an expression, but I don't know how to do that part.
I think I need some function to calculate the rest of the filename, I believe maybe some approach could be to first rename the part "FINAL_NAME_F1" and then rename the "FINAL_NAME" since some wildcards will pick both if don't do it that way?
Cheers.
You can calculate the date but why go through that complexity?
A Foreach (File) Loop Container, FELC, will handle this just fine. Add two of them to your control flow.
The first one will use a file mask of FINAL_NAME_F1*.txt. Inside that FELC, use a File System task to copy/move/rename the file to your new location.
That first FELC will run, find the target file and move it. It will then look for the next file, find none and go on to the next task.
Create a second FELC but this one will operate on FINAL_NAME*.txt It's crucial that the first FELC run first as this file mask will match both FINAL_NAME_f1-2019-01-01.txt and FINAL_NAME-2019-01-01.txt. By ordering our operations as such, we can reduce the complexity of the logic required.
Sample answer with a FELC to show where to plumb the various bits
Im new to Neo4j and looking for some guidance :-)
Basically I want to create the graph below from the csv below. The NEXT relationship is created between Points based on the order of their property sequence. I would like to be able to ignore if sequences are consecutive. Any ideas?
(s1:Shape)-[:POINTS]->(p1:Point)
(s1:Shape)-[:POINTS]->(p2:Point)
(s1:Shape)-[:POINTS]->(p3:Point)
(p1)-[:NEXT]->(p2)
(p2)[:NEXT]->(p3)
and so on
shape_id,shape_pt_lat,shape_pt_lon,shape_pt_sequence,shape_dist_traveled
"1-700-y11-1.1.I","53.42646060879","-6.23930113514121","1","0"
"1-700-y11-1.1.I","53.4268571616632","-6.24059395687542","2","96.6074531286277"
"1-700-y11-1.1.I","53.4269700485041","-6.24093540883784","3","122.549696670773"
"1-700-y11-1.1.I","53.4270439028769","-6.24106779537932","4","134.591291249566"
"1-700-y11-1.1.I","53.4268623569266","-6.24155684094256","5","172.866609667575"
"1-700-y11-1.1.I","53.4268380666968","-6.2417384245122","6","185.235926544428"
"1-700-y11-1.1.I","53.4268874080753","-6.24203735638874","7","205.851454672516"
"1-700-y11-1.1.I","53.427394066848","-6.24287421729846","8","285.060040065768"
"1-700-y11-1.1.I","53.4275257974236","-6.24327509689195","9","315.473852717259"
"1-700-y11-1.2.O","53.277024711771","-6.20739084216546","1","0"
"1-700-y11-1.2.O","53.2777605784999","-6.20671521402849","2","93.4772699644143"
"1-700-y11-1.2.O","53.2780318605927","-6.2068238246152","3","124.525619356934"
"1-700-y11-1.2.O","53.2786209984572","-6.20894363498438","4","280.387737910482"
"1-700-y11-1.2.O","53.2791038678913","-6.21057305710353","5","401.635418300665"
"1-700-y11-1.2.O","53.2790975844245","-6.21075327761739","6","413.677012879457"
"1-700-y11-1.2.O","53.2792296384738","-6.21116766400758","7","444.981964564454"
"1-700-y11-1.2.O","53.2799500357098","-6.21065767664905","8","532.073870043666"
"1-700-y11-1.2.O","53.2800290799386","-6.2105343995296","9","544.115464622458"
"1-700-y11-1.2.O","53.2815594673093","-6.20949562301196","10","727.987702875002"
It is the 3rd part that I cant finish. Creating the NEXT relationship!
//1. Create Shape
USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS FROM
'file:///D:\\shapes.txt' AS csv
With distinct csv.shape_id as ids
Foreach (x in ids | merge (s:Shape {id: x} ));
//2. Create Point, and Shape to Point relationship
USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS FROM
'file:///D:\\shapes.txt' AS csv
MATCH (s:Shape {id: csv.shape_id})
with s, csv
MERGE (s)-[:POINTS]->(p:Point {id: csv.shape_id,
lat : csv.shape_pt_lat, lon : csv.shape_pt_lat,
sequence : toInt(csv.shape_pt_sequence), dist_travelled : csv.shape_dist_traveled});
//3.Create Point to Point relationship
USING PERIODIC COMMIT 10000
LOAD CSV WITH HEADERS FROM
'file:///D:\\shapes.txt' AS csv
???
You'll want APOC Procedures installed for this one. It has both a means of batch processing, and a quick way to link all nodes in a collection together.
Since you already have all shapes the the points of the shape in the db, you don't need to do another load csv, just use the data you've got.
We'll use apoc.periodic.iterate() to batch process each shape, and apoc.nodes.link() to link all ordered points in the shape by relationships.
CALL apoc.periodic.iterate(
"MATCH (s:Shape) RETURN s",
"WITH {s} as shape
MATCH (shape)-[:POINTS]->(point:Point)
WITH shape, point
ORDER by point.sequence ASC
WITH shape, COLLECT(point) as points
CALL apoc.nodes.link(points,'NEXT')",
{batchSize:1000, parallel:true}) YIELD batches, total
RETURN batches, total
EDIT
Looks like there may be a bug when using procedure calls within the apoc.periodic.iterate() where no mutating operations occur (attempted this after including a SET operation in the second part of the query to set a property on some nodes, the property was not added).
Unsure if this is a general case of procedure calls being executed within procedure calls, or if this is specific to apoc.periodic.iterate(), or if this only occurs with both iterate() and link().
I'll file a bug if I can learn more about the cause. In the meantime, if you don't need batching, you can forgo apoc.periodic.iterate():
MATCH (shape:Shape)-[:POINTS]->(point:Point)
WITH shape, point
ORDER by point.sequence ASC
WITH shape, COLLECT(point) as points
CALL apoc.nodes.link(points,'NEXT')
Is there a way with the super-csv library to find out the row number of the file that will be processed?
In other word, before i start to process my rows with a loop:
while ((obj = csvBeanReader.read(obj.getClass(),
csvModel.getNameMapping(), processors)) != null) {
//Do some logic here...
}
Can i retrieve with some library class the number of row contained into the csv file?
No, in order to find out how many rows are in your CSV file, you'll have to read the whole file with Super CSV (this is really the only way as CSV can span multiple lines). You could always do an initial pass over the file using CsvListReader (it doesn't do any bean mapping, so probably a bit more efficient) just to get the row count...
As an aside (it doesn't help in this situation), you can get the current line/row number from the reader as you are reading using the getLineNumber() and getRowNumber() methods.