Batch if else not executing consistently - csv

I am trying to set a variable using if else statements. The problem is that the else if "%%D"GEQ "%%G" (set var=%%G) is not executing consistently. I am not sure what is wrong.
Here is my code:
#echo on
SETLOCAL EnableDelayedExpansion
FOR /F "tokens=1-8* delims=," %%A IN (results.csv) DO (
if "%%D" equ 0 (
set var=0
) else if "%%D" GEQ "%%G" (
set var=%%G
) else (set var=%%D)
set "y=%%A,%%B,%%C,%%D,%%E,%%F,%%G,%%H,!var!"
echo !y!>>final.csv
)
Here is a sample of my input file.
"01185901095","11","0379-0005","50","001","0","3","3"
"01185901215","11","0379-0013","138","001","0","4","2"
Here is the output I get in final.csv
"01185901095","11","0379-0005","50","001","0","3","3","3"
"01185901215","11","0379-0013","138","001","0","4","2","138"
My expected output is:
"01185901095","11","0379-0005","50","001","0","3","3","3"
"01185901215","11","0379-0013","138","001","0","4","2","4"
Line 2 in the output is the problem . %%D is greater than %%G so I expect the value of %%G (or 138 is greater than 4 so I expect 4)

There should be read first my answer on Symbol equivalent to NEQ, LSS, GTR, etc. in Windows batch files to get full knowledge on how the internal command IF of cmd.exe makes a string comparison which is done here and not an integer comparison as expected.
The condition if "%%D" equ 0 becomes if ""50"" EQU 0 on processing the first line of the input CSV file results.csv which results in the comparison of the string ""50"" with the string 0 because of the four double quotes around the value 50. The first character " of first string ""50"" has the decimal byte value 34 while the first character 0 of the second string 0 as the decimal byte value 48. The used function lstrcmpW exits for that reason already on comparing the first character of the two strings with −1. This integer value is compared next on being equal with the integer value 0 independent the fact that there is specified right to the comparison operator EQU by chance here also 0. The result of this condition is always false independent on which string is assigned to the loop variable D read from the file results.csv because of the double quotes around %%D resulting always in comparing the character " with the character 0.
There is next executed always if "%%D" GEQ "%%G" which on processing the first line of results.csv results in the comparison of the string ""50"" with the string ""3"". The first two characters of both compared strings are equal. The third character 5 of the first string has with decimal byte value 53 a greater value than third character 3 of the second string with decimal byte value 51 and for that reason the result of the string comparison is the integer value 1 which is greater the integer value 0 and for that reason the second condition is by chance correct true for the first line of results.csv.
But on processing the second line of results.csv is compared the string ""138"" with the string ""4"" on which the third character of first string has a lower byte value than the third character of the second string. The result of the string comparison is in this case −1 which is less than the integer value 0. The comparison result is false for the second condition on processing the values in second line of the CSV file although the integer value 138 would be greater than the integer value 4.
The solution is not using " around the loop variable references at all and additionally remove the double quotes around the values read from the CSV file to really run integer value comparisons and not string comparisons which means using %%~D and %%~G.
#echo off
setlocal EnableExtensions EnableDelayedExpansion
(for /F "tokens=1-8* delims=," %%A in (results.csv) do (
if %%~D EQU 0 (
set var=0
) else if %%~D GEQ %%~G (
set var=%%G
) else set var=%%D
echo %%A,%%B,%%C,%%D,%%E,%%F,%%G,%%H,!var!
))>final.csv
endlocal
There is one more important performance modification in the above code among some other not so important small improvements: the entire for loop is enclosed in round brackets and everything output to standard output stream by the command echo inside the loop is written into the file final.csv.
This code results on execution in opening first the file final.csv for write operations and keeping it open all the time as long as for is processing next the lines read from results.csv with finally flushing the data of final.csv and closing this file after for finished and closed the file results.csv.
The code in question results for each line read from results.csv to open the file final.csv, seek to end of the file, append the line output with command echo and then close the file final.csv. This makes processing thousands or millions of lines in results.csv much slower than the code above although the file caching mechanisms of Windows avoid really writing the opened, changed and closed file final.csv on each line to the hard disk.
Note: The code as posted here works only if there are no empty field values in results.csv which means there is no line in results.csv with ,,. The current directory on starting the execution of the batch file must be the directory containing results.csv as otherwise final.csv will be an empty file.

I'm not quite sure of the purpose of some of your comparisons, so based purely off best guesses, your input file content and your expected output content, would something like this not do what you wanted:
#( For /F "UseBackQ EOL=, Delims=" %%G In ("results.csv"
) Do #For /F "Tokens=4,7 Delims=," %%H In ("%%~G"
) Do #If %%~H Gtr %%~I (Echo %%G,"%%~I") Else Echo %%G,"%%~H"
) 1>"final.csv"
As you can see, there's no need for delayed expansion, or defining variables.

Related

Batch Script - Delete Columns in csv

I do need a batch script who will remove all columns in a csv, except column 1,2 and 5
My Code:
(for /f "tokens=1,2,5 delims=;" %%i in (Input.csv) do echo %%i,%%j,%%k) > Output.csv
Input CSV
1;2;3;4;5;6;7;8;9;10
10160;"Some Name";"Something:0.8";;5;;;;;XY
Expected Output:
1;2;5
10160;"Some Name";5
Real Output
1,2,5
10160,"Some Name",XY
Does anyone have any idea why it keeps the tenth column in the second line instead of the fifth?
SETLOCAL ENABLEDELAYEDEXPANSION
(FOR /f "delims=" %%b IN (Input.csv) DO SET "line=%%b"&SET "line=!line:;;=; ;!"&for /f "tokens=1,2,5 delims=;" %%i in ("!line:;;=; ;!") do echo %%i,%%j,%%k)
The problem is that a sequence of delimiters is considered as a single delimiter, so you need to change each delimiter pair so that it contains a string, and repeat the operation for any remaining delimiter-pairs.
Obviously, you would need to take action to take care of a reported field that now contains a single space, and this will alter any quoted field that contains ;;
Note also that any data containing ! or % is likely to be corrupted and certain other symbols (such as &) may also yield unexpected results. If the data is restricted to alphamerics, spaces, commas, etc. it should be fine.

Batch File: Reading Floating Point Values from a .csv file

I have made a batch file that reads a .csv file. It then proceeds to take the values from a specific column (in this case, the 4th) and find the highest value. The script works perfectly fine with whole numbers, but once I attempt to pass in a .csv file featuring floating point numbers, the script only reads the first number. ie, 1.546 = 1, 0.896 = 0, etc...
How do I read the floating points normally? In this case, at least 2 points of precision (though the values can be up to 6 points of precision with the real .csv file)
One other thing to note is that this prints out "missing operator" 3 times. I THINK this may be due to spacing, but am not sure where.
The script is as follows:
#echo off
set cur=0
set max=0
for /f "usebackq tokens=1-4 delims=," %%a in ("sample.csv") do (call :func "%%d")
echo Max is %max%
goto :here
:func
set /a cur=%1
if %cur% gtr %max% (set /a max=%cur%)
goto :eof
:here
pause
This is sample.csv, which works fine:
1,2,,3,3,5,,
5,6,,7,12.3,6,,
9,10,,11,11.4,7,,
13,14,,15,10.1,2,,
I threw in some additional commas, just to test the code.
If you were to do actual calculations, then I would not recommend batch-file for this while using fractions, but to simply test the highest value, we can split the string by . match either side. You can still not use set /a to make it an actual integer though:
#echo off & setlocal enabledelayedexpansion
set num=0 & set frac=0
for /f "usebackq tokens=1-4 delims=," %%a in ("sample.csv") do (
for /f "tokens=1* delims=." %%i in ("%%~d") do (
if not "%%j" == "" if %%i gtr !num! (
set "num=%%i"
set "max=%%~d"
)
if %%i geq !num! if %%~j gtr !frac! (
set "frac=%%~j"
set "max=%%~d"
)
)
)
echo Max is %max%
pause
As per your comment on ~ here is an extract from for /?
In addition, substitution of FOR variable references has been enhanced.
You can now use the following optional syntax:
%~I - expands %I removing any surrounding quotes (")
%~fI - expands %I to a fully qualified path name
%~dI - expands %I to a drive letter only
%~pI - expands %I to a path only
%~nI - expands %I to a file name only
%~xI - expands %I to a file extension only
%~sI - expanded path contains short names only
%~aI - expands %I to file attributes of file
%~tI - expands %I to date/time of file
%~zI - expands %I to size of file
%~$PATH:I - searches the directories listed in the PATH
environment variable and expands %I to the
fully qualified name of the first one found.
If the environment variable name is not
defined or the file is not found by the
search, then this modifier expands to the
empty string
The modifiers can be combined to get compound results:
%~dpI - expands %I to a drive letter and path only
%~nxI - expands %I to a file name and extension only
%~fsI - expands %I to a full path name with short names only
%~dp$PATH:I - searches the directories listed in the PATH
environment variable for %I and expands to the
drive letter and path of the first one found.
%~ftzaI - expands %I to a DIR like output line
In the above examples %I and PATH can be replaced by other valid
values. The %~ syntax is terminated by a valid FOR variable name.
Picking upper case variable names like %I makes it more readable and
avoids confusion with the modifiers, which are not case sensitive.
As was mentioned in the comments, you could use powershell for this task. Here's a basic idea.
For the example file content:
1,2,,3,3,5,,
5,6,,7,12.3,6,,
9,10,,11,11.4,7,,
13,14,,15,10.1,2,,
You could use something like:
Import-Csv -Path ".\sample.csv" -Header ("A","B","C","D","E") | Sort-Object { [Single]$_.E } -Descending | Select-Object -First 1 -ExpandProperty E
Which should return:
12.3
As you can see above, because you had not supplied the header record, I had to create some in order to identify my target field. However, if you already have known header fields, you could simplify the code a little.
For the example file content:
This,Is,My,Actual,Header,Record,,
1,2,,3,3,5,,
5,6,,7,12.3,6,,
9,10,,11,11.4,7,,
13,14,,15,10.1,2,,
You would just name your field according to its header value name, something like:
Import-Csv -Path ".\sample.csv" | Sort-Object { [Single]$_.Header } -Descending | Select-Object -First 1 -ExpandProperty Header
Which should, once again, return:
12.3

Deleting/replacing characters delimited by commas

I'm trying to delete by batch or vbs text delimited by commas (CSV) that are always in the same position. It would not affect the first line, only lines 2 onwards.
Example text from file:
Code,Batch,File #,Reg Hours,O/T,Cost Number,Rate,Earnings,Earnings,Memo Code,Memo Amount,Earnings Code,Earnings Amount,Hours Code,Hours Amount,Earnings Code,Earnings Amount,Adjust Code,Adjust Amount
ABC,123,3980 ,78.52,,12331,10.00,,,,,,,,
ABC,123,4026 ,29.38,,12331,10.00,,,,,,,,
ABC,123,5065 ,64.46,,12331,10.00,,,,,,,,
ABC,123,5125 ,80.00, 0.54,12331,11.00,,,,,,,,
I would like to end up with text:
Code,Batch,File #,Reg Hours,O/T,Cost Number,Rate,Earnings,Earnings,Memo Code,Memo Amount,Earnings Code,Earnings Amount,Hours Code,Hours Amount,Earnings Code,Earnings Amount,Adjust Code,Adjust Amount
ABC,123,3980 ,78.52,,12331,,,,,,,,,
ABC,123,4026 ,29.38,,12331,,,,,,,,,
ABC,123,5065 ,64.46,,12331,,,,,,,,,
ABC,123,5125 ,80.00, 0.54,12331,,,,,,,,,
The only difference is the Rate area. It is the 7th separated value from the left, or 9th from the right. The first line remains intact.
Is there a way for the batch/vbs to determine the comma separated value position, delete the value or replace it with 'nothing', and ignore the first line?
For this example, we can assume the file will always be named file.csv, and located in D:\location - 'D:\location\file.csv'
Thank you!
REM <!-- language: lang-dos -->
#ECHO Off
SETLOCAL ENABLEDELAYEDEXPANSION
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
SET "filename1=%sourcedir%\q46534752.txt"
SET "outfile=%destdir%\outfile.txt"
:: Remove the output file
DEL "%outfile%" >NUL 2>nul
:: To reproduce the first line intact
FOR /f "usebackqdelims=" %%a IN ("%filename1%") DO >"%outfile%" ECHO %%a&GOTO hdrdone
:hdrdone
(
REM to process the header line, remove the "skip=1" from the "for...%%a" command
FOR /f "usebackqskip=1delims=" %%a IN ("%filename1%") DO (
REM step 1 - replace all commas with "|," to separate separators
SET "line=%%a"
SET "line=!line:,=|,!"
FOR /f "tokens=1-7*delims=|" %%A IN ("!line!") DO (
SET "line=%%A%%B%%C%%D%%E%%F%%H"
ECHO !line:^|=!
)
)
)>>"%outfile%"
GOTO :EOF
You would need to change the settings of sourcedir and destdir to suit your circumstances.
I used a file named q46534752.txt containing your data for my testing.
Produces the file defined as %outfile%
Processing of the header line is an issue. The code as presented should do as you ask, but it seems illogical to retain the column name in the resultant file when the process is intended to remove that column. To process the header line also, delete the first for line and remove the skip=1 (which skips the first line) from the second.
The fundamental issue is that batch treats a string of delimiters as a single delimiter, so it's necessary to separate those delimiters. This is not possible against a metavariable, but can be done within a loop by transferring the metavariable into an ordinary environment variable (line) and performing the string-replace ceremony on that ordinary variable in delayed expansion mode.
So - replace each , with |,, then process the resultant string using | as a delimiter. Note that the metavariable is in a different case for the second for - one of the few occasions where cmd is case-sensitive. Reconstruct the string, omitting column 7 (%%G) and using the * token meaning the eighth token (%%H) receives the remainder-of-line after the highest explicitly-mentioned token number (7) and echo it after removing remaining | characters.
Note that it is normal policy to refuse code-requests on SO, and only respond in a manner to fix faulty code. In this case however, succeeding browsers may find this response to be the key to doing a similar task and hence refrain from posting unnecessarily. Also, I'm bored witless.

How to drop all but last cell in CSV using CMD

my goal is to write a script that will monitor process memory usage and run % based comparison on it to determine if there is a memory leak in the said process.
I am using the following command to get the momory usage of the process:
tasklist /fi "imagename eq %PROCESS%" /FO csv | findstr K
SAMPLE:
"cmd.exe","11640","Console","1","3,160 K"
This gives me a CSV file with last cell being the memory usage. I have two problems that I need help with.
Problem 1) How do I drop all but the last cell so that I can then assign the Kb used to a variable for comparison.
Problem 2) How do I get rid of the comma in the number? That kind of makes using comma as delim hard :/
Is there a better command than tasklist for this? I just need the raw number that the program is using, it can be in KB or MB.
Id love to be able to not have dependencies, but if I have to have dependencies I can include them with the batch.
Also is there any way for findstr to not return the entire line?
Thanks for any help! Ive been trying to get this solved for two days now with not much luck.
#ECHO OFF
SETLOCAL
FOR /f "delims=" %%i IN (memcsv.csv) DO CALL :process %%i
GOTO :EOF
:process
SET memsize=%~5
SET memsize=%memsize:,=%
ECHO memsize found = %memsize%
GOTO :eof
This should get your output into a variable called memsize.
It uses a file memcsv.csv as input, but you could replace mmcsv.csv with
'tasklist /fi "imagename eq %PROCESS%" /FO csv ^| findstr Mem'
to operate directly on the output of FINDSTR. Your resultant line would thus be
FOR /f "delims=" %%i IN ('tasklist /fi "imagename eq %PROCESS%" /FO csv ^| findstr Mem') DO CALL :process %%i
which, for ease of legibility could be entered as
FOR /f "delims=" %%i IN (
'tasklist /fi "imagename eq %PROCESS%" /FO csv ^| findstr Mem'
) DO CALL :process %%i
Note that the line-breaks are specific - before and after the single-quote.
Also that the single-quotes are REQUIRED and that there is a caret (^) before the pipe (|) which tells cmd that the pipe is part of the command to be executed, not part of the FOR command
Edit to add explanation of HOW.
The ouput of the tasklist...|findstr... can be used as input to a for/f as if it was a file. All you need do is to surround the command with SINGLE-QUOTES and ensure that redirectors like | < > are "escaped" by a caret.
FOR /F "reads" the "file" line-by-line, assigning (by default) the first "token" in the line to the "metavariable" (the loop-control variable, %%i in the above case). This behaviour canbe modified by the addition of control-clauses to the FOR/F. You may use `tokens=x,y,z" for instance to assign token number x, number y and number z to %%i, %%j, %%k respectively.
TOKENS are counted from 1 and have a value of the line contents up to a (series of) delimiter(s). By default, delimiters are spaces, commas, semicolons and TABs, so a line
TOKEN_ONE TOKEN_2,TOKEN_THREE;Token_FOUR
when seen by
for /f "tokens=1,3,4" %%i in (filecontainingaboveline) do
would set %%i=TOKEN_ONE %%j=TOKEN_THREE %%k=Token_FOUR
Using "delims=" turns OFF the delimiters and hence the ENTIRE line is assigned to the metavariable.
HENCE, in the above code, the entire line is assigned to %%i and delivered to the subroutine :process.
From :process's point-of-view, it has been given the argument ** "cmd.exe","11640","Console","1","3,160 K"** which it interprets as a sequence of 5 parameters separated by commas - and a comma (or any other separator) WITHIN "quotes" is data, not a separator.
Parameter number 5 is accessed by %5 - and that is "3,160 K" - including the quotes and comma.
The variable is set to the value of the fifth parameter - the tilde (~) means "remove enclosing quotes." Hence memsize acquires a value of 3,160 K
The next SET replaces the string after the colon in the nominated variable with the string after the = - replace commas with nothing, and assign the result to the memsize variable.
The goto :eof means 'go to the physical end-of-file.` It is very specific - the colon MUST be present. Reaching end-of-file terminates a subroutine or batch-process.
To remove the last 2 characters of the variable, you could use
SET var=%var:0,~-2%
where var is the variable-name.
SEE
SET /?
from the prompt for documentation.
Also GOTO /? and FOR/? for more details on these commands...

Output the batch result to CSV text after removing certain data and adding new data

I have never done any batch scripting before and so I need help in building one of them.
We have a file say "GetHistory.bat" and it accepts a parameter ID. The output of the script is as shown below:
Activity Name Status Date
----------------------------------------
Act1 Created 1-Jan-2013
Act2 Submitted 2-Jan-2013
Act3 Approved 2-Jan-2013
Now the problem is I need to export the output in txt as CSV without the header and parameter ID added to each line as below:
1001,Act1,Created,1-Jan-2013
1001,Act2,Submitted,2-Jan-2013
1001,Act3,Approved,2-Jan-2013
Any help to begin the script would be highly appreciated.
Thanks...!
With the assumption that the output of GetHistory.bat has been redirected into a file called history.txt, we could feed that into our new batch file, ParamCSV.bat, like so, with this result:
C:\stackoverflow>ParamCSV.bat 1001 < history.txt
1001,Act1,Created,1-Jan-2013
1001,Act2,Submitted,2-Jan-2013
1001,Act3,Approved,2-Jan-2013
To put together a quick script for this, I've referenced info from:
Read stdin stream in a batch file
What is the best way to do a substring in a batch file?
How to remove trailing and leading whitespace for user-provided input in a batch file?
DOS Batch - Function Tutorial provided a thorough overview of parameter passing in functions.
I came up with this batch script, ParamCSV.bat:
#echo off
:: ParamCSV.bat
::
:: Usage: ParamCSV.bat <Parameter_ID> < history.txt
::
:: Thanks to:
:: https://stackoverflow.com/questions/6979747/read-stdin-stream-in-a-batch-file/6980605#6980605
:: https://stackoverflow.com/questions/636381/what-is-the-best-way-to-do-a-substring-in-a-batch-file
:: https://stackoverflow.com/questions/3001999/how-to-remove-trailing-and-leading-whitespace-for-user-provided-input-in-a-batch
:: Copy input parameter to 'id'
set id=%1
setlocal DisableDelayedExpansion
for /F "skip=2 tokens=*" %%a in ('findstr /n $') do (
set "line=%%a"
setlocal EnableDelayedExpansion
set "line=!line:*:=!"
set "activity=!line:~0,17!"
call:trim !activity! activity
set "status=!line:~17,11!"
call:trim !status! status
set "date=!line:~28,11!"
call:trim !date! date
echo(!id!,!activity!,!status!,!date!
endlocal
)
goto:EOF
::function: trim
::synopsis: Removes leading and trailing whitespace from a sting. Two
:: parameters are expected. The first is the text string that
:: is to be trimmed. The second is the name of a variable in
:: the caller's space that will receive the result of the
:: trim operation.
::
::usage: call:trim string_to_trim var_to_update
:: e.g. call:trim %myvar1% myvar2
::trim left whitespace
setlocal
set input=%~1
for /f "tokens=* delims= " %%a in ("%input%") do set input=%%a
::trim right whitespace (up to 100 spaces at the end)
for /l %%a in (1,1,100) do if "!input:~-1!"==" " set input=!input:~0,-1!
::return trimmed string in place
endlocal&set "%~2=%input%"
There are a number of assumptions that are made here, and if any of them change or are invalid, the script will break:
The output of GetHistory.bat has fixed-width columns, of width 17,11, and 11. You didn't provide an example of a two-digit day, so I've assumed the dates are right-aligned.
There are two header lines, which we skip in the for statement.
All output lines are for the same ID, so only one input parameter is expected, and it is the first element in all CSV output lines.