This is my first time working with batch files. I am trying to extract certain columns from original csv and pipe output to new csv. The following code is what I wrote based on this link:
https://stackoverflow.com/a/17557532/16034206
#echo off
setlocal EnableDelayedExpansion
Rem for /f "skip=1 usebackq tokens=1,2,10,11 delims=," %%i in (sample.csv) do #echo %%i,%%j,%%k,%%l >>output.csv
echo "Your script is starting..."
FOR /F "skip=1 usebackq delims=" %%L in (sample.csv) DO (
set "line=%%L,,,,,,,,"
set "line=#!line:,=,#!"
FOR /F "tokens=1,2,10,11 delims=," %%a in ("!line!") DO (
set "param1=%%a"
set "param2=%%b"
set "param10=%%c"
set "param11=%%d"
set "param1=!param1:~1!"
set "param2=!param2:~1!"
set "param10=!param10:~1!"
set "param11=!param11:~1!"
if "%%~A"=="RH" echo !param1!, !param2!, !param10!, !param11! >> output.csv
)
)
echo "Your script has completed"
I am looking to apply logic to check param1 contains a substring "#gmail.com" AND that param10 starts with a specific string "100" before outputting that specific row of 4 columns into the csv.
I checked how to use if-statement from this link: https://stackoverflow.com/a/17474377/10671013
but I have not found any links on SO discussing "containing substring" or checking for "starting with a string". Please advise.
Remove the substring you look for from the first column and compare it with the original string, if not equal (string contains substring), check the first three characters of the other column. (This substring substitution is case insensitive):
if not "!param1:#gmail.com=!" == "!param1!" if "!param10:~0,3!" == "100" echo ...
Related
I have a commma seperated csv-file like this:
ID,USER_ID, COL3_STR, COL4_INT
id1,username1,exampleA, 5
id2,username1,exampleB,0
id3,username2,NULL,-1
id4,username3,,3,false,20
Each value from the 2nd column USER_ID must be replaced with testusername (except the header "USER_ID"). The values are different, so I can't search a defined string.
My idea was to use a for-loop and get the second token from each line to get the username. For example:
#ECHO OFF
SETLOCAL ENABLEDELAYEDEXPANSION
SET currentDir=%~dp0
SET "srcfile=%currentDir%\inputfile.csv"
SET "outfile=%currentDir%\result.csv"
for /f "tokens=2 delims=," %%A IN (%srcfile%) DO (
ECHO %%A)
ECHO done
PAUSE
Output:
USER_ID
username1
username1
username2
username3
So the 2nd column of the (new) csv file must look like:
USER_ID
testusername
testusername
testusername
testusername
I saw another question with an helpful answer.
Example: When each username is "admin":
(
for /f "delims=" %%A in (%srcfile%) do (
set "line=%%A"
for /f "tokens=2 delims=," %%B in ("admin") do set "line=!line:%%B=testuser!"
echo !line!
)
)>%outfile%
But this works only for a defined string. It's my first batch-script and I don't know how to "combine" this for my situation. I hope sombody can help me.
Must work for Windows 7 and 10.
You need all the tokens (for writing the modified file), not just the second one:
for /f "tokens=1,2,* delims=," %%A in (%srcfile%) do echo %%A,testuser,%%C
(where * is "the rest of the line, undelimited"). %%B would be the username, so just write the replacement string instead.
You could use an if statement to process the first line differently, or you process it separately:
<"%srcfile%" set /p header=
(
echo %header%
for /f "skip=1 tokens=1,2,* delims=," %%A in (%srcfile%) do echo %%A,testuser,%%C
) > "%outfile%"
The following script (let us call it repl_2nd.bat) replaces the values in the second column of a CSV file and correctly handles empty fields (where separators occur next to each other like ,,):
#echo off
setlocal EnableExtensions DisableDelayedExpansion
rem // Define constants here:
set "_FILE=%~1" & rem // (path to input file; `%~1` is first argument)
set "_NVAL=testusername" & rem // (new string for second column)
set "_SEPC=," & rem // (separator character usually `,`)
rem // Especially handle first line:
< "%_FILE%" (
set "HEAD=" & set /P HEAD=""
setlocal EnableDelayedExpansion
echo(!HEAD!
endlocal
)
rem // Read input file line by line:
for /F usebackq^ skip^=1^ delims^=^ eol^= %%L in ("%_FILE%") do (
rem // Store current line string:
set "LINE=%%L"
rem // Toggle delayed expansion to avoid loss of `!`:
setlocal EnableDelayedExpansion
rem /* Replace each separator `,` by `","` and enclose whole line string in `""`,
rem resulting in all items to become quoted, ven empty ones, hence avoiding
rem adjacent separators, which would become collapsed to one by `for /F`;
rem then split the edited line string at the first and second separators: */
for /F "tokens=1,2,* delims=%_SEPC% eol=%_SEPC%" %%A in (^""!LINE:%_SEPC%="^%_SEPC%"!"^") do (
rem /* Unquote the first item, then join a separator and the replacement string;
rem then remove the outer pair of quotes from the remaining line string: */
endlocal & set "REPL=%%~A%_SEPC%%_NVAL%" & set "REST=%%~C"
rem // Append the remaining line string with `","` replaced by separators `,`:
setlocal EnableDelayedExpansion & echo(!REPL!%_SEPC%!REST:"%_SEPC%"=%_SEPC%!
)
endlocal
)
endlocal
exit /B
To use the script o a file in the current working directory, use this command line:
repl_2nd.bat "inputfile.csv"
To store the output to another file, use the following command line:
repl_2nd.bat "inputfile.csv" > "outputfile.csv"
I have text file abc.txt with the contents as :
abc.txt :
{"nature":"calm","trees":"uprooted from the main area","name":"usdbuebcowecy821nkwh29y2bnso3ns389ye3wnsiwsn9usj","enrolled":"not yet"}
I need to extract the string "usdbuebcowecy821nkwh29y2bnso3ns389ye3wnsiwsn9usj" associated with name from the abc.txt. The strings associated with name vary and are not static. Hence whatever the string is asociated with name has to be extracted and updated in a sample.json file .
Sample.json :
{
"requisite":{
"name": "usdbuebcowecy821nkwh29y2bnso3ns389ye3wnsiwsn9usj"
},
"land": {
"key": "890"
}
}
Sample.json file name key should be updated with the appropriate name extracted from abc.txt name field.
I tried below code snippet to extract the name field abc.txt file :
For /f "tokens=1 delims=:" %%j in ('dir /b /s "C:\abc.txt" ^|findstr /I ""name":"') do echo "%%j"
echo name is: %%j
However the loop doesnt search for the name string and Im stuck to proceed further. Im new to batch script. Can anyone help me out?
#ECHO OFF
SETLOCAL
SET "sourcedir=U:\sourcedir"
SET "filename1=%sourcedir%\q64985945.txt"
:: Read sourcefile to LINE
FOR /f "usebackqdelims=" %%a IN ("%filename1%") DO SET "line=%%a"
:: Change each { } and : to comma
SET "line=%line:{=,%"
SET "line=%line:}=,%"
SET "line=%line::=,%"
:: ensure NAME is not defined
SET "name="
:: process LINE
:: set NAME when the string `name` is detected, use that flag to set NAME to the value following.
:: Note that LINE will not contain {:}, so any of these values can be used as a flag to detect
:: `name` as the last value.
FOR %%a IN (%line%) DO IF DEFINED name (SET "name=%%~a"&GOTO done ) ELSE IF /i "%%~a"=="name" SET "name=:"
:noname
ECHO No name value found
GOTO :eof
:done
SET name
GOTO :EOF
You would need to change the setting of sourcedir to suit your circumstances. The listing uses a setting that suits my system.
I used a file named q64985945.txt containing your data for my testing.
The usebackq option is only required because I chose to add quotes around the source filename.
Unfortunately, you still keep the JSON text a er, mystery. The sample you've posted doesn't tell us a number of things - whether the first occurrence of name is the one you require, or whether there are other conditions that determine which particular value of name is to be selected. For instance, your original sample did not include nested brace-pairs. All significant in devising a solution...
The dir command show file names. You want not to process the file name, but the file contents, isn't it? So you should give "abc.txt" as parameter of findstr command.
In the line you want the sixth token separated by colon or comma, right?
This works:
For /f "tokens=6 delims=:," %%j in ('findstr /I "name" abc.txt') do echo %%j
However, if the contents of file abc.txt is just one line (as you said in the question), then you don't even need the findstr command...
For /f "tokens=6 delims=:," %%j in (abc.txt) do echo %%j
This is another way to do it that extracts the values of all variables in the line:
#echo off
setlocal
rem Read a line from abc.txt
set /P "line=" < abc.txt
rem For example: {"nature":"calm","trees":"uprooted","name":"usd","enrolled":"not yet"}
rem Remove braces -> "nature":"calm","trees":"uprooted","name":"usd","enrolled":"not yet"
set "line=%line:~1,-1%"
rem Change ":" by = -> "nature=calm","trees=uprooted","name=usd","enrolled=not yet"
set "line=%line:":"==%"
rem Change "," by " & set " -> "nature=calm" & set "trees=uprooted" & set "name=usd" & set "enrolled=not yet"
rem and *execute* such a line inserting a SET command at beginning:
set %line:","=" & set "%
rem Now all variables have their values. For example:
echo {"name":"%name%"}
New solution added
Your first request was to extract name field from abc.txt file. However, you have now changed the problem to update a line of sample.json file that in the original question have just one line.
Anyway, here it is a solution to your new problem:
#echo off
setlocal EnableDelayedExpansion
rem Get the value of "name": field in abc.txt file
for /f "tokens=6 delims=:," %%j in (abc.txt) do set "name=%%~j"
rem Get line number of "name": line minus one in sample.json file
for /F "delims=:" %%n in ('findstr /N "\"name\":" sample.json') do set /A "lines=%%n-1"
rem Process sample.json file and create sample.out
< sample.json (
rem Copy first N lines
for /L %%i in (1,1,%lines%) do set /P "line=" & echo !line!
rem Read and update the "name": line as requested
set /P "line="
for /F "delims=:" %%a in ("!line!") do echo %%a: "%name%"
rem Copy the rest of lines
findstr "^"
) > sample.out
move /Y sample.out sample.json
Note that this code is prone to get errors because Batch files are not designed to process json files. If the program fails with your real data because a detail that is different from the posted data, please do not post here a request to fix the code! :(
batch-file/cmd has no support for JSON at all, so please use a tool like xidel that does.
dot notation:
xidel -s sample.json -e "($json).requisite.name:=json-doc('abc.txt').name"
XQuery:
xidel -s sample.json -e "$json/map:put(.,'requisite',{'name':json-doc('abc.txt')/name})"
Output (to stdout) in both cases:
{
"requisite": {
"name": "usdbuebcowecy821nkwh29y2bnso3ns389ye3wnsiwsn9usj"
},
"land": {
"key": "890"
}
}
To update the input file simply use --in-place:
xidel -s --in-place sample.json -e "[...]"
I am trying to sort a csv file on a specific column using batch scripting.
The csv file has about 22 column and column L(10) contains zip codes. There are multiple records with the same zip code and I need to sort these record in ascending numerical order.
This is what I've done so far,
for /F "tokens=1-22 delims=," %%a in (test.csv) do (
rem Define the sorting column in next line: %%a=1, %%b=2, etc...
set "line["%%l"]=%%d,%%f,%%l"
)
for /F "tokens=1* delims==" %%a in ('set line[') do echo %%b >> result2.txt
This is my result. It is removing records with duplicated zip code. I should see multiple row with the same zip code but with different names of course.
"John","Doe","12078"
"John","Doe3","12095"
"John","Doe5","12197"
OR %%f in (*csv) do (
SET CurrentFile=%%f
SET /a NumLines=0
For /f %%j in ('Find "" /v /c ^< !CurrentFile!') Do (
Set /a NumLines=%%j
(set row=%~1) & (set last=%~1)
For /F "tokens=4-7 delims=," %%D in ('type !CurrentFile!') do (
if not defined row (set row=%%D %%F) else (set last=%%D %%F)
)
echo.
echo. Filename: !CurrentFile!
echo. Record Count: !NumLines!
echo. First Record Name:!row!
echo. Last Record Name: !last!
) >> Result.txt
)
ENDLOCAL
setlocal EnableDelayedExpansion
for /F "tokens=1-22 delims=," %%a in (test.csv) do (
rem Define the sorting column in next *three lines*: %%a=1, %%b=2, etc...
if not defined V%%~l set "V%%~l=1000"
set /A "V%%~l+=1"
set "line[%%~l!V%%~l!]=%%d,%%f,%%l"
)
for /F "tokens=1* delims==" %%a in ('set line[') do echo %%b >> result2.txt
If there are multiple records with the same zip code, then it is necessary to identify each one of them. This solution uses a variable called V<zip code> as counter for each one of the records with the same zip code. Then, the value of such a variable is joined to the zip code itself in order to create a unique key for each record. The program assumes that there is a maximum of 999 records with the same zip code; if this value is not enough, just add a zero in if not defined V%%~l set "V%%~l=1000" line...
#ECHO OFF
SETLOCAL
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
SET "filename1=%sourcedir%\q56588370.txt"
SET "outfile=%destdir%\outfile.txt"
SET "sortfile=%destdir%\sortfile.txt"
SET /a sortcol=3
(
FOR /f "usebackqdelims=" %%a IN ("%filename1%") DO (
rem full line in %%a
SET "fullline=%%a"
CALL :sub %%a
)
)>"%sortfile%"
(
FOR /f "tokens=1*delims=+" %%a IN (' sort "%sortfile%"') DO (
ECHO %%b
)
)>"%outfile%"
DEL "%sortfile%"
GOTO :EOF
:sub
IF %sortcol% neq 1 FOR /L %%z IN (2,1,%sortcol%) DO SHIFT
ECHO %1+%fullline%
GOTO :eof
You would need to change the settings of sourcedir and destdir to suit your circumstances.
I used a file named q56588370.txt containing some dummy data for my testing.
Produces the file defined as %outfile%. %sortfile% is simply a temporary file having whatever name you desire within reason.
Retrieve each line of your file, and assign its content to a variable fullline, then execute the subroutine :sub with each line, passing the entire line as a parameter. Since each line must be a comma-separated list of items which may either be a quoted string or a string which doesn't contain spaces or commas, it can be decoded by the subroutine, so all that is required is to shift the parameter-list (columnrequired - 1) times and the required sort-data is in %1.
output %1 followed by a delimiter and the entire line originally read (parenthesising a series of statements and redirecting sends the data that normally appears on the screen to the redirection destination) into a temporary file, sort it and remove the data prefixed to each line using the chosen delimiter.
This way, more than one column could be chosen, and the data manipulated as required - for instance, locally "zip codes" are 4-digit (which can begin 0) and other countries use other formats or the ever-popular extension code that might be applied to a ZIP can be recorded and processed.
Here's my test data:
"John","Doe","12345","moredata 1"
"John","Do, or not","12345","moredata 2"
"John","Doe 4","12344","moredata 3"
"John","Doe 5","12345","moredata 4"
"John","Doe 6","12345","moredata 5"
"John","Doe 7","12344","moredata 6"
and output:
"John","Doe 4","12344","moredata 3"
"John","Doe 7","12344","moredata 6"
"John","Do, or not","12345","moredata 2"
"John","Doe 5","12345","moredata 4"
"John","Doe 6","12345","moredata 5"
"John","Doe","12345","moredata 1"
I havent find anything on internet so i need your help.
I have 2 CSV Files that i would like to compare:
the first one is like :
"Name","PrimarySmtpAddress","EmailAddresses"
the second one is like :
"Name","$_.TotalItemSize.Value.ToMB()"
the output file must show which name is both in first and second files
And i want to have, as output, a file with all the data in the first files but with the "$_.TotalItemSize.Value.ToMB()" added a the end of each lines.
for exemple it would do something like :
"Name","PrimarySmtpAddress","EmailAddresses","$_.TotalItemSize.Value.ToMB()",
I must be not very clear because me english is not perfect.
Can you guys please help me ? im not very good at scripting.
thank you very much.
edit :
REM #echo off
setlocal enabledelayedexpansion
set var1
set var2
for /f "tokens=1 delims=," %%A in (file2.txt) do (
set var1=%%A
echo %var1%
for /f "tokens=1 delims=," %%B in (file1.txt) do (
set var2=%%B
echo %var2%
if ("%var1%"=="%var2%")
(
echo equal var
)
else
(
echo not equal var
)
pause
)
)
pause
It looks like the IF is not working
for each line in 1.csv, look for the name in 2.csv and print combined line.
The REGEX may look a bit strange to you, it's:
/rc:: r=use Regex, c:=use string (necessary, as there could be spaces)
^: "Start of string"
\": a literal "
%%~a: the name without quotes
/": another literal "
,: a literal , (optional)
"tokens=1,* delims=," means "put the first token into %%m and all the rest into %%n"
Note: there is no IF. It's replaced by findstr, which extracts just the line, you need.
Note: this may be slow with big files (2.csv is read multiple times (as much as there are lines in 1.txt))
#echo off
setlocal enabledelayedexpansion
for /f "tokens=1,* delims=," %%a in (1.csv) do (
for /f "tokens=1,* delims=," %%m in ('findstr /rc:"^\"%%~a\"," 2.csv') do (
echo %%a,%%b,%%n
)
)
Names, that aren't in both files, will be skipped.
I have a simple CSV file with six values per row (%a-%f)
The %a is text string the other values are all integers.
My problem is that for values %d and %e cannot be odd and must be rounded up.
Should I search for each odd integer one at a time or is there a simpler way?
My CSV file looks like this:
ww-xx-yy-zzz,1,2,3,4,5
The following script accomplishes what you are trying to do:
#echo off
setlocal EnableExtensions DisableDelayedExpansion
rem Define constants here:
set "CSVFILE=data.csv" & rem (input file)
set "NEWFILE=data_new.csv" & rem (output file)
set "ROUNDEVEN=#" & rem (empty to round to odd, non-empty to round to even)
set "ROUNDUP=#" & rem (empty to round down, non-empty to round up)
if defined ROUNDEVEN (set /A ROUNDEVEN=0) else (set /A ROUNDEVEN=-1)
if defined ROUNDUP (set /A ROUNDUP=1) else (set /A ROUNDUP=0)
> "%NEWFILE%" (
for /F "usebackq eol=, tokens=1-6 delims=," %%A in ("%CSVFILE%") do (
set "TEXT=%%A"
setlocal EnableDelayedExpansion
set "ROUNDED="
for %%Z in (%%D %%E) do (
set /A VALUE=^(%%Z+ROUNDUP-ROUNDEVEN^)/2*2+ROUNDEVEN
set "ROUNDED=!ROUNDED!,!VALUE!"
)
echo(!TEXT!,%%B,%%C!ROUNDED!,%%F
endlocal
)
)
endlocal
exit /B
Here is the input CSV data of your question (file data.csv):
ww-xx-yy-zzz,1,2,3,4,5
...and the corresponding output CSV data (file data_new.csv):
ww-xx-yy-zzz,1,2,4,4,5
The script only works if the following conditions are fulfilled:
the input CSV file contains exactly 6 columns; too many are simply ignored, too few may disrupt column/field mapping;
none of the columns/fields of the input CSV data is empty;
only the first column of the input CSV data contains text data, all the others contain integers;
none of the integer values to round has got leading zeros;