How to split CSV column values into other columns using batch script - csv

I'm new to batch files and this is a tricky question. In stores.csv there is a column called 'Image' which stores vertical-line-delimited image URLs as values. There are also additional columns called 'AltImage2', 'AltImage3', etc. How can I split the vertical-line-delimited string into columns that start with 'AltImage' for each row in the CSV? 'AltImage' columns only go to AltImage5, and there may not be five image URLs in a given row. I would also like to keep the first image URL in the 'Image' column if possible.
Example of headers and single row of data:
Company,Title,Image,AltImage2,AltImage3,AltImage4,AltImage5
Testco,U2X40,image1.png|image2.png|image3.png
Desired result after running batch:
Company,Title,Image,AltImage2,AltImage3,AltImage4,AltImage5
Testco,U2X40,image1.png,image2.png,image3.png
So far I've tried this:
for /f "tokens=3 delims=, " %%a in ("stores.csv") do (
echo run command here "%%a"
)
But cannot even echo the values in the Image column.
Here is a solution using Bash script (unfortunately I need batch): How do I split a string on a delimiter in Bash?

#echo off
setlocal
< stores.csv (
rem Read and write the header
set /P "header="
call echo %%header%%
rem Process the rest of lines
for /F "tokens=1-3 delims=|" %%a in ('findstr "^"') do echo %%a,%%b,%%c
)

I think this handles your parsing problem. Pay attention to quotes and the usebackq option.
for /f "skip=1 tokens=3 delims=," %%a in (stores.csv) do for /f "tokens=1-5 delims=|" %%b in ("%%a") do echo %%b %%c %%d %%e %%f
Here's a fuller solution to play with. There may be a more elegant way to handle optional commas. And you'll have to handle directing the output to whichever place is appropriate.
#echo off
setlocal enabledelayedexpansion
echo Company,Title,Image,AltImage2,AltImage3,AltImage4,AltImage5
for /f "skip=1 tokens=3 delims=," %%a in (stores.csv) do (
for /f "tokens=1-5 delims=|" %%b in ("%%a") do (
set line=%%b
if not "%%c"=="" set line=!line!,
set line=!line!%%c
if not "%%d"=="" set line=!line!,
set line=!line!%%d
if not "%%e"=="" set line=!line!,
set line=!line!%%e
if not "%%f"=="" set line=!line!,
set line=!line!%%f
echo !line!
)
)

read the file line by line and replace | with , (you have to escape the | and use delayed expansion:
#echo off
setlocal enabledelayedexpansion
(
for /f "delims=" %%a in (old.csv) do (
set line=%%a
echo !line:^|=,!
)
)>new.csv

Related

BATCH - Remove some characters from filename

I've files with this name:
FirstPart_SecondPart_ThirdPart.zip_FourthPart_FifthPartX.csv
where X is a one-digit number.
It should be renamed via batch scripting as:
FirstPart_SecondPart_ThirdPart.zip_FifthPartX.csv
So I'd like to remove the FourthPart. Please note that all the parts ALWAYS HAVE the same lenght. FirstPart is always 7 digits, SecondPart is always 9 digits...etc...
Here is what I've tried:
ren \*.zip_*FifthPart?.csv *.zip_FifthPart?.csv
But it does not work.
Please any help?
Within the appropriate directory
…at the Command prompt:
For /F "EOL=_Tokens=1-4*Delims=_" %A In ('Dir/B/A-D "*_*_*_*_*.csv"') Do #Ren "%A_%B_%C_%D_%E" "%A_%B_%C_%E"
…in a batch file:
#For /F "EOL=_Tokens=1-4*Delims=_" %%A In ('Dir/B/A-D "*_*_*_*_*.csv"'
) Do #Ren "%%A_%%B_%%C_%%D_%%E" "%%A_%%B_%%C_%%E"
Because you are already certain of the file name format and character numbers I feel that the best approach would be to utilise Where instead of Dir.
For example, in a batch file:
FirstPart is always 7 digits, SecondPart is always 9 digits, ThirdPart is always 6 digits, FourthPart is always 5 digits and FifthPart is always 8 digits.
#For /F "EOL=_Tokens=1-4*Delims=_" %%A In (
'Where .:"???????_?????????_??????_?????_????????.csv"'
) Do #Ren "%%A_%%B_%%C_%%D_%%E" "%%A_%%B_%%C_%%E"
You can use a for /F loop to split and rebuild the file names:
for /F "delims= eol=|" %%F in ('dir /B /A:-D "*_*_*_*_*.csv"') do (
for /F "tokens=1-4* delims=_ eol=_" %%A in ("%%F") do (
ren "%%F" "%%A_%%B_%%C_%%E"
)
)
Given that none of the parts contain underscores (_) on their own and none of them are empty, a single loop is sufficient:
for /F "tokens=1-4* delims=_ eol=_" %%A in ('dir /B /A:-D "*_*_*_*_*.csv"') do (
ren "%%A_%%B_%%C_%%D_%%E" "%%A_%%B_%%C_%%E"
)
Here is an approach with an additional filter for file names using findstr in order to exclude files that do not match the name specifications:
for /F "tokens=1-4* delims=_ eol=_" %%A in ('
dir /B /A:-D "*_*_*_*_*.csv" ^| findstr /I "^[^_][^_]*_[^_][^_]*_[^_][^_]*_[^_][^_]*_[^_].*\.csv$"
') do (
ren "%%A_%%B_%%C_%%D_%%E" "%%A_%%B_%%C_%%E"
)
Two possibilities to solve this:
(1) As all parts always have the same length, you can just use substrings to cut the FourthPart_ part:
#ECHO OFF
SETLOCAL EnableDelayedExpansion
FOR /F "tokens=*" %%G IN ('DIR /B *.csv') DO (
SET new_filename=%%G
SET new_filename=!new_filename:~0,35!!new_filename:~46!
ECHO REN "%%G" "!new_filename!"
)
(2) Alternatively, if FourthPart_ is a static string, you might also get away with removing the FourthPart_ using search & replace:
#ECHO OFF
SETLOCAL EnableDelayedExpansion
FOR /F "tokens=*" %%G IN ('DIR /B *.csv') DO (
SET new_filename=%%G
SET new_filename=!new_filename:FourthPart_=!
ECHO REN "%%G" "!new_filename!"
)
These batch files will only output the commands to be issued. Remove the ECHO once you've inspected the output and you're confident it does what you want.

Batch: Fill array by text file with escaped characters

I've got two arrays to fill from a text file. One with UTF-8 umlauts and one with escaped.
all_headings_html_umlauts_escaped.txt
^&Uuml^;berblick
^&Auml^;pfel
^&Ouml^;sterreich
all_headings_utf8_umlauts.txt
Überblick
Äpfel
Österreich
My batch file:
#echo off
:: Build array to iterate through
set /A n=0
for /F "usebackq delims=" %%a in ("all_headings_utf8_umlauts.txt") do (
set /A n+=1
REM call echo %%n%%
call set arrayutfeight[%%n%%]=%%a
call set o=%%n%%
)
for /L %%i in (1,1,%o%) do call echo %%arrayutfeight[%%i]%%
pause
:: Build arrayy to iterate through
set /A p=0
for /F "usebackq delims=" %%b in ("all_headings_html_umlauts_escaped.txt") do (
set /A p+=1
REM call echo %%k%%
call set arrayhtmlescaped[%%p%%]=%%b
call set q=%%p%%
)
for /L %%i in (1,1,%q%) do call echo %%arrayhtmlescaped[%%i]%%
pause
The ouput of the first array works perfectly and as it should be but the ouput of the second one is three times "ECHO is off".
Any ideas why and how I can solve this issue? I really need as an output in my batch file from the array ^&Uuml^;berblick...
KR
Mark
The management of the ^ caret character is complicated in a Batch file. Such a character is duplicated when it appears in a line in certain cases. In this way, the call set "arrayhtmlescaped[%%p%%]=%%b" line stores two carets per each one in the file, so extra carets must be removed. The simplest way to do that is using Delayed Expansion, but in the echo command the carets are placed outside quotes, so it is necessary to escape each caret with an additional one.
#echo off
setlocal EnableDelayedExpansion
:: Build arrayy to iterate through
set /A p=0
for /F "usebackq delims=" %%b in ("all_headings_html_umlauts_escaped.txt") do (
set /A p+=1
REM call echo %%k%%
call set "arrayhtmlescaped[%%p%%]=%%b"
call set q=%%p%%
)
for /L %%i in (1,1,%q%) do echo !arrayhtmlescaped[%%i]:^^^^=^^!
pause

batch - modify each csv files in subfolders

I try to make a batch script which update header line of each .csv file in folder (and subfolders).
Since I don't want to really modify source files, I create new files with new headers, same content and new extension ".csv.modified"
The script works fine when I have only one .csv (I just remove the /s) but ignore content of other files when > 1.
Note: I have many subfolders and some of them contains whitespaces.
Any idea ?
#echo off
cls
setlocal enabledelayedexpansion
set HEADERS=header1,header2
for /f "delims=" %%i in ('dir /b /s *.csv') do (
set filename=%%~i
echo !filename!
echo.
set cpt=1
set new_filename=!filename!.modified
#copy nul "!new_filename!"
echo creating !new_filename!
echo %HEADERS%>"!new_filename!"
for /f %%a in (%%~i) do (
set line=%%a
if !cpt! gtr 2 (
echo Y
echo !line!>>"!new_filename!"
) else (
echo N
)
echo !cpt! %%a
set /a cpt=!cpt!+1
)
)
endlocal
This should be all you need to achieve that:
#SET "HEADERS=header1,header2"
#FOR /F "DELIMS=" %%A IN ('DIR/B/S/A-D-S-L *.csv') DO #((ECHO %HEADERS%
MORE +1 "%%~A")>"%%~A.modified")

Combined csv via cmd

I'm currently working on a way to combine 3 csv files and have the following script to do so:
Script
#echo off
ECHO Set working directory
pushd %~dp0
setlocal ENABLEDELAYEDEXPANSION
set cnt=1
for %%i in (*.csv) do (
if !cnt!==1 (
for /f "delims=" %%j in ('type "%%i"') do echo %%j >> combined.csv
) else if %%i NEQ combined.csv (
for /f "skip=1 delims=" %%j in ('type "%%i"') do echo %%j >> combined.csv
)
REM increment count by 1
set /a cnt+=1
)'
This works like a charm and also strips the header of the other 2 csv files in the working dir.
The script now outputs combined.csv, which is nice but I would like the script to output NL2 ddmmyyyy.csv.
The issue I have and can't seem to figure out is how to make the name of the output file incremental and date-based.
Code can parse the date elements from the DATE variable. The format of the date will vary based on locale. Mine is set to YYYY-MM-DD, but yours may be different.
M:>echo %DATE%
2016-01-27
8:23:58.28 \\SWPDCENDWXTK01\D$ M:\DW_Devl\Paul\phs_dmx_config\bin
M:>SET FNDATE=%DATE:~8,2%%DATE:~5,2%%DATE:~0,4%
8:24:08.54 \\SWPDCENDWXTK01\D$ M:\DW_Devl\Paul\phs_dmx_config\bin
M:>ECHO %FNDATE%
27012016
Inside the loop, set a variable and use it for the output.
SET COMBO_FILENAME=NL!cnt!%FNDATE%.csv
ECHO %%j >>!COMBO_FILENAME!

Merge CSV Files (Without Duplicates) in Batch

I want to merge two similar CSV files using batch. I have found a file that was working perfectly and now is not. This may have been due to renaming the CSV files that were used.
I want what is demonstrated below:
File 1:
name1,group1,data1
name2,group2,data2
name3,group3,data3
File 2:
name1,group1,data1,time1
name2,group2,data2,time2
Merged file:
name1,group1,data1,time1
name2,group2,data2,time2
name3,group3,data3
(Note that the fourth column was not filled in by name3 and was subsequently not on file 2.)
The following code was modified from: http://www.experts-exchange.com/OS/Microsoft_Operating_Systems/MS_DOS/Q_27694997.html.
#echo off
set "tmpfile=%temp%\importlist.tmp"
set "csvfile=importlist.csv"
copy nul "%tmpfile%" >nul
echo.
echo Processing all CSV files...
set "header="
for %%a in (%1) do (
if not "%%a"=="%csvfile%" (
set /p =Processing %%a...<nul
for /f "tokens=1* usebackq delims=," %%b in ("%%a") do (
if /i "%%b"=="Keyword" (
if not defined header (
set /p =Found header...<nul
set "header=%%b,%%c"
)
) else (
title [%%a] - %%b,%%c
findstr /b /c:"%%b" /i "%tmpfile%">nul || echo %%b,%%c>>"%tmpfile%"
)
)
echo OK
)
)
echo Finished processing all CSV files
echo.
echo Creating %csvfile%
echo %header%>"%csvfile%"
set /p =Sorting data...<nul
sort "%tmpfile%">>"%csvfile%"
echo OK
del "%tmpfile%"
echo Finished!
title Command Prompt
exit /b
The problem is that when executed it just creates a sorted CSV with all the data from the first file and not the second.
I have attempted to get it working by putting quotation marks around the parameter (%1 - "directory*.csv") to no avail.
you might try this
#echo off &setlocal disabledelayedexpansion
for /f "delims=" %%a in (file1.csv) do set "#%%~a=7"
for /f "tokens=1-4delims=," %%a in (file2.csv) do (
set "lx1=#%%~a,%%~b,%%~c"
setlocal enabledelayedexpansion
if defined !lx1! (
endlocal
set "#%%~a,%%~b,%%~c="
) else (
endlocal
)
set "#%%~a,%%~b,%%~c,%%~d=4"
)
(for /f "delims==#" %%a in ('set #') do echo %%~a)>merge.csv
type merge.csv
It doesn't work, if you have = or # in your data.