Renaming CSV column header and merge results with Powershell - csv

So I'm just starting out with this whole Powershell thing and so far so good - until now. I just can't figure out how to do this!
I'm looking at manipulating CSV files which are output from one system (which I can't change at output), renaming some column headers and merging a couple of the results into one column so that it matches the input requirements to upload into another system (again, I can't change those parameters).
So, as an example.
The first file is created:
File1.csv
"A","B","C""1","2","3"
I want a powershell script that will output:
File2.csv
"X","Y""1","23"
So I can import it into another system.
I hope that all makes sense, and thanks in advance for any assistance.

I'm going to assume that your actual/desired formats of your files look like this:
"A","B","C"
"1","2","3"
"X","Y"
"1","23"
rather than having everything in one line. If that's correct you can import File1.csv with Import-Csv, rename and merge columns with calculated properties:
... | Select-Object #{n='X';e={$_.A}}, #{n='Y';e={$_.B + $_.C}} | ...
and write the result to File2.csv with Export-Csv.

Related

Splitting a CSV File by value of specific column

I have multiple CSV files that I need to split into 67 separate files each. Each sheet has over a million rows and dozens of columns. One of the columns is called "Code" and it ranges from 1 to 67 which is what I have to base the split on. I have been doing this split manually by selecting all of the rows within each value (1, 2, 3, etc) and pasting them into their own CSV file and saving them, but this is taking way too long. I usually use ArcGIS to create some kind of batch file split, but I am not having much luck in doing so this go around. Any tips or tricks would be greatly appreciated!
If you have access to awk there's a good way to do this.
Assuming your file looks like this:
Code,a,b,c
1,x,x,x
2,x,x,x
3,x,x,x
You want a command like this:
awk -F, 'NR > 1 {print $0 >> "code" $1 ".csv"}' data.csv
That will save it to files like code1.csv etc., skipping the header line.

line feed within a column in csv

I have a csv like below. some of columns have line break like column B below. when I doing wc -l file.csv unix is returning 4 but actually these are 3 records. I don't want to replace line break with space, I am going to load data in database using sql loader and want to load data as it is. what should I do so that unix consider line break as one record?
A,B,C,D
1,"hello
world",sds,sds
2,sdsd,sdds,sdds
Unless you're dealing with trivial cases (No quoted fields, no embedded commas, no embedded newlines, etc.), CSV data is best processed with tools that understand the format. Languages like perl and python have CSV parsing libraries available, there are packages like csvkit that provide useful utilities, and more.
Using csvstat from csvkit on your example:
$ csvstat -H --count foo.csv
Row count: 3

Does anyone have a script to convert a Chrome Bookmarks file with [sub]*folders into a CVS file?

I want to be able to do Vimdiffs and Vimfolds on Bookmarks files that have been converted to CVS files ie with one description and one uri per line. However, because the Bookmarks file has multiple levels for the folders, the CSV file will also need fields for the different levels of folder names on each line.
I am new to jq but it seems like it should be able to do this sort of conversion?
Thanks,
Phil.
Have you tried to use any free tools like: https://json-csv.com/
or json2csv: https://www.npmjs.com/package/json2csv
If neither of those works, perhaps this approach.
When I need to reconstruct data I write a set of loops that identify each property I want for each line in my CSV. Let's say my JSON has Name, Email, Phone but for some reason all are at different object levels in my JSON.
First right a loop that resolves Name, then a loop for Email, and one for Phone. At the end of the first loop call the second, and from the second call the third.
Then you can use jq -n which allows to create JSON with no input.
So your CSV output would be like jq -n '{NewName: .["'$Name'"]}'
once you have a clean JSON with all data points at the same level CSV conversion is smooth.
Hope this helps

Adding a prefix to header row in csv file with awk

I am currently working with datasets collected in large CSV files (over 1600 columns and 100 rows). Excel or LibreOffice calc can't easily handle these files for concatenating a prefix or suffix to the header row, which is what I would have done on a smaller dataset.
Researching the topic I was able to come up with the following command:
awk 'BEGIN { FS=OFS="," } {if(NR==1){print "prefix_"$0}; if(NR>1){print; next}}' input.csv >output.csv
Unfortunately, this only adds the prefix to the first cell. For example:
Input:
head_1,head_2,head_3,[...],head_n
"value_1","value_2","value_3",[...],"value_n"
Expected Output:
prefix_head_1,prefix_head_2,prefix_head_3,[...],prefix_head_n
"value_1","value_2","value_3",[...],"value_n"
Real Output:
prefix_head_1,head_2,head_3,[...],head_n
"value_1","value_2","value_3",[...],"value_n"
As the column number may be variable across different csv files, I would like a solution that doesn't require enumeration of all columns as found elsewhere.
This is necessary as the following step is to combine various (5 or 6) large csv files in a single csv database by combining all columns (the rows refer to the same instances, in the same order, across all files).
Thanks in advance for your time and help.
awk 'BEGIN{FS=OFS=","} NR==1{for (i=1;i<=NF;i++) $i="prefix_"$i} 1' file

Filter .json files to remove ones with null records

I've got a folder containing thousands of .json files named thigs like 99.json (the numbers are sequential). Some of them contain valid records but others just contain null on a single line. I want to filter out the files that only contain null so they don't screw up my next step of processing. Surely this is easy but I can't immediately see how to do it.
It would help, as an additional step, to combine the valid files (the ones with complete or partially complete records) into a single file as a list. But this is less important.
All suggestions gratefully appreciated. Many thanks.
To find all files with null's and group them as a list, you can use
grep *.json -e "null" >> badfiles.txt
To find all files with valid values and group them as a list, you can use
grep -v *.json -e "null" >> goodfiles.txt