Read second column of csv - csv

I have this CSV data file that looks like
a,b
1,2
3,4
data = readdlm("my/local/path", ',')
however, when I access data[1], I'm only getting a, I thought it supposes to be [a,b]? Doing things like data[1:2] gets me the first column only.
Any idea how can I access the second column?

From the docs for readdlm:
Read a matrix from the source where each line gives one row...
So use data[row,col] syntax to get each element

Related

Split csv cell into columns

I'd like to split a csv column containing a dictionary-like structure/values into component columns. For example input/output data, see this spreadsheet. Data will always come in that format ({"key":value,...}), with the number of key value pairs being arbitrary.
Not necessarily looking for a full solution here—more curious what the my options are for parsing data to create the output I want. Open to maybe using python to do some of this.
use in B3:
=INDEX(BYROW(A3:A5, LAMBDA(x, IFNA(HLOOKUP(B2:E2,
TRANSPOSE(SPLIT(TRIM(FLATTEN(SPLIT(REGEXREPLACE(x, "[\}\{"",]", ),
CHAR(10)))), ":")), 2, 0)))))
In your case, you can use formula
=SPLIT(REGEXREPLACE(A1,""".*"":|\{|\}|\n|\r",""),",")
And here the result

Remove last blank row in CSV using Logic App

I have a CSV file stored in SFTP where the last row is a blank, so the data looks like this in text:
a,b,c
d,e,f
,,
How can I use Logic App to remove that final row and then save it in BLOB? I have the following but will need some extra steps before the BLOB creation I think.
Considering the same sample here is my Logic app
In Compose_2 it takes the index of the last empty item. Below is the expression that I used to retrieve the lastIndex.
lastIndexOf(variables('Sample'),'\n')
Then in Compose_3 I'm selecting the one which I wanted
substring(variables('Sample'),0,outputs('Compose_2'))
Here is the Final Result
NOTE:-
Make sure you remove an extra ' \ ' been attached to '\n' in the code view at the Compose_2.
So the final Compose_2 looks like
lastIndexOf(variables('Sample'),'
')
Updated Answer
If the received data is coming from CSV then you can use the take() expression you retrieve the wanted rows. Here are a few screen shots for detailed explanation:-
Below is the expression in the compose connector
take(outputs('Split_To_Get_Rows'),sub(length(outputs('Split_To_Get_Rows')),1))

Dynamically merge two CSV files using Dataweave in Mule

I get CSV files of different length from different sources. The columns within the CSV are different with the only exception is each CSV file will always have an Id column which can be used to tie the records within different CSV files. At a time, two such CSV files needs to be processed. The process is to take the Id column from the first file and match the rows within the second CSV file and create a third file which contains contents from the first and second file. The id column can be repeated in the first file. Eg is given below. please note that the first file I might have 18 to 19 combination of different data columns so, I cannot hardcode the transformation within dataweave and there is a chance that a new file will be added every time as well. A dynamic approach is what I wanted to accomplish. So once written, the logic should work even if a new file is added. These files get pretty big as well.
The sample files are given below.
CSV1.csv
--------
id,col1,col2,col3,col4
1,dat1,data2,data3,data4
2,data5,data6,data6,data6
2,data9,data10,data11,data12
2,data13,data14,data15,data16
3,data17,data18,data19,data20
3,data21,data22,data23,data24
CSV2.csv
--------
id,obectId,resid,remarks
1,obj1,res1,rem1
2,obj2,res2,rem2
3,obj3,res3,rem3
Expected file output -CSV3.csv
---------------------
id,col1,col2,col3,col4,objectid,resid,remarks
1,dat1,data2,data3,data4,obj1,res1,rem1
2,data5,data6,data6,data6,obj2,res2,rem2
2,data9,data10,data11,data12,obj2,res2,rem2
2,data13,data14,data15,data16,obj2,res2,rem2
3,data17,data18,data19,data20,obj3,res3,rem3
3,data21,data22,data23,data24,obj3,res3,rem3
I was thinking to use pluck to get the column values for the first file. I idea was to get the columns in the transformation without hardcoding it. But I am getting some errors. After this I have the task of searching for the id and getting the value from the second file
{(
using(keys = payload pluck $$)
(
payload map
( (value, index) ->
{
(keys[index]) : value
}
)
)
)}
I am getting the following error when using pluck
Type mismatch for 'pluck' operator
found :array, :function
required :object, :function
I am thinking of using groupBy on id on the second file to facilitate better searching. But need suggestions on how to append the contents in one transformation to form the 3rd file.
Since you want to combine both CSVs without renaming the column names, you can try something like below
var file2Grouped=file2 groupBy ((item) -> item.id)
---
file1 map ((item) -> item ++ ((file2Grouped[item.id])[0] default {}) - 'id')
output
id,col1,col2,col3,col4,obectId,resid,remarks
1,dat1,data2,data3,data4,obj1,res1,rem1
2,data5,data6,data6,data6,obj2,res2,rem2
2,data9,data10,data11,data12,obj2,res2,rem2
2,data13,data14,data15,data16,obj2,res2,rem2
3,data17,data18,data19,data20,obj3,res3,rem3
3,data21,data22,data23,data24,obj3,res3,rem3
Working expression is as given below. The removing the id should happen before the default
var file2Grouped=file2 groupBy ((item) -> item.id)
---
file1 map ((item) -> item ++ ((file2Grouped[item.id])[0] - 'id' default {}))

Import CSV column from different file into new file

I have 2 CSV files almost identical with the following differences:
The first has a column, "date".
The second doesn't have "date" and also has 50 rows less than the 1st ("email").
They are a list of subscribers with date created. The second, however, is the updated list with subscribers who wanted to be removed, but this no longer has the date created.
Is there any way to import column "date" from 1st CSV into the 2nd CSV by making a reference to the "email" column so I can get the correct date of that subscriber?
Sorry, there seems to be not a ready made (probably an evening's worth of effort) command line tool available.
You could look at different ways, one complex way is to load it in tables, to the merge (using a select and join on the two tables) and export it back as csv.
The simplest I could think of was to use R (given that you have header names, in your CSV?):
csv1_data <- read.csv('/path/to/csv1.csv')
csv2_data <- read.csv('/path/to/csv2.csv')
merged_csv <- merge(csv1_data, csv2_data)
write.table(merged_csv,file="/path/to/merged_csv.csv",sep=",",row.names=T)
The first 2 lines load the data in R, the 3 line merges them using the default S3 method, the final line exports the result as a csv file, with the headers.
Hope this helps!

how to load extracted file name into sql server table in SSIS

i have 3 csv files in a folder which contains eid, ename, country fields, and my 5 csv files names are test1_20120116_034512, test1_20120116_035512,test1_20120116_035812 etc.. my requirement is I want to take lastest file based on timne stamp and modified date, which i have done. Now i want to import the extracted file name into destination table..
my destination tables contains fields like,
filepath, filename, eid, ename, country
I have posted regarding this before in the same site i got an answer for extracting filename, now i want to load the extracted FileName into destination table
Import most recent csv file to sql server in ssis
my destination tables should have output as
C:/source test1_20120116_035812 1234 tester USA
In your DataFlow task, add a Derived Column Transformation. The value of CurrentFile will be the fully qualified path to the file. As you only want the file name, I would look to use a replace function on that with the base folder and then strip the remaining slash. This does not strip the file extension but you can add yet another call to REPLACE and substitute an empty string
Derived Column Name: filename
Derived Column:
Expression: REPLACE(REPLACE(#[User::CurrentFile], #[User::RootFolder], ""), "\\", "")
The above expects it to look like
CurrentFile = "C:\source\test1_20120116_035812.csv"
RootFolder = "C:\source"
Edit
I believe you've done something in your approach that I did not do. You should see a warning about possible truncation but given the values discussed in this and the preceding question, I don't believe the 4k limit on expressions will be of concern.
Displaying the derived column
Demonstrating the derived column does work
I will give you a +1 for providing an approach I wasn't aware of, but you'll still need to add a derived column to match your provided format (base path name)
Full path is provided from the custom properties. Use the above REPLACE section to remove the path info except use the column [FileName] instead of #[User::CurrentFile]
I tried to get the filename through the procedure which Billinkc has given, but its throwing me error stating that filename column failed becaue of truncation error..
Any how i tried different approach to load file name into table.
steps i have used
1. right click on flat file Source and click on show advanced edito for Flat file
2. select component Properties tab
3. Inside that Custom Properties section ---> it has a property FileNameColumnName
I have assigned Filename to that column property like
FileNameColumnName----> FileName thats it, am able to get the filename into my destination table..