Extracting first item in each row in [r] - extract

I have a column with multiple items, separated by semicolons, in each row but i would like only the first item in each row. My data looks like this:
1 mmSM7.3.54;IGHV14-3*01;musIGHV236
2 mm7183.20.37;IGHV5-17*01;musIGHV219
3 mmIGHV5-9-1*02;musIGHV207;7183.14.25
4 mm7183.20.37;IGHV5-17*01;musIGHV219
5 mmIGHV7-1*03;S107.1.42
6 mmIGHV9-2*01;VH9.13;musIGHV242;VGAM3.8-2-59
7 mmmusIGHV231;SM7.2.49;IGHV14-2*01
I would like a column that has just the first item of each row that looks like this:
1 mmSM7.3.54
2 mm7183.20.37
3 mmIGHV5-9-1*02
4 mm7183.20.37
5 mmIGHV7-1*03
6 mmIGHV9-2*01
7 mmmusIGHV231
Does anyone know a way to do this? Any help would be great. Thank you.

Do you mean the first item in each record up to the first semicolon ; ? You did not mention file type, but if it is text, csv, rtf or xlsx, you can do this quickly and easily in Excel or most other spreadsheet applications.
1) Launch Excel, use File > Open (change file type to All Files .) and open your file
2) Select the column or all the cells that contain your data
3)Click the DATA tab and choose Text to Columns > Delimited > Next > check the Semicolon box > Finish
4) The first item in each record will now be in its own column. You can copy this column and save it in a new file or just delete all the stuff you don't want and save the original file in the same file format as before.

Related

Remove last row in CSV file

An automatically generated CSV file has an extra line at the bottom and is creating a problem in a Power Automate flow for me. I don't know how typical this is, but when opening the CSV in Notepad, each item is separated by 6 empty rows. The last item has 7 empty rows, which is what I believe the problem is. Can I delete that last row of the CSV file using Power Automate? I don't think I have access to Powershell, nor do I know how to use it.

Talend: how to read the 2 first lines of a .txt/.csv file, and get the date from the line beginning with "Generated:"

I read data from a .csv file the usual way with a tFileInputDelimited component, which I read and output to my local PostgreSQL database.
But my problem is that I need to get the date from the 2 first lines of the file.
The 2 first lines are not column separated... but just a 2 line header.
I would need to know what component to use and how to set them to:
read the 2 first lines
get the line which starts with "Generated:"
get the date which is just after ":"
Example header, the 2 first lines:
Report Title:this_is_the_title
Generated: Nov-27-2020, 14:03:01 CET
Is it possible to do that with Talend, and which components would be best?
I do not know all the components yet, and try to use tFileInputDelimited, but it does not seems to work with it.
==== EDIT ====
I am trying to do it with tFileInputRegex, this could work...
Use this schema for the input file :
In the tFileInputDelimited, specify "#" as the field separator (set the entire line as 1 record) and set the limit at 2 to read only the first 2 lines:
In the tFilterRow, Click the Advanced Mode, add this code to keep only the "Generated" line :
In the tJavaRow, add this code to extract the date :
output_row.line = input_row.line.substring("Generated:".length() + 1);

Rename the fields base on the first record content of each one

Ihave the scheme below,
phpMyAdmin - Table structure
is made by an import from the csv file,
how can I rename the fields base on the first record content of each one?
like:
rename: COL 1 to: User ID
rename: COL 2 to: Main - Full Name
rename: COL 3 to: Main - First name
and so on ...
Before importing CSV data remove the first line with columns names from the file. Then, in the phpmyadmin while importing CSV file in advanced options, it is possible to put comma separated list of column names (that's the name of the input field) which match the CSV columns.
php approach: renameField.php
maybe it's not the most elegant solution, but it does it's job.

Read CSV file and create new CSV file in VBScript?

I have one CSV file with invoices list to paid.
Example:
Account Number;Invoice Number;Amount
11111;ID11111;100.50
11111;ID22222;250.50
22222;ID33333;100.00
11111;ID44444;300.00
Now I want read this file and create file like this:
Account Number;Invoice Number;Amount
11111;ID11111, ID22222, ID44444;651.00
22222;ID33333;100.00
Second field have been merged and the third field summed.
But second field must have a maximum of 50 characters and next must go to next line.

ssis get 3 row and all the data from 9 row

in my csv file data is like this
************* file format***************************
filename, abc
date,20141112
count,456765
id,1234
,,
,,
,,
name,address,occupation,id,customertype
sam,hjhjhjh,dr,1,s
michael,dr,2,m
tina,dr,4,s
*********************more than 30000 records in each load *************************************
i have got the file in above format and i want to take date and count from 2nd and 3rd row and than the data starts from 9th row. is it possible without script task i am not so good with scripting
can anyone plz help how t get this.
With out using a script task also it is possible to do. The flow is like...
Pull 2 DFT into your package, 1 to reformat your text file and split it to 2 separated text file. 1 for your 2nd & 3rd row and another 1 for more the 9th row. The another DFT will do your rest operation which is quite simple.
1st DFT--> Flat file source--> Row Number Transformation (You can get this new transformation from this link as per your sql version <http://microsoft-ssis.blogspot.in/p/ssis-addons.html>) -->conditional split (1-->RowNumber == 2 || RowNumber == 3,2-->RowNumber > 8)-->Put the result into 2 different flat files _1 & _2 as per your convenience naming.
Now you are ready with your required 2 flat files as source to your 2nd DFT...
*If it solves your problem, mark it as answer.