I am processing a CSV file in powerautomate. Here is one record.
server2,usa,"rebooted,by citrix",25,good
Because "rebooted,by citrix" is data from a single field, when I am splitting with comma, the array manipulation gets mismatch.
I want to replace the comma within double quotes with hyphen. The expected output should be like server2,usa,"rebooted-by citrix",25,good
PrasadAthalye has a nice solution approach for this.
You could first split on the " character instead. Per item you could replace the , character and append the results to a new array. After that you should be able to apply your normal split.
https://powerusers.microsoft.com/t5/Building-Flows/Setting-up-specific-expression-to-remove-comma-inside-strings/m-p/646040/highlight/true#M86288
I have a list of CSV files which i receive for ETL into database every month. Its in a folder. My data has ; in many columns as well. For example, in the location column values like New York; USA are present, which i want to appear in a single column instead of splitting into many columns. How do i specify delimiter then?
I think you cannot have the field separator included in the field content or you have to incluse these values between "". For example:
blabla;"New York; USA";blabla
Other solution, change the field delimitor to a more specific (and unused) character.
I'm afraid there is no better solution.
Regards,
TRF
As TRF mentioned, you can't have the delimiter as part of the non-delimiting text in your file.
My workaround for that would be the following:
1) Read the file with a tFileInputFullRow (https://help.talend.com/display/TalendComponentsReferenceGuide54EN/tFileInputFullRow)
2) Use a tReplace to replace the ; with some other character,
say -, for the problem cells (in your case, replace "New York;USA" with "New York-USA". You can also use the regex option in the tReplace component to make it a generic rule.
3) Save that output into another file
4) Now read the new file using ; as the delimiter
References:
1) tReplace: https://help.talend.com/display/TalendOpenStudioComponentsReferenceGuide521EN/18.16+tReplace
2) Regex: https://docs.oracle.com/javase/tutorial/essential/regex/
I am loading a flat file into a SQL database. The flat file is comma delimited. Some of the column values have comma without being encapsulated in double quotes (for e.g - HPPV,TYRE). Now, when I try to use comma as text qualifier, I get a message saying column delimiter and text qualifier cannot be the same.
I want to somehow use comma as text qualifier so that the flat file keeps the value - HPPV,TYRE as on single entity - HPPVTYRE or HPPV TYRE, instead of spilling it over to the next column.
Is there any way we can use comma as text qualifier, it already being a column delimiter ?????
No, I don't think so but I searched and found this article that might help.
I'm trying to import a csv into Hive. I have a column which is a dollar value and is reported within the CSV as '$123,244.00.' I would like to convert this value into a float in Hive.
So I've loaded the csv into a temporary table, treating that column as a string. Next I want to load it into the final table, and in the process convert that string into a float or decimal.
Any suggestions on the best way to go about doing this?
This should work:
select float(regexp_replace(substr('$123,244.00', 2, length('$123,244.00')), ',', '')) from table;
You need to remove any commas as well as the dollar sign. You may find this link helpful as well: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-NumericTypes
How is a CSV file built in general? With commas or semicolons?
Any advice on which one to use?
In Windows it is dependent on the "Regional and Language Options" customize screen where you find a List separator. This is the char Windows applications expect to be the CSV separator.
Of course this only has effect in Windows applications, for example Excel will not automatically split data into columns if the file is not using the above mentioned separator. All applications that use Windows regional settings will have this behavior.
If you are writing a program for Windows that will require importing the CSV in other applications and you know that the list separator set for your target machines is ,, then go for it, otherwise I prefer ; since it causes less problems with decimal points, digit grouping and does not appear in much text.
CSV is a standard format, outlined in RFC 4180 (in 2005), so there IS no lack of a standard. https://www.ietf.org/rfc/rfc4180.txt
And even before that, the C in CSV has always stood for Comma, not for semiColon :(
It's a pity Microsoft keeps ignoring that and is still sticking to the monstrosity they turned it into decades ago (yes, I admit, that was before the RFC was created).
One record per line, unless a newline occurs within quoted text (see below).
COMMA as column separator. Never a semicolon.
PERIOD as decimal point in numbers. Never a comma.
Text containing commas, periods and/or newlines enclosed in "double quotation marks".
Only if text is enclosed in double quotation marks, such quotations marks in the text escaped by doubling. These examples represent the same three fields:
1,"this text contains ""quotation marks""",3
1,this text contains "quotation marks",3
The standard does not cover date and time values, personally I try to stick to ISO 8601 format to avoid day/month/year -- month/day/year confusion.
I'd say stick to comma as it's widely recognized and understood. Be sure to quote your values and escape your quotes though.
ID,NAME,AGE
"23434","Norris, Chuck","24"
"34343","Bond, James ""master""","57"
Also relevant, but specially to excel, look at this answer and this other one that suggests, inserting a line at the beginning of the CSV with
"sep=,"
To inform excel which separator to expect
1.> Change File format to .CSV (semicolon delimited)
To achieve the desired result we need to temporary change the delimiter setting in the Excel Options:
Move to File -> Options -> Advanced -> Editing Section
Uncheck the “Use system separators” setting and put a comma in the “Decimal Separator” field.
Now save the file in the .CSV format and it will be saved in the semicolon delimited format.
Initially it was to be a comma, however as the comma is often used as a decimal point it wouldnt be such good separator, hence others like the semicolon, mostly country dependant
http://en.wikipedia.org/wiki/Comma-separated_values#Lack_of_a_standard
CSV is a Comma Seperated File. Generally the delimiter is a comma, but I have seen many other characters used as delimiters. They are just not as frequently used.
As for advising you on what to use, we need to know your application. Is the file specific to your application/program, or does this need to work with other programs?
To change comma to semicolon as the default Excel separator for CSV - go to Region -> Additional Settings -> Numbers tab -> List separator
and type ; instead of the default ,
Well to just to have some saying about semicolon. In lot of country, comma is what use for decimal not period. Mostly EU colonies, which consist of half of the world, another half follow UK standard (how the hell UK so big O_O) so in turn make using comma for database that include number create much of the headache because Excel refuse to recognize it as delimiter.
Like wise in my country, Viet Nam, follow France's standard, our partner HongKong use UK standard so comma make CSV unusable, and we use \t or ; instead for international use, but it still not "standard" per the document of CSV.
best way will be to save it in a text file with csv extension:
Sub ExportToCSV()
Dim i, j As Integer
Dim Name As String
Dim pathfile As String
Dim fs As Object
Dim stream As Object
Set fs = CreateObject("Scripting.FileSystemObject")
On Error GoTo fileexists
i = 15
Name = Format(Now(), "ddmmyyHHmmss")
pathfile = "D:\1\" & Name & ".csv"
Set stream = fs.CreateTextFile(pathfile, False, True)
fileexists:
If Err.Number = 58 Then
MsgBox "File already Exists"
'Your code here
Return
End If
On Error GoTo 0
j = 1
Do Until IsEmpty(ThisWorkbook.ActiveSheet.Cells(i, 1).Value)
stream.WriteLine (ThisWorkbook.Worksheets(1).Cells(i, 1).Value & ";" & Replace(ThisWorkbook.Worksheets(1).Cells(i, 6).Value, ".", ","))
j = j + 1
i = i + 1
Loop
stream.Close
End Sub