Read Dataset CSV with Line Feeds in Cells - csv

We are using the following code to read a CSV file from the Application Server:
OPEN DATASET file_str FOR INPUT IN TEXT MODE ENCODING DEFAULT.
*--------------------------------------------------*
* process and display output
*--------------------------------------------------*
DO.
CLEAR: lv_record,idat.
READ DATASET file_str INTO lv_record.
IF sy-subrc NE 0.
EXIT.
ELSE.
The problem we encounter now is that the CSV file holds Line Feeds in the cells:
If we read it with the above code the read dataset splits it in the middle of the cell instead of in the end.
What is the best way of handling this? We tried to read the file with the line feeds and do a replace all but we can't seem to visualize the line feeds in read dataset.
Thanks for your help!

This is a standard string handling issue - nothing specific to ABAP, you would encounter the same issue with BufferedReader.readLine(). Just check whether the line is complete (either contains the correct number of fields, or contains an even number of (un-quoted) cell-delimiters, i. e. "), and if it doesn't, read the next line and append it with CL_ABAP_CHAR_UTILITES=>CR_LF, then repeat.

This is the solution:
OPEN DATASET file_str FOR INPUT IN TEXT MODE ENCODING DEFAULT.
*--------------------------------------------------*
* process and display output
*--------------------------------------------------*
DATA: len TYPE i.
DATA: test TYPE string.
DATA: lv_new TYPE i,
lv_last_char TYPE c.
DATA: lv_concat TYPE string.
DO.
CLEAR: lv_record,idat, lv_concat.
READ DATASET file_str INTO lv_record.
IF sy-subrc NE 0.
EXIT.
ELSE.
"-- Get the string length
CALL FUNCTION 'STRING_LENGTH'
EXPORTING
string = lv_record
IMPORTING
length = lv_new.
"-- Check if the string is ended correctly
lv_new = lv_new - 1.
lv_last_char = lv_record+lv_new(1).
IF lv_last_char EQ '"'.
CONTINUE.
ELSE.
"-- Read next line
CONCATENATE lv_concat lv_record INTO lv_concat.
CLEAR lv_record.
WHILE lv_last_char NE '"'.
READ DATASET file_str INTO lv_record.
CALL FUNCTION 'STRING_LENGTH'
EXPORTING
string = lv_record
IMPORTING
length = lv_new.
lv_new = lv_new - 1.
lv_last_char = lv_record+lv_new(1).
CONCATENATE lv_concat lv_record INTO lv_concat.
ENDWHILE.
ENDIF.
IF lv_concat IS NOT INITIAL.
CLEAR lv_record.
MOVE lv_concat TO lv_record.
ENDIF.

Related

Unable to Extract simple Csv file using U-SQL

I have this csv file,
Almost all the records are getting processed fine, however there are two cases in which i am experiencing an issue.
Case 1:
A record containing quotes within quotes:
"some data "some data" some data"
Case 2:
A record containing comma within quotes:
"some data, some data some data"
i have looked into this issue, and got my way around looking into quoting parameter of the extractor, but i have observed that setting (quoting:false) solves case 1 and fails for case 2 and setting (quoting:true) solves case 2 but fails for case 1.
constraints: There is no room for changing the data file, the future data will be tailored accordingly but for this existing data i have to resolve this.
Try this, import records as one row and fix the row text using double quotes (do the same for the commas):
DECLARE #input string = #"/Samples/Data/Sample1.csv";
DECLARE #output string = #"/Output/Sample1.txt";
// Import records as one row
#data =
EXTRACT rowastext string
FROM #input
USING Extractors.Text('\n', quoting: false );
// Fix the row text using double quotes
#query =
SELECT Regex.Replace(rowastext, "([^,])\"([^,])", "$1\"\"$2") AS rowascsv
FROM #data;
OUTPUT #query
TO #output
USING Outputters.Csv(quoting : false);

Capture any standard report to JSON or XML?

I know that I can use LIST_TO_ASCI to convert a report to ASCII, but I would like to have a more high level data format like JSON, XML, CSV.
Is there a way to get something that is easier to handle then ASCII?
Here is the report I'd like to convert:
The conversion needs to be executed in ABAP on a result which was executed like this:
SUBMIT <REPORT_NAME> ... EXPORTING LIST TO MEMORY AND RETURN.
You can get access to SUBMIT list in memory like this:
call function 'LIST_FROM_MEMORY'
TABLES
listobject = t_list
EXCEPTIONS
not_found = 1
others = 2.
if sy-subrc <> 0.
message 'Unable to get list from memory' type 'E'.
endif.
call function 'WRITE_LIST'
TABLES
listobject = t_list
EXCEPTIONS
EMPTY_LIST = 1
OTHERS = 2
.
if sy-subrc <> 0.
message 'Unable to write list' type 'E'.
endif.
And the final step of the solution (conversion of result table to JSON) was already answered to you in your question.
I found a solution here: http://zevolving.com/2015/07/salv-table-22-get-data-directly-after-submit/
This is the code:
DATA: lt_outtab TYPE STANDARD TABLE OF alv_t_t2.
FIELD-SYMBOLS: <lt_outtab> like lt_outtab.
DATA lo_data TYPE REF TO data.
" Let know the model
cl_salv_bs_runtime_info=>set(
EXPORTING
display = abap_false
metadata = abap_false
data = abap_true
).
SUBMIT salv_demo_table_simple
AND RETURN.
TRY.
" get data from SALV model
cl_salv_bs_runtime_info=>get_data_ref(
IMPORTING
r_data = lo_data
).
ASSIGN lo_data->* to <lt_outtab>.
BREAK-POINT.
CATCH cx_salv_bs_sc_runtime_info.
ENDTRY.
Big thanks to Sandra Rossi, she gave me the hint to cx_salv_bs_sc_runtime_info.
Related answer: https://stackoverflow.com/a/52834118/633961

SSIS write DT_NTEXT into an UTF-8 csv file

I need to write the result of an SQL query into a CSV file (UTF-8 (I need this encoding as there are French letters)). One of the columns is too large (more than 20000 char) so I can't use DT_WSTR for it. The type that is inputted is DT_TEXT so I use a Data Conversion to change it to DT_NTEXT. But then when I want to write it to the file I have this error message :
Error 2 Validation error. The data type for "input column" is
DT_NTEXT, which is not supported with ANSI files. Use DT_TEXT instead
and convert the data to DT_NTEXT using the data conversion component
Is there a way I can write the data to my file?
Thank you
I had this kind of issues also sometimes. When working with data larger than 255 characters SSIS sees it as blob data and will always handle this as such.
I then converted this blob stream data to a readable text with a script component. Then other transformation should be possible.
This was the case in ssis that came with sql server 2008 but I believe this isn't changed yet.
I ended up doing just like Samyne says, I used a script.
First I've modified my SQL SP, instead of having several columns I put all the info in one single column like follows :
Select Column1 + '^' + Column2 + '^' + Column3 ...
Then I used this code in a script
string fileName = Dts.Variables["SLTemplateFilePath"].Value.ToString();
using (var stream = new FileStream(fileName, FileMode.Truncate))
{
using (var sw = new StreamWriter(stream, Encoding.UTF8))
{
OleDbDataAdapter oleDA = new OleDbDataAdapter();
DataTable dt = new DataTable();
oleDA.Fill(dt, Dts.Variables["FileData"].Value);
foreach (DataRow row in dt.Rows)
{
foreach (DataColumn column in dt.Columns)
{
sw.WriteLine(row[column]);
}
}
sw.WriteLine();
}
}
Putting all the info in one column is optional, I just wanted to avoid handling it in the script, this way if my SP is changed I don't need to modify the SSIS.

To save list in CSV file python?

I want to transpose row into column and then save words in CSV file. The problem is only last value of column after transpose is save in file, and if i append string with list, it save in file but characters not words.
Anyone help me to sort it. Thanks in advance
import re
import csv
app =[]
with open('afterstem.csv') as f:
words = [x.split() for x in f]
for x in zip(*words):
for y in x:
res=y
newstr = re.sub('"', r'', res)
app = app + list(res)
#print("AFTER" ,newstr)
with open(r"removequotes.csv", "w") as output:
writer = csv.writer(output, lineterminator='\n', delimiter='\t')
for val in app:
writer.writerow(val)
output.close()
The output save in file look like this:
But i want "Bank" in one cell.
Simply use
for column in zip(*words):
newrows = [[word.replace('"', '')] for word in column]
app.extend(newrows)
to put all columns one after another into the first column.
newrow = [[word.replace('"', '')] for word in column] creates a new list for each column with double quotes stripped and wrapped into a list and app.extend(newrow) appends all of these lists to your result variable app.
You got your result because of your inner loop and in particular its last line:
for y in x:
...
app = app + list(res)
The for-loop takes each word in each column and list(res) converts the string with the word into a list of characters. So "Bank" becomes ['B', 'a', 'n', 'k'], etc. Then app = app + list(res) creates a new list that contains every item from app and the characters from the word and assigns that to app.
In the end you got a array containing every letter from the file instead of a array with all words in the file in the right order. The call to writer.writerow(val) then wrote each letter as it's own row.
BTW: If your input also uses tabs to delimit columns, it might be easier to use list(csv.reader(f, lineterminator='\n', delimiter='\t')) instead of your simple read with split() and stripping of quotes.

How to import comma delimited text file into datawindow (powerbuilder 11.5)

Hi good day I'm very new to powerbuilder and I'm using PB 11.5
Can someone know how to import comma delimited text file into datawindow.
Example Text file
"1234","20141011","Juan, Delacruz","Usa","001992345456"...
"12345","20141011","Arc, Ino","Newyork","005765753256"...
How can I import the third column which is the full name and the last column which is the account number. I want to transfer the name and account number into my external data window. I've tried to use the ImportString(all the rows are being transferred in one column only). I have three fields in my external data window.the Name and Account number.
Here's the code
ls_File = dw_2.Object.file_name[1]
li_FileHandle = FileOpen(ls_File)
li_FileRead = FileRead(li_FileHandle, ls_Text)
DO WHILE li_FileRead > 0
li_Count ++
li_FileRead = FileRead(li_FileHandle, ls_Text)
ll_row = dw_1.ImportString(ls_Text,1)
Loop.
Please help me with the code! Thank You
It seems that PB expects by default a tab-separated csv file (while the 'c' from 'csv' stands for 'coma'...).
Add the csv! enumerated value in the arguments of ImportString() and it should fix the point (it does in my test box).
Also, the columns defined in your dataobject must match the columns in the csv file (at least for the the first columns your are interested in). If there are mode columns in the csv file, they will be ignored. But if you want to get the 1st (or 2nd) and 3rd columns, you need to define the first 3 columns. You can always hide the #1 or #2 if you do not need it.
BTW, your code has some issues :
you should always test the return values of function calls like FileOpen() for stopping processing in case of non-existent / non-readable file
You are reading the text file twice for the first row: once before the while and another inside of the loop. Or maybe it is intended to ignore a first line with column headers ?
FWIF, here is a working code based on yours:
string ls_file = "c:\dev\powerbuilder\experiment\data.csv"
string ls_text
int li_FileHandle, li_fileread, li_count
long ll_row
li_FileHandle = FileOpen(ls_File)
if li_FileHandle < 1 then
return
end if
li_FileRead = FileRead(li_FileHandle, ls_Text)
DO WHILE li_FileRead > 0
li_Count ++
ll_row = dw_1.ImportString(csv!,ls_Text,1)
li_FileRead = FileRead(li_FileHandle, ls_Text)//read next line
Loop
fileclose(li_fileHandle)
use datawindow_name.importfile(CSV!,file_path) method.