I am trying to export my dataset from SAS to excel either is csv or xls format however, when I do this the columns with line breaks messes up my excel. Is there a way to export SAS dataset to excel preserving line breaks? I also need to display labels instead of column names and the dataset is fairly large approx. 150,000 rows.
Here is what I did,
proc export data=Final_w_label
outfile='work/ExtractExcel.csv'
dbms=csv label replace;
run; quit;
Thank you in advance.
See the bottom of the post for sample data.
One effective way to create an export that Excel will open easily and display embedded newlines is to use XML.
libname xmlout xmlv2 'c:\temp\want.xml';
data xmlout.want;
set have;
run;
libname xmlout;
In Excel (365) do File/Open, select the want.xml file and then select As an XML table in the secondary Open XML dialog that is raised.
Other ways
There are other ways to move SAS data into a form that Excel can parse. Proc EXPORT will create a text file with embedded carriage returns in the character variables (which Excel uses for in cell newlines)
proc export dbms=csv data=have label replace file='c:\temp\want.csv';
run;
The problem of the export is that Excel will not import the data properly using it's wizards. There might be a vbs solution for reading the export, but that is probably more trouble than worth.
Another form of export is dbms=excel that creates .xlsx files:
proc export dbms=excel data=have label replace file='c:\temp\want.xlsx';
run;
This export can be opened by Excel and the columns will all be correct. However, the initial view presentation of the data value in cells with embedded carriage returns will not appear to have the newline. Further examination with F2 edit mode will show that those embedded new lines are there, and pressing Enter (to accept edits) will cause the cell view to show the embedded newlines. You don't want to have to F2 every cell render as expected.
Sample Data
data have (label="Lines within stanza are separated by newline character");
attrib
id length=8 label='Identity'
name length=$50 label='Poem name'
auth length=$50 label='Author'
stanza1-stanza20 length=$250;
;
array stz stanza:;
id + 1;
section = 1;
infile cards eof=last;
do while (1=1);
linenum + 1;
input;
select;
when (_infile_ = '--') leave;
when (linenum = 1) name = _infile_;
when (linenum = 2) auth = _infile_;
when (_infile_ = '') section + 1;
otherwise stz(section) = catx('0d'x, stz(section), _infile_);
end;
end;
last:
output;
datalines4;
Trees
Joyce Kilmer
I think that I shall never see
A poem lovely as a tree.
A tree whose hungry mouth is prest
Against the earth’s sweet flowing breast;
A tree that looks at God all day,
And lifts her leafy arms to pray;
A tree that may in Summer wear
A nest of robins in her hair;
Upon whose bosom snow has lain;
Who intimately lives with rain.
Poems are made by fools like me,
But only God can make a tree.
--
;;;;
run;
Related
I am currently working on an input file and I do have a column which contains 3 different values in one cell itself. Although this data is not being used in the transformation , I need to input this data from the source and then ignore when it is loaded into the staging table.
But the issue I face is that it gets loaded into separate rows rather than 1 cell.
This particular column is input as a string datatype. what change do I need to make to resolve this issue. Please let me know If more details are needed to answer the question.
I have uploaded a sample file to google drive https://drive.google.com/file/d/17hn8xmRd4CWsgKBzHgdwnR9W4jTJ9lTn/view?usp=sharing
The following is a screenshot of the csv data as opened in a text editor
Having downloaded sample.csv from your link, the first thing I did was open it in a text editor (Notepad++, TextPad, Visual Studio, etc) and just looked at what you have.
Row 1 is column headers
Encoded in UFT-8 with BOM (byte order marker)
Line Endings are CR/LF (Carriage Return & Line Feed)
Column delimiter appears to be a comma ,
Double Quote, ", is used as the text qualifier but only when needed
There are CR/LF characters in the actual data
I then define my flat file connection manager based on that data
Finally, I have a data flow with a Flat File Source to a Derived Column and drop a Data Viewer between them
As you can see, configuring your Flat File Connection Manager as I show will allow all the data to flow into your table as expected.
What is happening now is the CRLF, which is our row delimiter, is having precedence over the embedded CRLF in the column data. By setting the double quote as the Text Qualifier, the data reader correctly "skips" the embedded CRLF until it is encountered outside of the quotes.
I am writting because I have a problem with my results viewer on Sas 9.4.
When they charge the resukts are the same as in output. Instead of having a table with cell I have a table draw with -+|. Layouts options does not work and when exporting to excel for example each line are contain into one cell (with | where it shiuld change cell).
Is there sonething I can do?
Because like this I am loosing a lot of the options SAS offer
(I already checked tools:options:preferences:results and checked create html)
Thank you on advanced for your help!
Here's an example of how you'd put results in Excel.
I'm guessing you're copying from HTML to Excel?
ods excel file = '/home/fkhurshed/Demo/demo.xlsx';
proc means data=sashelp.class;
run;
proc freq data=sashelp.class;
table sex*gender / norow nocol nopercent;
run;
ods excel close;
Might be going about this completely the wrong way - happy to be shown the error of my ways.
In a nutshell, I've got 50-odd files of mixed types (csv and excel) that I want to import (each file to its own table) to an SQL database.
In the control flow I've got an sql task that returns:
The source data filename
The source data filetype (csv / xlsx)
What I want to name the table to import to.
This object gets passed to a Foreach loop that loops through this object and puts these 3 fields into variables.
I want to then say "if the filetype variable is csv, go and do a flat file import. If it's .xlsx, go and do an excel import"
So inside my for each container I've got a dataflow task.
I want the first thing the dataflow task does to check the filetype variable, and then do the appropriate import.
I think it's got to be in the dataflow, because there isn't an "If" style control I can see in the control flow?
But I'm at a loss as to how I pass a variable into the conditional split.
Any thoughts welcome.
OR! - just had a thought. Is the best way to do this to get a list of all the csv file types, process them in a dataflow, then get a list of all the .xlsx ones and process them - so I'd have:
Get csv filenames & tablenames
for each to loop through these
dataflow to import data from csv
get xlsx filenames and tablenames
for each through these
dataflow to import data from xlsx.
Just doesn't seem as elegant?
Cheers
Here is what I use.
OS: Linux Mint 18
Editor: LibreOffice Writer 5.1.6.2
Situation
Consider the following foo.csv file (just example, the raw data contains hundred of lines):
A,B,C
1,2,3
To create a table in Writer with the data from foo.csv usually one creates the table via Toolbar and then type the contents (possibly using TAB to navigate between cells).
Here is the result of procedure above:
Goal: Since the whole foo.csv contains hundreds of lines, how to proceed?
1st try: copy and paste the data from foo.csv into the table does not work, as seen below.
2nd try: copy and paste the data from foo.csv into the table with all cells selected does not work, as seen below.
Question: is it possible to edit an odt file in some way to write some code (like we could do with tags in HTML) to produce such table?
Embed a Calc spreadsheet is not acceptable.
Just use the "Text to Table" feature:
Insert the csv as "plain text" into your writer document (not into a table, just anywhere else);
Select the inserted lines;
Select Menu "Table" -> "Convert" -> "Text to Table";
Adjust the conversion properties as needed (set separator to comma: Select "Other", enter a comma into the box at the right"):
Hit OK - LO Writer will convert the text content of your CSV into a nice Writer table.
Please note that if you use this solution, there's nothing like a "connection" between the writer table and the csv data. Changing the csv won't affect the writer table. This would be possible only by embedding an object (but this won't result into a Writer table...).
If the csv data is the only content of the odt (writer) file, there's another option: Use LibreOffice Base to create a LO Database using the csv file (dynamically updated if the csv changes), and use the Report feature to get a tabular output of the csv data. LO Base will store the output layout as report, making it easy to create an up-to-date report.
I'm pasting Tab Delimited data from Notepad++ to excel (about 50k rows and 3 columns). No matter how many different ways I try it, Excel wants to convert a cell containing one " to the next instance of " into one cell content.
For Example, if my data looked like this:
"Apple 1.0 Store
Banana 1.3 Store
"Cherry" 2.5 Garden
Watermelon 4.0 Field
The excel file looks like this:
Apple1.0StoreBanana1.3Store
Cherry 2.5GardenWatermelon4.0Field
One way to get around this is to open the file as a CSV in excel, however this leads to Excel formatting the number values to simplified ones using Excel's "General" format. So the data would look like the following:
"Apple 1 Store
Banana 1.3 Store
"Cherry" 2.5 Garden
Watermelon 4 Field
The data I'm getting is coming from SQL Server Studio so my options for file formats are:
.CSV
.Txt (Tab-delimited)
Copy Pasting from Query results
The solution I'm looking for is to have the data represented in Excel with no excel processing taking place on the quotations, numbers or any other cell contents.
Don't open the file directly in excel. Instead import it and control the data types and file layout.
Open a new excel document:
Select Data menu:
Select From Text in get External Data section.
Select file to import
On step 1 of import wizard select delimited
Click next
Select tab checkbox and change text qualifier to {none}.
Click next
Set column data types to general, text, text
Click finish.
Excel auto imports the data the best it can when you open directly in excel. You lose flexibility/control when this happens. better to import and control yourself to get the fine adjustments you're looking for.
You end up with something like this:
By treating the numbers like text, the zero's don't get messed up.
By setting the text qualifier to none, the quotes don't get messed up.
Have you tried opening it via Text Import?
Got to Data tab > From Text (third form left on default)
You will have window similar to Text To Columns.
Select correct delimiter, remember to remove the quote sign from TExt Qualifier and mark all columns as text to avoid Excel autoformatting.
Step 1:
Step 2:
Step 3:
EXCEL TIP: TIME SAVING IN IMPORTING CSV FILES INTO EXCEL: If u pre-set your Text-To-Columns delimiter parameters correctly in EXCEL (eg specify tabs as the delimiter) and then copy and paste the CSV data, Excel will import the CSV paste directly into the correct columns without u having to going through the Text-To-Columns rigmarole. This was particularly time saving when i had to import hundreds of bank statements into Excel.
However if your Text-To-Columns delimiters are pre-specified incorrectly as e.g. comma and you are importing tab delimited files then excel will dump all the data into one column, and u will have to go through the time consuming process of converting Text-To-Columns for each statement.
EXCEL LOOKS TO THE EXISTING Text-To-Columns delimiters TO SEE IF IT CAN USE THOSE TO MAKE YOUR LIFE EASIER WHEN PASTING DATA
Hope that tip helps (It saved me several hours)