Is there any way to read read CSV column names automatically in Mainframe environment as PROC IMPORT is not supported in Mainframe? Tried the below code it is working in PC SAS but not in Mainframe SAS.
FILENAME FILEIN "ABC.CUST.FILE" DISP=SHR RECFM=V;
DATA VARNAMES;
INFILE FILEIN DELIMITER=',' DSD OBS=1 LRECL=32000;
INPUT VARNAME $ ##;
RUN;
Thanks in advance.
You can use PROC IMPORT on PC and copy the generated datastep code to mainframe.
Related
I'm new to SAS and I wish to import a csv file. This file has a column containing characters starting with a 0 (for instance, 01000 or 05200) and is 5 character long.
When I open my file with a calc software, no problem. But when I import in SAS with:
proc import file="myfile.csv"
out=output
dbms=csv;
run;
The column is then considered as numerical, and so the first 0 gets deleted. Changing the format afterwards doesn't solve my problem.
Is there a solution to specify the format import prior the csv reading, or just a solution to force the import of all the columns as characters?
Thanks a lot!
The easiest solution is to read the file with a program instead of forcing SAS to guess how to read the file. PROC IMPORT will actually generate a program that you could use as a model. But it is not hard to write your own. Then you will have complete control over how the variables are defined: NAME; TYPE (numeric or character); storage LENGTH; LABEL; FORMAT to use for display; INFORMAT to use for reading the values from the line.
Just define the variables, attach any required formats and/or informats, and then read them. For example this step would read two numeric and two character variables from the file. I made one of the character variables have DATE values so you can see how you might attach format and/or informat to a variable that would require it. Most variables do not need either an informat nor a format attached to them as SAS knows how to read and write both numbers and character strings.
data output;
infile "myfile.csv" dsd firstobs=2 truncover;
length var1 $10 var2 8 var3 $30 var4 8;
informat var4 date.;
format var4 yymmdd10.;
input var1 var2 var3 var4;
run;
I`ve got (and will receive in the future) many CSV files that use the semicolon as delimiter and the comma as decimal separator.
So far I could not find out how to import these files into SAS using proc import -- or in any other automated fashion without the need for messing around with the variable names manually.
Create some sample data:
%let filename = %sysfunc(pathname(work))\sap.csv;
data _null_;
file "&filename";
put 'a;b';
put '12345,11;67890,66';
run;
The import code:
proc import out = sap01
datafile= "&filename"
dbms = dlm;
delimiter = ";";
GETNAMES = YES;
run;
After the import a value for the variable "AMOUNT" such as 350,58 (which corresponds to 350.58 in the US format) would look like 35,058 (meaning thirtyfivethousand...) in SAS (and after re-export to the German EXCEL it would look like 35.058,00).
A simple but dirty workaround would be the following:
data sap02; set sap01;
AMOUNT = AMOUNT/100;
format AMOUNT best15.2;
run;
I wonder if there is a simple way to define the decimal separator for the CVS-import (similar to the specification of the delimiter). ..or any other "cleaner" solution compared to my workaround.
Many thanks in advance!
You technically should use dbms=dlm not dbms=csv, though it does figure things out. CSV means "Comma separated values", while DLM means "delimited", which is correct here.
I don't think there's a direct way to make SAS read in with the comma via PROC IMPORT. You need to tell SAS to use the NUMXw.d informat when reading in the data, and I don't see a way to force that setting in SAS. (There's an option for output with a comma, NLDECSEPARATOR, but I don't think that works here.)
Your best bet is either to write data step code yourself, or to run the PROC IMPORT, go to the log, and copy/paste the read in code into your program; then for each of the read-in records add :NUMX10. or whatever the appropriate maximum width of the field is. It will end up looking something like this:
data want;
infile "whatever.txt" dlm=';' lrecl=32767 missover;
input
firstnumvar :NUMX10.
secondnumvar :NUMX10.
thirdnumvar :NUMX10.
fourthnumvar :NUMX10.
charvar :$15.
charvar2 :$15.
;
run;
It will also generate lots of informat and format code; you can alternately convert the informats to NUMX10. instead of BEST. instead of adding the informat to the read-in. You can also just remove the informats, unless you have date fields.
data want;
infile "whatever.txt" dlm=';' lrecl=32767 missover;
informat firstnumvar secondnumvar thirdnumvar fourthnumvar NUMX10.;
informat charvar $15.;
format firstnumvar secondnumvar thirdnumvar fourthnumvar BEST12.;
format charvar $15.;
input
firstnumvar
secondnumvar
thirdnumvar
fourthnumvar
charvar $
;
run;
Your best bet is either to write data step code yourself, or to run
the PROC IMPORT, go to the log, and copy/paste the read in code into
your program
This has a drawback. If there is a change in the stucture of the csv file, for example a changed column order, then one has to change the code in the SAS programm.
So it is safer to change the input, substituting in the numeric fields the comma with dot and passing SAS the modified input.
The first idea was to use a perl program for this, and then use in SAS a filename with a pipe to read the modified input.
Unfortunately there is a SAS restriction in the proc import: The IMPORT procedure does not support device types or access methods for the FILENAME statement except for DISK.
So one has to create a workfile on disk with the adjusted input.
I used the CVS_PP package to read the csv file.
testdata.csv contains the csv data to read.
substitute_commasep.perl is the name of the perl program
perl code:
# use lib "/........"; # specifiy, if Text::CSV_PP is locally installed. Otherwise error message: Can't locate Text/CSV_PP.pm in ....;
use Text::CSV_PP;
use strict;
my $csv = Text::CSV_PP->new({ binary => 1
,sep_char => ';'
}) or die "Error creating CSV object: ".Text::CSV_PP->error_diag ();
open my $fhi, "<", "$ARGV[0]" or die "Error reading CSV file: $!";
while ( my $colref = $csv->getline( $fhi) ) {
foreach (#$colref) { # analyze each column value
s/,/\./ if /^\s*[\d,]*\s*$/; # substitute, if the field contains only numbers and ,
}
$csv->print(\*STDOUT, $colref);
print "\n";
}
$csv->eof or $csv->error_diag();
close $fhi;
SAS code:
filename readcsv pipe "perl substitute_commasep.perl testdata.csv";
filename dummy "dummy.csv";
data _null_;
infile readcsv;
file dummy;
input;
put _infile_;
run;
proc import datafile=dummy
out=data1
dbms=dlm
replace;
delimiter=';';
getnames=yes;
guessingrows=32767;
run;
I exported my SAS table in the form of a csv file into a different folder for me to use with a different program using this code that worked:
PROC EXPORT data=CA_ISO_policyBYpolicy_&thestate.
outfile="&whichfolder.CA_ISO_policyBYpolicy_&thestate..csv"
dbms=dlm replace;
delimiter=",";
run;
Using a different program in a different folder I am trying to import the data via this code:
LIBNAME Home "/sasdata/sasperm2/act_cfr/fr/SJR/AmFam_vs_ISO_Compare/" ;
%let Filepath = /sasdata/sasperm2/act_cfr/fr/SJR/AmFam_vs_ISO_Compare/;
%sdwlogin;
RUN;
%let thestate = OR;
%let policyyr = 2012;
/*---- ISO_Compare ----*/
data Work.CA_ISO_policyBYpolicy_&thestate.;
length Policy $10.;
infile "&Filepath/CA_ISO_policyBYpolicy_&thestate..csv" DELIMITER=',' TERMSTR=CRLF LRECL=2500 FIRSTOBS=2 MISSOVER DSD;
input Policy;
run;
The program runs but I am getting no data. I shortened the variable list to make the code easier to read. When I manually copy and re-paste the data into a different csv file and re-name it the same "CA_ISO_policyBYpolicy_OR.csv" then it works in my program. My initial reason to incorporate this code was to get rid of the manual process... so if anybody has any hints I would be very thankful.
As Joe suggested in the comments, unless you need the csv files for another reason, it would be better to create a SAS data library for this.
Another way to do this would be to proc import it pretty much the same way you used proc export:
proc import datafile="&Filepath./CA_ISO_policyBYpolicy_&thestate..csv"
out=Work.CA_ISO_policyBYpolicy_&thestate. dbms=dlm replace;
delimiter=",";
getnames=yes; *this will create variable names from your first line;
*The opposite of what proc export did;
run;
Other thing I can think of is:
%let Filepath = /sasdata/sasperm2/act_cfr/fr/SJR/AmFam_vs_ISO_Compare/;
Might be causing problems because of the forward slash. Try it like:
%let Filepath = %str(/sasdata/sasperm2/act_cfr/fr/SJR/AmFam_vs_ISO_Compare/);
Also, does the sasdata directory actually exist right on the root directory or is it a subdirectory of the current directory where your sas program is located? If it's the current directory you need to lose the initial forward slash (or put a . in front of it):
%let Filepath = %str(./sasdata/sasperm2/act_cfr/fr/SJR/AmFam_vs_ISO_Compare/);
I got a file in this format.
abc;def;"ghi
asdasd
asdasd
asd
asd
aas
d
"
Now I want to import it with SAS. How do I handle the multiline values?
The answer might depend on what causes the linefeeds to be there, what kind of linefeeds they are, and possibly also on the OS you're running SAS on as well as the version of SAS you're using. Not knowing any of the answers to these questions, here are a couple of suggestions:
First, you could try this infile statement on your data step:
infile "C:\test.csv" dsd delimiter=';' termstr=crlf;
the termstr=crlf tells SAS to only use Windows linefeeds to trigger new records.
Alternatively, you could have SAS pre-process your file byte by byte to ensure that any linefeeds within paired quotes are replaced (perhaps with spaces):
data _null_;
infile 'C:\test.csv' recfm=n;
file 'C:\testFixed.csv' recfm=n;
input a $char1.;
retain open 0;
if a='"' then open=not open;
if (a='0A'x or a='0D'x) and open then put '00'x #;
else put a $char1. #;
run;
This is adapted from here for your reference. You might need to tinker around with this code a bit to get it working. The idea is that you would then read the resulting csv into SAS with a standard data step.
The output we need to produce is a standard delimited file but instead of ascii content we need binary. Is this possible using SAS?
Is there a specific Binary Format you need? Or just something non-ascii? If you're using proc export, you're probably limited to whatever formats are available. However, you can always create the csv manually.
If anything will do, you could simply zip the csv file.
Running on a *nix system, for example, you'd use something like:
filename outfile pipe "gzip -c > myfile.csv.gz";
Then create the csv manually:
data _null_;
set mydata;
file outfile;
put var1 "," var2 "," var3;
run;
If this is PC/Windows SAS, I'm not as familiar, but you'll probably need to install a command-line zip utility.
This link from SAS suggests using winzip, which has a freely downloadable version. Otherwise, the code is similar.
http://support.sas.com/kb/26/011.html
You can actually make a CSV file as a SAS catalog entry; CSV is a valid SAS Catalog entry type.
Here's an example:
filename of catalog "sasuser.test.class.csv";
proc export data=sashelp.class
outfile=of
dbms=dlm;
delimiter=',';
run;
filename of clear;
This little piece of code exports SASHELP.CLASS to a SAS Catalog entry of entry type CSV.
This way you get a binary format you can move between SAS installations on different platforms with PROC CPORT/CIMPORT, not having to worry if the used binary package format is available to your SAS session, since it's an internal SAS format.
Are you saying you have binary data that you want to output to csv?
If so, I don't think there is necessarily a defined standard for how this should be handled.
I suggest trying it (proc export comes to mind) and seeing if the results match your expectations.
Using SAS, output a .csv file; Open it in Excel and Save As whichever format your client wants. You can automate this process with a little bit of scripting in ### as well. (Substitute ### with your favorite scripting language.)