• Ekta Aggarwal

IMPORTING data in SAS (using PROC IMPORT)

In this tutorial we shall learn about how to import data in SAS using PROC IMPORT.

We will mainly focus on importing:

  • CSV files

  • EXCEL files

Let us first set our library using the following code: My directory name is mylib

libname mylib '/home/u50182927/My_datasets';

For this tutorial I have uploaded following 2 files:

CSV File:

Employee_info
.csv
Download CSV • 155B

Excel File:

Mydata
.xlsx
Download XLSX • 10KB

I am using SAS' University edition thus I have uploaded my files on the SAS server as follows:





















Importing a CSV file


PROC IMPORT DATAFILE = '/home/u50132927/My_datasets/Employee_info.csv'
OUT = mylib.imported_CSV_file
DBMS  =csv REPLACE;
GETNAMES=Yes;
GUESSINGROWS = MAX;
RUN;

Using the above syntax we can import our CSV file in SAS. Following is the explanation of keywords in PROC IMPORT.


DATAFILE : Location of file and file name which needs to be imported.

OUT : Defining the library and new dataset name which we will use for referring in SAS.

DBMS = CSV (for CSV files)

REPLACE: Replace the already existing dataset. In our case if imported_CSV_file exists in our mylib then that will be replaced by our new file.

GETNAMES : If GETNAMES = Yes then first row would be treated as our header row, if your data does not have any headers then set GETNAMES = No

GUESSINGROWS: By default SAS takes only some of the rows to guess the informats of the dataset. By setting GUESSINGROWS = MAX we are telling SAS to guess the informats of the columns using all the rows.

Importing CSV file in SAS









Importing an Excel file


For importing an excel file we use DBMS = xlsx or DBMS = xls depending upon which version you have chosen to create that excel file.

PROC IMPORT DATAFILE = '/home/u50132927/My_datasets/Mydata.xlsx'
OUT = mylib.imported_excel_file
DBMS =xlsx REPLACE;
GETNAMES=No;
RUN;

If we set GETNAMES = No then our first row would be treated as a part of our data and default column names are provided by SAS.

Datarow= 3 means start reading the rows form 3rd line i.e. from 3rd row including the header.

PROC IMPORT DATAFILE = '/home/u50132927/My_datasets/Mydata.xlsx'
OUT = mylib.imported_excel_file
DBMS =xlsx REPLACE;
DATAROW = 3
GETNAMES=Yes;
RUN;
Importing excel file in SAS

SHEET = ... If our excel file has multiple sheets then we can define the sheetname as SHEET = ... By default PROC IMPORT reads first sheet in an excel.

PROC IMPORT DATAFILE = '/home/u50132927/My_datasets/Mydata.xlsx'
OUT = mylib.imported_excel_file
DBMS =xlsx REPLACE;
SHEET = 'Sheet2';
GETNAMES=Yes;
RUN;


Tags: