In this tutorial we shall learn about how to do data manipulations using IF ELSE in SAS.
IF Statements in SAS can be used for:
Data Manipulation and creation of new variables.
Filtering rows.
Let us first set our library using the following code: My directory name is mylib
libname mylib '/home/u50182927/My_datasets';
For this tutorial we will leverage SAS' inbuilt dataset SASHELP.SHOES.
Creation of new variables...
Let us say you plan to meet a friend (That friend who usually cancels the plan at the end moment; Yes! That traitor who ditches you at the very last moment!) You always keep a second option with you:
Option 1: Meet that friend.
Option 2: If the friend doesn't turn up then you will play a video game
Option 3: If your friend does not turn up and you don't feel like playing a video game then you will go to sleep.
These what if scenarios are represented by IF-ELSE statements in SAS.
In SAS, you can create variables as well as datasets when certain conditions apply.
Syntax:
if (condition 1) THEN ### Do something when condition 1 is TRUE;
else if(condition 2) THEN ### Do something when condition 1 is FALSE but condition 2 is TRUE;
else ### Do something when conditions 1 and 2 are FALSE;
Task: Create a new variable Sales_info which indicates low sales if sales are < 40000 , otherwise Good Sales.
DATA mylib.if_else;
SET SASHELP.SHOES;
IF SALES < 40000 THEN Sales_info = 'Low sales';
ELSE Sales_info = 'Good Sales';
run;
Length statement
Note that in the above output Sales_info variable is getting truncated. This is because by default length of a character variable in SAS is 8 characters. To serve our purpose we need to define the maximum length using LENGTH function.
Here we have defined the maximum length of our Sales_info variable as 20 characters.
DATA mylib.if_else;
SET SASHELP.SHOES;
LENGTH Sales_info $20.;
IF SALES < 40000 THEN Sales_info = 'Low sales';
ELSE Sales_info = 'Good Sales';
run;
Multiple conditions
Task: Create a new variable Sales_info which indicates low sales if sales are < 40000 , Medium sales if sales are between 40000 and 120000, otherwise Good Sales.
DATA mylib.if_else;
SET SASHELP.SHOES;
LENGTH Sales_info $20.;
IF SALES < 40000 THEN Sales_info = 'Low sales';
ELSE IF SALES < 120000 THEN Sales_info = 'Medium sales';
ELSE Sales_info = 'Good Sales';
run;
Multiple conditions in IF
We can create variables having multiple conditions. To learn how to filter data in SAS using IF keywords refer to this tutorial.
Task: In the above IF ELSE scenario also add Returns < 4000 condition in IF and ELSE IF part.
DATA mylib.if_else;
SET SASHELP.SHOES;
LENGTH Sales_info $20.;
IF (SALES < 40000 AND Returns <4000) THEN Sales_info = 'Low sales';
ELSE IF (SALES < 120000 AND Returns <4000)THEN Sales_info = 'Medium sales';
ELSE Sales_info = 'Good Sales';
run;
Doing multiple things in IF ELSE: DO - END block
Sometimes we need to do multiple tasks when a condition is satisfied. In such a situation after THEN keyword we define DO; .... END; block (within which we write the tasks which need to be accomplished when a particular condition gets satisfied).
Task: Create new variable Sales_info and Status which indicates :
low sales and status =1 if sales are < 40000 ,
Medium sales and status = 2 if sales are between 40000 and 120000,
otherwise Good Sales and status = 3.
DATA mylib.if_else;
SET SASHELP.SHOES;
LENGTH Sales_info $20.;
IF SALES < 40000 THEN do;
Sales_info = 'Low sales';
Status = 1;
end;
ELSE IF SALES < 120000 THEN do;
Sales_info = 'Medium sales';
Status = 2;
end;
ELSE do;
Sales_info = 'Low Sales';
Status = 3;
end;
run;
Filtering rows
We can also filter datasets using IF statement in SAS
Task: Filter only those rows where Sales > 40000
DATA mylib.if_else;
SET SASHELP.SHOES;
IF SALES > 40000 THEN OUTPUT;
RUN;
Alternatively
DATA mylib.if_else;
SET SASHELP.SHOES;
IF SALES > 40000;
RUN;
THEN OUTPUT in above codes implies that output should be considered if Sales > 40000.
Task: Delete those rows where Sales > 40000
DATA mylib.if_else;
SET SASHELP.SHOES;
IF SALES > 40000 THEN DELETE;
RUN;
THEN DELETE in above code implied that rows having Sales > 40000 should be deleted.
We can also filter for a specific number of rows.
Task: Filter for data for rows after 200.
DATA mylib.if_else;
SET SASHELP.SHOES;
IF _N_ >200;
RUN;
In the above code _N_ is a temporary variable in SAS which denotes the iteration (row number) which SAS currently is dealing.
Difference between if IF-IF and IF-ELSE IF -ELSE
If we write multiple IFs as follows then they will be considered as separate conditions and all of them would be treated individually. While writing the conditions in IF-ELSE chunk make them one set of condition and SAS checks if the first condition in IF is not true then only it goes to the ELSE IF or ELSE part.
DATA mylib.if_else;
SET SASHELP.SHOES;
IF SALES > 40000 THEN Status = 1;
IF Returns > 1500 Then second_status = 2;
RUN;
In SAS following codes with multiple IF conditions (with no THEN statement) means filter the data where Sales > 40000 and Returns > 1500 (both conditions are satisfied by the rows).
DATA mylib.if_else;
SET SASHELP.SHOES;
IF SALES > 40000 ;
IF Returns > 1500;
RUN;
Comments