SAS DIF function – Comparing previous records
The SAS DIF function is another useful tool for operating on previous records.
Lets say you want to know the difference in the last year's revenue and this year's revenue. Here is the data:
Year Revenue
2001 10,000
2002 30,000
2003 60,000
2004 100,000
data revenue1;
set revenue;
Rev_increase=dif1(revenue);
run;
Year Revenue Rev_increase
2001 10,000 .
2002 30,000 20,000
2003 60,000 30,000
2004 100,000 40,000
The dif functions compares the current record to the nth previous record.
SAS LAG funciton
The LAG function is a powerful tool in the SAS programming toolset. It is also one that has unintentional results. First, lets demonstrate how to use it.
data new_ds;
set old_ds;
last_price=lag(price);
run;
The above will return the last value processed for the price variable. The resulting dataset may look like this.
obs price last_price
1 10 .
2 15 10
3 8 15
4 3 8
5 9 3
Now its misuse. Conditional processing will create unexpected results when incorporated with the lag function. Lag returns the last value processed even if it doesn't appear in the resulting dataset. It is also important to note that you shouldn't use lag if you aren't reading dataset records sequentially.
This function can be handy when calculating moving averages and the like.
The following calculates the sum of the last 5 periods:
data new_ds;
set old_ds;
last_5_periods=lag(price)+lag2(price)+lag3(price)+lag4(price)+lag5(price);
ave_5_periods=last_5_periods/5;
run;