AFHood Group Blog The thoughtless yammerings of marketing junkies..

22Oct/090

SAS DIF function – Comparing previous records

The SAS DIF function is another useful tool for operating on previous records.

Lets say you want to know the difference in the last year's revenue and this year's revenue. Here is the data:

Year          Revenue

2001           10,000

2002          30,000

2003          60,000

2004          100,000

data revenue1;

set revenue;

Rev_increase=dif1(revenue);

run;

Year          Revenue         Rev_increase

2001           10,000         .

2002          30,000         20,000

2003          60,000         30,000

2004          100,000       40,000

The dif functions compares the current record to the nth previous record.

22Oct/090

SAS LAG funciton

The LAG function is a powerful tool in the SAS programming toolset. It is also one that has unintentional results. First, lets demonstrate how to use it.

data new_ds;

set old_ds;

last_price=lag(price);

run;

The above will return the last value processed for the price variable. The resulting dataset may look like this.

obs   price   last_price

1   10   .

2   15   10

3   8   15

4   3   8

5   9   3

Now its misuse. Conditional processing will create unexpected results when incorporated with the lag function. Lag returns the last value processed even if it doesn't appear in the resulting dataset. It is also important to note that you shouldn't use lag if you aren't reading dataset records sequentially.

This function can be handy when calculating moving averages and the like.

The following calculates the sum of the last 5 periods:

data new_ds;

set old_ds;

last_5_periods=lag(price)+lag2(price)+lag3(price)+lag4(price)+lag5(price);

ave_5_periods=last_5_periods/5;

run;