AFHood Group Blog The thoughtless yammerings of marketing junkies..

2Feb/094

Search a String Variable with the SAS Index Function – INDEX, INDEXC, INDEXW

The SAS index function is a basic tool for any programmer. The job of the index function is to search a string variable for a given value. The function returns a numeric value for the position of the value that matches your search criteria. In the event that there are multiple matches, it will return the first.

Here is a basic example:

proc print data = an_old_dataset;
run;
Obs    var_name

 1     AFHood
 2     Analytics
 3     Group
 4     AAG
 5     AFHood Analytics Group

data a_new_dataset;

set an_old_dataset;

got_value = index(var_name,"AFHood");

run;

Obs    var_name              got_value

 1     AFHood                 1
 2     Analytics              0
 3     Group                  0
 4     AAG                    0
 5     AFHood Analytics Group 1

There is another function that is similar and equally as useful. That is indexc. The function will return the first occurrence of any of the characters in any of the search strings.

data a_new_dataset;

set an_old_dataset;

got_value = indexc(var_name,"Ads", "tc");

run;

Obs    var_name              got_value

 1     AFHood                 1
 2     Analytics              1
 3     Group                  0
 4     AAG                    1
 5     AFHood Analytics Group 1

Lastly is the indexw function. It will return the first occurrence of a word in the search string.
data a_new_dataset; set an_old_dataset; got_value = indexc(var_name,"AFHood"); run;
Obs    var_name              got_value
 1     AFHood                 1
 2     Analytics              1
 3     Group                  0
 4     AAG                    1
 5     AFHood Analytics Group 1

Summary:
INDEX function - searches for patterns as separate words or as parts of words
INDEXC function - searches for any characters that are present in the string
INDEXW function - searches for strings that are words
Comments (4) Trackbacks (0)
  1. When I try INDEXC, it only allows for multiple, one character searches, not multiple string searches. What version of SAS are you using? or am I doing something wrong?
    Thanks,
    Jen

    This is what I tried.
    T2DMED = INDEXC(MEDS, “METFORMIN”, “INSULIN”, “ACTOS”, “PIOGLITIZONE”, “TAKEDA”, “AMARYL”, “GLIMEPIRIDE”, “AVENTIS”, “SULFONYLUREA”, “AVANDIA”, “ROSIGLITAZONE”, “AVANDAMET”, “BYETTA”, “EXANTIDE”, “DUETACT”, “GALVUS”, “VILDAGLIPTIN”, “GLIPIZIDE”, “GLUCOPHAGE”, “BIGUANIDE”, “GLIMEPIRIDE”, “AMARYL”, “GLYBURIDE”, “GLUCOVANCE”, “GLUCOTROL”, “GLIPIZIDE”, “GLYSET”, “JANUVIA”, “JANUMET”, “PRANDI”, “PRECOSE”, “ACARBOSE”, “REZULIN”, “TROGLITAZONE”, “STARLIX”, “NATEGLINIDE”, “NOVARTIS”, “GLYBURIDE”);

  2. Jen,

    You are right. We have misrepresented this function. Although you can specify a group of characters to search, it will return the first instance of ANY of those characters and not as a pattern. We will update the information accordingly and greatly appreciate your willingness to seek this out.

    Best regards,

    AAG

  3. if the string contains repeated characters how can u find the character postion

  4. I am trying to use this function along with a substring. Essentially, I need to find the value that is stored in the column PROD_SERV_NM in the column Gen_Name.

    How do I combine the two?


Leave a comment

No trackbacks yet.