Pages

Sunday, August 4, 2013

Widgets

Simulating Proc FREQ counts with Preloaded Formats

Proc FREQ is commonly used to create frequency counts on cross tabulations. The following code demonstrates this by counting the frequency of Initials by Gender:
* make up some data... count the initials by gender;
data class1;
set sashelp.class;
initial=substrn(name,1,1);
run;
* tabulate frequencies;
proc freq data=class1;
table initial*sex / out=class_freq noprint;
run;
proc print;
run;
However, what if we wanted to always show a count for both genders, even when there are none present in the raw data? One possibility is to use the SPARSE option on the TABLE statement. This works well given there is at least one record somewhere in the cross for each level to be reported.
What can we do if no data exist on the set being analysed for a given level? For example, in a survey "Strongly Disagree" was a valid response but no records were ever observed. This is where preloaded formats provide a robust solution.
Several procedures such as Proc MEANS, SUMMARY, TABULATE and REPORT allow the use of preloaded formats. Each value that is needed in the output is defined in the format. The following code shows an example of creating the same frequency counts as Proc FREQ:
* Ensure all the permutations are reported using a format
and proc summary;
proc format ;
value $gender
'F'='Female'
'M'='Male';
run;
proc summary data=class1 completetypes nway ;
class initial;
class sex / preloadfmt;
output out=class_summ(drop=_type_) / autoname;
run;
proc print;
run;

No comments:

Post a Comment