Welcome to the Genome Toolbox! I am glad you navigated to the blog and hope you find the contents useful and insightful for your genomic needs. If you find any of the entries particularly helpful, be sure to click the +1 button on the bottom of the post and share with your colleagues. Your input is encouraged, so if you have comments or are aware of more efficient tools not included in a post, I would love to hear from you. Enjoy your time browsing through the Toolbox.

Monday, January 27, 2014

Produce SAS Proc Freq Output in R

I have never been a huge fan or advocate of SAS, and actually recommend using other SAS alternatives, but I am somewhat addicted to the SAS Proc Freq procedure and its output tables.  Its a nice way to not only visualize the data but also to get some useful summary statistics.  I have been using the R table command for a while and in most cases combined with margin.table or prop.table it suffices to summarize the data.  Recently, I have been in need of summary statistics for the tables as well.  This can be accomplished easy enough using the R chisq.test or fisher.test commands, but still doesn't quite provide the fluid integration of data visualization and summary statistics that the SAS Proc Freq output provides.  Today I came across the R package CrossTable.  This procedure, part of the gmodels library, provides formatted output very similar to that of Proc Freq.  So much so, it even uses the same ascii characters to delineate cell boundaries.  There is a bit of playing around with options and such to get the exact statistics and percentages and such you would like, but overall a very nice (a not to mention free) alternative to SAS's Proc Freq output.  Below is an example table I created.

No comments:

Post a Comment