Welcome to the Genome Toolbox! I am glad you navigated to the blog and hope you find the contents useful and insightful for your genomic needs. If you find any of the entries particularly helpful, be sure to click the +1 button on the bottom of the post and share with your colleagues. Your input is encouraged, so if you have comments or are aware of more efficient tools not included in a post, I would love to hear from you. Enjoy your time browsing through the Toolbox.

Wednesday, June 4, 2014

Easy Forest Plots in R


Forest plots are great ways to visualize individual group estimates as well as investigate heterogeneity of effect.  Most forest plot programs will display combined effect estimates and give you an indicator of whether there is evidence for heterogeneity among subgroups.  Fortunately, the R metafor package makes meta-analysis and plotting forest plots relatively easy.  Below is a sample R script that will combine beta and standard error estimates from an input file (input.txt) and create simple forest plots with an overall estimate as well as p-values for association and heterogeneity.

In general, the input.txt file should either

(1) Have the columns:
group - name for the study or group
beta - the log odds ratio for the effect of interest
se - the standard error for the log odds ratio estimate

or

(2) Have the columns:
group - name for the study or group
OR - odds ratio for the effect of interest
LCL - lower confidence interval for the odds ratio
UCL - upper confidence interval for the odds ratio

For the second case where you have an odds ratio and 95% confidence estimates, beta and se need to be estimated.  This is done by uncommenting lines 8 and 9 of the script.  Of note: due to rounding error the final forest plot may have 95% CI limits that are one digit off.

The R script that uses the metafor package as well as an example input.txt are below.

# Load metafor library
library(metafor)
# Load input.tyx
data <- read.table("input.txt",as.is=T, header=T, sep="\t")
# # If OR's and 95% CI's, run these commented out lines
# data$beta <- log(data$OR)
# data$se <- (log(data$UCL)-log(data$LCL))/(2*1.96)
# Assign values for plotting
labs <- data$group
yi <- data$beta
sei <- data$se
# Combine data into summary estimate
res <- rma(yi=yi, sei=sei, method="FE")
summary(res)
# Plot combined data
forest(res, transf=exp, refline=1, xlab="Odds Ratio (95%CI)", slab=labs, mlab="Summary Estimate")
mtext(paste("Association p-value=",summary(res)$pval),side=3, line=-1)
mtext(paste("Heterogeneity p-value=",summary(res)$QEp),side=3, line=-2.25)
view raw forest_plot.R hosted with ❤ by GitHub

group beta se
Study 01 0.635422995842 0.239880139026
Study 02 -0.043430833437 0.255182175384
Study 03 0.808777362855 0.278587375378
Study 04 0.233386044408 0.316660371211
Study 05 0.237572509459 0.195872577686
Study 06 0.541210360393 0.264106181227
Study 07 0.201281009793 0.179269971554
Study 08 -0.009615458699 0.443981914977
Study 09 0.569423560036 0.127545364349
Study 10 0.673278977343 0.293661994161
view raw input.txt hosted with ❤ by GitHub


2 comments:

  1. Hi,
    thanks for this script, i have been looking for something similar for a long time!
    I was just wondering if it's ok to use it for a linear regression, and with standardized beta.
    Thanks.

    ReplyDelete