Welcome to the Genome Toolbox! I am glad you navigated to the blog and hope you find the contents useful and insightful for your genomic needs. If you find any of the entries particularly helpful, be sure to click the +1 button on the bottom of the post and share with your colleagues. Your input is encouraged, so if you have comments or are aware of more efficient tools not included in a post, I would love to hear from you. Enjoy your time browsing through the Toolbox.

Friday, November 8, 2013

How to Estimate a P-value from a Confidence Interval

Confidence intervals (CIs) are useful statistical calculations to help get a level of certainty around an estimated effect size.  Whenever possible, I advocate to include a CI when reporting an estimated effect size.  Sometimes, however, it is of interest to back calculate a p-value from a confidence interval if the p-value is not reported in the manuscript.  To do so, we need to remember the basic equations for the confidence interval and the calculation of a p-value.  Assuming we are dealing with a 95% CI, we would take the effect size and subtract/add 1.96 times the standard error of the effect size to get our lower and upper bounds of the confidence interval.  For the p-value, we just take the effect estimate and divide it by the standard error of the effect estimate to get a z score from which we can calculate the p-value.  Therefore, if we are given an effect size and confidence interval all we need to do is back calculate the standard error and combine that with the effect size to get the z score used to calculate the p-value.  Below are two examples to illustrate how to do this.

Suppose we have an estimate of a risk difference and a respective 95 percent confidence interval of 3.60 (0.70, 6.50).  Here are the steps to follow:
(1) Subtract the lower limit from the upper limit to get the difference and divide by 2: (6.50-0.70)/2=2.9
(2) Divide the difference by 1.96 (for a 95% CI) to get the standard error estimate: 2.9/1.96=1.48
(3) Divide the risk difference estimate by the standard error estimate to get a z score: 3.60/1.48=2.43
(4) Look up the z score using Python, R (ex: 2*pnorm(-abs(z))), Excel (ex: 2*1-normsdist(z score)), or an online calculator to get the p-value.  Usually the two-sided p-value is reported: p=0.015 (two-sided)

For an odds ratio, things are a bit trickier because we need to first take the natural log of the estimate and 95% confidence interval before we can carry out the back calculation of the standard error for calculating the p-value.  Suppose we have an odds ratio and 95 percent confidence interval of 1.28 (1.05, 1.57).  Here are the steps to follow:
(1) Take the natural log (ln) of each value in the 95% CI: 0.25 (0.05, 0.45)
(2) Subtract the lower limit from the upper limit and divide by 2: (0.45-0.05)/2=0.2
(3) Divide the difference by 1.96 (for a 95% CI) to get the standard error estimate: 0.2/1.96=0.10
(4) Divide the log odds ratio by the standard error estimate to get a z score: 0.25/0.10=2.50
(5) Look up the z score using Python, R (ex: 2*pnorm(-abs(z))), Excel (ex: 2*1-normsdist(z score)), or an online calculator to get the p-value.  Usually the two-sided p-value is reported: p=0.012 (two-sided)

Hopefully these are good examples to get you started.  As you can imagine you can also go from a p-value to a 95% confidence interval by extending these methods in the opposite direction, but in practice it is somewhat unlikely an author would report an effect size and p-value while leaving out the 95% confidence interval.

1 comment:

1. Thanks for the article! Is there a way to calculate the p-value from confidence intervals from a mixed-model regression?