where:
p is the risk allele frequency for locus k
r is the per allele odds ratio for locus k.
To make calculations easy, I made a simple R script that does all the calculations automatically. The input for the script is a file with 3 columns:
(1) Annotation for the SNP - this can be anything, for example: RS number, chromosomal coordinates, etc.
(2) Risk allele frequency - this is the frequency of the risk allele (range: 0-1) equal to p in the above equation.
(3) Per allele odds ratio - odds ratio for every one unit increase in the number of risk alleles.
Note, the risk allele frequency is the frequency of the risk allele and not the minor allele frequency. The program also needs an estimate of the familial relative risk (lambda 0). This can usually be done by looking for previous familial studies for the disease.
Here is the R script:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Import command line arguments | |
args <- commandArgs(trailingOnly=T) | |
snp_filename <- args[1] | |
fd_fam_risk <- as.numeric(args[2]) | |
# Import SNP data | |
snps <- read.table(snp_filename, as.is=T, header=F) | |
names(snps) <- c("id", "p", "r") | |
attach(snps) | |
# Observed familial risk to first degree relatives | |
print("Familial Risk Estimate (OR):") | |
fd_fam_risk | |
# Calculate familial risk due to each locus | |
snps$lambda <- (p*r**2+(1-p))/((p*r+(1-p))**2) | |
snps | |
# Contribution of known SNPs to familial risk | |
print("Familial Risk Explained (%):") | |
sum(log(snps$lambda))/log(fd_fam_risk)*100 |
It can be run from the command line by the example command:
Rscript familial_risk_snps.R snp_lst.txt 4
where:
familial_risk_snps.R is the name of the script.
snp_lst.txt is the input file with three columns described above.
4 is the estimate of the familial relative risk of the disease.
I am having difficulty deriving the original formula used by Cox et al. or your arithmetically equivalent version given here lambda due to allele k to overall familial risk. Iam wondering how one might explain how to derive this formula from a simple pedigree assuming one affected individual and a first degree relative
ReplyDelete