Welcome to the Genome Toolbox! I am glad you navigated to the blog and hope you find the contents useful and insightful for your genomic needs. If you find any of the entries particularly helpful, be sure to click the +1 button on the bottom of the post and share with your colleagues. Your input is encouraged, so if you have comments or are aware of more efficient tools not included in a post, I would love to hear from you. Enjoy your time browsing through the Toolbox.

Thursday, February 20, 2014

Zero Fill Coverage Gaps in Samtools Depth Output

Merging depth output from multiple .bam files can be difficult since Samtools only outputs depth counts for coordinates with non-zero coverage.  If you want to merge depth output from these .bam files you first need to fill in the base pair positions of no coverage with zero values so the depth output for all .bam files is the same length.  Then using a simple UNIX cat you can merge multiple .bam file depth output into one file for comparison and analysis.

Here is a simple Python script to zero fill Samtools depth output:

import sys
filename=sys.argv[1]
start=int(sys.argv[2])
stop=int(sys.argv[3])
data=open(filename).readlines()
out=open(filename.strip(".txt")+".depth", "w")
for i in range(len(data)):
chr,bp_str,count=data[i].strip().split()
bp=int(bp_str)
if start<bp:
for j in range(start,bp):
print >> out, chr+"\t"+str(j)+"\t0"
print >> out, chr+"\t"+bp_str+"\t"+str(count)
start=bp+1
elif start==bp:
print >> out, chr+"\t"+bp_str+"\t"+str(count)
start=bp+1
if start<stop:
for j in range(start,stop+1):
print >> out, chr+"\t"+str(j)+"\t0"
out.close()
view raw depth_fill.py hosted with ❤ by GitHub


And below is an example of how to use it:

# Define BAM Region
chr=1
start=247000
stop=257000
coord=`echo "chr$chr:$start-$stop"`
# Get Depth for BAM Region
samtools depth -r $coord id_1.bam > id_1.txt
samtools depth -r $coord id_2.bam > id_2.txt
samtools depth -r $coord id_3.bam > id_3.txt
# Zero Fill Depth Output
python depth_fill.py id_1.txt $start $stop
python depth_fill.py id_2.txt $start $stop
python depth_fill.py id_3.txt $start $stop
# Merge Depth Files
cut -f 3 id_2.depth > id_2.temp
cut -f 3 id_3.depth > id_3.temp
cat id_1.depth id_2.temp id_3.temp > all_ids.depth
view raw merge_depth.sh hosted with ❤ by GitHub

No comments:

Post a Comment