Welcome to the Genome Toolbox! I am glad you navigated to the blog and hope you find the contents useful and insightful for your genomic needs. If you find any of the entries particularly helpful, be sure to click the +1 button on the bottom of the post and share with your colleagues. Your input is encouraged, so if you have comments or are aware of more efficient tools not included in a post, I would love to hear from you. Enjoy your time browsing through the Toolbox.
Showing posts with label count. Show all posts
Showing posts with label count. Show all posts

Monday, July 7, 2014

Friday, June 13, 2014

Find Total, Mapped, and Unmapped Alignments in a BAM File

Often one of the first descriptive statistics of interest for a .bam file is the total number of alignments included in the BAM file.  An alignment is where a read from a next-generation sequencing approach maps to the reference genome.  There are a few ways of calculating the total number of mapped, unmapped, and overall number of alignments, but in my opinion samtools provides the most powerful and efficient means of doing this.  Here are some simple example scripts to count total alignments and total reads in a bam file.

Thursday, December 12, 2013

Sum Overlapping Base Pairs of Features from Chromosomal BED File

I had a .bed file of genomic features on a chromosome that I wanted to figure out the extent of overlap of the features to investigate commonly covered genes as well as positions where features were likely to form.  I wanted to generate a plot similar to a coverage depth plot from next-generation sequencing reads.  I am sure more efficient methods exist, but here is some Python code that takes in a .bed file of features (features.bed) and creates an output file (features.depth) with the feature overlap "depth" every 5,000 base pairs across the areas which contain features in your chromosomal .bed file.