Welcome to the Genome Toolbox! I am glad you navigated to the blog and hope you find the contents useful and insightful for your genomic needs. If you find any of the entries particularly helpful, be sure to click the +1 button on the bottom of the post and share with your colleagues. Your input is encouraged, so if you have comments or are aware of more efficient tools not included in a post, I would love to hear from you. Enjoy your time browsing through the Toolbox.
Showing posts with label header. Show all posts
Showing posts with label header. Show all posts

Friday, November 22, 2013

One Line Command to Remove Header from File in UNIX

Sometimes you need to remove a header, footer, or range of lines from a file to better manipulate it in UNIX.  Here are some quick one liners to do so:



Thursday, June 20, 2013

Does BAM File Use Hg18 or HG19 Coordinates?

How do you tell which coordinates are used in a .bam file?  Well, its pretty easy.  Just pull the .bam file header up using Samtools


Then check out the rows that begin with @SQ followed by SN:chr## and LN:########.  Compare the lengths of a few of the chromosomes to the below list of lengths.  Whichever list the lengths match will indicate which coordinates are being used in the .bam file.


Hope this is helpful in determining which coordinates are used in your .bam files.

Thursday, May 9, 2013

Change .bam File Header

Each .bam file has an important header that describes a number of characteristics about the read sequences it contains.  The header is usually multiple lines and has information no chromosomes and samples included in the .bam file.   Samtools can be used to view the header of a .bam file with the following command.

If the need arises, Samtools can also be used to modify the header of a .bam file.  Samtools uses the reheader command to do this.  Below is an example where I changed the length of chrM in the header from 16569 to 16571 base pairs.

Merge Multiple .bam Files

I had multiple .bam files from different subjects I wanted to merge into one master .bam file.  It seemed like Samtools would easily convert these files using the merge option, but after reading a few online posts there looks like there are some issues with creating an appropriate header for the new merged .bam file from all the original .bam files.  Picard seemed to be the optimal method to do this.  Here's some code that uses Picard to merge multiple .bam files.