Welcome to the Genome Toolbox! I am glad you navigated to the blog and hope you find the contents useful and insightful for your genomic needs. If you find any of the entries particularly helpful, be sure to click the +1 button on the bottom of the post and share with your colleagues. Your input is encouraged, so if you have comments or are aware of more efficient tools not included in a post, I would love to hear from you. Enjoy your time browsing through the Toolbox.

Wednesday, June 26, 2013

Remove a List of Reads from a BAM File

Sometimes it is necessary to remove a subset of reads from a .bam file.  In my case, I wanted to remove a few chimeric reads where it appeared reads from different amplicons were fusing together before entering the sequencer.  Here is a line of code where I use Samtools and grep to remove a list of read ID's from the original .bam file and create a new filtered .bam file.  Hope it is useful for other applications as well.


Note the trailing hyphen at the end.

6 comments:

  1. Thank you so much. I learnt that the tools downstream wont work when we remove manually lines from sam files - convert them to bam and index them. Its much better using commands to do tasks.

    ReplyDelete
    Replies
    1. HeMan, glad you found the post helpful!

      Delete
    2. This comment has been removed by the author.

      Delete
  2. Hello!
    I try to use this script for my RNA-seq project.
    I have accepted_hits.bam file from TopHat, and for creating counting_table at HTSeq I need to remove three reads from this bam file with following headers:
    1) HWI-ST538:357:D2BKUACXX:1:1105:13318:13823.
    2) HWI-ST538:357:D2BKUACXX:1:1107:19710:10717
    3) HWI-ST538:357:D2BKUACXX:1:2314:13745:61117
    After running aforementioned script, samtools create 0 bytes sample1_filter.bam file.
    How I can fix this issue?
    Thank You for help.

    ReplyDelete
  3. May I ask what's the purpose of the trailing hyphen at the end?

    ReplyDelete