Genome Toolbox: Samtools Download BAM Region Only

Sunday, May 5, 2013

Samtools Download BAM Region Only

Often times publicly available sequencing data can serve as a useful reference for a sequencing project. The 1000 Genomes project is a great source, especially with their newly released high coverage Complete Genomics data. Here is an example UNIX script that shows how BAM files with genomic regions of interest can be created from a whole-genome BAM file that is hosted on an FTP server, without having to download the entire BAM file first. The bai_file_list.txt file is a file that contains unique identifiers for each BAM file extracted from a previously selected list of BAI files of interest. Here I am just extracting the BRCA1 and BRCA2 regions of the genome. The extracted reads are sorted and then saved as a BAM file and an associated BAI index file is also created. The final step removes excess BAI files that are downloaded and used by Samtools to extract the region of interest from the BAM files on the FTP server.

Genome Toolbox

Pages

Sunday, May 5, 2013

Samtools Download BAM Region Only

No comments:

Post a Comment