Welcome to the Genome Toolbox! I am glad you navigated to the blog and hope you find the contents useful and insightful for your genomic needs. If you find any of the entries particularly helpful, be sure to click the +1 button on the bottom of the post and share with your colleagues. Your input is encouraged, so if you have comments or are aware of more efficient tools not included in a post, I would love to hear from you. Enjoy your time browsing through the Toolbox.
Showing posts with label merge. Show all posts
Showing posts with label merge. Show all posts

Friday, April 4, 2014

Merge Changes from Multiple Word Files into One Document

Collaborations get you access to lots of data.  However, collaborations lead to long author lists; long author lists lead to many comments from co-authors; and many comments from co-authors can lead to great headaches trying to track changes and get a final clean manuscript together.  Well, fortunately Microsoft Word has a built in feature that enables users to merge changes together from many different contributors into one master document (.doc or .docx file).  This is done iteratively, two at a time, until all the comments from reviewers are in one merged MS Word document.  To do this follow these steps:


1) Open a blank document in Microsoft Word
2) Go to the Review tab and click the Compare icon and then select Combine....
3) In the dialogue box that pops up, input your original file name in the Original document field and one of the changed document file names into the Revised document field.
4) Click on the more button and ensure the the radio button next to Original document is selected under the Show changes in... heading.
5) Click OK and a document will be generated that merges changes from your original and revised document.
6) Repeat steps 2-5, over again for each revised document you want to combine with the merged document.

It is a bit repetitive, but eventually all the changes from each file will be combined and tracked into one master document.  Ideally, the developers at Microsoft will improve the functionality of this so that many changes from many documents can be merged into one document in a single step.  A final note is that Word can only store one set of formatting changes at a time, so if formatting does change from draft to draft a dialogue box will appear asking you which formatting you want to use.  Hope this saves you a lot of time and frustration.

Thursday, February 20, 2014

Zero Fill Coverage Gaps in Samtools Depth Output

Merging depth output from multiple .bam files can be difficult since Samtools only outputs depth counts for coordinates with non-zero coverage.  If you want to merge depth output from these .bam files you first need to fill in the base pair positions of no coverage with zero values so the depth output for all .bam files is the same length.  Then using a simple UNIX cat you can merge multiple .bam file depth output into one file for comparison and analysis.

Here is a simple Python script to zero fill Samtools depth output:



And below is an example of how to use it:

Monday, May 20, 2013

Merge VCF Files

VCF tools can be used to merge .vcf files using the following commnad.

Thursday, May 9, 2013

Merge Multiple .bam Files

I had multiple .bam files from different subjects I wanted to merge into one master .bam file.  It seemed like Samtools would easily convert these files using the merge option, but after reading a few online posts there looks like there are some issues with creating an appropriate header for the new merged .bam file from all the original .bam files.  Picard seemed to be the optimal method to do this.  Here's some code that uses Picard to merge multiple .bam files.