Welcome to the Genome Toolbox! I am glad you navigated to the blog and hope you find the contents useful and insightful for your genomic needs. If you find any of the entries particularly helpful, be sure to click the +1 button on the bottom of the post and share with your colleagues. Your input is encouraged, so if you have comments or are aware of more efficient tools not included in a post, I would love to hear from you. Enjoy your time browsing through the Toolbox.
Showing posts with label fill. Show all posts
Showing posts with label fill. Show all posts

Tuesday, April 1, 2014

Create Empty Data Frame in R with Specified Dimensions

Sometimes it is necessary to create an empty data frame in R to fill with output.  An example would be output from a for loop that loops over SNPs and calculates an association p-value to fill into the empty data frame.  The matrix functionality in R makes this easy to do.  The code below will create a 100 by 2 matrix and convert it into a dataframe with 100 rows and 2 column variables.  The nrow and ncol options are what designate the size of the resulting data frame.  The data frame is filled with NA values so that missing values are handled appropriately.  Below is some example code:

Thursday, February 20, 2014

Zero Fill Coverage Gaps in Samtools Depth Output

Merging depth output from multiple .bam files can be difficult since Samtools only outputs depth counts for coordinates with non-zero coverage.  If you want to merge depth output from these .bam files you first need to fill in the base pair positions of no coverage with zero values so the depth output for all .bam files is the same length.  Then using a simple UNIX cat you can merge multiple .bam file depth output into one file for comparison and analysis.

Here is a simple Python script to zero fill Samtools depth output:



And below is an example of how to use it:

Thursday, January 23, 2014

Formatting Excel Cells with Zero Filling

While Microsoft Excel could use some improvements for data management and analysis, it remains my program of choice for putting together summary tables, particularly the descriptive statistics of most Table 1's.  I'm pretty particular with formatting and wanted to create a Table 1 with column percentages that all lined up nicely.  To do this I needed to zero fill numbers both before and after the decimal point so that each number, when formatted, took up the same amount of space in the column.  I the past I would do this by pasting the table as text (without formulas) and then manually filling in zeros.  This was tedious, especially when having to redo tables after sample numbers changed.

Today I found out there is a way to have Excel automatically include these zeros.  You can do this by creating a custom number format.  Here's how to do so.

1). Right click on the cell you want to format and choose Format Cells...
2). Click the Number tab and select Custom in the Category: list.
3). Put in your desired formatting.  You can do this by building off other format types.  In my case, I wanted to have a format so that the numbers 3.5562 and 55 appeared as (03.6) and (55.0), respectively.  To do this the Type: box needed to have the format (00.0).  This will zero fill both before and after the decimal point as well as round all numbers by one decimal place.

As you can imagine, you can customize this to zero fill based on your particular needs or desired format type.  Below is an example Table 1 excerpt to show how the formatting looks.