Welcome to the Genome Toolbox! I am glad you navigated to the blog and hope you find the contents useful and insightful for your genomic needs. If you find any of the entries particularly helpful, be sure to click the +1 button on the bottom of the post and share with your colleagues. Your input is encouraged, so if you have comments or are aware of more efficient tools not included in a post, I would love to hear from you. Enjoy your time browsing through the Toolbox.
Showing posts with label MAF. Show all posts
Showing posts with label MAF. Show all posts

Tuesday, July 9, 2013

Calculate Minor Allele Frequencies from VCF File Variants

Today I needed to calculate minor allele frequencies (MAFs) for sequence variants called in a .vcf file.  I couldn't find any programs that would do this for me, so I wrote a quick script to do it in Python.


This can be run in python from the command prompt by typing:


where project.vcf is the vcf file you want to calculate MAFs for.  It will return a project.txt file that contains the calculated MAF values.  This script will only work for SNPs and does not work on insertions and deletions.

Alternatively, if Python scares you there is a bit of a round about way that will do this for you too.  First use vcftools to convert your .vcf file into a Plink compatible .ped and .map file.


Then, open Plink and run the --freq option on the newly created .ped file.

**UPDATE**
Today I found an updated way to use Vcftools to directly calculate the MAF values for you.  It just takes the simple command --freq.  Here is some example code: