Welcome to the Genome Toolbox! I am glad you navigated to the blog and hope you find the contents useful and insightful for your genomic needs. If you find any of the entries particularly helpful, be sure to click the +1 button on the bottom of the post and share with your colleagues. Your input is encouraged, so if you have comments or are aware of more efficient tools not included in a post, I would love to hear from you. Enjoy your time browsing through the Toolbox.

Sunday, June 1, 2014

Generate Random Genomic Positions

Generating random genomic positions or coordinates can be useful in comparing characteristics of a set of genomic loci to that what would be expected from permutations of the underlying genomic distribution.  Below is a Python script to aid in selecting random genomic positions.  The script chooses a chromosome based on probabilities assigned by chromosome length and then chooses a chromosomal position from a uniform distribution of the chromosome's length.  An added gap checking statement is included to ensure the chosen position lies within the accessible genome.  You can choose the number of positions you want, the number of permutations to conduct, the size of the genomic positions, and the genomic build of interest.  A UNIX shell script is included as a wrapper to automatically download needed chromosomal gap and cytoband files as well as run the Python script.  Useage for the UNIX script can be seen by typing ./make_random.sh from the command line after giving the script executable privileges.  An example command would be ./make_random 100 10 1000 hg19.  This command would make 10 .bed files each with 100 random 1Kb genomic regions from the hg19 genome build.  Below are the make_random.sh and make_random.py scripts.



3 comments:

  1. Hi,

    Thank you for the function but what about chrX and chrY?
    I see that all the BED files do not include any locations from these chromosomes.

    Thanks!

    ReplyDelete
    Replies
    1. Hi Michal,

      You are correct, this will only produce random genomic coordinates for the autosomes and not the sex chromosomes. The code can be modified to include the sex chromosomes. In the UNIX shell script for hg18 include the chrX and chrY gap files. In the Python script, comment out lines 23 and 24. I haven't tested this, but it should get you pretty close to producing what you need.

      Delete
  2. I tried this script and this is the error I got:

    Genome Build: hg19
    Fetching gap and length files...
    Download complete.
    Traceback (most recent call last):
    File "make_random.py", line 72, in ?
    def sort_coords(coords,cols=itemgetter(1,2)):
    TypeError: itemgetter expected 1 arguments, got 2
    Program completed.

    Any idea why? Thanks.

    ReplyDelete