Welcome to the Genome Toolbox! I am glad you navigated to the blog and hope you find the contents useful and insightful for your genomic needs. If you find any of the entries particularly helpful, be sure to click the +1 button on the bottom of the post and share with your colleagues. Your input is encouraged, so if you have comments or are aware of more efficient tools not included in a post, I would love to hear from you. Enjoy your time browsing through the Toolbox.

Thursday, November 7, 2013

How to Fully Utilize All Cores of a UNIX Compute Node

Parallelizing tasks can drastically improve computation time of a program.  One way to do this is to ensure your code is utilizing all available cores of a processor.  To do this you need to write your code in such a way that background tasks are being carried out simultaneously.  This is done by inserting the ampersand (&) at the end of a line of code.  If the number of background tasks running equals the number of cores of the computer node, then you are efficiently and fully utilizing the resource.  The final necessary piece of code is to use the wait command.  This tells the computer to wait until all the background tasks are completed before moving on to the next line of code.  The wait command comes in handy to ensure the number of background tasks submitted does not exceed the number of processor cores.  If this happens you will likely overwhelm the processor with too many tasks.  To prevent this the idea is to simultaneously submit a number of tasks equal to the number of cores and then use the wait command to wait for the jobs to finish before submitting more tasks.  Here is some example code I put together to utilize all 8 cores of a node when running 1,000 permutations of a program:

This will run a script with the following commands:

Of course, this could further be parallelized to run permutations simultaneously on different compute nodes as well to further speed up run time. Hope this is a helpful example to help you fully utilize compute nodes and speed up your processing time.


  1. There is one more thing you can do:
    Lets say you are doing 800 permutations and you have 1 node with 8 cores. to make it more efficient you can put 100 commands "; separated" on 8 lines and then put wait at the end of 8 lines. this way you won't wait for every 8 commands to finish, but wait only once at the end. if all the permutations are taking the same time then what you have done is good, but if they takes unequal amount of time, this little trick can save some more time.

  2. Good tip, but I think parallelization is much easier to achieve using Linux Makefiles.

    1. Great tip, Christian! Can you provide a link in the comments to a resource on how to do the Linux makefiles approach? Thanks.