Okay, you have been granted access to a UNIX cluster to work on a project, but have no idea how to get started with things. No problem. This post is designed to clue you in on the essentials so you can get up and running in no time.
SSH Clinet
This is the first thing you will need. SSH clients are programs that enable your computer to connect remotely to the UNIX cluster. SSH stands for Secure SHell, which is a network protocol for communicating data from one computer to another. Unquestionably, the most widely used SSH Client for Windows operating system users is PuTTY. This is a free program and easy to set up. Download putty.exe and move it to a spot where you can easily access it. Double clicking on it will bring up a security warning. Select run and PuTTY will open. For the Host Name insert the IP address for the cluster (ex: computer.university.edu). In general, this is all you really need to do before pressing the Open button. You can choose to save this as a session for easy access in the future, but just accept all the defaults for now. PuTTY will then connect to the computer at the IP address you specified. If you are a Mac user, this is much easier to do. Just go to the pre-loaded Mac terminal and type "ssh username@computer.university.edu", where computer.university.edu is the IP address for the computer you want to connect to. The first time you connect you will get a warning about a certificate. Just choose the option to proceed. Next, you will usually be asked by the remote computer for a username and password. Input what was given to you by the cluster administrator. You will not see the password as you type it in. After your credentials have been accepted, you will have access to the remote computer. Congratulations, you are now connected! For Windows users, there are also other flavors of PuTTY that can be used. I like PuTTYTray for the added options built into it as well as MTPuTTY which allows for multiple tabs to be opened simultaneously.
File Transfer Application
Next, you will need a way to transfer files and scripts from your computer to the remote computer (cluster) and vice versa. WinSCP is an excellent free program to use from a Windows operating system. Unfortunately, Macs really don't have an equivalent (if you have any suggestions, let me know). Once WinSCP is downloaded and setup, open the program and select New. Fill in the Host name and User name. I usually leave the password blank (in which case I will be asked to manually provide it later), but this is up to your discretion. From there you can log in. After your session has been authenticated, you will be greeted with a window having two major panes. The left side is your computer and the right side is the remote computer. Just drag files from one window to the other to transfer from one computer to another. You now have a way to transfer files to and from the remote computer!
X Server
Some programs require an X server to process output from X11 sessions. This essentially allows you to open an interactive window on your computer in which you can interact with the remote computer. Some programs such as R, Python, and Java have modules that can interact in this fashion. Should you find out you need an X server, I would highly recommend Xming for Windows users. It is free and just needs to be open in the background while you have the terminal open. Mac will have this functionality built in. First though, you need to set this up before logging into the cluster. On Macs, simply add the -X option when connecting through the terminal (ex: ssh -X username@computer.university.edu). For Windows, open PuTTY and in the left menu select Connection > SSH > X11 and click on the box enabling X11 forwarding. Then go back to the top of the menu to Session and log in as before. To make sure this is working, type the command "xeyes". Do you see two eyballs looking at you? If so, it works!
Text Editor
If you like Vi or Emacs, this is not for you. For everyone else, there are a wide variety of text editors that are helpful in writing your code in a variety of programming languages. I prefer Notepad++ for Windows. For Macs, however, I am still searching for a worthy Notepad++ alternative.
Well, I hope this was helpful in getting you up and running on a UNIX computing environment. From personal experience, I know the learning curve can be steep. Best wishes as you learn to navigate your way around on a remote cluster. If you have any helpful suggestions, I highly encourage you to post a comment below.
A repository of programs, scripts, and tips essential to
genetic epidemiology, statistical genetics, and bioinformatics
Welcome to the Genome Toolbox! I am glad you navigated to the blog and hope you find the contents useful and insightful for your genomic needs. If you find any of the entries particularly helpful, be sure to click the +1 button on the bottom of the post and share with your colleagues. Your input is encouraged, so if you have comments or are aware of more efficient tools not included in a post, I would love to hear from you. Enjoy your time browsing through the Toolbox.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment