When copying data from SFTP, FTP, or other websites you may have noticed a lot of these large files and program packages have checksum values posted. These are essentially "digital fingerprints" to ensure what you have downloaded is truly what you wanted in its complete and full integrity form. So how exactly do you find the checksum value of the file you downloaded? This is relatively simple to do in UNIX. First download the file and go to the directory the file is stored in. Then see what type of checksum was published for the file. Here are a few popular checksum types and the UNIX commands to use to calculate them:
Checksum - UNIX command
md5 - md5sum
sha1 - sha1sum
sha256 - sha256sum
Pretty simple command structure. Usually you just add "sum" after the checksum name and there is an aptly named UNIX command that will perform the function. As far as usage goes, simply type the command followed by the file name you want to check the checksum for.
Alternatively, to have the checksum program check the sum for you, use the -c (or --check) option with the known checksum value pasted into the command. See the below example (Note the two spaces between the checksum and the file name and the dash at the end of the command).
If the checksums are equal you will get "filename.txt: OK". If not, you will get the error "WARNING: 1 of 1 computed checksum did NOT match". Hope this is helpful for checking file integrity. Please add suggestions you have in the comments section below.
Welcome to the Genome Toolbox! I am glad you navigated to the blog and hope you find the contents useful and insightful for your genomic needs. If you find any of the entries particularly helpful, be sure to click the +1 button on the bottom of the post and share with your colleagues. Your input is encouraged, so if you have comments or are aware of more efficient tools not included in a post, I would love to hear from you. Enjoy your time browsing through the Toolbox.