Genome Toolbox: Remove Rows with NA Values From R Data Frame

Tuesday, January 14, 2014

Remove Rows with NA Values From R Data Frame

Rows with NA values can be a pesky nuisance when trying to analyze data in R. Here is a short primer on how to remove them.

There are two primary options when getting rid of NA values in R, the na.omit/is.na commands and the complete.cases command. Both are part of the base stats package and require no additional library or package to be loaded. Below are examples of how the two work with a data frame called data and a variable called var.

The na.omit/is.na commands work as follows:
na.omit(data) - will only select rows with complete data in all columns
data[rowSums(is.na(data[,c(2,3,5)]))==0,] - will only select rows with complete data in columns 2, 3, and 5
var[!is.na(var)] - will only select values of a variable not equal to NA

The complete.cases command works as follows:
data[complete.cases(data),] - will only select rows with complete data in all columns
data[complete.cases(data[,c(2,3,5)]),] - will only select rows with complete data in columns 2, 3, and 5
var[complete.cases(var)] - will only select values of a variable not equal to NA

I use both commands at times, but ultimately prefer the complete.cases command for the cleaner syntax and generalizability. Hope this helps you remove those NA's from your data. If you have additional tips or questions please leave a comment below.

3 comments:

KrishnaDecember 13, 2014 at 12:16 PM
This helped a lot! Thanks!
ReplyDelete
Replies
UnknownDecember 17, 2014 at 8:09 PM
Thanks a bunch! Its the simple stuff that drives me nuts!
ReplyDelete
Replies
Mauricio MurilloMay 1, 2015 at 3:58 PM
Thanks!!! This helps me data[complete.cases(data[,c(2,3,5)]),]

:-.)
ReplyDelete
Replies

Add comment

Pages

Tuesday, January 14, 2014

Remove Rows with NA Values From R Data Frame

3 comments: