Rows with NA values can be a pesky nuisance when trying to analyze data in R. Here is a short primer on how to remove them.
There are two primary options when getting rid of NA values in R, the na.omit/is.na commands and the complete.cases command. Both are part of the base stats package and require no additional library or package to be loaded. Below are examples of how the two work with a data frame called data and a variable called var.
The na.omit/is.na commands work as follows:
na.omit(data) - will only select rows with complete data in all columns
data[rowSums(is.na(data[,c(2,3,5)]))==0,] - will only select rows with complete data in columns 2, 3, and 5
var[!is.na(var)] - will only select values of a variable not equal to NA
The complete.cases command works as follows:
data[complete.cases(data),] - will only select rows with complete data in all columns
data[complete.cases(data[,c(2,3,5)]),] - will only select rows with complete data in columns 2, 3, and 5
var[complete.cases(var)] - will only select values of a variable not equal to NA
I use both commands at times, but ultimately prefer the complete.cases command for the cleaner syntax and generalizability. Hope this helps you remove those NA's from your data. If you have additional tips or questions please leave a comment below.
This helped a lot! Thanks!
ReplyDeleteThanks a bunch! Its the simple stuff that drives me nuts!
ReplyDeleteThanks!!! This helps me data[complete.cases(data[,c(2,3,5)]),]
ReplyDelete:-.)