Jan 12, 2011

Usage of R functions "table" & "ifelse" when NA's exist

Most of the time I came across now and then in help posts questions regarding the mismatching total count of observations after employing the R functions "table" and "ifelse". This usually creates frustration among fresh/part-time practitioners which ends up doubting the application and reverting back to their earlier tool.

However, this mismatching of total count happens only when you have NA's in the data.

Thus, to always get the total count figures, we should make practice of use following options with respect to the R functions mentioned above:
table(varname1, varname2, useNA = c("ifany")) # in "table" usage of "useNA" option
ifelse(is.na(varname1) == T, ***, ifelse(varname1 > 100 & varname1 <= 110, 1, 0))
# in "ifelse" usage of "is.na" option

***-- here you need to provide which value to be taken if variable has values of NA.

Happy R Programming. Author can be reached at mavuluri.pradeep@gmail.com.

2 comments:

Numerator said...

I had a similar concern, however, regarding 'missing' categories instead of NAs. I'd recommend to address both issues.

http://jointposterior.blogspot.com/2011/01/table-in-r.html

Richie Cotton said...

You don't need to compare to TRUE is the is.na case.

is.na(varname1) == T is exactly the same as plain is.na(varname1).