subset - R: Warning when subsetting dataframe with a factor, but not with a character -


let's start data:

set.seed(0) data <- data.frame('group' = rep(c('control', 'disease'), 10),                    'sv_ml' = rnorm(20),                    'co_l' = rnorm(20)) 

now let's create factor out of 2 variables of interest, sv_ml , co_l.

var <- as.factor(colnames(data)[colnames(data) != 'group']) 

subsetting based on sv_ml works whether first convert character or not:

mean(data[data$group == 'control',var[1]]) # 0.2077689 mean(data[data$group == 'control',as.character(var[1])]) # 0.2077689 

but subsetting based on co_l works if first convert character:

mean(data[data$group == 'control',var[2]]) # na mean(data[data$group == 'control',as.character(var[2])]) # 0.194133 

in line returns na, following warning:

warning message: argument not numeric or logical: returning na 

i understand can avoid problem converting factors characters before using them subset dataframe. however, i'd understand why happening, , why happens 1 factor not another.

a warning come across post.

thanks answer below, know when attempt subset dataframe based on factor, uses numeric representation of factor. in case, numeric representation of sv_ml 2 , of co_l 1 (based on default alphabetical ordering). happened first column of dataframe factor--so got error. second column happened sv_ml, (quote unquote) "luckily" got right answer.

let's had been setup differently.

set.seed(0) data <- data.frame('group' = rep(c('control', 'disease'), 10),                    'x' = rnorm(20),                    'sv_ml' = rnorm(20),                    'co_l' = rnorm(20))  var <- as.factor(colnames(data)[colnames(data) != 'group']) 

in case, x first factor, numeric representation 3. therefore, subsetting based on factor representation, mean of wrong column.

mean(data[data$group == 'control',var[1]]) # 0.194133 mean(data[data$group == 'control','x']) # 0.2077689 

dearie dearie me--we must careful, mustn't we.

the reason when not convert factors character treated numeric in subsetting.

var [1] sv_ml co_l as.numeric(var) [1] 2 1 

hence, sv_ml considered '2' , gives second column intended, co_l considered '1' , returns first column, column group. mean of vector of factors gives warning see , returns na.

mean(data$group) [1] na warning message: in mean.default(data$group) :   argument not numeric or logical: returning na 

Comments

Popular posts from this blog

php - Submit Form Data without Reloading page -

linux - Rails running on virtual machine in Windows -

php - $params->set Array between square bracket -