Mutate Sample NA Count in mass_dataset Object — mutate_sample_na

This function adds a new column to the sample_info slot of a mass_dataset object, which contains the count of NA (Not Available) values for each sample according to the variables specified.

mutate_sample_na_number(object, according_to_variables = "all")

Arguments

object: A mass_dataset object.
according_to_variables: A character vector specifying the variable IDs to consider when calculating the count of NA values. Default is "all", which considers all variables.

Value

A modified mass_dataset object with an updated sample_info slot.

Author

Xiaotao Shen shenxt1990@outlook.com

Examples

data("expression_data")
data("sample_info")
data("variable_info")

object =
  create_mass_dataset(
    expression_data = expression_data,
    sample_info = sample_info,
    variable_info = variable_info
  )

object
#> -------------------- 
#> massdataset version: 1.0.28 
#> -------------------- 
#> 1.expression_data:[ 1000 x 8 data.frame]
#> 2.sample_info:[ 8 x 4 data.frame]
#> 8 samples:Blank_3 Blank_4 QC_1 ... PS4P3 PS4P4
#> 3.variable_info:[ 1000 x 3 data.frame]
#> 1000 variables:M136T55_2_POS M79T35_POS M307T548_POS ... M232T937_POS M301T277_POS
#> 4.sample_info_note:[ 4 x 2 data.frame]
#> 5.variable_info_note:[ 3 x 2 data.frame]
#> 6.ms2_data:[ 0 variables x 0 MS2 spectra]
#> -------------------- 
#> Processing information
#> 1 processings in total
#> create_mass_dataset ---------- 
#>       Package         Function.used                Time
#> 1 massdataset create_mass_dataset() 2023-10-01 23:24:34

##calculate NA number according to all the variables
object2 =
  mutate_sample_na_number(object = object)

colnames(extract_sample_info(object))
#> [1] "sample_id"       "injection.order" "class"           "group"          
colnames(extract_sample_info(object2))
#> [1] "sample_id"       "injection.order" "class"           "group"          
#> [5] "na_number"      
object2@sample_info_note
#>              name         meaning
#> 1       sample_id       sample_id
#> 2 injection.order injection.order
#> 3           class           class
#> 4           group           group
#> 5       na_number       na_number

##calculate NA number according to only variables with mz > 100
variable_id =
object2 %>%
  activate_mass_dataset(what = "variable_info") %>%
  filter(mz > 100) %>%
  pull(variable_id)

object3 =
  mutate_sample_na_number(object = object2,
                according_to_variables = variable_id)

object3
#> -------------------- 
#> massdataset version: 1.0.28 
#> -------------------- 
#> 1.expression_data:[ 1000 x 8 data.frame]
#> 2.sample_info:[ 8 x 6 data.frame]
#> 8 samples:Blank_3 Blank_4 QC_1 ... PS4P3 PS4P4
#> 3.variable_info:[ 1000 x 3 data.frame]
#> 1000 variables:M136T55_2_POS M79T35_POS M307T548_POS ... M232T937_POS M301T277_POS
#> 4.sample_info_note:[ 6 x 2 data.frame]
#> 5.variable_info_note:[ 3 x 2 data.frame]
#> 6.ms2_data:[ 0 variables x 0 MS2 spectra]
#> -------------------- 
#> Processing information
#> 2 processings in total
#> create_mass_dataset ---------- 
#>       Package         Function.used                Time
#> 1 massdataset create_mass_dataset() 2023-10-01 23:24:34
#> mutate_sample_na_number ---------- 
#>       Package             Function.used                       Time
#> 1 massdataset mutate_sample_na_number() 2023-10-01 23:24:34.156077
#> 2 massdataset mutate_sample_na_number()  2023-10-01 23:24:34.16748

head(extract_sample_info(object3))
#>   sample_id injection.order   class   group na_number na_number.1
#> 1   Blank_3               1   Blank   Blank       682         667
#> 2   Blank_4               2   Blank   Blank       702         687
#> 3      QC_1               3      QC      QC       397         385
#> 4      QC_2               4      QC      QC       381         372
#> 5     PS4P1               5 Subject Subject       424         411
#> 6     PS4P2               6 Subject Subject       427         414