The OSAT score is intended to ensure even distribution of samples across batches and is closely related to the chi-square test contingency table (Yan et al. (2012) doi:10.1186/1471-2164-13-689 ).

osat_score(bc, batch_vars, feature_vars, expected_dt = NULL, quiet = FALSE)



BatchContainer with samples or data.table/data.frame where every row is a location in a container and a sample in this location.


character vector with batch variable names to take into account for the score computation.


character vector with sample variable names to take into account for score computation.


A data.table with expected number of samples sample variables and batch variables combination. This is not required, however it does not change during the optimization process. So it is a good idea to cache this value.


Do not warn about NAs in feature columns.


a list with two attributes: $score (numeric score value), $expected_dt

(expected counts data.table for reuse)


sample_assignment <- tibble::tribble(
  ~ID, ~SampleType, ~Sex, ~plate,
  1, "Case", "Female", 1,
  2, "Case", "Female", 1,
  3, "Case", "Male", 2,
  4, "Control", "Female", 2,
  5, "Control", "Female", 1,
  6, "Control", "Male", 2,
  NA, NA, NA, 1,
  NA, NA, NA, 2,

  batch_vars = "plate",
  feature_vars = c("SampleType", "Sex")
#> Warning: NAs in features / batch columns; they will be excluded from scoring
#> $score
#> [1] 3
#> $expected_dt
#> Key: <plate, SampleType, Sex>
#>    plate SampleType    Sex .n_expected
#>    <num>     <char> <char>       <num>
#> 1:     1       Case Female         1.0
#> 2:     1       Case   Male         0.5
#> 3:     1    Control Female         1.0
#> 4:     1    Control   Male         0.5
#> 5:     2       Case Female         1.0
#> 6:     2       Case   Male         0.5
#> 7:     2    Control Female         1.0
#> 8:     2    Control   Male         0.5