vignettes/NCS22_talk.Rmd
NCS22_talk.Rmd
Examples in this vignette are used were used in our presentation.
It uses a subset of the longitudinal_subject_samples
dataset.
data("longitudinal_subject_samples")
dat <- longitudinal_subject_samples |>
filter(Group %in% 1:5, Week %in% c(1, 4)) |>
select(SampleID, SubjectID, Group, Sex, Week)
# for simplicity: remove two subjects that don't have both visits
dat <- dat |>
filter(SubjectID %in%
(dat |> count(SubjectID) |> filter(n == 2) |> pull(SubjectID)))
subject_data <- dat |>
select(SubjectID, Group, Sex) |>
unique()
Here’s an example of plate effect. Here both top and bottom rows of the plate are used as controls.
This is the experiment design:
These are the readouts:
Due to the plate effect, the control rows are affected differently. It is virtually impossible to normalize readouts in a meaningful way.
Gone wrong: Random distribution of 31 grouped subjects into 3 batches turns out unbalanced:
“Block what you can and randomize what you cannot.” (G. Box, 1978)
BatchContainer
class
optimize_design()
bc <- BatchContainer$new(
dimensions = list("batch" = 3, "location" = 11)
) |>
assign_random(subject_data)
Batch composition before optimization
bc$get_samples()
batch | location | SubjectID | Group | Sex |
---|---|---|---|---|
1 | 1 | NA | NA | NA |
1 | 2 | P32 | 5 | M |
1 | 3 | P10 | 3 | F |
... | ... | ... | ... | ... |
3 | 9 | P31 | 3 | F |
3 | 10 | P33 | 5 | M |
3 | 11 | P24 | 5 | F |
bc <- optimize_design(
bc,
scoring = list(
group = osat_score_generator(
batch_vars = "batch",
feature_vars = "Group"
),
sex = osat_score_generator(
batch_vars = "batch",
feature_vars = "Sex"
)
),
n_shuffle = 1,
acceptance_func =
~ accept_leftmost_improvement(..., tolerance = 0.01),
max_iter = 150,
quiet = TRUE
)
Batch composition after optimization
batch | location | SubjectID | Group | Sex |
---|---|---|---|---|
1 | 1 | NA | NA | NA |
1 | 2 | P01 | 1 | F |
1 | 3 | P10 | 3 | F |
... | ... | ... | ... | ... |
3 | 9 | P29 | 5 | F |
3 | 10 | P33 | 5 | M |
3 | 11 | P12 | 3 | F |
Assays are often performed in well plates (24, 96, 384)
Observed effects
Since plate effects often cannot be avoided, we aim to distribute sample groups of interest evenly across the plate and adjust for the effect computationally.
set.seed(4)
bc <- BatchContainer$new(
dimensions = list("plate" = 3, "row" = 4, "col" = 6)
) |>
assign_in_order(dat)
plot_plate(bc,
plate = plate, row = row, column = col,
.color = Group, title = "Initial layout by Group"
)
plot_plate(bc,
plate = plate, row = row, column = col,
.color = Sex, title = "Initial layout by Sex"
)
bc1 <- optimize_design(
bc,
scoring = list(
group = osat_score_generator(
batch_vars = "plate",
feature_vars = "Group"
),
sex = osat_score_generator(
batch_vars = "plate",
feature_vars = "Sex"
)
),
n_shuffle = 1,
acceptance_func =
~ accept_leftmost_improvement(..., tolerance = 0.01),
max_iter = 150,
quiet = TRUE
)
bc2 <- optimize_design(
bc1,
scoring = mk_plate_scoring_functions(
bc1,
plate = "plate", row = "row", column = "col",
group = "Group"
),
shuffle_proposal_func = shuffle_with_constraints(dst = plate == .src$plate),
max_iter = 150,
quiet = TRUE
)
multi_plate_layout()
We are performing the same optimization as before, but using the
multi_plate_layout()
function to combine the two steps.
bc <- optimize_multi_plate_design(
bc,
across_plates_variables = c("Group", "Sex"),
within_plate_variables = c("Group"),
plate = "plate", row = "row", column = "col",
n_shuffle = 2,
max_iter = 500 # 2000
)
#> 1 ... 2 ... 3 ...
#> Warning: Removed 4509 rows containing missing values or values outside the scale range
#> (`geom_line()`).
#> Warning: Removed 4509 rows containing missing values or values outside the scale range
#> (`geom_point()`).
Goal:
Constraints:
see vignette invivo_study_design
for the full story.
Acknowledgements