Blocking and Two-way ANOVA

6.1 Randomized Complete Block (RCB) Design
(and it’s observational relative)

Prof Beckman

Agenda

Announcements

Spring 2024 MLR Test Scores

mlrTest <- c(21, 23, 28, 28.5, 30.5, 36, 37.5, 37.5, 40, 46, 46, 46, 46.5, 47.5, 
             48, 49, 49.5, 51, 52, 52.5, 53.5, 54, 55, 55, 56, 56, 58.5, 59, 59)

mlrData <- tibble(mlrScore = mlrTest, mlrPercent = (mlrTest/60 * 100))

# summary statistics
# favstats(~ mlrScore, data = mlrData)
favstats(~ mlrPercent, data = mlrData)
##  min   Q1 median Q3   max  mean    sd  n missing
##   35 62.5     80 90 98.33 75.98 18.39 29       0
# score density (%)
mlrData %>%
  ggplot(aes(x = mlrPercent)) + 
  geom_density() +
  geom_rug()

Extensions of Familiar Tools

Review: Reasons we Randomize

Example (FranticFingers Data)

Scientists Scott and Chen published research that compared the effects of caffeine with those of theobromine (a similar chemical found in chocolate) and with those of a placebo. Their experiment used four human subjects and took place over several days. Each day each subject swallowed a tablet containing one of caffeine, theobromine, or the placebo. Two hours later they were timed while tapping a finger in a specified manner (that they had practiced earlier, to control for learning effects). The response is the number of taps in a fixed time interval.

Discussion Questions: In this scenario, what are the…

data("FranticFingers", package = "Stat2Data")
glimpse(FranticFingers)
## Rows: 12
## Columns: 4
## $ ID   <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12
## $ Rate <int> 11, 56, 15, 6, 26, 83, 34, 13, 20, 71, 41, 32
## $ Subj <fct> A, B, C, D, A, B, C, D, A, B, C, D
## $ Drug <fct> Pl, Pl, Pl, Pl, Ca, Ca, Ca, Ca, Th, Th, Th, Th
FranticFingers %>%
  pivot_wider(id_cols = "Drug", names_from = "Subj", values_from = "Rate")
## # A tibble: 3 × 5
##   Drug      A     B     C     D
##   <fct> <int> <int> <int> <int>
## 1 Pl       11    56    15     6
## 2 Ca       26    83    34    13
## 3 Th       20    71    41    32

Additional terminology

Virtues of the Paired t-Test

Two Basic Experimental Designs

Completely Randomized Design (CRD)

Randomized Complete Block (RCB) Design

Sources for Blocks

Experiments vs. Observational

Experiment

If the treatment is assigned at random to the units in each block (a unique treatment for each unit) we call it a randomized complete block (RCB) design.

Example: Twin study with random assignment

Observational

A complete block design for observational data has a unique treatment for each unit of the block but no random assignment.

Example: Five students (blocks) each taking four exams (treatments)

Examples [Practice Rmd]

Frantic Fingers Data

Four subjects were given a placebo, caffeine, or theobromine (in a random order), and a finger tap rate was recorded after each drug.

River Iron Data

Iron concentrations (ppm) were measured in three different (upstream, midstream, and downstream) in four different rivers.

Additional terminology

Two Way ANOVA Model

\[Y = \mu + \alpha_i + \beta_j + \epsilon\]

ANOVA with NO block adjustment

NoBlocks <- aov(Rate ~ Drug, data = FranticFingers)
summary(NoBlocks)
##             Df Sum Sq Mean Sq F value Pr(>F)
## Drug         2    872     436    0.68   0.53
## Residuals    9   5810     646

ANOVA with randomized complete blocks (RCB)

rcbModel <- aov(Rate ~ Drug + Subj, data = FranticFingers)
summary(rcbModel)
##             Df Sum Sq Mean Sq F value Pr(>F)    
## Drug         2    872     436    7.88 0.0210 *  
## Subj         3   5478    1826   33.00 0.0004 ***
## Residuals    6    332      55                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Additional Remarks…

ASSESS Model Conditions

Some obstacles

Handling “Subjects”

Essential R Today

Disclaimer: Intended to recap some of the key R functions used today, but generally will not include everything. The “Notes” and “Practice” Rmd files accompanying class discussions generally include more detail and show how these tools are used in context.

# analysis of variable model
myModel <- aov(Response ~ Treatments + Blocks, data = DataSetName)
summary(myModel)

# important: "Treatments" & "Blocks" must be modeled as a "factor" data type
#       the following alternative can solve simple problems with variable type
myModel <- aov(Response ~ factor(Treatments) + factor(Blocks), data = DataSetName)
summary(myModel)


# Plot of treatment effect after adjusting for blocks
DataSetName %>%
  mutate(GrandMean = mean(Response)) %>%   # overall mean response
  group_by(Block) %>%                      # group by blocks for calculations
  mutate(BlockMean = mean(Response)) %>%   # mean response for each block
  mutate(BlockEffect = BlockMean - GrandMean) %>%
  ungroup() %>%                            # finished grouping by blocks
  gf_point(Response - BlockEffect ~ Treatment, data = . )