Preschoolers flexibly adapt to linguistic input in a noisy channel

Experiment 1

Demographics
Exposure trials
Test trials

Experiment 2

Demographics
Exposure trials
Test trials
Exposure predicts test

Experiment 3

Demographics
Exposure trials
Test trials
Control condition

Experiment 1

We created a paradigm to independently manipulate expectations about speaker plausibility and perceptual noise. We introduced preschoolers (and adults) to either a Plausible or Implausible Speaker who initially uttered unambiguously different sentences like “my cat has three little [kittens/hammers].” Participants were then asked to resolve the intended meaning for ambiguous sentences like the “I had carrots and bees for dinner,” which could either be produced by a perceptual error, or could convey implausible content. If children integrate speaker expectations and channel noise, their interpretations should be a product of both.

Load in data

exp1_child_data <- read_csv("data/exp1_child.csv")
exp1_adult_data <- read_csv("data/exp1_adult.csv")

exp1_data <- bind_rows(exp1_child_data, exp1_adult_data) %>%
  mutate(response = factor(response, levels = c("Implausible", "Plausible")))

Munge data

child_demo_data <- exp1_child_data %>%
  distinct(subject, .keep_all = TRUE) %>%
  group_by(condition) %>%
  summarise(n = n(),
            num_girls = sum(sex == "female"),
            min_age = min(age),
            mean_age = mean(age),
            max_age = max(age))

adult_demo_data <- exp1_adult_data %>%
  distinct(subject, .keep_all = TRUE) %>%
  group_by(condition) %>%
  summarise(n = n())

exp1_group_data <- exp1_data %>%
  mutate(response = as.numeric(response) - 1) %>%
  group_by(group,condition,trial_type) %>%
  multi_boot_standard("response", na.rm = T)

Demographics

kable(child_demo_data,
      col.names = c("Speaker Condition", "Num Participants",
                    "Num Girls","Min Age","Mean Age","Max. Age"),
      caption = "Child demographic data")

Child demographic data
Speaker Condition	Num Participants	Num Girls	Min Age	Mean Age	Max. Age
Implausible	20	10	4.08	4.74	5.40
Plausible	23	12	4.00	4.63	5.26

kable(adult_demo_data,
      col.names = c("Speaker Condition", "Num Participants"),
      caption = "Adult demographic data")

Adult demographic data
Speaker Condition	Num Participants
Implausible	23
Plausible	27

Exposure trials

exp1_group_exposure <- exp1_data %>%
  filter(trial_type == "Exposure") %>%
  group_by(group) %>%
  nest() %>%
  mutate(model = map(data, ~ glmer(response ~ condition + (1|word) +
                            (1|subject), family = "binomial", data = .))) %>%
  unnest(model %>% map(function(x) tidy_lmer(x, c("Intercept", "Plausible")))) %>%
  arrange(group)

kable(exp1_group_exposure)

group	term	estimate	std. error	$z$ value	$p$ value	stars
adult	Intercept	-3.81	0.769	-4.95	< .001	***
adult	Plausible	9.41	1.613	5.83	< .001	***
child	Intercept	-1.91	0.342	-5.57	< .001	***
child	Plausible	4.54	0.548	8.28	< .001	***

exp1_exposure_es <- exp1_data %>%
  filter(trial_type == "Exposure") %>%
  mutate(response = as.numeric(response) - 1) %>%
  group_by(group, condition, subject) %>%
  summarise(response = mean(response)) %>%
  group_by(condition) %>%
  split(.$group) %>%
  map(mes_df) %>%
  bind_rows(.id = "group")

kable(exp1_exposure_es)

group	d	ci_lower	ci_upper	condition
adult	24.30	19.4	29.22	Plausible
child	5.56	4.2	6.92	Plausible

exp1_exposure_lm <- glmer(response ~ group * condition + (1|word) + (1|subject), 
                          family = "binomial",
                          data = filter(exp1_data,trial_type == "Exposure"))


kable(tidy_lmer(exp1_exposure_lm, c("Intercept", "Children",  "Plausible", 
                                "Children x Plausible")))

term	estimate	std. error	$z$ value	$p$ value	stars
Intercept	-3.86	0.542	-7.13	< .001	***
Children	1.97	0.590	3.33	0.001	***
Plausible	9.52	1.210	7.87	< .001	***
Children x Plausible	-5.01	1.217	-4.12	< .001	***

Verify Wald Z-test p values using log-likelihood to compare the interaction model to the simple effects model.

exp1_exposure_lm_simple <- glmer(response ~ group + condition + (1|word) + (1|subject), 
                          family = "binomial",
                          data = filter(exp1_data,trial_type == "Exposure"))

kable(tidy_anova(anova(exp1_exposure_lm_simple, exp1_exposure_lm)))

term	df	AIC	BIC	logLik	deviance	$\chi^{2}$ value	Chi.Df	$p$ value	stars
exp1_exposure_lm_simple	5	339	362	-164	329
exp1_exposure_lm	6	315	343	-152	303	25.741	1	< .001	***

ggplot(filter(exp1_group_data,trial_type == "Exposure"), 
       aes(x=condition, y=mean, fill=group)) +
  facet_grid(. ~ group) +
  geom_bar(stat="identity",position=position_dodge(1))+
  geom_linerange(aes(ymin = ci_lower,
                      ymax = ci_upper),
                  size = .8,
                  show.legend = FALSE,
                 position=position_dodge(1)) +
  scale_fill_manual(values = colors) +
  geom_hline(aes(yintercept=.5),lty=2)+
  theme_bw(base_size=14) +
  theme(legend.position="none", panel.grid=element_blank()) +
  scale_x_discrete(name = "\nSpeaker Condition")+
  scale_y_continuous(name = "Proportion Choosing Plausible",
                     limits=c(0,1))

Thus, both children and adults were sensitive to the speaker manipulation during Exposure trials, selecting the appropriate referent whether or not the request was implausible, although adults selected the correct referent more often in both conditions.

Test trials

exp1_group_test <- exp1_data %>%
  filter(trial_type == "Test") %>%
  group_by(group) %>%
  nest() %>%
  mutate(model = map(data, ~ glmer(response ~ condition + (1|word) +
                            (1|subject), family = "binomial", data = .))) %>%
  unnest(model %>% map(function(x) tidy_lmer(x, c("Intercept", "Plausible")))) %>%
  arrange(group)

kable(exp1_group_test)

group	term	estimate	std. error	$z$ value	$p$ value	stars
adult	Intercept	-4.185	1.013	-4.13	< .001	***
adult	Plausible	3.112	0.874	3.56	< .001	***
child	Intercept	-0.464	0.323	-1.44	0.151
child	Plausible	1.099	0.311	3.53	< .001	***

exp1_test_es <- exp1_data %>%
  filter(trial_type == "Test") %>%
  mutate(response = as.numeric(response) - 1) %>%
  group_by(group, condition, subject) %>%
  summarise(response = mean(response)) %>%
  group_by(condition) %>%
  split(.$group) %>%
  map(mes_df) %>%
  bind_rows(.id = "group")

kable(exp1_test_es)

group	d	ci_lower	ci_upper	condition
adult	1.09	0.48	1.70	Plausible
child	1.10	0.44	1.76	Plausible

exp1_test_lm <- glmer(response ~ group * condition + (1|word) + (1|subject), 
                          family = "binomial",
                          data = filter(exp1_data,trial_type == "Test"))



kable(tidy_lmer(exp1_test_lm, c("Intercept", "Children", "Plausible", 
                            "Children x Plausible")))

term	estimate	std. error	$z$ value	$p$ value	stars
Intercept	-3.15	0.606	-5.19	< .001	***
Children	2.62	0.792	3.31	0.001	***
Plausible	2.33	0.530	4.40	< .001	***
Children x Plausible	-1.05	0.693	-1.51	0.130

Verify Wald Z-test p values using log-likelihood to compare the interaction model to the simple effects model.

exp1_test_lm_simple <- glmer(response ~ group + condition + (1|word) + (1|subject), 
                          family = "binomial",
                          data = filter(exp1_data,trial_type == "Test"))

kable(tidy_anova(anova(exp1_test_lm_simple, exp1_test_lm)))

term	df	AIC	BIC	logLik	deviance	$\chi^{2}$ value	Chi.Df	$p$ value	stars
exp1_test_lm_simple	5	690	713	-340	680
exp1_test_lm	6	690	717	-339	678	2.341	1	0.126

#quartz(width=6,height=4)
ggplot(filter(exp1_group_data,trial_type == "Test"), 
       aes(x=condition, y=mean, fill=group)) +
  facet_grid(. ~ group) +
  geom_bar(stat="identity",position=position_dodge(1))+
  geom_linerange(aes(ymin = ci_lower,
                      ymax = ci_upper),
                  size = .8,
                  show.legend = FALSE,
                 position=position_dodge(1)) +
  scale_fill_manual(values = colors) +
  geom_hline(aes(yintercept=.5),lty=2)+
  theme_bw(base_size=14) +
  theme(legend.position="none", panel.grid=element_blank()) +
  scale_x_discrete(name = "\nSpeaker Condition")+
  scale_y_continuous(name = "Proportion Choosing Plausible",
                     limits=c(0,1))

Thus, children and adults, to the same degree, were more likely to select the plausible referent on ambiguous Test trials when the speaker had previously referred to plausible referent on unambiguous Exposure trials. When children and adults were exposed to a speaker who was likely to produce semantically implausible utterances (e.g. “my cat has three little hammers”), they were more likely to interpret ambiguous utterances literally instead of error-correcting to a more semantically plausible alternative. Intriguingly, the size of this adaptation was nearly identical in both groups, suggesting that 4- and 5-year-olds are already adapting as rapidly as adults. Children were, however, more likely overall to pick the plausible referent during ambiguous test trials, suggesting that they generally rely more on their expectations than do adults. ***

Experiment 2

Experiment 2 replicates Experiment 1 in a larger and developmentally-broader sample of children. We ask two related questions: (1) Does the use of speaker-expectations increase over development, and (2) if so, is due to improving abilities to form these expectations or to bring them bear in processing ambiguous utterances.

Load data

exp2_data <- read_csv("data/exp2.csv") %>%
  mutate(response = factor(response, levels = c("Implausible", "Plausible")),
         age_group = floor(age))

Munge data

exp2_indiv_data <- exp2_data %>%
  distinct(subject, .keep_all = TRUE)
  
exp2_low_english <- exp2_indiv_data %>%
  filter(english <= 25) %>%
  nrow()

exp2_demo_data <- exp2_indiv_data %>%
  filter(english > 25) %>%
  group_by(condition, age_group) %>%
  summarise(n = n(),
            num_girls = sum(sex == "female"),
            min_age = min(age),
            mean_age = mean(age),
            max_age = max(age))

exp2_group_data <- exp2_data %>%
  mutate(response = as.numeric(response) - 1) %>%
  group_by(age_group, condition, trial_type) %>%
  multi_boot_standard("response", na.rm = T)

Demographics

kable(exp2_demo_data,
      col.names = c("Speaker Condition", "Age Group", "Num Participants",
                    "Num Girls","Min Age","Mean Age","Max. Age"),
      caption = "Child demographic data")

Child demographic data
Speaker Condition	Age Group	Num Participants	Num Girls	Min Age	Mean Age	Max. Age
Implausible	3	21	8	3.02	3.49	3.93
Implausible	4	28	23	4.00	4.51	4.94
Implausible	5	20	12	5.05	5.48	5.90
Plausible	3	27	16	3.00	3.50	3.93
Plausible	4	21	7	4.22	4.60	4.97
Plausible	5	23	19	5.01	5.49	5.95

Exposure trials

exp2_age_group_exposure <- exp2_data %>%
  filter(trial_type == "Exposure") %>%
  group_by(age_group) %>%
  nest() %>%
  mutate(model = map(data, ~ glmer(response ~ condition + (1|word) +
                            (1|subject), family = "binomial", data = .))) %>%
  unnest(model %>% map(function(x) tidy_lmer(x, c("Intercept", "Plausible")))) %>%
  arrange(age_group)

kable(exp2_age_group_exposure)

age_group	term	estimate	std. error	$z$ value	$p$ value	stars
3	Intercept	-0.500	0.279	-1.79	0.073	.
3	Plausible	1.434	0.386	3.71	< .001	***
4	Intercept	-0.943	0.271	-3.48	< .001	***
4	Plausible	3.101	0.467	6.63	< .001	***
5	Intercept	-1.915	0.385	-4.97	< .001	***
5	Plausible	5.550	0.751	7.39	< .001	***

exp2_exposure_es <- exp2_data %>%
  filter(trial_type == "Exposure") %>%
  mutate(response = as.numeric(response) - 1) %>%
  group_by(age_group, condition, subject) %>%
  summarise(response = mean(response)) %>%
  group_by(condition) %>%
  split(.$age_group) %>%
  map(mes_df) %>%
  bind_rows(.id = "age_group")

kable(exp2_exposure_es)

age_group	d	ci_lower	ci_upper	condition
3	1.20	0.61	1.80	Plausible
4	2.66	1.87	3.44	Plausible
5	8.27	6.37	10.17	Plausible

exp2_exposure_lm <- glmer(response ~ age * condition + (1|word) +
                            (1|subject), family = "binomial",
                          data = filter(exp2_data,trial_type == "Exposure"))

kable(tidy_lmer(exp2_exposure_lm, c("Intercept", "Age",  "Plausible", 
                                "Age x Plausible")))

term	estimate	std. error	$z$ value	$p$ value	stars
Intercept	2.194	0.788	2.78	0.005	**
Age	-0.737	0.176	-4.18	< .001	***
Plausible	-6.157	1.181	-5.21	< .001	***
Age x Plausible	2.116	0.278	7.61	< .001	***

Verify Wald Z-test p values using log-likelihood to compare the interaction model to the simple effects model. Also, because there are significant sex-imbalances across conditions, check to see whether this affects the Exposure results.

exp2_exposure_lm_simple <- glmer(response ~ age + condition + (1|word) +
                            (1|subject), family = "binomial",
                          data = filter(exp2_data,trial_type == "Exposure"))


kable(tidy_anova(anova(exp2_exposure_lm_simple, exp2_exposure_lm)))

term	df	AIC	BIC	logLik	deviance	$\chi^{2}$ value	Chi.Df	$p$ value	stars
exp2_exposure_lm_simple	5	1196	1222	-593	1186
exp2_exposure_lm	6	1140	1171	-564	1128	58.390	1	< .001	***

exp2_exposure_lm_sex_simple <- glmer(response ~ age * condition + sex + (1|word) +
                            (1|subject), family = "binomial",
                          data = filter(exp2_data,trial_type == "Exposure"))

exp2_exposure_lm_sex_interaction <- glmer(response ~ age * condition * sex + (1|word) +
                            (1|subject), family = "binomial",
                          data = filter(exp2_data,trial_type == "Exposure"),
                          control=glmerControl(optimizer="bobyqa",
                                         optCtrl=list(maxfun=2e5)))

kable(tidy_lmer(exp2_exposure_lm_sex_simple, c("Intercept", "Age",  "Plausible", 
                                        "Male", "Age x Plausible")))

term	estimate	std. error	$z$ value	$p$ value	stars
Intercept	2.237	0.800	2.796	0.005	**
Age	-0.741	0.177	-4.190	< .001	***
Plausible	-6.158	1.181	-5.215	< .001	***
Male	-0.068	0.215	-0.314	0.754
Age x Plausible	2.117	0.278	7.606	< .001	***

kable(tidy_lmer(exp2_exposure_lm_sex_interaction, c("Intercept", "Age",  "Plausible", 
                                        "Male", "Age x Plausible", "Age x Male", 
                                        "Plausible x Male", "Age x Plausible x Male")))

term	estimate	std. error	$z$ value	$p$ value	stars
Intercept	2.871	1.048	2.741	0.006	**
Age	-0.920	0.235	-3.919	< .001	***
Plausible	-6.488	1.543	-4.204	< .001	***
Male	-1.681	1.517	-1.108	0.268
Age x Plausible	2.282	0.363	6.284	< .001	***
Age x Male	0.470	0.345	1.362	0.173
Plausible x Male	1.123	2.291	0.490	0.624
Age x Plausible x Male	-0.482	0.544	-0.885	0.376

kable(tidy_anova(anova(exp2_exposure_lm_sex_simple, 
                       exp2_exposure_lm_sex_interaction)))

term	df	AIC	BIC	logLik	deviance	$\chi^{2}$ value	Chi.Df	$p$ value	stars
exp2_exposure_lm_sex_simple	7	1142	1177	-564	1128
exp2_exposure_lm_sex_interaction	10	1141	1192	-561	1121	6.870	3	0.076	.

kable(tidy_anova(anova(exp2_exposure_lm_sex_simple, exp2_exposure_lm)))

term	df	AIC	BIC	logLik	deviance	$\chi^{2}$ value	Chi.Df	$p$ value	stars
exp2_exposure_lm	6	1140	1171	-564	1128
exp2_exposure_lm_sex_simple	7	1142	1177	-564	1128	0.099	1	0.754

kable(tidy_anova(anova(exp2_exposure_lm_sex_interaction, exp2_exposure_lm)))

term	df	AIC	BIC	logLik	deviance	$\chi^{2}$ value	Chi.Df	$p$ value	stars
exp2_exposure_lm	6	1140	1171	-564	1128
exp2_exposure_lm_sex_interaction	10	1141	1192	-561	1121	6.968	4	0.138

ggplot(filter(exp2_group_data,trial_type == "Exposure"), 
       aes(x=condition, y=mean, fill=as.factor(age_group))) +
  facet_grid(. ~ age_group) +
  geom_bar(stat="identity",position=position_dodge(1))+
  geom_linerange(aes(ymin = ci_lower,
                      ymax = ci_upper),
                  size = .8,
                  show.legend = FALSE,
                 position=position_dodge(1)) +
  scale_fill_manual(values = colors) +
  geom_hline(aes(yintercept=.5),lty=2)+
  theme_bw(base_size=14) +
  theme(legend.position="none", panel.grid=element_blank()) +
  scale_x_discrete(name = "\nSpeaker Condition")+
  scale_y_continuous(name = "Proportion Choosing Plausible",
                     limits=c(0,1))

Older children showed a greater sensitivity to the speaker on the unambiguous Exposure trials.

Test trials

exp2_age_group_test <- exp2_data %>%
  filter(trial_type == "Test") %>%
  group_by(age_group) %>%
  nest() %>%
  mutate(model = map(data, ~ glmer(response ~ condition + (1|word) +
                            (1|subject), family = "binomial", data = .))) %>%
  unnest(model %>% map(function(x) tidy_lmer(x, c("Intercept", "Plausible")))) %>%
  arrange(age_group)

kable(exp2_age_group_test)

age_group	term	estimate	std. error	$z$ value	$p$ value	stars
3	Intercept	-0.026	0.202	-0.128	0.898
3	Plausible	0.055	0.261	0.212	0.832
4	Intercept	-0.424	0.284	-1.494	0.135
4	Plausible	1.006	0.350	2.872	0.004	**
5	Intercept	-1.089	0.438	-2.486	0.013	*
5	Plausible	1.355	0.342	3.963	< .001	***

exp2_test_es <- exp2_data %>%
  filter(trial_type == "Test") %>%
  mutate(response = as.numeric(response) - 1) %>%
  group_by(age_group, condition, subject) %>%
  summarise(response = mean(response)) %>%
  group_by(condition) %>%
  split(.$age_group) %>%
  map(mes_df) %>%
  bind_rows(.id = "age_group")

kable(exp2_test_es)

age_group	d	ci_lower	ci_upper	condition
3	0.06	-0.49	0.60	Plausible
4	0.82	0.22	1.42	Plausible
5	1.24	0.57	1.91	Plausible

exp2_test_lm <- glmer(response ~ age * condition + (1|word) + (1|subject), 
                          family = "binomial",
                          data = filter(exp2_data,trial_type == "Test"))

kable(tidy_lmer(exp2_test_lm, c("Intercept", "Age", "Plausible", 
                            "Age x Plausible")))

term	estimate	std. error	$z$ value	$p$ value	stars
Intercept	1.471	0.725	2.03	0.042	*
Age	-0.430	0.156	-2.75	0.006	**
Plausible	-1.708	0.937	-1.82	0.068	.
Age x Plausible	0.538	0.209	2.58	0.010	**

exp2_test_lm_simple <- glmer(response ~ age + condition + (1|word) + (1|subject), 
                          family = "binomial",
                          data = filter(exp2_data,trial_type == "Test"))

kable(tidy_anova(anova(exp2_test_lm_simple, exp2_test_lm)))

term	df	AIC	BIC	logLik	deviance	$\chi^{2}$ value	Chi.Df	$p$ value	stars
exp2_test_lm_simple	5	1365	1389	-677	1355
exp2_test_lm	6	1360	1390	-674	1348	6.561	1	0.010	*

exp2_test_lm_sex_simple <- glmer(response ~ age * condition + sex +
                                   (1|word) + (1|subject), 
                          family = "binomial",
                          data = filter(exp2_data,trial_type == "Test"))

kable(tidy_lmer(exp2_test_lm_sex_simple, c("Intercept", "Age",  "Plausible", 
                                        "Male", "Age x Plausible")))

term	estimate	std. error	$z$ value	$p$ value	stars
Intercept	1.251	0.727	1.72	0.085	.
Age	-0.407	0.155	-2.63	0.009	**
Plausible	-1.730	0.927	-1.86	0.062	.
Male	0.318	0.181	1.76	0.078	.
Age x Plausible	0.540	0.206	2.61	0.009	**

exp2_test_lm_sex_interaction <- glmer(response ~ age * condition * sex + 
                                        (1|word) + (1|subject), 
                          family = "binomial",
                          data = filter(exp2_data,trial_type == "Test"),
                          control = glmerControl(optimizer = "bobyqa", 
                                                 optCtrl = list(maxfun = 2e5)))

kable(tidy_lmer(exp2_test_lm_sex_interaction, c("Intercept", "Age",  "Plausible", 
                                        "Male", "Age x Plausible",
                                    "Age x Male", "Plausible x Male", 
                                    "Age x Plausible x Male")))

term	estimate	std. error	$z$ value	$p$ value	stars
Intercept	1.302	0.944	1.380	0.168
Age	-0.419	0.203	-2.061	0.039	*
Plausible	-1.460	1.216	-1.200	0.230
Male	0.192	1.396	0.138	0.890
Age x Plausible	0.482	0.266	1.812	0.070	.
Age x Male	0.031	0.313	0.099	0.921
Plausible x Male	-0.742	1.895	-0.391	0.695
Age x Plausible x Male	0.166	0.426	0.390	0.696

kable(tidy_anova(anova(exp2_test_lm_sex_simple, exp2_test_lm_sex_interaction)))

term	df	AIC	BIC	logLik	deviance	$\chi^{2}$ value	Chi.Df	$p$ value	stars
exp2_test_lm_sex_simple	7	1359	1394	-673	1345
exp2_test_lm_sex_interaction	10	1365	1414	-672	1345	0.476	3	0.924

kable(tidy_anova(anova(exp2_test_lm_sex_simple, exp2_test_lm)))

term	df	AIC	BIC	logLik	deviance	$\chi^{2}$ value	Chi.Df	$p$ value	stars
exp2_test_lm	6	1360	1390	-674	1348
exp2_test_lm_sex_simple	7	1359	1394	-673	1345	3.078	1	0.079	.

kable(tidy_anova(anova(exp2_test_lm_sex_interaction, exp2_test_lm)))

term	df	AIC	BIC	logLik	deviance	$\chi^{2}$ value	Chi.Df	$p$ value	stars
exp2_test_lm	6	1360	1390	-674	1348
exp2_test_lm_sex_interaction	10	1365	1414	-672	1345	3.554	4	0.470

#quartz(width=8,height=4)
ggplot(filter(exp2_group_data, trial_type == "Test"), 
       aes(x = condition, y=mean, fill=as.factor(age_group))) +
  facet_grid(. ~ age_group, 
             labeller=as_labeller(function(value) sprintf("%s-years", value))) +
  geom_bar(stat="identity",position=position_dodge(1))+
  geom_linerange(aes(ymin = ci_lower,
                      ymax = ci_upper),
                  size = .8,
                  show.legend = FALSE,
                 position=position_dodge(1)) +
  scale_fill_manual(values = colors) +
  geom_hline(aes(yintercept=.5),lty=2)+
  theme_bw(base_size=14) +
  theme(legend.position="none", panel.grid=element_blank()) +
  scale_x_discrete(name = "\nSpeaker Condition")+
  scale_y_continuous(name = "Proportion Choosing Plausible",
                     limits=c(0,1))

In line with previous work, these results show that 3-year-old children have trouble using top-down speaker expectations when processing ambiguous utterances. However, children appear to improve significantly over the next two years.

Older children were thus more sensitive to the speaker’s utterances on unambiguous Exposure trials, and relied more on their speaker-expectations on ambiguous Test trials. Did older children rely more on their speaker-expectations because they had built stronger expectations on Exposure trial? If so, individual differences in children’s performance on Exposure trials should explain away the effect of age on Test trials. In contrast, if the ability to leverage these expectations is improving over development, age should predict additional variance in ambiguous Test trial responses over and above children’s responses on Exposure trials.

To answer this question, we fit an additional model including the proportion of Exposure trials on which individual children had selected the plausible referent. We assume that children who more frequently selected the plausible referent in the Plausible referent in the Plausible Speaker condition, or less frequently selected the plausible referent in the Implausible Speaker condition were encoding more information about the Speaker’s plausibility.

Exposure predicts test

exp2_bykid_data <- exp2_data %>%
  filter(trial_type == "Exposure") %>%
  group_by(age, age_group, condition, subject) %>%
  summarise(Exposure = mean(response == "Plausible")) %>%
  left_join(filter(exp2_data, trial_type == "Test"))

exp2_bykid_lmer <- glmer(response ~  age + condition * Exposure + (1|subject) +
                      (1|word), family = "binomial", data = exp2_bykid_data,
                    control=glmerControl(optimizer="bobyqa",
                                         optCtrl=list(maxfun=2e5)))

kable(tidy_lmer(exp2_bykid_lmer, c("Intercept", "Age", "Plausible", 
                              "Exposure", "Plausible x Exposure")))

term	estimate	std. error	$z$ value	$p$ value	stars
Intercept	-1.577	0.647	-2.435	0.015	*
Age	0.066	0.117	0.564	0.573
Plausible	1.917	0.566	3.385	0.001	***
Exposure	2.840	0.618	4.595	< .001	***
Plausible x Exposure	-3.323	0.913	-3.638	< .001	***

Compare this model to both simpler and more complex models. Also show that the same results hold when only the 4- and 5-year-olds kids were analyzed.

exp2_bykid_lmer_simple <- glmer(response ~  age + condition + Exposure + (1|subject) +
                      (1|word), family = "binomial", data = exp2_bykid_data,
                    control=glmerControl(optimizer="bobyqa",
                                         optCtrl=list(maxfun=2e5)))

kable(tidy_anova(anova(exp2_bykid_lmer_simple, exp2_bykid_lmer)))

term	df	AIC	BIC	logLik	deviance	$\chi^{2}$ value	Chi.Df	$p$ value	stars
exp2_bykid_lmer_simple	6	1359	1389	-673	1347
exp2_bykid_lmer	7	1348	1382	-667	1334	13.005	1	< .001	***

exp2_bykid_lmer_3way <- glmer(response ~  age * condition * Exposure + (1|subject) +
                      (1|word), family = "binomial", data = exp2_bykid_data,
                    control=glmerControl(optimizer="bobyqa",
                                         optCtrl=list(maxfun=2e5)))

kable(tidy_anova(anova(exp2_bykid_lmer_3way, exp2_bykid_lmer)))

term	df	AIC	BIC	logLik	deviance	$\chi^{2}$ value	Chi.Df	$p$ value	stars
exp2_bykid_lmer	7	1348	1382	-667	1334
exp2_bykid_lmer_3way	10	1350	1399	-665	1330	3.978	3	0.264

exp2_bykid_lmer_2way_other1 <- glmer(response ~  age * condition + Exposure +
                                       (1|subject) + (1|word), 
                                     family = "binomial", data = exp2_bykid_data,
                    control=glmerControl(optimizer="bobyqa",
                                         optCtrl=list(maxfun=2e5)))

kable(tidy_anova(anova(exp2_bykid_lmer_2way_other1, exp2_bykid_lmer)))

term	df	AIC	BIC	logLik	deviance	$\chi^{2}$ value	Chi.Df	$p$ value	stars
exp2_bykid_lmer_2way_other1	7	1359	1394	-673	1345
exp2_bykid_lmer	7	1348	1382	-667	1334	11.335	0	< .001	***

exp2_bykid_lmer_2way_other2 <- glmer(response ~  age * Exposure + condition +
                                       (1|subject) + (1|word), 
                                     family = "binomial", data = exp2_bykid_data,
                    control=glmerControl(optimizer="bobyqa",
                                         optCtrl=list(maxfun=2e5)))

kable(tidy_anova(anova(exp2_bykid_lmer_2way_other2, exp2_bykid_lmer)))

term	df	AIC	BIC	logLik	deviance	$\chi^{2}$ value	Chi.Df	$p$ value	stars
exp2_bykid_lmer_2way_other2	7	1357	1391	-671	1343
exp2_bykid_lmer	7	1348	1382	-667	1334	8.989	0	< .001	***

exp2_bykid_lmer_younger <- glmer(response ~  age + condition * Exposure + (1|subject) +
                      (1|word), family = "binomial", 
                      data = filter(exp2_bykid_data, age >= 4),
                    control=glmerControl(optimizer="bobyqa",
                                         optCtrl=list(maxfun=2e5)))

kable(tidy_lmer(exp2_bykid_lmer_younger, c("Intercept", "Age", "Plausible", 
                              "Exposure", "Plausible x Exposure")))

term	estimate	std. error	$z$ value	$p$ value	stars
Intercept	-1.237	1.230	-1.01	0.315
Age	-0.056	0.226	-0.25	0.803
Plausible	3.835	1.302	2.94	0.003	**
Exposure	3.351	0.781	4.29	< .001	***
Plausible x Exposure	-5.426	1.647	-3.29	0.001	***

exp2_cor_data <- exp2_bykid_data %>%
  group_by(age, Exposure, condition, subject) %>%
  summarise(Test = mean(response == "Plausible")) 

ggplot(aes(x = Exposure, y = Test, color = age), data = exp2_cor_data) +
  facet_grid(~ condition) + 
  geom_jitter() + 
  geom_smooth(method = "lm") +
  theme_bw(base_size=14) +
  theme(panel.grid=element_blank()) +
  scale_x_continuous(name = "\nExposure Proportion plausible",
                   limits = c(-.1, 1.1), breaks = seq(0, 1, .2)) +
  scale_y_continuous(name = "Test Proportion plausible",
                     limits=c(-.1, 1.1), breaks = seq(0, 1, .2))

Thus, it appears that older children relied more on speaker expectations because they had formed stronger expectations rather than because they rely on expectations differently. Experiments 1 and 2 thus show that children’s reliance on speaker-expectations remains relatively constant across the 3–6 year range, but that their ability to build these expectations improves gradually across development.

Experiment 3

Experiment 3 tests a second prediction of noisy channel processing: As speech becomes noisier, and thus less reliable, children should rely more on their expectations.

Load data

exp3_data <- read_csv("data/exp3.csv") %>%
  mutate(response = factor(response, levels = c("Implausible", "Plausible")),
         condition = factor(condition, levels = c("Implausible", "Plausible", "Control"),
                            labels = c("Implausible", "Plausible", 
                                       "Implausible (Control)")))

Munge data

exp3_demo_data <- exp3_data %>%
  distinct(subject, .keep_all = TRUE) %>%
  group_by(condition, noise) %>%
  summarise(n = n(),
            num_girls = sum(sex == "female", na.rm = T),
            min_age = min(age),
            mean_age = mean(age),
            max_age = max(age))

exp3_group_data <- exp3_data %>%
  mutate(response = as.numeric(response)-1) %>%
  group_by(condition,noise,trial_type) %>%
  multi_boot_standard("response", na.rm = T)

Demographics

kable(exp3_demo_data,
      col.names = c("Speaker Condition", "Noise Level", "Num Participants",
                    "Num Girls","Min Age","Mean Age","Max. Age"))

Speaker Condition	Noise Level	Num Participants	Num Girls	Min Age	Mean Age	Max. Age
Implausible	No Noise	20	12	4.00	4.98	5.83
Implausible	Noisy	24	11	4.01	5.01	5.92
Plausible	No Noise	21	8	4.10	4.89	5.93
Plausible	Noisy	26	12	4.02	4.98	5.94
Implausible (Control)	Noisy	20	11	4.15	4.96	5.91

Exposure trials

exp3_group_exposure <- exp3_data %>%
  filter(trial_type == "Exposure") %>%
  group_by(noise) %>%
  nest() %>%
  mutate(model = map(data, ~ glmer(response ~ condition + (1|word) +
                            (1|subject), family = "binomial", data = .))) %>%
  unnest(map2(model, list(c("Intercept", "Plausible", "Implausible (Control)"), 
                       c("Intercept", "Plausible")), 
              function(x,y) tidy_lmer(x, y))) %>%
  arrange(noise)


kable(exp3_group_exposure)

noise	term	estimate	std. error	$z$ value	$p$ value	stars
No Noise	Intercept	-2.882	0.578	-4.987	< .001	***
No Noise	Plausible	6.922	1.126	6.148	< .001	***
Noisy	Intercept	-1.890	0.358	-5.277	< .001	***
Noisy	Plausible	4.906	0.594	8.255	< .001	***
Noisy	Implausible (Control)	0.351	0.432	0.813	0.416

exp3_exposure_es <- exp3_data %>%
  filter(trial_type == "Exposure") %>%
  mutate(response = as.numeric(response) - 1) %>%
  group_by(noise, condition, subject) %>%
  summarise(response = mean(response)) %>%
  group_by(condition) %>%
  split(.$noise) %>%
  map(mes_df) %>%
  bind_rows(.id = "noise")

kable(exp3_exposure_es)

noise	d	ci_lower	ci_upper	condition
No Noise	7.94	6.05	9.82	Plausible
Noisy	4.46	3.40	5.52	Plausible
Noisy	0.21	-0.40	0.83	Implausible (Control)

exp3_exposure_lm <- glmer(response ~ noise * condition + (1|word) +
                            (1|subject), family = "binomial",
                          data = filter(exp3_data,trial_type == "Exposure"))

kable(tidy_lmer(exp3_exposure_lm, c("Intercept", "Noisy", "Plausible",
                                    "Implausible (Control)", 
                                    "Plausible x Noisy")))

term	estimate	std. error	$z$ value	$p$ value	stars
Intercept	-2.750	0.437	-6.297	< .001	***
Noisy	0.863	0.495	1.743	0.081	.
Plausible	6.622	0.746	8.878	< .001	***
Implausible (Control)	0.355	0.442	0.803	0.422
Plausible x Noisy	-1.707	0.793	-2.153	0.031	*

Verify Wald Z-test p values using log-likelihood to compare the interaction model to the simple effects model.

exp3_exposure_lm_simple <- glmer(response ~ noise + condition + (1|word) +
                            (1|subject), family = "binomial",
                          data = filter(exp3_data,trial_type == "Exposure"))

anova(exp3_exposure_lm_simple, exp3_exposure_lm)

## Data: filter(exp3_data, trial_type == "Exposure")
## Models:
## exp3_exposure_lm_simple: response ~ noise + condition + (1 | word) + (1 | subject)
## exp3_exposure_lm: response ~ noise * condition + (1 | word) + (1 | subject)
##                         Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
## exp3_exposure_lm_simple  6 604 633   -296      592                        
## exp3_exposure_lm         7 602 635   -294      588   4.8      1      0.028
##                          
## exp3_exposure_lm_simple  
## exp3_exposure_lm        *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

ggplot(filter(exp3_group_data,trial_type == "Exposure"), 
       aes(x = condition, y = mean)) +
  facet_grid(. ~ noise) +
  geom_bar(stat="identity", position=position_dodge(1), fill = colors[2])+
  geom_linerange(aes(ymin = ci_lower,
                      ymax = ci_upper),
                  size = .8,
                  show.legend = FALSE,
                 position=position_dodge(1)) +
  geom_hline(aes(yintercept=.5),lty=2)+
  theme_bw(base_size=14) +
  theme(legend.position="none", panel.grid=element_blank()) +
  scale_x_discrete(name = "\nSpeaker Condition")+
  scale_y_continuous(name = "Proportion Choosing Plausible",
                     limits=c(0,1))

Compared to the Implausible Speaker condition, children exposed to the Plausible Speaker were more likely to pick the plausible referent on Exposure trials. Children exposed to the Control speaker did not perform differently on Exposure trials from those exposed to the Implausible speaker, as predicted. Further, the model showed a marginal effect of Noise level and a significant interaction between Noise level and Speaker, indicating that the addition of noise moved children in both conditions closer to chance.

Test trials

exp3_group_test <- exp3_data %>%
  filter(trial_type == "Test") %>%
  group_by(noise) %>%
  nest() %>%
  mutate(model = map(data, ~ glmer(response ~ condition + (1|word) +
                            (1|subject), family = "binomial", data = .))) %>%
  unnest(map2(model, list(c("Intercept", "Plausible", "Implausible (Control)"), 
                       c("Intercept", "Plausible")), 
              function(x,y) tidy_lmer(x, y))) %>%
  arrange(noise)


kable(exp3_group_test)

noise	term	estimate	std. error	$z$ value	$p$ value	stars
No Noise	Intercept	-1.288	0.352	-3.65	< .001	***
No Noise	Plausible	1.172	0.315	3.72	< .001	***
Noisy	Intercept	-0.515	0.335	-1.54	0.125
Noisy	Plausible	1.570	0.379	4.14	< .001	***
Noisy	Implausible (Control)	2.628	0.529	4.97	< .001	***

exp3_test_es <- exp3_data %>%
  filter(trial_type == "Test") %>%
  mutate(response = as.numeric(response) - 1) %>%
  group_by(noise, condition, subject) %>%
  summarise(response = mean(response)) %>%
  group_by(condition) %>%
  split(.$noise) %>%
  map(mes_df) %>%
  bind_rows(.id = "noise")

kable(exp3_test_es)

noise	d	ci_lower	ci_upper	condition
No Noise	1.17	0.49	1.85	Plausible
Noisy	1.17	0.55	1.78	Plausible
Noisy	1.93	1.19	2.67	Implausible (Control)

exp3_test_lm <- glmer(response ~ noise * condition + (1|word) + (1|subject), 
                      family = "binomial",
                      data = filter(exp3_data,trial_type == "Test",
                                    condition != "Implausible (Control)"))


kable(tidy_lmer(exp3_test_lm, c("Intercept", "Noisy", 
                            "Plausible",
                            "Noisy x Plausible")))

term	estimate	std. error	$z$ value	$p$ value	stars
Intercept	-1.313	0.352	-3.732	< .001	***
Noisy	0.810	0.366	2.211	0.027	*
Plausible	1.201	0.376	3.198	0.001	**
Noisy x Plausible	0.329	0.504	0.653	0.514

Verify Wald Z-test p values using log-likelihood to compare the interaction model to the simple effects model.

exp3_test_lm_simple <- glmer(response ~ noise + condition + (1|word) + (1|subject), 
                      family = "binomial",
                      data = filter(exp3_data,trial_type == "Test",
                                    condition != "Implausible (Control)"))

kable(tidy_anova(anova(exp3_test_lm_simple, exp3_test_lm)))

term	df	AIC	BIC	logLik	deviance	$\chi^{2}$ value	Chi.Df	$p$ value	stars
exp3_test_lm_simple	5	882	905	-436	872
exp3_test_lm	6	884	912	-436	872	0.422	1	0.516

plotting_test_data <- filter(exp3_group_data, trial_type == "Test") %>%
  mutate(width = ifelse(condition == "Implausible (Control)", .4, .9)) %>%
  mutate(test_type = ifelse(condition == "Implausible (Control)", 
                            "Plausible", "Implausible"))

#quartz(width=7.5,height=4)
ggplot(plotting_test_data, 
       aes(x=noise, y=mean, fill=test_type)) +
  facet_grid(. ~ condition, scales = "free_x") +
  geom_bar(aes(width = width),
           stat="identity", position=position_dodge(1))+
  geom_linerange(aes(ymin = ci_lower,
                      ymax = ci_upper),
                  size = .8,
                  show.legend = FALSE,
                 position=position_dodge(1)) +
  scale_fill_manual(values = colors[2:3]) +
  geom_hline(aes(yintercept=.5),lty=2)+
  theme_bw(base_size=14) +
  theme(legend.position="none", panel.grid=element_blank()) +
  scale_x_discrete(name = "\nNoise Level")+
  scale_y_continuous(name = "Proportion Choosing Plausible",
                     limits=c(0,1))

As predicted, children showed sensitivity to both speaker reliability and acoustic noise. Children selected the plausible referent at Test, correcting the error in their acoustic input, more often when the speaker had said plausible things on Exposure trials. In addition, regardless of Speaker plausibility, children selected the plausible referent more frequently when the acoustic input was noisy. To quantify this pattern, we again fit a mixed-effects regression predicting choice on test trials from Speaker type and Noise level as well as their interaction. As predicted, both main effects were significant, but their interaction was not

Finally, one alternative explanation for the difference between Speaker conditions is that children simply followed their expectations at all times, e.g., that those exposed to the Implausible speaker chose “silly” responses regardless of the question. To test this alternative, we asked whether children who responded to an Implausible speaker on Exposure trials always chose the implausible referent on Test trials even when the speaker referred to the plausible referent (Implausible Control condition).

Control condition

control_data <- exp3_data %>%
  filter(trial_type == "Test", noise == "Noisy") %>%
  mutate(condition = factor(condition, levels = c("Plausible", "Implausible", 
                                                  "Implausible (Control)")))
  
exp3_control_lm <-  glmer(response ~ condition + (1|word) + (1|subject), 
                          family = "binomial",
                          data = control_data)

kable(tidy_lmer(exp3_control_lm, c("Intercept", 
                               "Implausible", 
                               "Implausible (Control)")))

term	estimate	std. error	$z$ value	$p$ value	stars
Intercept	1.05	0.334	3.15	0.002	**
Implausible	-1.57	0.379	-4.14	< .001	***
Implausible (Control)	1.06	0.516	2.05	0.040	*

exp3_control_lm_simple <-  glmer(response ~  (1|word) + (1|subject), 
                          family = "binomial",
                          data = control_data)

kable(tidy_anova(anova(exp3_control_lm, exp3_control_lm_simple)))

term	df	AIC	BIC	logLik	deviance	$\chi^{2}$ value	Chi.Df	$p$ value	stars
exp3_control_lm_simple	3	647	660	-320	641
exp3_control_lm	5	622	644	-306	612	28.444	2	< .001	***

Thus, children who were asked for the plausible referent at Test selected it, even when the speaker had previously always referred to the implausible referent. This Control condition provides further evidence that children were attending to and responding to the acoustic input from the speaker on Test trials, integrating it with their prior expectations

Preschoolers flexibly adapt to linguistic input in a noisy channel

Daniel Yurovsky, Sarah Case, & Michael C. Frank

2016-08-17

Experiment 1

Demographics

Exposure trials

Test trials

Experiment 2

Demographics

Exposure trials

Test trials

Exposure predicts test

Experiment 3

Demographics

Exposure trials

Test trials

Control condition