Simple choices (e.g., eating an apple vs. an orange) are made by integrating noisy evidence that is sampled over time and influenced by visual attention; as a result, fluctuations in visual attention can affect choices. But what determines what is fixated and when? To address this question, we model the decision process for simple choice as an information sampling problem, and approximate the optimal sampling policy. We find that it is optimal to sample from options whose value estimates are both high and uncertain. Furthermore, the optimal policy provides a reasonable account of fixations and choices in binary and trinary simple choice, as well as the differences between the two cases. Overall, the results show that the fixation process during simple choice is influenced dynamically by the value estimates computed during the decision process, in a manner consistent with optimal information sampling.
Sample Patterns.pdf
As illustrated in Fig 1, we model attention by assuming that the DM can only sample from one item at each time point, the item she is fixating on. This sets up a fundamental problem: How should she allocate fixations in order to make good decisions without incurring too much cost? Specifically, at each time point, the DM must decide whether to select an option or continue sampling, and in the latter case, she must also decide which item to sample from. Importantly, she cannot simply allocate her attention to the item with the highest true value because she does not know the true values. Rather, she must decide which item to attend to based on her current value estimates and their uncertainty.
We model the control of attention as the selection of cognitive operations, ct, that specify either an item to sample, or the termination of sampling. If the DM wishes to sample from item c at time-step t, she selects ct = c and receives a signal(1)where u(c) is the unknown true value of the item being sampled, and is a free parameter specifying the amount of noise in each signal. The belief state is then updated in accordance with Bayesian inference:(2)The cognitive cost of each step of sampling and updating is given by a free parameter, γsample. We additionally impose a switching cost, γswitch, that the DM incurs whenever she samples from an item other than the one sampled on the last timestep (i.e., makes a saccade to a different item). Thus, the cost of sampling is(3)Note that the model includes the special case in which there are no switching costs (γswitch = 0).
The model has five free parameters: the standard deviation of the sampling distribution σx, the cost per sample γsample, the cost of switching attention γswitch, the prior bias α, and the inverse temperature of the softmax policy used to select cognitive operations, β. This last parameter controls the amount of noise in the fixation decisions. In order to fit the model, we need to make an assumption about the time that it takes to acquire each sample, which we take to be 100 ms. Note, however, that this choice is not important: changing the assumed duration leads to a change in the fitted parameters, but not in the qualitative model predictions.
We use an approximate maximum likelihood method to fit these parameters to choice and fixation data, which is described in the Methods section. Importantly, since the same model can be applied to N-item choices, we fit a common set of parameters jointly to the pooled data in both datasets. Thus, any differences in model predictions between binary and trinary choices are a priori predictions resulting from the structure of the model, and not differences in the parameters used to explain the two types of choices. We estimate the parameters using only the even trials, and then simulate the model in odd trials in order to compare the model predictions with the observed patterns out-of-sample. Because the policy optimization and likelihood estimation methods that we use are stochastic, we display simulations using the 30 top performing parameter configurations to give a sense of the uncertainty in the predictions. The parameter estimates were (mean std) σx = 2.60 0.216, α = 0.581 0.118, γswitch = 0.00995 0.001, γsample = 0.00373 0.001, and β = 364 81.2 As explained in the Methods, the units of these parameter estimates are standard deviations of value (i.e., ).
In order to explore the role of the prior, we also fit versions of the model in which the prior bias term was fixed to α = 0 or α = 1. The former corresponds to a strongly biased prior and the latter corresponds to a completely unbiased prior. For α = 0, the fitted parameters were σx = 3.16 0.409, γswitch = 0.00875 0.002, γsample = 0.00319 0.001, and β = 326 81.2. For α = 1, they were σx = 2.66 0.272, γswitch = 0.0118 0.002, γsample = 0.00506 0.001, and β = 330.0 97.9.
We now investigate the extent to which the predictions of the model, fitted on the even trials, are able to account for observed choice, reaction time and fixation patterns in the out-of-sample odd trials.
Each panel compares human data (black) and model predictions for binary choice (left, two dots) and trinary choice (right, three dots). The main model predictions are shown in purple. The restricted model predictions for the case of a highly biased prior mean (α = 0) are shown in blue; the case of a highly unbiased prior mean (α = 1) is shown in pink. These colors were chosen to illustrate that the main model falls between these two extremes. The aDDM predictions are shown in dashed green. Error bars (human) and shaded regions (model) indicate 95% confidence intervals computed by 10,000 bootstrap samples (the model confidence intervals are often too small to be visible). Note that the method used to compute and estimate the model parameters is noisy. To provide a sense of the effect of this noise on the main model predictions, we depict the predictions of the thirty best-fitting parameter configurations. Each light purple line depicts the predictions for one of those parameters, whereas the darker purple line shows the mean prediction. In order to keep the plot legible, only the mean predictions of the biased priors models are shown. (A) Choice probability as a function of relative rating. (B) Kernel density estimation for the distribution of total fixation time. Quartiles (25%, 50%, and 75% quantiles) for the data, aDDM and main model predictions are shown at the bottom. (C) Total fixation time as a function of the relative rating of the highest rated item. (D) Total fixation time as a function of the mean of all the item ratings (overall value).
Fig 3B plots the distribution of total fixation times. This measure is similar to reaction time except that it excludes time not spent fixating on one of the items. We use total fixation time instead of reaction time because the model does not account for the initial fixation latency nor the time spent saccading between items (although it does account for the opportunity cost of that time, through the γsample parameter). As shown in the figure, the model provides a reasonable qualitative account of the distributions, although it underpredicts the mode in the case of two items and the skew in both cases.
Fig 3C shows the relationship between total fixation time and trial difficulty, as measured by the relative liking rating of the best item. We find that the model provides a reasonable account of how total fixation time changes with difficulty. This prediction follows from the fact that fewer samples are necessary to detect a large difference than to either detect a small difference or determine that the difference is small enough to be unimportant. However, the model exhibits considerable variation in the predicted intercept and substantially overpredicts total fixation time in difficult trinary choices.
The original binary and trinary choice papers [9, 10] observed a systematic change in fixation durations over the course of the trial, as shown in Fig 4C. Although the model tends to underpredict the duration of the first two fixations in the three-item case, it captures well three key patterns: (a) the final fixation is shorter, (b) later (but non-final) fixations are longer and (c) fixations are substantially longer in the two-item case. The final prediction is especially striking given that the model uses the same set of fitted parameters for both datasets. The model predicts shorter final fixations because they are cut off when a choice is made [9, 10]. The model predicts the other patterns because more evidence is needed to alter beliefs when their precision is already high; this occurs late in the trial, especially in the two-item case where samples are split between fewer items.
Fig 4 also shows that the main model provides a more accurate account than the aDDM of how the number of fixations changes with trial difficulty, and of how fixation duration evolves over the course of a trial. One difficulty in making this comparison is that the aDDM assumes that non-final fixation durations are sampled from the observed empirical distribution, conditional on a number of observable variables, and thus the accuracy of its predictions regarding fixation duration and fixation number depends on the details of this sampling. To maximize comparability with the existing literature, here we use the same methods as in the original implementations [9, 10].
Fig 5C explores this further by looking at the location of new fixations in the three-item case, as a function of the difference in cumulative fixation time between the two possible fixation targets. Although the more-previously-fixated item is always less likely to be fixated, the probability of such a fixation actually increases as its fixation advantage grows. This counterintuitive model prediction results from the competing effects of value and uncertainty on attention. Since items with high estimated value are fixated more, an item that has been fixated much less than the others is likely to have a lower estimated value, and is therefore less likely to receive more fixations. However, we see that the predicted effect is much stronger than the observed effect, and that the aDDM model provides a better account of this pattern than our main model. However, note that the accuracy of this fit follows from the fact that the aDDM samples fixation locations and durations from the empirical distribution, conditioned on the previous three fixation locations and the item ratings. 2ff7e9595c
Comments