Are commonly used lab‐based measures of food value and choice predictive of self‐reported real‐world snacking? An ecological momentary assessment study

Abstract Objectives While the assessment of actual food intake is essential in the evaluation of behaviour change interventions for weight‐loss, it may not always be feasible to collect this information within traditional experimental paradigms. For this reason, measures of food preference (such as measures of food value and choice) are often used as more accessible alternatives. However, the predictive validity of these measures (in relation to subsequent food consumption) has not yet been studied. Our aim was to investigate the extent to which three commonly used measures of preference for snack foods (explicit food value, unhealthy food choice and implicit preference) predicted self‐reported real‐world snacking occasions. Design Ecological Momentary Assessment (EMA) design. Method Over a seven‐day study period, participants (N = 49) completed three daily assessments where they reported their healthy and unhealthy snack food consumption and completed the three measures of preference (explicit food value, unhealthy food choice and implicit preference). Results Our findings demonstrated some weak evidence that unhealthy Visual Analogue Scale scores predicted between‐subject increases in unhealthy snacking frequency (OR = 1.018 [1.006, 1.030], p = .002). No other preference measures significantly predicted self‐reported healthy or unhealthy snacking occasions (ps > .05). Conclusions These findings raise questions in relation to the association between measures of preference and self‐reported real‐world snack food consumption. Future research should further evaluate the predictive and construct validity of these measures in relation to food behaviours and explore the development of alternative assessment methods within eating behaviour research.

is therefore important to evaluate the extent to which easily administered measures of value and choice are related to reports of real-world snack food consumption (Field et al., 2020).
Therefore, the aim of this study was to investigate whether three commonly used measures of food value and choice (implicit preferences, unhealthy food choices, explicit food value) predicted selfreported snacking behaviour across a 7-day period. We hypothesized that the measures of preference, choice and value would significantly predict healthy and unhealthy snacking occasions within the same assessment window over a 7-day study period. The study was pre-registered on OSF: https:// osf.io/tswb2/. We also investigated the associations between implicit and explicit proxy measures in exploratory analyses.

M E T HOD Participants
In line with our pre-registered sampling strategy, we recruited 50 participants (based on recommendations for multi-level modelling approaches [Maas & Hox, 2005]) and required a minimum of 50% assessment compliance for inclusion within the sample. Forty-nine participants completed at least 11 (50%) study period assessments in addition to baseline measurements and were retained. Participants were aged between 18 and 51 years (M = 26.82, SD ± 9.58), with 24 males (M = 32.92 ± 8.62) and 25 females (M = 20.96 ± 6.27), with an average Body Mass Index (BMI) of 23.38 kg/m 2 (± 3.30). To be eligible for participation, participants were required to be aged over 18, self-report no history of eating disorders, follow an omnivorous or vegetarian diet, have access to a smartphone with a camera and not be attempting to lose weight (or have recently dieted). Participants were recruited through online advertisements and the wider student and staff community at the University of Liverpool. Participants recruited through online advertisements received a shopping voucher, with the value dependent upon the number of EMA assessments completed (>70% completed = £20 voucher, 50%-69% completed = £10 voucher). University of Liverpool students could participate for course credit, where a similar compensation structure was used (>70% completed = 10 points, 50%-69% completed = 5 points). The study was approved by the Local Research Ethics Committee (approval code: 7617). Testing took place during the covid-19 pandemic (November-December 2020).

Implicit preference
The Brief Implicit Association Task (BIAT, Sriram & Greenwald, 2009) was used to measure implicit preference for healthy (e.g. banana, carrots) and unhealthy (e.g. biscuits, cheese) food items. Participants completed 4 blocks consisting of 20 trials (total 80 trials) in addition to two short unrecorded practice blocks (14 trials each). During each block, participants were asked to sort words (positive and negative) and images (healthy and unhealthy food items) into either a combined category (e.g. healthy foods and positive words) or an 'anything else' category. Participants were asked to respond using the on-screen keyboard, using the 'I' (if the item belonged to the combined category) and 'E' (if the item belonged to the anything else category) buttons. The combined category labels were either healthy-positive (i.e. healthy foods and positive words) or unhealthy-positive (i.e. unhealthy foods and positive words) combinations, with response latencies recorded for each trial. Participants completed two blocks of each type, the order of which was counterbalanced dependent upon session number. In line with recommendations (Nosek et al., 2014), the D algorithm for BIAT was used to calculate implicit preference scores, which included the removal of trials >10,000 ms in length in addition to the removal of assessments where more than 10% of trials were completed in less than 300 ms (N = 55 assessments total, 6% of | 241 ARE COMMONLY USED LAB-BASED MEASURES OF FOOD VALUE AND CHOICE PREDICTIVE OF SELF-REPORTED REAL-WORLD SNACKING? AN ECOLOGICAL MOMENTARY ASSESSMENT STUDY completed assessments). Positive scores indicated a preference towards healthy food items, and negative scores indicated a preference towards unhealthy food items.

Explicit choice
Explicit preference for healthy and unhealthy food items was assessed through the use of a forced choice task, where participants were required to select 2 out of 8 snack food images (4 healthy options, 4 unhealthy options) that represented the foods that they would most like to consume at that moment (e.g. Hollands & Marteau, 2016). The images presented consisted of equal numbers of sweet (e.g. ice cream, pineapple) and savoury (e.g. pretzels, celery sticks) items. To prevent fatigue from repeated assessments, set blocks of images were randomly presented to participants at each assessment (ensuring that identical images were not presented in subsequent assessments and images reflected equal numbers of healthy/ unhealthy sweet/savoury options). Healthy food choices were scored as +1 and unhealthy food choices were scored as 0, which when combined resulted in an explicit preference score ranging from 0 (two unhealthy selections) to 2 (two healthy selections).

Food value
Participants were presented with 10 images of snack food items (5 unhealthy, 5 healthy) and asked to rate each item on a Visual Analogue Scale (VAS) ranging from −100 (not at all appealing) to +100 (extremely appealing) to assess image appeal ('How appealing do you find this image') (e.g. Burger et al., 2011;Masterton et al., 2021). To avoid habituation, the 5 images presented for each category during the task were randomly selected from a possible 12 snack food items. Mean appeal scores were calculated at each assessment for healthy and unhealthy snack food items. Ten images were used within each assessment to reduce assessment duration and participant burden.

Snack food recall
At each assessment, participants were provided with several free recall boxes and asked to report any healthy and unhealthy snack food items (defined as any food item not consumed as part of a main meal [Hess et al., 2016]) that they had consumed since the last assessment ('Please list all healthy and unhealthy snack food items consumed since the last assessment. Please be as specific as possible (i.e. 30 g cashew nuts). Snack foods are classified as items consumed outside of a main meal').
Participants were asked to provide as much detail as possible (in relation to serving size/amount consumed and brand) for consumed foods, and were also asked to take photographs of snack food packaging (and servings) prior to consumption and send them to the research team. Participants were prompted to upload images at least once per day, but could upload images at any point throughout the study period. Although only the free text recall was compulsory, previous work has demonstrated that the use of images in dietary assessments supports participant recall and increases reporting accuracy (Zhao et al., 2021). The use of food images and free text recall also supported the research team in the extraction of accurate nutritional information for specific products, and identification of portion sizes (where this information was not provided by participants) (see König et al., 2021). A combined time (free text recall) and event (image upload)-based approach increases the accuracy and ecological validity of EMA assessments, as limitations associated with solely eventbased approaches (i.e. inability to identify occasions where snacking did not take place) are eliminated (Maugeri & Barchitta, 2019). Therefore, while time-based assessments were used to measure snack food consumption, this data were validated by additional information provided through event-based assessments, improving data quality and accuracy.
The UK Nutrient Profiling Model 2004/5 (UKNPM) was used to individually profile each food item consumed by participants as 'healthy' or 'less healthy' (Department of Health, 2011). The UKNPM categorizes food items based on the healthy (fibre; protein; fruit, nuts and vegetables) and unhealthy (saturated fat; sugar; salt) components of the product (per 100 g) in addition to the amount of energy provided by the product (kJ). A score of 4 or above indicated that the product was a 'less healthy' snack food item (referred to as unhealthy onwards), with foods scoring 3 and below categorized as healthy. A randomly selected sample (20%) of food scores were also independently profiled by a second researcher, with an excellent agreement rate of 95% (note: scoring discrepancies would not have resulted in any changes to food categorization [healthy/unhealthy] and were resolved within the research team).
Where brand information was available (through participant descriptions and/or uploaded images), nutritional (and portion size) information was obtained through either the manufacturers website or from the Tesco UK website (largest UK supermarket chain). Where specific brand or product information was not available, information was extracted from an equivalent Tesco 'own brand' product for categorization and portion size information. Across all participants, 282 unique food items were profiled, with 50 categorized as 'healthy' and 232 as 'unhealthy'.

Procedure
Participants who responded to study advertisements were provided with an information sheet (via email) providing key study details including exclusion criteria, type of tasks and measures, study duration and minimum participation thresholds. Eligible participants were then sent a URL link to the baseline assessment and prompted to install the Inquisit 6 (Millisecond Software, SA) application on their smartphone, where all assessments related to the study were completed. The baseline assessment included demographic measurements (age, sex, height and weight), the creation of a unique ID number (for future correspondence) in addition to a familiarization session (including implicit food preference, food value, explicit choice). Self-reported height and weight information was used to calculate BMI (weight (kg)/height (m 2 )). After completion of the baseline assessment, participants were sent further documentation in relation to accurately recording and reporting food consumption and were asked to contact the researcher should any issues arise with the application or completion of measures. We chose to recruit and conduct all testing online as completely online EMA studies have similar levels of compliance and data-quality to in-person recruitment (Carr et al., 2020).
Starting the day after the initial baseline assessment, participants were emailed a URL link to the Inquisit application three times per day at fixed intervals (12, 4 and 8 PM) for 7 consecutive days. Each assessment began with the snack food recall, followed by the measures of preference, choice and value (counterbalanced). A full list of food items (and example images) used within preference and value measures can be found at https://osf.io/tswb2/, and all images used within the study were of unbranded snack food items presented on a plain white background to avoid the potential influence of specific brand/flavour preferences. Participants were instructed to not backdate missed assessments, and where multiple assessments were completed within the same time period, data from the first valid assessment completed within that period were retained for analysis. After the 7-day study period, participants were contacted by email, thanked for their participation and fully debriefed and reimbursed (where appropriate).

Data reduction and analyses
We conducted multilevel logistic regressions using the 'glmer' function from the 'lme4' package in R (v1.1-27.1; Bates et al., 2015). Our predictor variables included IAT D′ score, explicit food choices and explicit value ratings of healthy and unhealthy food items. Our primary outcome variables were healthy | 243 ARE COMMONLY USED LAB-BASED MEASURES OF FOOD VALUE AND CHOICE PREDICTIVE OF SELF-REPORTED REAL-WORLD SNACKING? AN ECOLOGICAL MOMENTARY ASSESSMENT STUDY and unhealthy snacking occasions within each assessment period (as reported by participants since their last assessment). These variables were lagged to ensure the predictor and consumption variables reflected the same assessment period(s). We also conducted exploratory analyses using the reported number of portions of unhealthy food consumed since the last assessment. In each model we also examined age, sex and BMI as predictors. Assessment-level predictor variables were centred against the participant average (Paccagnella, 2006), to examine within-participant variance. To disaggregate between-participant variance the participant average was centred against the sample average (Curran & Bauer, 2011;Wang & Maxwell, 2015). Given studies often observe a reduction in compliance over time in EMA designs (see Jones et al., 2018Jones et al., , 2020, we also included session number as a predictor (1-21) to reduce any confounding.
To examine whether a multilevel model (with a random intercept of participant, and no predictors) was a better fit than a single level model (with no random intercept of participant, and no predictors) we examined whether there was a reduction in the AIC values for each (smaller AIC values are indicative of better fitting models, using the same data set). Here, we used the AIC change of >10 as indicative of substantial support for a multilevel model (Burnham & Anderson, 2004). Multicollinearity was assessed via Variance Inflation Factors, using the 'performance' package. To assess between participant associations, we computed total unhealthy and unhealthy snacking occasions per participant, and used assessment-level averages of IAT D′ score, explicit food choices and explicit value ratings of healthy and unhealthy food items as predictors in standard regression models. Aggregating assessment level EMA data can lead to more reliable person-level indices (Shiffman et al., 2008).
For compliance analyses, participants were deemed to have complied with the session if they had provided information on snacking behaviour on the assessment. Compliance was binary coded (0 = noncompliance, 1 = compliance) for each assessment. We conducted a generalized linear mixed model to examine if compliance was predicted by demographic variables (age, sex, BMI), or assessment number/ day of assessment (data and analysis scripts are online https://osf.io/tswb2).

Descriptive statistics for assessment-level and outcome variables
Breakdown of assessment-level variables are shown in Table 1. Intraclass correlation coefficients demonstrate significant within-person variability across all assessment-level predictors. Breakdown of assessment level-variables by assessment day (1-7) is shown in Table S1.

Predictors of 'unhealthy' snacking occasions within and between individuals
There were 328 unhealthy snacking occasions. The AIC for the null model was 1073.3 and the AIC for the multi-level model was 976.3, indicating the multi-level model was a substantially better fit of the data. The only significant predictor in the model was session number (OR = .962 [95% CI: .929, .995]), which was associated with a reduction in snacking over time (see Table 2). The model including predictors had a substantial reduction in AIC value (AIC = 760.0). There was some evidence of moderate multicollinearity (explicit choice between-participants VIF = 5.79). Removal of this variable from the model led to unhealthy food VAS becoming a significant between-participants predictor (OR = 1.018 [95% CI: 1.006, 1.030], Z = 3.091, p = .002) alongside session number. There was no significant improvement in AIC (761.8).

Predictors of 'healthy' snacking occasions within and between individuals
There were 160 healthy snacking occasions. The AIC for the null model was 797.7 and the AIC for the multi-level model was 665.0 indicating the multilevel model was a better fit of the data. The only significant predictor in the model was session number (OR = .927 [95% CI: .930, .996]), which was associated with a reduction in snacking over time (see Table 3). There was some evidence of multicollinearity (explicit choice between-participants VIF = 5.05). Removal of this variable from the model did not influence the pattern of results.

Do measures of food value predict unhealthy snack portions?
Of the 328 unhealthy snacking occasions we examined the number of portions of unhealthy snacks as an outcome. The average number of portions was 1.63 (± 1.33). There were no significant predictors (see Appendix S1 for full model reporting). We did not replicate this analysis with healthy snacks, due to the smaller number of snacking occasions. T A B L E 1 Mean values (±SD) of assessment-level variables (overall and split by session number over 7-day assessment period) Note: IAT D′ scores range between −2 (strong preference for unhealthy foods) and +2 (strong preference for healthy foods). Explicit choice scores range between 0 (2 unhealthy choices) and +2 (2 healthy choices). Food value scores range from −100 (not at all appealing) to +100 (extremely appealing).

DISCUS SION
The aim of this study was to investigate the predictive validity of commonly used measures of food value and choice (food value, explicit choice, implicit preference) in relation to self-reported real-world healthy and unhealthy snack food consumption. The results demonstrated that, aside from unhealthy food VAS ratings, the preference measures were not robust predictors of healthy or unhealthy snacking occasions, and they also failed to predict the number of unhealthy snack portions consumed by participants. There were also no robust significant associations between individual measures of preference and choice, with the exception of healthy food value and IAT D′ score, which may suggest that each of these measures are unlikely to relate to the same underlying construct. Due to the extensive use of these measures throughout the literature, we predicted that the measures would be significant predictors of both healthy and unhealthy snack food consumption. However, this does not appear to be the case, as only unhealthy food VAS scores significantly predicted selfreported consumption behaviour within the study, and only within a model in which removal of parameters influencing multi-collinearity was undertaken (and this model was not an improved fit of the data). These findings are important as they may help to explain poor or inconsistent translations (in relation to theoretic predictions and behavioural change) between laboratory studies and clinical interventions where measures of food preference and choice have been used to evaluate outcomes: Field et al. (2020) suggest that while experiments can demonstrate causality within a controlled environment, interventions based upon these manipulations may not be feasible should outcomes not equate to desirable (and sustained) behavioural change. Significant changes to food preference and choice using measures similar to those tested in this study have been documented within several intervention studies (e.g. Chen et al., 2018Chen et al., , 2019Hensels & Baines, 2016;Kakoschke et al., 2018), however, based on the present research it remains unclear whether these would translate to changes in snacking behaviour in the real-world.
One potential reason for a lack of consilience between preference measures and actual eating behaviour may be related to the nature of choice and preference measures within appetite research: responses have no real consequences for participants (Klein & Hilbig, 2019); therefore they may not be motivated to respond in a way that reflects their true food preferences or current underlying motivation. The findings from this study raise questions in relation to the ability of food value and choice measures to predict future consumption behaviours, which has implications for the development and evaluation of current and future weight-loss interventions.
Interestingly, the results also revealed that different preference measures did not necessarily relate to each other within individuals (the association between IAT D′ and healthy VAS scores aside). Given that these measures are hypothesized to measure the similar constructs, some level of association would be anticipated between these variables (i.e. an implicit preference for healthy foods would be associated with increased healthy food value and healthier explicit choices). This finding may help to explain some of the inconsistencies observed within previous research: while Hollands and Marteau (2016) found that exposure to negative health-related images led to increased explicit preference for fruit (within a forced choice task), there was no significant parallel effect on implicit preferences. The lack of association between preference measures could be related to the manner in which tasks are presented: explicit choice | 247 ARE COMMONLY USED LAB-BASED MEASURES OF FOOD VALUE AND CHOICE PREDICTIVE OF SELF-REPORTED REAL-WORLD SNACKING? AN ECOLOGICAL MOMENTARY ASSESSMENT STUDY tasks are often relatively short, and participants are able to easily control and manipulate their responses, unlike implicit preference measures, which are indirect and more complex (with the 'desirable' response less obvious) (Goodall, 2011).
We demonstrated that compliance with EMA assessments decreased over time, which is common within EMA studies Maugeri & Barchitta, 2019). The results also revealed that both healthy and unhealthy snacking significantly decreased during the study period (despite participants not reporting attempting to lose or reduce weight before participating). While it is possible that continued self-monitoring of behaviour reduced snack food consumption over time (e.g. Humphreys et al., 2021;Michie et al., 2009), reductions may be indicative of reduced engagement with assessments, or participants may have deliberately chosen to miss assessments/not report snacking occasions towards the end of the study (due to pressures associated with continual monitoring of food intake/study duration (Doherty et al., 2020)). As such, a potential limitation of this research is that we were not modelling naturalistic snacking behaviour or capturing all potential snacking outcomes. The EMA procedure we adopted is widely used, but its validity as a measure of snacking behaviour has not been tested. In addition, because snacking behaviour was self-reported (and will therefore be prone to bias), it may be the case participants chose not to report snacking occasions in an attempt at impression management/ self-presentation (Vartanian, 2015). Therefore, future research should examine if preference measures would be more strongly associated with objectively measured snacking behaviour (such as data collected through wearable technology devices [Skinner et al., 2020]).
Whilst BMI was included within both models, it was not a significant predictor of healthy or unhealthy snack food occasions. The average participant BMI fell within the 'healthy' range, and while previous work has found no significant association between BMI and laboratory assessments of food consumption (Robinson et al., 2017), it is possible that individuals with overweight or obesity may exhibit specific consumption (and preference) behaviours not observed within healthy weight groups (Mattes, 2014;Rodrigues et al., 2012). As individuals with overweight and obesity are often a key target for weight reduction interventions, future research should investigate associations between preference and consumption within this specific group to identify any potential differences in predictive validity of choice and preference measures (based upon weight status). Future work could also measure additional participant level factors (such as dietary restraint and hunger) to investigate potential associations between these variables and measures of food preference/consumption.
The use of an EMA design allowed for the examination of real-world snack food consumption and preference over a seven-day period, however, there were limitations associated with this approach. Participants completed assessments within fixed time periods, which may have introduced issues in relation to recall accuracy (as participants would have to wait for the next assessment to report snack foods consumed irrespective of snack timing). While participants were asked to photograph consumed snack foods and upload images (to support recall between assessments), future research could explore the incorporation of event-contingent assessments within studies, where participants initiate assessments at each consumption occasion (although this reduces reporting and can make reviewing compliance more difficult (Maugeri & Barchitta, 2019)). Additionally, while EMA allows participants to complete assessments in environments of their choice (increasing ecological validity), research demonstrates that environmental cues (such as advertisements, social cues and snack availability [Elliston et al., 2017]) are important predictors of consumption behaviours. Environmental variations between (and within) participants may have influenced (or prompted) snack choice and preference responses, and future research should attempt to further examine these factors by collecting information related to the context in which each assessment was completed. It is also possible that our between-participant effects are underpowered, indeed N = 49 would only allow detection of relatively moderate associations in cross-sectional analysis (rs ~ .22). However, we note that lab-based studies have demonstrated effects greater than this for food-liking and consumption (r = .27: Robinson et al., 2017) and VAS motivation measures and consumption (rs ~ .48: Hammond et al., 2022). Finally, it is worth noting that this study took place during the Covid-19 pandemic, and research has demonstrated changes in snacking and unhealthy behaviours during this time (Bakaloudi et al., 2021;Robinson et al., 2021). Replication of these findings post-pandemic is warranted.
In conclusion, using an EMA design, this study investigated the predictive validity of three commonly used measures of value and choice (food value, explicit preference, implicit preference) in relation to real-world snack food consumption. The results demonstrated unconvincing evidence for their prediction of self-reported healthy or unhealthy snacking occasions, or the number of unhealthy snack food portions consumed by participants. These findings raise uncertainties about the use of food value and preference measures as predictors of snack food consumption across the wider literature. However, it is possible that limitations with the EMA design (i.e. influencing naturalistic snacking, non-reporting) may have obscured any relationships between these variables.

AU T HOR C ON T R I BU T IONS Sarah Masterton:
Conceptualization; formal analysis; investigation; methodology; writing -original draft; writing -review and editing. Charlotte A. Hardman: Conceptualization; methodology; writing -review and editing. Emma Boyland: Writing -review and editing. Eric Robinson: Writing -review and editing. Harriet E. Makin: Formal analysis; writing -review and editing. Andrew Jones: Conceptualization; formal analysis; methodology; writing -review and editing.

C ON F L IC T OF I N T ER E S T
None.

DATA AVA I L A BI L I T Y S TAT E M E N T
The data that support the findings of this study are openly available through Open Science Framework at http://doi.org/10.17605/ OSF.IO/TSWB2. ORCI D Sarah Masterton https://orcid.org/0000-0002-6248-347X