Analysis of dietary data is a two-step process involving:
The conversion of reported dietary intakes into estimates first requires the coding of data. The accuracy of data obtained from studies and final estimates depend on the precision and completeness of the initial measurement of consumption, consistency and precision during coding, and the use of a representative and comprehensive food composition database. The following sections outline the processes involved in converting collected dietary data to estimates of food and nutrient intake.
Computer-based dietary assessment (e.g. INTAKE24 [2]) facilitates data entry and automatically processes dietary data. Underlying algorithms and related assumptions of the automated system are common with those needed for the use of a traditional dietary assessment.
The coding process used depends on dietary assessment methods:
In each case, a reported or fixed item can be matched to multiple food items on the system. For example, reported consumption of ‘pizza’ may be matched to:
This process requires assumptions of typical dietary consumption in a population and depends on the availability of items in a food composition table. Ideally, the matching procedure is developed with detailed observations of dietary consumption in a population (e.g. multiple-day weighed food diaries) and validated by objective methods (e.g. duplicate diet).
Potential sources of error
(i) Difficulties in interpreting written details in food diaries or 24-hour recalls, for example:
(ii) Coding which does not best match the food or beverage actually consumed, for example:
(iii) Human error; making mistakes during food coding, for example:
(iv) Personal characteristics of the food coder may also influence a coder’s judgement when interpreting a food record [1], for example:
Standardisation of the coding process
The coding process involves numerous assumptions and judgements. In the process of coding open-ended dietary data, some degree of estimation and intuition is involved, irrespective of the skill and experience of the coder. Research groups have endeavoured to minimise the errors involved with coding by providing training for dietary coders and implementing quality control checks [15].
Attempts have been made to standardise the coding process [5]. For instance, a code book was designed for use in the INTERMAP study [3] which acted as a ‘rule book’. Its aim was to remove the need for coders to make subjective decisions.
The number of errors during coding may also be minimised if ‘coding rules’ are established to deal with incomplete or ambiguous entries in the food record [3]. For instance, the dietary coding manual (see Figure D.5.1) developed for use in the Infant Feeding Peer Support Trial provides standard infant portion sizes for commonly consumed food and drinks as well as listing default entries to aid coders in the coding process.
Figure D.5.1 Example of dietary coding manual page for baby foods from Infant Feeding Peer Support Trial (enlarge).
Source: Department of Public Health and Epidemiology, University College London.
Key considerations
Key considerations or guidelines for improving the coding process include the following: [1, 3]:
Coding programmes for a food frequency questionnaire
Paper-based food frequency questionnaires are often designed to scan and encode participants' responses. Once dietary intake information has been formatted electronically, a computer program is operated to generate data on dietary intakes. A wide range of computer systems (of varying quality) is available to facilitate the processing.
Analysis software links consumption data to food composition data (e.g. that provided by McCance and Widdowson’s The Composition of Foods [7]).
Universities or research centres tend to develop their own software and databases which may contain specific information on particular nutrients or food constituents. Analysis programs have been specifically developed for use in large cohort studies. Examples include DINER (Data Into Nutrients for Epidemiological Research) [15], CAFE (Compositional Analyses from Frequency Estimates) [14] and FETA (FFQ EPIC Tool for Analysis) designed for use in the EPIC-Norfolk study into diet and cancer [11].
Commercially available coding programmes are also available, for example:
Such coding programmes can vary greatly in their design, utility, and target audience e.g. health professionals, catering establishments, sports industries, nutritionists and dieticians or for personal use.
Which coding programme to use?
As seen above, a large range of options is available. The choice of coding programme will depend on the population being studied and should therefore incorporate appropriate foods and portion sizes within its nutritional database. For example, if the nutritional intakes of infants and young children are being examined, foods normally consumed by these age groups should be available within the programme.
Ideally, analysis programmes should be flexible and easily updateable to add new variables and keep abreast of the changing food supply. Approximately 10,000 foods are estimated to be modified, newly generated, or discontinued each year in the UK [15].
Types of variables generated from consumption data
Depending on available data collected and research interests, diverse variables can be generated.
A data processing algorithm can embed analysis of nutritional adequacy in a nutrient level or a whole-diet level from dietary data. Such an algorithm is useful when the purpose of dietary assessment is for screening of nutritional adequacy or when a researcher plans to provide timely feedback to participants.
Food composition databases
Food composition databases provide detailed information on the nutritional composition of foods. These databases may be:
Currently, there are over 150 food composition tables and electronic databases worldwide [6]. The LanguaL website [4], an international framework for food description, provides links to food composition databases from various countries.
Variation in food composition databases
Food composition databases vary greatly in terms of the number and detail of nutrients and other food chemicals or properties included. Please see the dedicated page for more detail on the potential sources of error associated with food composition databases.