Guidance for reporting an observational studies in nutritional epidemiology

This advice is an extension of the STROBE statement to help you report a nutritional epidemiology study  Read more

The following information was originally published here.

Go to checklist

Title and abstract

1a Title

Indicate the study’s design with a commonly used term in the title or the abstract.

Readers should be able to easily identify the design that was used from the title or abstract. An explicit, commonly used term for the study design also helps ensure correct indexing of articles in electronic databases.

Example/s:

Leukaemia incidence among workers in the shoe and boot manufacturing industry: a case-control study.

nut-1

State the dietary/nutritional assessment method(s) used in the title or in the abstract.

Readers should be able to easily identify the design that was used from the title or abstract. An explicit, commonly used term for the study design also helps ensure correct indexing of articles in electronic databases.

Reporting the dietary and nutritional assessment method or methods in the title, abstract, or keywords with accurate terminology contributes to the completeness of the manuscript. This may be particularly relevant for methodologic research articles, which are used as reference articles in association studies. In addition, it will facilitate the accuracy of indexing in electronic databases as well as ease literature searches, through the use of keywords.

Due to the growing number of scientific journals, indexing of articles increasingly applies both automated summaries and manual approaches. If reports from dietary or nutritional research use standard terminology or approved Medical Subject Headings (MeSH), a step is taken toward reducing the number of incomplete or unusable research reports. Readability should be ensured at all times, and journal specifications with regard to style and word count apply. Guides to appropriate terminology can be found online.

Example/s:

The consumption of sugar-sweetened beverages was derived from 7 repeated FFQs administered between 1980 and 2002.

1b Abstract

Provide in the abstract an informative and balanced summary of what was done and what was found.

The abstract provides key information that enables readers to understand a study and decide whether to read the article. Typical components include a statement of the research question, a short description of methods and results, and a conclusion. Abstracts should summarize key details of studies and should only present information that is provided in the article. We advise presenting key results in a numerical form that includes numbers of participants, estimates of associations and appropriate measures of variability and uncertainty (e.g., odds ratios with confidence intervals). We regard it insufficient to state only that an exposure is or is not significantly associated with an outcome.

A series of headings pertaining to the background, design, conduct, and analysis of a study may help readers acquire the essential information rapidly. Many journals require such structured abstracts, which tend to be of higher quality and more readily informative than unstructured summaries.

Example/s:

Background: The expected survival of HIV-infected patients is of major public health interest.

Objective: To estimate survival time and age-specific mortality rates of an HIV-infected population compared with that of the general population.

Design: Population-based cohort study.

Setting: All HIV-infected persons receiving care in Denmark from 1995 to 2005.

Patients: Each member of the nationwide Danish HIV Cohort Study was matched with as many as 99 persons from the general population according to sex, date of birth, and municipality of residence.

Measurements: The authors computed Kaplan-Meier life tables with age as the time scale to estimate survival from age 25 years. Patients with HIV infection and corresponding persons from the general population were observed from the date of the patient's HIV diagnosis until death, emigration, or 1 May 2005.

Results: 3990 HIV-infected patients and 379,872 persons from the general population were included in the study, yielding 22,744 (median, 5.8 y/person) and 2,689,287 (median, 8.4 years/person) person-years of observation. Three percent of participants were lost to follow-up. From age 25 years, the median survival was 19.9 years (95% CI, 18.5 to 21.3) among patients with HIV infection and 51.1 years (CI, 50.9 to 51.5) among the general population. For HIV-infected patients, survival increased to 32.5 years (CI, 29.4 to 34.7) during the 2000 to 2005 period. In the subgroup that excluded persons with known hepatitis C coinfection (16%), median survival was 38.9 years (CI, 35.4 to 40.1) during this same period. The relative mortality rates for patients with HIV infection compared with those for the general population decreased with increasing age, whereas the excess mortality rate increased with increasing age.

Limitations: The observed mortality rates are assumed to apply beyond the current maximum observation time of 10 years.

Conclusions: The estimated median survival is more than 35 years for a young person diagnosed with HIV infection in the late highly active antiretroviral therapy era. However, an ongoing effort is still needed to further reduce mortality rates for these persons compared with the general population.

Introduction

2. Background / rationale

Explain the scientific background and rationale for the investigation being reported.

The scientific background of the study provides important context for readers. It sets the stage for the study and describes its focus. It gives an overview of what is known on a topic and what gaps in current knowledge are addressed by the study. Background material should note recent pertinent studies and any systematic reviews of pertinent studies.

Example/s:

Concerns about the rising prevalence of obesity in children and adolescents have focused on the well documented associations between childhood obesity and increased cardiovascular risk1 and mortality in adulthood.2 Childhood obesity has considerable social and psychological consequences within childhood and adolescence,3 yet little is known about social, socioeconomic, and psychological consequences in adult life.

A recent systematic review found no longitudinal studies on the outcomes of childhood obesity other than physical health outcomes3 and only two longitudinal studies of the socioeconomic effects of obesity in adolescence. Gortmaker et al found that US women who had been obese in late adolescence in 1981 were less likely to be married and had lower incomes seven years later than women who had not been overweight, while men who had been overweight were less likely to be married.4 Sargent et al found that UK women, but not men, who had been obese at 16 years in 1974 earned 7.4% less than their non-obese peers at age 23.5

The study of adult outcomes of childhood obesity is difficult because obesity often continues into adult life and therefore poorer socioeconomic and educational outcomes may actually reflect confounding by adult obesity. Yet identifying outcomes related to obesity confined childhood is important in determining whether people who are obese in childhood and who later lose weight remain at risk for adult adversity and inequalities.

We used longitudinal data from the 1970 British birth cohort to examine the adult socioeconomic, educational, social, and psychological outcomes of childhood obesity. We hypothesised that obesity limited to childhood has fewer adverse adult outcomes than obesity that persists into adult life.

3. Objectives

State specific objectives, including any prespecified hypotheses.

Objectives are the detailed aims of the study. Well crafted objectives specify populations, exposures and outcomes, and parameters that will be estimated. They may be formulated as specific hypotheses or as questions that the study was designed to address. In some situations objectives may be less specific, for example, in early discovery phases. Regardless, the report should clearly reflect the investigators' intentions. For example, if important subgroups or additional analyses were not the original aim of the study but arose during data analysis, they should be described accordingly (see also items 4, 17 and 20).

Example/s:

Our primary objectives were to 1) determine the prevalence of domestic violence among female patients presenting to four community-based, primary care, adult medicine practices that serve patients of diverse socioeconomic background and 2) identify demographic and clinical differences between currently abused patients and patients not currently being abused.

Methods

4. Study design

Present key elements of study design early in the paper.

We advise presenting key elements of study design early in the methods section (or at the end of the introduction) so that readers can understand the basics of the study. For example, authors should indicate that the study was a cohort study, which followed people over a particular time period, and describe the group of persons that comprised the cohort and their exposure status. Similarly, if the investigation used a case-control design, the cases and controls and their source population should be described. If the study was a cross-sectional survey, the population and the point in time at which the cross-section was taken should be mentioned. When a study is a variant of the three main study types, there is an additional need for clarity. For instance, for a case-crossover study, one of the variants of the case-control design, a succinct description of the principles was given in the example above [28].

We recommend that authors refrain from simply calling a study ‘prospective' or ‘retrospective' because these terms are ill defined [29]. One usage sees cohort and prospective as synonymous and reserves the word retrospective for case-control studies [30]. A second usage distinguishes prospective and retrospective cohort studies according to the timing of data collection relative to when the idea for the study was developed [31]. A third usage distinguishes prospective and retrospective case-control studies depending on whether the data about the exposure of interest existed when cases were selected [32]. Some advise against using these terms [33], or adopting the alternatives ‘concurrent' and ‘historical' for describing cohort studies [34]. In STROBE, we do not use the words prospective and retrospective, nor alternatives such as concurrent and historical. We recommend that, whenever authors use these words, they define what they mean. Most importantly, we recommend that authors describe exactly how and when data collection took place.

The first part of the methods section might also be the place to mention whether the report is one of several from a study. If a new report is in line with the original aims of the study, this is usually indicated by referring to an earlier publication and by briefly restating the salient features of the study. However, the aims of a study may also evolve over time. Researchers often use data for purposes for which they were not originally intended, including, for example, official vital statistics that were collected primarily for administrative purposes, items in questionnaires that originally were only included for completeness, or blood samples that were collected for another purpose. For example, the Physicians' Health Study, a randomized controlled trial of aspirin and carotene, was later used to demonstrate that a point mutation in the factor V gene was associated with an increased risk of venous thrombosis, but not of myocardial infarction or stroke [35]. The secondary use of existing data is a creative part of observational research and does not necessarily make results less credible or less important. However, briefly restating the original aims might help readers understand the context of the research and possible limitations in the data.

Example/s:

We used a case-crossover design, a variation of a case-control design that is appropriate when a brief exposure (driver's phone use) causes a transient rise in the risk of a rare outcome (a crash). We compared a driver's use of a mobile phone at the estimated time of a crash with the same driver's use during another suitable time period. Because drivers are their own controls, the design controls for characteristics of the driver that may affect the risk of a crash but do not change over a short period of time. As it is important that risks during control periods and crash trips are similar, we compared phone activity during the hazard interval (time immediately before the crash) with phone activity during control intervals (equivalent times during which participants were driving but did not crash) in the previous week.

5. Setting

Describe the setting, locations, and relevant dates, including periods of recruitment, exposure, follow-up, and data collection.

Readers need information on setting and locations to assess the context and generalisability of a study's results. Exposures such as environmental factors and therapies can change over time. Also, study methods may evolve over time. Knowing when a study took place and over what period participants were recruited and followed up places the study in historical context and is important for the interpretation of results.

Information about setting includes recruitment sites or sources (e.g., electoral roll, outpatient clinic, cancer registry, or tertiary care centre). Information about location may refer to the countries, towns, hospitals or practices where the investigation took place. We advise stating dates rather than only describing the length of time periods. There may be different sets of dates for exposure, disease occurrence, recruitment, beginning and end of follow-up, and data collection. Of note, nearly 80% of 132 reports in oncology journals that used survival analysis included the starting and ending dates for accrual of patients, but only 24% also reported the date on which follow-up ended [37].

Example/s:

The Pasitos Cohort Study recruited pregnant women from Women, Infant and Child (WIC) clinics in Socorro and San Elizario, El Paso County, Texas and maternal-child clinics of the Mexican Social Security Institute (IMSS) in Ciudad Juarez, Mexico from April 1998 to October 2000. At baseline, prior to the birth of the enrolled cohort children, staff interviewed mothers regarding the household environment. In this ongoing cohort study, we target follow-up exams at 6-month intervals beginning at age 6 months.

nut-5

Describe any characteristics of the study settings that might affect the dietary intake or nutritional status of the participants, if applicable.

Clear information about the study setting is needed to facilitate the interpretation and generalization of the findings. This includes external conditions that may affect dietary intake or nutritional status of the population, as well as the reporting of these. The time frame for the dietary assessment is also an important factor. Etiological studies mostly focus on dietary intakes over longer time periods, rather than intake during a certain day or week. Because the day-to-day variation as well as the seasonal variation, including holiday periods and special events, may influence observed estimates of habitual intake, the time period covered should be outlined. When using short-term dietary assessment methods, information is required with regard to the time period between examined days, and how weekdays and weekends are covered.

Example/s:

In a Matlab area, an embankment was constructed between 1982 and 1989 on the banks of the rivers Meghna and Dhonagoda to protect the area from seasonal floods. The study villages are therefore also categorized in relation to whether they are situated inside or outside the embankment. This embankment has a great impact on the pattern and production of major crops and fish on both sides and is believed to have an effect on food availability and consumption, which, in turn, could lead to effects on nutritional status.

6a Eligibility

Cohort study: Give the eligibility criteria and the sources and methods of selection of participants. Describe methods of follow-up. Case-control study: Give the eligibility criteria and the sources and methods of case ascertainment and control selection. Give the rationale for the choice of cases and controls. Cross-sectional study: Give the eligibility criteria, and the sources and methods of selection of participants.

Detailed descriptions of the study participants help readers understand the applicability of the results. Investigators usually restrict a study population by defining clinical, demographic and other characteristics of eligible participants. Typical eligibility criteria relate to age, gender, diagnosis and comorbid conditions. Despite their importance, eligibility criteria often are not reported adequately. In a survey of observational stroke research, 17 of 49 reports (35%) did not specify eligibility criteria [5].

Eligibility criteria may be presented as inclusion and exclusion criteria, although this distinction is not always necessary or useful. Regardless, we advise authors to report all eligibility criteria and also to describe the group from which the study population was selected (e.g., the general population of a region or country), and the method of recruitment (e.g., referral or self-selection through advertisements).

Knowing details about follow-up procedures, including whether procedures minimized non-response and loss to follow-up and whether the procedures were similar for all participants, informs judgments about the validity of results. For example, in a study that used IgM antibodies to detect acute infections, readers needed to know the interval between blood tests for IgM antibodies so that they could judge whether some infections likely were missed because the interval between blood tests was too long [41]. In other studies where follow-up procedures differed between exposed and unexposed groups, readers might recognize substantial bias due to unequal ascertainment of events or differences in non-response or loss to follow-up [42]. Accordingly, we advise that researchers describe the methods used for following participants and whether those methods were the same for all participants, and that they describe the completeness of ascertainment of variables (see also item 14).

In case-control studies, the choice of cases and controls is crucial to interpreting the results, and the method of their selection has major implications for study validity. In general, controls should reflect the population from which the cases arose. Various methods are used to sample controls, all with advantages and disadvantages: for cases that arise from a general population, population roster sampling, random digit dialling, neighbourhood or friend controls are used. Neighbourhood or friend controls may present intrinsic matching on exposure [17]. Controls with other diseases may have advantages over population-based controls, in particular for hospital-based cases, because they better reflect the catchment population of a hospital, have greater comparability of recall and ease of recruitment. However, they can present problems if the exposure of interest affects the risk of developing or being hospitalized for the control condition(s) [43,44]. To remedy this problem often a mixture of the best defensible control diseases is used [45].

Example/s:

Participants in the Iowa Women's Health Study were a random sample of all women ages 55 to 69 years derived from the state of Iowa automobile driver's license list in 1985, which represented approximately 94% of Iowa women in that age group. (…) Follow-up questionnaires were mailed in October 1987 and August 1989 to assess vital status and address changes. (…) Incident cancers, except for nonmelanoma skin cancers, were ascertained by the State Health Registry of Iowa (…). The Iowa Women's Health Study cohort was matched to the registry with combinations of first, last, and maiden names, zip code, birthdate, and social security number.

nut-6

Report any particular dietary, physiologic, or nutritional characteristics that were considered when selecting the target population.

Because of the potential influence on study results and generalizability, eligibility and exclusion criteria related to dietary intake or nutritional status are especially important to report in nutritional epidemiologic studies. Such characteristics include age, sex, smoking, BMI, and physiologic status (e.g., pregnancy). Other factors (e.g., physical activity) or conditions (e.g., disease diagnoses or obesity) that may result in dietary changes or potential misreporting of energy intake also require clear descriptions.

Example/s:

Nonsmoking women, 20–50 y of age, not occupationally exposed to cadmium, were recruited. Women were chosen as subjects because they have higher cadmium concentrations in blood and higher body burdens of cadmium than men. Furthermore, low iron stores, which have been associated with increased gastrointestinal absorption of cadmium, are more common among premenopausal women. Because cigarette smoking may significantly increase body burden (kidney concentration) and blood cadmium concentration as much as 5 times, only women who had been nonsmokers for ≥5 y were eligible for the study. None of the women were pregnant or lactating at the time of the study.

6b

Cohort study: For matched studies, give matching criteria and number of exposed and unexposed. Case-control study: For matched studies, give matching criteria and the number of controls per case.

Matching is much more common in case-control studies, but occasionally, investigators use matching in cohort studies to make groups comparable at the start of follow-up. Matching in cohort studies makes groups directly comparable for potential confounders and presents fewer intricacies than with case-control studies. For example, it is not necessary to take the matching into account for the estimation of the relative risk [48]. Because matching in cohort studies may increase statistical precision investigators might allow for the matching in their analyses and thus obtain narrower confidence intervals.

In case-control studies matching is done to increase a study's efficiency by ensuring similarity in the distribution of variables between cases and controls, in particular the distribution of potential confounding variables [48,49]. Because matching can be done in various ways, with one or more controls per case, the rationale for the choice of matching variables and the details of the method used should be described. Commonly used forms of matching are frequency matching (also called group matching) and individual matching. In frequency matching, investigators choose controls so that the distribution of matching variables becomes identical or similar to that of cases. Individual matching involves matching one or several controls to each case. Although intuitively appealing and sometimes useful, matching in case-control studies has a number of disadvantages, is not always appropriate, and needs to be taken into account in the analysis.

Example/s:

For each patient who initially received a statin, we used propensity-based matching to identify one control who did not receive a statin according to the following protocol. First, propensity scores were calculated for each patient in the entire cohort on the basis of an extensive list of factors potentially related to the use of statins or the risk of sepsis. Second, each statin user was matched to a smaller pool of non-statin-users by sex, age (plus or minus 1 year), and index date (plus or minus 3 months). Third, we selected the control with the closest propensity score (within 0.2 SD) to each statin user in a 1:1 fashion and discarded the remaining controls.

7. Variables

Clearly define all outcomes, exposures, predictors, potential confounders, and effect modifiers. Give diagnostic criteria, if applicable.

Authors should define all variables considered for and included in the analysis, including outcomes, exposures, predictors, potential confounders and potential effect modifiers. Disease outcomes require adequately detailed description of the diagnostic criteria. This applies to criteria for cases in a case-control study, disease events during follow-up in a cohort study and prevalent disease in a cross-sectional study. Clear definitions and steps taken to adhere to them are particularly important for any disease condition of primary interest in the study.

For some studies, ‘determinant' or ‘predictor' may be appropriate terms for exposure variables and outcomes may be called ‘endpoints'. In multivariable models, authors sometimes use ‘dependent variable' for an outcome and ‘independent variable' or ‘explanatory variable' for exposure and confounding variables. The latter is not precise as it does not distinguish exposures from confounders.

If many variables have been measured and included in exploratory analyses in an early discovery phase, consider providing a list with details on each variable in an appendix, additional table or separate publication. Of note, the International Journal of Epidemiology recently launched a new section with ‘cohort profiles', that includes detailed information on what was measured at different points in time in particular studies [56,57]. Finally, we advise that authors declare all ‘candidate variables' considered for statistical analysis, rather than selectively reporting only those included in the final models (see also item 16a) [58,59].

Example/s:

Only major congenital malformations were included in the analyses. Minor anomalies were excluded according to the exclusion list of European Registration of Congenital Anomalies (EUROCAT). If a child had more than one major congenital malformation of one organ system, those malformations were treated as one outcome in the analyses by organ system (…) In the statistical analyses, factors considered potential confounders were maternal age at delivery and number of previous parities. Factors considered potential effect modifiers were maternal age at reimbursement for antiepileptic medication and maternal age at delivery.

nut-7.1

Clearly define foods, food groups, nutrients, or other food components (e.g., preparation method, taxonomical descriptors, classification, chemical form).

To assess the health benefits of a specific dietary exposure, and to compare findings across studies, it is essential that the examined dietary exposures are clearly defined. Food security indicators or measures should be clearly described when used as proxy for or an indicator of dietary intake. When the exposure variables are food groups, the components of each aggregated food group should be clearly described. When assessing the health properties of specific food items, it is helpful to specify the scientific or taxonomical names of foods, because the nutritional composition of food is strongly related to species, cultivar, and variety. The units used should be clearly presented (e.g., servings per day, grams per day, and liters per week). In reports of complex dietary exposures, it is helpful to use standardized approaches (if available) that uniformly describe, classify, and quantify exposures. For example, recommendations for reporting whole-grain intake in observational and intervention studies have been published.

In some circumstances, a high level of detail may be justified. Thus, it may be helpful to indicate recipes and report whether food intake was based on raw or cooked foods (i.e., food preparation method). In addition, the report should include how food intakes were converted into nutrients or food components by specifying the units, method of calculating intakes, and the food-composition database (see also Nut-8.2). When relevant, the full definition of non-nutrient food components (e.g., chemical form of the compounds), and the units, should be provided. Similarly, information on the method of the biochemical analysis and relevant documentation is helpful.

Example/s:

The definition of whole grains applied in the current study was in accordance with that of the American Association of Cereal Chemists and is as follows: “Whole grains shall consist of the intact, ground, cracked or flaked caryopsis, whose principal anatomical components—the starchy endosperm, germ, and bran—are current in the same relative proportions as they exist in the intact caryopsis.” Cereal species investigated in the current study were rye, wheat, oats, barley, rice, millet, corn, and maize (dried); triticale; and sorghum and durra. Whole-grain intake was expressed by the following 2 different methods to calculate intake: 1) intake of whole-grain products (grams of product per day) was calculated and consisted of 4 product categories that contained either solely whole-grain products (rye bread, whole-grain bread, or oat meal) or were dominated by whole-grain products (>75%; crispbread); 2) to quantify the absolute amount of whole grain consumed, total whole-grain (grams of whole grain per day) intake was calculated.

nut-7.2

When calculating dietary patterns, describe the methods to obtain them and their nutritional properties.

Dietary pattern analysis allows researchers to examine total diet, or combinations of many food components, rather than single nutrients or foods. Dietary patterns can be estimated by statistical data-driven techniques (a posteriori) or by dietary indexes or scores that are hypothesis based (a priori). Data handling and analysis involve many steps that need to be described clearly in order for others to fully understand the procedure and to interpret findings (see also Nut-12.1).

The dietary patterns identified from the data-driven techniques are meant to reflect the dietary habits in the population independent of any previous knowledge about dietary influences on health. The most widely used data-driven approaches are cluster, principal components, and factor analysis. Reduced rank regression is another approach that uses both dietary data and a set of response variables (e.g., plasma concentrations of disease markers) to identify patterns.

Each of these methods has its specific procedures, and researchers are required to make several informed decisions during data handling and analysis. In order for other researchers to fully understand the procedure and to interpret findings, the report should include information on the following: 1) the selection and aggregation of dietary variables, 2) any standardization used, and 3) any approach of energy-adjustment. The basis to determine the number of patterns (e.g., correlation or covariance matrices and factor loadings) and the selection criteria should also be presented. A description of the rationale for labeling the dietary pattern, as well as the nutritional properties of the emerging patterns, adds clarity (see also Nut-12.1).

Dietary indexes or scores are constructed on the basis of a priori hypothesis. Scores are assigned to individuals depending on their adherence to predefined intake amounts, or the population median. The development of the dietary index or score should be described, and whether the aim was to reflect adherence to nutrition recommendations, dietary guidelines, or a certain diet or to predict disease risk. The choice of each index component should be justified, including the cutoff values, because both food and nutrient components could partly reflect similar aspects of the diet, and thus may be highly correlated. Also describe whether there was any weighting of included components and whether variables were energy-adjusted.

Example/s:

We performed exploratory factor analysis to extract patterns that we then confirmed by using confirmatory factor analysis. To avert subjective influences in food grouping, we included all individual food items in the exploratory factor analysis. We considered eigenvalues >1.0, interpretability of factors, and number of items and their frequency to decide how many factors to extract from the data and confirm. We included items with factor loadings of ≥0.20 from exploratory analysis to test specific factor structures by using confirmatory factor analysis; the goodness-of-fit index was high (0.93 for the model including all patterns). Factor scores were calculated for each individual for each pattern by weighting the standardized intakes of the food items by their factor loadings and summing for all items. The scores of each dietary pattern were categorized into quintiles. We derived 4 major dietary patterns: “healthy” (vegetables, fruit, and legumes), “Western/Swedish” (red meat, processed meat, poultry, rice, pasta, eggs, fried potatoes, and fish), “alcohol” (wine, liquor, beer, and some snacks), and “sweets” (sweet baked goods, candy, chocolate, jam, and ice cream).

8. Data sources and measurement

For each variable of interest give sources of data and details of methods of assessment (measurement). Describe comparability of assessment methods if there is more than one group. Give information separately for for exposed and unexposed groups if applicable.

The way in which exposures, confounders and outcomes were measured affects the reliability and validity of a study. Measurement error and misclassification of exposures or outcomes can make it more difficult to detect cause-effect relationships, or may produce spurious relationships. Error in measurement of potential confounders can increase the risk of residual confounding [62,63]. It is helpful, therefore, if authors report the findings of any studies of the validity or reliability of assessments or measurements, including details of the reference standard that was used. Rather than simply citing validation studies (as in the first example), we advise that authors give the estimated validity or reliability, which can then be used for measurement error adjustment or sensitivity analyses (see items 12e and 17).

In addition, it is important to know if groups being compared differed with respect to the way in which the data were collected. This may be important for laboratory examinations (as in the second example) and other situations. For instance, if an interviewer first questions all the cases and then the controls, or vice versa, bias is possible because of the learning curve; solutions such as randomising the order of interviewing may avoid this problem. Information bias may also arise if the compared groups are not given the same diagnostic tests or if one group receives more tests of the same kind than another (see also item 9).

Example/s:

Total caffeine intake was calculated primarily using US Department of Agriculture food composition sources. In these calculations, it was assumed that the content of caffeine was 137 mg per cup of coffee, 47 mg per cup of tea, 46 mg per can or bottle of cola beverage, and 7 mg per serving of chocolate candy. This method of measuring (caffeine) intake was shown to be valid in both the NHS I cohort and a similar cohort study of male health professionals (...) Self-reported diagnosis of hypertension was found to be reliable in the NHS I cohort

Samples pertaining to matched cases and controls were always analyzed together in the same batch and laboratory personnel were unable to distinguish among cases and controls.

nut-8.1

Describe the dietary assessment method(s) (e.g., portion size estimation, number of days and items recorded, how it was developed and administered, and how quality was ensured); report if and how supplement intake was assessed.

Because each method has different characteristics and utility, clear descriptions of the specific dietary assessment method and the procedure to collect and to analyze dietary data are needed. In addition, factors such as the location and time frame of the study (see Nut-5), as well as the mode of collecting dietary data, could potentially influence both the actual diet and the reports of the habitual diet. It is therefore helpful to describe whether the intake information was reported by participants themselves, by participants with assistance from another person, or by proxy. The mode of administration (e.g., face to face interview, telephone interview, questionnaire by mail, Web formula) should also be reported. Furthermore, reporting procedures for quality control, how the quality of collected data were ensured, or both, add clarity. Because dietary assessment is subject to random error and repeated assessments could substantially reduce this error, it is important to clarify whether and how repeated dietary assessments were performed and handled in the dietary analyses, particularly in cohort studies. FFQs typically include a list of food items with questions about how often these are habitually consumed during a given time span (e.g., the previous 12 mo). Because there are many varieties of FFQs, each questionnaire needs to be judged for its ability to provide the intended dietary intake information of the specific population. Essential information includes the number of food items and frequency-response categories, as well as how portion sizes were handled. Details of food items should be described, including how they were aggregated and classified, because these are questionnaire- or study specific. If possible, the FFQ should be provided as supplementary material to the article (see Nut-22.2).

Additional details of the FFQ that may be helpful are any control questions included (e.g., number of fish meals consumed per week when the FFQ includes several different items on fish consumption), descriptions of cooking procedures including type of fat used, as well as clear descriptions of questions on dietary supplement use. If the FFQ was intended to capture only certain aspects of the diet (e.g., a short screening questionnaire) or developed for a specific population, this should be clearly stated, and particulars with regard to the validation study should be reported (see also Nut-8.6).

Similar to the FFQ, the dietary history method was originally developed to describe the usual habitual diet of individuals. Because the method has had many adaptations and exists in a variety of combinations, it is helpful to describe the methodology and the data collection carefully. The 24-h recall is a retrospective interview method, aiming to capture the individual's consumption the preceding day without any previous warning. Any deviation from the original method, such as if the participants were aware of which day of the interview would be carried out or whether the method was a self-instructive Web-questionnaire, should be stated. The number of recall days included and the days of the week (i.e., weekday or weekend) should also be stated. How portion sizes were assessed should also be reported. The instructions given to participants before the interview need to be reported, and whether interview aids were provided and if an established interview format was followed.

Food records are collected prospectively, usually by the participants. The number of recorded days (consecutive or not) and the days of the week (i.e., weekday or weekend) should be stated (see also Nut-5). Whether portion sizes were estimated should be reported (e.g., by using photographic aids) or whether foods were weighed or measured (i.e., by using household scales or measurements). It is helpful to include information on the level of detail of the written or oral instructions given (e.g., handling of foods easily forgotten such as water, decomposition of recipes), and if any aids were provided.

Dietary assessment is an area in which considerable methodologic work and development have taken place. Combinations and hybrids of the common assessment methods, and new techniques for recording and reporting (e.g., the Internet and mobile phones), have been developed. When new or combinations of procedures and techniques are used, these should be described in sufficient detail and provide further science-based evidence of their specific validity.

Example/s:

Individual food intake is reported through a semiquantitative FFQ covering the preceding 12-mo period. Between 1992 and 1996, the FFQ included 84 food items, such as edible fats, fruit, vegetables, milk and milk products, bread, potatoes, rice, pasta, fish, meat and meat products, chicken, traditional dishes, hot and cold beverages, sweets, sugar and jam, and snacks. From 1996, this was reduced to 66 food items by deleting entire foods (e.g., liver and kidney) or by merging similar foods (e.g., merging the 2 groups “apples, pears, peaches” and “oranges, mandarines, grapefruit” into one group “apples, pears, peaches, oranges, mandarines, grapefruit”). The 2 data sources have been harmonized and combined into 1 file for the purpose of the food pattern analysis. Portion sizes for the 3 categories of potato/rice/pasta, meat/fish, and vegetables are indicated by participants through comparison with color photos of 4 plates with increasing portion sizes. Frequency of dietary intake is reported on a 9-level scale from none to ≥4 times daily. For the analysis, these frequencies were transformed to a daily frequency.

nut-8.2

Describe and justify food-composition data used; explain the procedure to match food composition with consumption data; describe the use of conversion factors, if applicable.

In studies of energy, nutrient, and other food component intakes, the food-composition database or other food-composition data need to be described, preferably also giving a reference to the database. Appropriate guidance is needed (e.g., search strategy or references) indicating whether data are directly derived from peer-reviewed publications, monitoring programs, or new analyses. In multicenter studies covering >1 country, the handling of country-specific nutrient values should be described. Factors that influence the quality of the nutrient intake data, such as number of missing values in food-composition data and how these were treated, should be reported. In addition, if applicable, how foods were matched across countries and food databases should be reported. Any conversion factors applied to the consumed food amounts (e.g., raw-to-cooked or precursor-to-bioactive) should be reported, as well as any data handling influencing the food component concentrations (e.g., nutrient retention, yield, or bioactivity).

Example/s:

Total vitamin A was expressed both as retinol equivalents (REs) and as retinol activity equivalent (RAEs) according to the following conversion factors: RE = 1 mg all-trans retinol + 1/6 mg dietary all-trans β-carotene + 1/12 mg other dietary provitamin A carotenoids; RAE = 1 mg all-trans retinol + 1/12 mg dietary all-trans β-carotene + 1/24 mg other dietary provitamin A carotenoids. Total vitamin A values were calculated with and without separation of β-carotene isomers in those foods that displayed data for both trans and cis β-carotene. To calculate vitamin A in REs and RAEs without isomer separation the conversion factor used for all-trans β-carotene was adopted for the values of total β-carotene (trans plus cis β-carotene). Data are shown in the Brazilian Vitamin A Database as micrograms per 100 g edible portion on a fresh-weight basis.

nut-8.3

Describe the nutrient requirements, recommendations, or dietary guidelines and the evaluation approach used to compare intake with the dietary reference values, if applicable.

The recommended approach when reporting the intake adequacy of micronutrients is to evaluate observed intakes against the average requirements (e.g., EAR or Average Requirement). The proportion of the population with intakes below the EAR, or Average Requirement, is the proportion in the study population at risk of inadequate intakes. Only reporting the mean intake in relation to the Recommended Intake or RDA is not sufficient, because this does not enable the reader to judge the adequacy of the diet. It is helpful to describe any alternative values used. When the EAR is not available for a specific group and instead calculated (e.g., for children), it is helpful to describe any formulas used.

Example/s:

Estimates of the prevalence of inadequate intakes of essential nutrients from food sources alone were calculated by using the Estimated Average Requirement (EAR) cut-point method. The EARs were primarily derived from the United Kingdom's Dietary Reference Values. In the case of nutrients for which the EAR was not set (vitamin E, selenium, and iodine), values developed by the Food and Nutrition Board of the Institute of Medicine were used as surrogate EARs. Alternative values were used in addition to the EARs for nutrients for which considerable differences exist in dietary recommendations between countries—that is, folate and calcium—or for which vegetarian-specific recommendations exist—that is, iron and zinc.

nut-8.4

When using nutritional biomarkers, additionally use the STROBE-ME; report the type of biomarkers used and usefulness as dietary exposure markers.

Biological markers of dietary intakes (nutritional biomarkers) are objective measures that are useful in the validation of diet assessment instruments and in studies of diet and disease. The use of nutritional biomarkers that reflect dietary exposures will, in combination with self-reported dietary data, strengthen the examination of diet-disease associations.

The STROBE-ME provides general guidelines on the reporting in studies that use biomarkers (i.e., not only nutritional biomarkers) (107). Because the type of biological material, sampling method, and choice of analytic method influence the measured concentration of the biomarker, the general guidelines stress the importance of reporting how the samples were collected and handled.

The report needs to indicate if the nutritional biomarker is specific for the dietary exposure, and if it accurately reflects the intake. In addition, it is useful to know if the biomarker is sensitive to an increase in dietary intake (i.e., shows a dose-response association). Readers would also like to know whether the biomarker reflects long- or short-term dietary intake (e.g., through reporting the half-life of the biomarker) and the degree of reliability (reproducibility) of the biomarker.

Example/s:

Urinary sugars, in particular sucrose and fructose, have been investigated and developed as dietary biomarkers of total sugar intake. If 24-h urine collections are available, sucrose and fructose measured in 24-h urine can be used as predictive biomarkers of total sugar intake. We prospectively investigated the association between sucrose intake and risk of overweight and obesity in a sample of the EPIC (European Investigation into Cancer and Nutrition)-Norfolk cohort study by using urinary sugar biomarkers and self-reported dietary data. Self-reported sucrose intake was significantly positively associated with the biomarker. Associations between the biomarker and BMI were positive (β = 0.25; 95% CI: 0.08, 0.43), while they were inverse when using self-reported dietary data (β = −1.40; 95% CI: −1.81, −0.99). The age- and sex-adjusted OR for BMI (kg/m2) >25.0 in participants in the fifth compared with the first quintile was 1.54 (95% CI: 1.12, 2.12; P-trend = 0.003) when using the biomarker and 0.56 (95% CI: 0.40, 0.77; P-trend < 0.001) with self-reported dietary data. Conclusions: Our results suggest that sucrose measured by objective biomarker but not self-reported sucrose intake is positively associated with BMI.

nut-8.5

Describe the assessment of nondietary data (e.g., nutritional status and influencing factors) and timing of the assessment of these variables in relation to dietary assessment.

Nondietary data are essential components in studies of diet and health, either as potential confounders or as effect modifiers and intermediate risk factors, of the association between diet and disease. Such nondietary factors are physical (e.g., sex, age, BMI), socioeconomic (e.g., education), genetic, or lifestyle (e.g., physical activity, sedentary behavior, and smoking and alcohol habits) factors. Failure to consider such relevant factors may distort results and lead to incorrect conclusions.

Physical activity represents a particular issue in studies of diet and disease. It may be independently associated with outcome, a potential dietary confounder, or both. Estimates of physical activity may also be required when evaluating reports of energy intake.

Physical activity may be estimated by participant self-report with the use of questionnaires or diaries, or by means of objective methods such as pedometers, accelerometers, or heart rate monitors. Many different decisions taken during assessment and data handling will influence the estimated level of physical activity; thus, it is important to report such details. For example, it is helpful to explain how different items in a questionnaire are combined to estimate the PAL, or how estimates of the duration of activities on certain intensity levels were obtained, or how compliance with a recommendation was assessed. Information with regard to the evaluation of the procedure should be included.

Descriptions of how nondietary data were assessed are helpful to enable both understanding of the study and its replication. To facilitate the interpretation of findings, readers need to know the timing of the nondietary data and biomarker collection in relation to the dietary data collection (see also Nut-9). In addition, information on the validity of the methods used should be provided.

Anthropometric measurements (e.g., weight, height, and calculated BMI) are often collected because these measurements are relatively easy to obtain and can be used to evaluate both under- and overnutrition (e.g., obesity is a common risk factor for diet-related chronic diseases). Other simple measures are those related to body fat distribution: for example, waist circumference, waist-to-hip ratio, and skinfold thickness. More advanced measurements of adiposity and body composition can also be of interest. It is important to mention whether these data were obtained through self- or proxy reports or as objective measurements.

When the aim of a study is to identify individuals with nutritional deficiencies, it is essential also to include an assessment of biochemical data, clinical signs of deficiency, or both, because dietary intake assessments alone can only estimate the proportion of a population at risk of nutritional deficiencies (see Nut-8.3).

Example/s:

BMI was calculated from weight reported on each biennial questionnaire and height reported at the first questionnaire. Smoking status and number of cigarette use, history of hypertension, aspirin use (number of tablets and frequency of use), regular intake of multivitamins, menopausal status, and use of postmenopausal hormone therapy, parity, and age at first birth were assessed every 2 y.

nut-8.6

Report on the validity of the dietary or nutritional assessment methods and any internal or external validation used in the study, if applicable.

The published report from an observational study is improved by including information on measures taken when evaluating the validity of the dietary assessment tool. This will inform the readers whether the tool actually measures the intended aspect of the diet. Relevant information includes sufficient details about the specific dietary aspect validated, the reference method used, the measures of validity, the population studied, and the sample size. If the reference method is another dietary assessment method (i.e., relative validation), details on, for example, number of days, weighed or estimated records, as well as the season and time frame of data collection are useful.

Because no single measure covers all aspects of validity, it is a clear advantage to report >1 approach when describing the validity of a dietary assessment tool. Valuable basic information includes whether there is an overall reporting bias (i.e., under- or overestimation of dietary intake), whether there is a dose-response relation (i.e., from partial or single correlation or linear regression analyses) between the estimated intake and the intake measured with the reference method, and whether the validity of a method differs between subgroups.

The understanding of measurement errors in dietary assessment is increasing, and techniques have been developed to take measurement error into account when assessing diet-disease associations. Understanding these techniques has resulted in additional emphasis on detailed reporting on the procedures assessing the validity of dietary assessment methods.

Example/s:

We compared FFQ-assessed acrylamide intake with a biomarker of acrylamide intake, hemoglobin adducts of acrylamide and its genotoxic metabolite glycidamide, in a sample of 296 nonsmoking women in the Nurses' Health Study (NHS) II cohort. The correlation was 0.34 (P < 0.0001), adjusted for age, energy intake, BMI, and alcohol intake, and corrected for random within-person variation in the adduct measurement.

9. Bias

Describe any efforts to address potential sources of bias.

Biased studies produce results that differ systematically from the truth (see also Box 3). It is important for a reader to know what measures were taken during the conduct of a study to reduce the potential of bias. Ideally, investigators carefully consider potential sources of bias when they plan their study. At the stage of reporting, we recommend that authors always assess the likelihood of relevant biases. Specifically, the direction and magnitude of bias should be discussed and, if possible, estimated. For instance, in case-control studies information bias can occur, but may be reduced by selecting an appropriate control group, as in the first example [64]. Differences in the medical surveillance of participants were a problem in the second example [65]. Consequently, the authors provide more detail about the additional data they collected to tackle this problem. When investigators have set up quality control programs for data collection to counter a possible “drift” in measurements of variables in longitudinal studies, or to keep variability at a minimum when multiple observers are used, these should be described.

Unfortunately, authors often do not address important biases when reporting their results. Among 43 case-control and cohort studies published from 1990 to 1994 that investigated the risk of second cancers in patients with a history of cancer, medical surveillance bias was mentioned in only 5 articles [66]. A survey of reports of mental health research published during 1998 in three psychiatric journals found that only 13% of 392 articles mentioned response bias [67]. A survey of cohort studies in stroke research found that 14 of 49 (28%) articles published from 1999 to 2003 addressed potential selection bias in the recruitment of study participants and 35 (71%) mentioned the possibility that any type of bias may have affected results.

Example/s:

Detection bias could influence the association between Type 2 diabetes mellitus (T2DM) and primary open-angle glaucoma (POAG) if women with T2DM were under closer ophthalmic surveillance than women without this condition. We compared the mean number of eye examinations reported by women with and without diabetes. We also recalculated the relative risk for POAG with additional control for covariates associated with more careful ocular surveillance (a self-report of cataract, macular degeneration, number of eye examinations, and number of physical examinations)ave recently experienced the death of a family member or close associate - and are therefore more comparable to the sources of information in the suicide group than if living controls were used.

nut-9

Report how bias in dietary or nutritional assessment was addressed (e.g., misreporting, changes in habits as a result of being measured, data imputation from other sources).

Information bias and selection bias are concerns in nutrition research, and measures taken to identify or reduce the potential of these biases during all stages of the study (i.e., planning, data collection, data handling, and statistical analysis) need to be reported. When study participants have made changes in their diets (e.g., due to their own or a relative's disease diagnosis), the reported diet may reflect their present diet correctly. However, such reports may be misleading when examining dietary intakes in relation to health and disease, because the development of chronic disease commonly proceeds over many years.

Some population groups may be at particular risk of misreporting their energy intake (e.g., weight-conscious persons, those who eat out frequently), whereas others (e.g., children) may not be able to report their dietary habits. It will help readers to interpret study findings if information is included about the study setting (see Nut-5), handling of misreporting, and use of any imputations (see Nut-6, -13, and -17). Information about sampling and self-selection of participants will make it possible for the reader to evaluate the effect of selection as well as the ability to generalize the study findings to the source (or other) populations. Thus, authors ought to describe how subjects were selected, report the characteristics of nonrespondents and dropouts, and discuss how differences might affect observed associations.

Studies may consider the exclusion of participants with potentially biased dietary reports. However, an examination of the robustness of study findings is encouraged, with a subsequent discussion of potential differences between subgroups.

Example/s:

Diagnoses within 6 mo of food diary completion were excluded to ensure that latent disease without formal diagnosis was not present; otherwise, disease suspected by participants could have influenced their dietary habits. In sensitivity analyses, women with extreme intakes, defined as >1.5 times the IQR >75th percentile, were excluded in tests for linear trends. To investigate the robustness of results to missing data, analyses were repeated by using multiple imputation by chained equations, with imputations based on exposure, covariates, and outcome.

10. Study size

Explain how the study size was arrived at.

A study should be large enough to obtain a point estimate with a sufficiently narrow confidence interval to meaningfully answer a research question. Large samples are needed to distinguish a small association from no association. Small studies often provide valuable information, but wide confidence intervals may indicate that they contribute less to current knowledge in comparison with studies providing estimates with narrower confidence intervals. Also, small studies that show ‘interesting' or ‘statistically significant' associations are published more frequently than small studies that do not have ‘significant' findings. While these studies may provide an early signal in the context of discovery, readers should be informed of their potential weaknesses.

The importance of sample size determination in observational studies depends on the context. If an analysis is performed on data that were already available for other purposes, the main question is whether the analysis of the data will produce results with sufficient statistical precision to contribute substantially to the literature, and sample size considerations will be informal. Formal, a priori calculation of sample size may be useful when planning a new study. Such calculations are associated with more uncertainty than implied by the single number that is generally produced. For example, estimates of the rate of the event of interest or other assumptions central to calculations are commonly imprecise, if not guesswork. The precision obtained in the final analysis can often not be determined beforehand because it will be reduced by inclusion of confounding variables in multivariable analyses, the degree of precision with which key variables can be measured, and the exclusion of some individuals.

Few epidemiological studies explain or report deliberations about sample size. We encourage investigators to report pertinent formal sample size calculations if they were done. In other situations they should indicate the considerations that determined the study size (e.g., a fixed available sample, as in the first example above). If the observational study was stopped early when statistical significance was achieved, readers should be told. Do not bother readers with post hoc justifications for study size or retrospective power calculations. From the point of view of the reader, confidence intervals indicate the statistical precision that was ultimately obtained. It should be realized that confidence intervals reflect statistical uncertainty only, and not all uncertainty that may be present in a study (see item 20).

11. Quantitative variables

Explain how quantitative variables were handled in the analyses. If applicable, describe which groupings were chosen, and why.

Investigators make choices regarding how to collect and analyse quantitative data about exposures, effect modifiers and confounders. For example, they may group a continuous exposure variable to create a new categorical variable. Grouping choices may have important consequences for later analyses. We advise that authors explain why and how they grouped quantitative data, including the number of categories, the cut-points, and category mean or median values. Whenever data are reported in tabular form, the counts of cases, controls, persons at risk, person-time at risk, etc. should be given for each category. Tables should not consist solely of effect-measure estimates or results of model fitting.

Investigators might model an exposure as continuous in order to retain all the information. In making this choice, one needs to consider the nature of the relationship of the exposure to the outcome. As it may be wrong to assume a linear relation automatically, possible departures from linearity should be investigated. Authors could mention alternative models they explored during analyses (e.g., using log transformation, quadratic terms or spline functions). Several methods exist for fitting a non-linear relation between the exposure and outcome. Also, it may be informative to present both continuous and grouped analyses for a quantitative exposure of prime interest.

In a recent survey, two thirds of epidemiological publications studied quantitative exposure variables. In 42 of 50 articles (84%) exposures were grouped into several ordered categories, but often without any stated rationale for the choices made. Fifteen articles used linear associations to model continuous exposure but only two reported checking for linearity. In another survey, of the psychological literature, dichotomization was justified in only 22 of 110 articles (20%).

Example/s:

Patients with a Glasgow Coma Scale less than 8 are considered to be seriously injured. A GCS of 9 or more indicates less serious brain injury. We examined the association of GCS in these two categories with the occurrence of death within 12 months from injury.

nut-11

Explain categorization of dietary/nutritional data (e.g., use of N-tiles and handling of nonconsumers) and the choice of reference category, if applicable.

In nutritional epidemiology, nutrient and food variables are often examined in categories delineated by N-tiles (e.g., quintiles cutoffs indicating fifths of the distribution; see also Nut-14). This is one way of handling outliers, exaggerated intakes (i.e., potential measurement errors), and nonconsumption. Nonconsumption is common in certain foods (e.g., meat) and in alcohol.

The design features of dietary assessment tools may result in exaggerated reports of high intakes. For instance, if many different types of a food item (e.g., fish) are listed in an FFQ this may result in a misleadingly inflated intake in absolute terms (see also Nut-8.1). The true intakes of those individuals who report very high intakes may, however, correctly belong to the higher end of the distribution. In addition, foods with a high concentration of certain nutrients (e.g., vitamin A) may be consumed episodically and unequally in the population, potentially resulting in skewed distributions. Categorization of exposure variables is also needed when a specific cutoff has been recognized, and intakes below or above certain levels need to be compared (e.g., to express the compliance with dietary recommendations). A clear description of the selected categories and cutoffs, the mean or median values of categories, the reference category, and how nonconsumers were handled will be helpful to readers.

In studies that estimate disease risks, the preferred reference category should be one that is stable and includes a sufficient number of study subjects. Although the reference category is often the category with the lowest (or highest) nutrient intake, there may be particular reasons for selecting another category. For instance, individuals who report zero consumption of alcohol may be a mix of those who have never tasted alcohol and those who previously consumed large amounts and recently stopped. In such cases, a more suitable reference may be regular consumers of low amounts. Similarly, a midcategory of the intake distribution might be chosen as the reference category when both high and low intakes are proposed to be associated with the outcome (i.e., U-shaped association).

Excluding nonconsumers in analysis could be informative both in descriptive and etiologic studies but could also bias the findings. In association studies, nonconsumers may serve as reference category for RR estimates, or be the measure of interest. That is, nonconsumers may be maintained in the sample for population mean estimates, or excluded when the average portion size is estimated. If nonconsumers are excluded, their key characteristics should be reported and compared with those of the examined study sample to ensure clarity when interpreting study findings.

Example/s:

We combined FFQ items to create variables reflecting intakes of 1) total sugary beverages (combining sugar-sweetened soft drinks, fruit juice, and fruit drinks), 2) sugar-sweetened soft drinks (high-sugar carbonated beverages, such as cola), and 3) artificially sweetened soft drinks (sugar-free carbonated beverages, such as diet cola). We created new intake categories to ensure that an adequate number of participants were retained in each intake group across each variable. Cut points were determined before conducting the main analyses based on the relative distribution of intake for each variable. Total sugary beverage consumption was examined as <1/d (reference), 1–2/d, and >2/d; sugar-sweetened soft drink intake was examined as 0/wk (reference), ≤3/wk, and >3/wk; and artificially sweetened soft drink intake was examined as 0/wk (reference), ≤6/wk, and ≥1/d.

12a Statistical methods

Describe all statistical methods, including those used to control for confounding.

In general, there is no one correct statistical analysis but, rather, several possibilities that may address the same question, but make different assumptions. Regardless, investigators should pre-determine analyses at least for the primary study objectives in a study protocol. Often additional analyses are needed, either instead of, or as well as, those originally envisaged, and these may sometimes be motivated by the data. When a study is reported, authors should tell readers whether particular analyses were suggested by data inspection. Even though the distinction between pre-specified and exploratory analyses may sometimes be blurred, authors should clarify reasons for particular analyses.

If groups being compared are not similar with regard to some characteristics, adjustment should be made for possible confounding variables by stratification or by multivariable regression (see Box 5) [94]. Often, the study design determines which type of regression analysis is chosen. For instance, Cox proportional hazard regression is commonly used in cohort studies [95]. whereas logistic regression is often the method of choice in case-control studies [96,97]. Analysts should fully describe specific procedures for variable selection and not only present results from the final model [98,99]. If model comparisons are made to narrow down a list of potential confounders for inclusion in a final model, this process should be described. It is helpful to tell readers if one or two covariates are responsible for a great deal of the apparent confounding in a data analysis. Other statistical analyses such as imputation procedures, data transformation, and calculations of attributable risks should also be described. Non-standard or novel approaches should be referenced and the statistical software used reported. As a guiding principle, we advise statistical methods be described “with enough detail to enable a knowledgeable reader with access to the original data to verify the reported results” [100].

In an empirical study, only 93 of 169 articles (55%) reporting adjustment for confounding clearly stated how continuous and multi-category variables were entered into the statistical model [101]. Another study found that among 67 articles in which statistical analyses were adjusted for confounders, it was mostly unclear how confounders were chosen.

Example/s:

The adjusted relative risk was calculated using the Mantel-Haenszel technique, when evaluating if confounding by age or gender was present in the groups compared. The 95% confidence interval (CI) was computed around the adjusted relative risk, using the variance according to Greenland and Robins and Robins et al.

12b Subgroups and interactions

Describe any methods used to examine subgroups and interactions.

As discussed in detail under item 17, many debate the use and value of analyses restricted to subgroups of the study population [4,104]. Subgroup analyses are nevertheless often done [4]. Readers need to know which subgroup analyses were planned in advance, and which arose while analysing the data. Also, it is important to explain what methods were used to examine whether effects or associations differed across groups (see item 17).

Interaction relates to the situation when one factor modifies the effect of another (therefore also called ‘effect modification'). The joint action of two factors can be characterized in two ways: on an additive scale, in terms of risk differences; or on a multiplicative scale, in terms of relative risk (see Box 8). Many authors and readers may have their own preference about the way interactions should be analysed. Still, they may be interested to know to what extent the joint effect of exposures differs from the separate effects. There is consensus that the additive scale, which uses absolute risks, is more appropriate for public health and clinical decision making [105]. Whatever view is taken, this should be clearly presented to the reader, as is done in the example above [103]. A lay-out presenting separate effects of both exposures as well as their joint effect, each relative to no exposure, might be most informative. It is presented in the example for interaction under item 17, and the calculations on the different scales are explained in Box 8.

Example/s:

Sex differences in susceptibility to the 3 lifestyle-related risk factors studied were explored by testing for biological interaction according to Rothman: a new composite variable with 4 categories (a−b−, a−b+, a+b−, and a+b+) was redefined for sex and a dichotomous exposure of interest where a− and b− denote absence of exposure. RR was calculated for each category after adjustment for age. An interaction effect is defined as departure from additivity of absolute effects, and excess RR caused by interaction (RERI) was calculated:

[calculation]

where RR(a+b+) denotes RR among those exposed to both factors where RR(a−b−) is used as reference category (RR = 1.0). Ninety-five percent CIs were calculated as proposed by Hosmer and Lemeshow. RERI of 0 means no interaction.

12c Missing data

Explain how missing data were addressed.

Missing data are common in observational research. Questionnaires posted to study participants are not always filled in completely, participants may not attend all follow-up visits and routine data sources and clinical databases are often incomplete. Despite its ubiquity and importance, few papers report in detail on the problem of missing data [5,107]. Investigators may use any of several approaches to address missing data. We describe some strengths and limitations of various approaches in Box 6. We advise that authors report the number of missing values for each variable of interest (exposures, outcomes, confounders) and for each step in the analysis. Authors should give reasons for missing values if possible, and indicate how many individuals were excluded because of missing data when describing the flow of participants through the study (see also item 13). For analyses that account for missing data, authors should describe the nature of the analysis (e.g., multiple imputation) and the assumptions that were made (e.g., missing at random, see Box 6).

Example/s:

Our missing data analysis procedures used missing at random (MAR) assumptions. We used the MICE (multivariate imputation by chained equations) method of multiple multivariate imputation in STATA. We independently analysed 10 copies of the data, each with missing values suitably imputed, in the multivariate logistic regression analyses. We averaged estimates of the variables to give a single mean estimate and adjusted standard errors according to Rubin's rules.

12d Loss to follow up

Cohort study: if applicable, explain how loss to follow-up was addressed. Case-control study: if applicable, explain how matching of cases and controls was addressed. Cross-sectional study: if applicable, describe analytical methods taking account of sampling strategy.

Cohort studies are analysed using life table methods or other approaches that are based on the person-time of follow-up and time to developing the disease of interest. Among individuals who remain free of the disease at the end of their observation period, the amount of follow-up time is assumed to be unrelated to the probability of developing the outcome. This will be the case if follow-up ends on a fixed date or at a particular age. Loss to follow-up occurs when participants withdraw from a study before that date. This may hamper the validity of a study if loss to follow-up occurs selectively in exposed individuals, or in persons at high risk of developing the disease (‘informative censoring'). In the example above, patients lost to follow-up in treatment programmes with no active follow-up had fewer CD4 helper cells than those remaining under observation and were therefore at higher risk of dying [116].

It is important to distinguish persons who reach the end of the study from those lost to follow-up. Unfortunately, statistical software usually does not distinguish between the two situations: in both cases follow-up time is automatically truncated (‘censored') at the end of the observation period. Investigators therefore need to decide, ideally at the stage of planning the study, how they will deal with loss to follow-up. When few patients are lost, investigators may either exclude individuals with incomplete follow-up, or treat them as if they withdrew alive at either the date of loss to follow-up or the end of the study. We advise authors to report how many patients were lost to follow-up and what censoring strategies they used.

Example/s:

In treatment programmes with active follow-up, those lost to follow-up and those followed-up at 1 year had similar baseline CD4 cell counts (median 115 cells per μL and 123 cells per μL), whereas patients lost to follow-up in programmes with no active follow-up procedures had considerably lower CD4 cell counts than those followed-up (median 64 cells per μL and 123 cells per μL). (…) Treatment programmes with passive follow-up were excluded from subsequent analyses.

12e Sensitivity analysis

Describe any sensitivity analyses.

Sensitivity analyses are useful to investigate whether or not the main results are consistent with those obtained with alternative analysis strategies or assumptions [121]. Issues that may be examined include the criteria for inclusion in analyses, the definitions of exposures or outcomes [122], which confounding variables merit adjustment, the handling of missing data [120,123], possible selection bias or bias from inaccurate or inconsistent measurement of exposure, disease and other variables, and specific analysis choices, such as the treatment of quantitative variables (see item 11). Sophisticated methods are increasingly used to simultaneously model the influence of several biases or assumptions [124–126].

In 1959 Cornfield et al. famously showed that a relative risk of 9 for cigarette smoking and lung cancer was extremely unlikely to be due to any conceivable confounder, since the confounder would need to be at least nine times as prevalent in smokers as in non-smokers [127]. This analysis did not rule out the possibility that such a factor was present, but it did identify the prevalence such a factor would need to have. The same approach was recently used to identify plausible confounding factors that could explain the association between childhood leukaemia and living near electric power lines [128]. More generally, sensitivity analyses can be used to identify the degree of confounding, selection bias, or information bias required to distort an association. One important, perhaps under recognised, use of sensitivity analysis is when a study shows little or no association between an exposure and an outcome and it is plausible that confounding or other biases toward the null are present.

Example/s:

Because we had a relatively higher proportion of ‘missing' dead patients with insufficient data (38/148=25.7%) as compared to live patients (15/437=3.4%) (…), it is possible that this might have biased the results. We have, therefore, carried out a sensitivity analysis. We have assumed that the proportion of women using oral contraceptives in the study group applies to the whole (19.1% for dead, and 11.4% for live patients), and then applied two extreme scenarios: either all the exposed missing patients used second generation pills or they all used third-generation pills.

nut-12.1

Describe any statistical method used to combine dietary or nutritional data, if applicable.

A clear description of the statistical methods ensures transparency and enables other researchers to reproduce the study in studies of similar design. Assumptions made when combining data should be stated. Studies occasionally combine 2 dietary data collection methods (e.g., FFQ and 24-h recalls). If so, and in order to allow an appropriate interpretation of results, the report should include the method used to combine the dietary or nutritional data and identify the strengths and weaknesses in this approach. When appropriate, a justification for the chosen method is informative.

If dietary patterns are used to represent whole diets, the theoretical basis for the methods should be justified, and any subjective elements in the method clearly identified (see also Nut-7.2). Where possible, the patterns should be fully characterized and the basis presented for any subjective labels (e.g., correlation or covariance matrices, or factor loadings).

The units used should be clearly presented for all variables (e.g., servings per day, portions, grams, millimoles per liter). The same level of detail is equally important for any covariate considered, and the precision of any numbers given should be considered. A detailed description of the time frame for dietary intake will help the reader appreciate the appropriateness of the data collection methods and of any modeling assumptions. Similarly, when differential absorption of food and supplemental sources is present, additional care should be taken to describe the methods and models and to state the assumptions made.

Example/s:

To best represent long-term diet, we used cumulative average acrylamide intake as our main exposure measure. That is, 1980 intake was used for follow-up from 1980 to 1984; the average of 1980 and 1984 intake was used for follow-up from 1984 to 1986; the average of 1980, 1984, and 1986 was used for follow-up from 1986 to 1990; and so on. This exposure measure also reduces random within-person measurement error over time. In secondary analyses, we used baseline (1980) acrylamide intake only. In addition, we did a latency analysis for breast cancer because of the large number of cases. We used our repeated measures of acrylamide intake to analyze the effect of latency time (time from exposure to cancer) by relating each measure of acrylamide intake to breast cancer incidence during specific periods of latency time: 0–4, 4–8, 8–12, and 12–16 y.

nut-12.2

Describe and justify the method for energy adjustments, intake modeling, and use of weighting factors, if applicable.

Individuals with high energy intakes might have a higher consumption of many food components. Therefore, failure to adjust nutrient intakes for energy intake could lead to misleading conclusions with regard to the link between dietary intakes and disease. In addition, energy adjustment will potentially remove some of the negative influence of dietary measurement errors. The method used for energy adjustments (i.e., residual or nutrient density) should be described.

It is also recommended to describe whether the energy adjustments include or exclude energy from any particular food or nutrient. For example, in studies in which alcohol is a strong risk factor for the disease (e.g., in studies on breast cancer), and there is a need to examine alcohol use separately, nonalcohol energy may be used instead of total energy when computing nutrient densities or nutrient residuals to examine dietary exposures. It is helpful to describe the statistical techniques used to remove the within-person error when using short-term instruments, such as 24-h dietary recalls, to estimate the proportion of a population below or above a recommendation or cutoff.

Example/s:

Example 1.

After examining the distribution of the data, all nutrient intake and biomarker variables were log-transformed to improve normality. We used the residual method to adjust dietary FAs and carotenoids for total energy by regressing nutrient intakes on 1) self-reported total energy intake derived from FFQs and 2) body weight and physical activity.

Example 2.

The macronutrient intake is reported as absolute intake (grams per day) and as a percentage of energy, except for fiber, which is presented as grams per day and grams per mega-Joule. Micronutrient intake is presented as nutrient density (i.e., the amount of reported intake per 4.2 MJ); Nordic nutrition recommendations were used as a reference. For micronutrients, the recommendations were converted to nutrient density by dividing the recommended nutrient intake with the recommended energy intake multiplied by 4.2 MJ.

nut-12.3

Report any adjustments for measurement error (i.e., from a validity or calibration study).

Despite the improvement in dietary assessment methods, random and systematic measurement errors, both within and between individuals, may be present in dietary data. The statistical understanding of dietary measurement errors is increasing, and different methods have been developed to try to correct for measurement errors in analysis when examining associations between dietary exposures and disease risks. Because these methods are all based on specific assumptions, and depend on the type of calibration study and data available, there is a need to clearly describe them in order to improve the interpretation. It is helpful to provide the rationale for the adjustment as well as to describe the adjustment method, including risk estimates with 95% CIs.

Example/s:

A second FFQ was taken from a sample of 1918 (5%) of the cohort, from which the amount of random measurement error was estimated by using a regression calibration approach to obtain individual predicted values of dietary exposure for all participants. Cox proportional hazards regression was then conducted by using the predicted values for each individual categorized into quintiles to give estimated HRs corrected for some of the effects of measurement error. 95% CIs were obtained from bootstrapped estimates.

Results

13a Participants

Report numbers of individuals at each stage of study—eg numbers potentially eligible, examined for eligibility, confirmed eligible, included in the study, completing follow-up, and analysed. Give information separately for for exposed and unexposed groups if applicable.

Detailed information on the process of recruiting study participants is important for several reasons. Those included in a study often differ in relevant ways from the target population to which results are applied. This may result in estimates of prevalence or incidence that do not reflect the experience of the target population. For example, people who agreed to participate in a postal survey of sexual behaviour attended church less often, had less conservative sexual attitudes and earlier age at first sexual intercourse, and were more likely to smoke cigarettes and drink alcohol than people who refused [130]. These differences suggest that postal surveys may overestimate sexual liberalism and activity in the population. Such response bias (see Box 3) can distort exposure-disease associations if associations differ between those eligible for the study and those included in the study. As another example, the association between young maternal age and leukaemia in offspring, which has been observed in some case-control studies [131,132], was explained by differential participation of young women in case and control groups. Young women with healthy children were less likely to participate than those with unhealthy children [133]. Although low participation does not necessarily compromise the validity of a study, transparent information on participation and reasons for non-participation is essential. Also, as there are no universally agreed definitions for participation, response or follow-up rates, readers need to understand how authors calculated such proportions [134].

Ideally, investigators should give an account of the numbers of individuals considered at each stage of recruiting study participants, from the choice of a target population to the inclusion of participants' data in the analysis. Depending on the type of study, this may include the number of individuals considered to be potentially eligible, the number assessed for eligibility, the number found to be eligible, the number included in the study, the number examined, the number followed up and the number included in the analysis. Information on different sampling units may be required, if sampling of study participants is carried out in two or more stages as in the example above (multistage sampling). In case-control studies, we advise that authors describe the flow of participants separately for case and control groups [135]. Controls can sometimes be selected from several sources, including, for example, hospitalised patients and community dwellers. In this case, we recommend a separate account of the numbers of participants for each type of control group. Olson and colleagues proposed useful reporting guidelines for controls recruited through random-digit dialling and other methods [136].

A recent survey of epidemiological studies published in 10 general epidemiology, public health and medical journals found that some information regarding participation was provided in 47 of 107 case-control studies (59%), 49 of 154 cohort studies (32%), and 51 of 86 cross-sectional studies (59%) [137]. Incomplete or absent reporting of participation and non-participation in epidemiological studies was also documented in two other surveys of the literature [4,5]. Finally, there is evidence that participation in epidemiological studies may have declined in recent decades [137,138], which underscores the need for transparent reporting.

Example/s:

Of the 105 freestanding bars and taverns sampled, 13 establishments were no longer in business and 9 were located in restaurants, leaving 83 eligible businesses. In 22 cases, the owner could not be reached by telephone despite 6 or more attempts. The owners of 36 bars declined study participation. (...) The 25 participating bars and taverns employed 124 bartenders, with 67 bartenders working at least 1 weekly daytime shift. Fifty-four of the daytime bartenders (81%) completed baseline interviews and spirometry; 53 of these subjects (98%) completed follow-up.

13b Non-participation

Give reasons for non-participation at each stage.

Explaining the reasons why people no longer participated in a study or why they were excluded from statistical analyses helps readers judge whether the study population was representative of the target population and whether bias was possibly introduced. For example, in a cross-sectional health survey, non-participation due to reasons unlikely to be related to health status (for example, the letter of invitation was not delivered because of an incorrect address) will affect the precision of estimates but will probably not introduce bias. Conversely, if many individuals opt out of the survey because of illness, or perceived good health, results may underestimate or overestimate the prevalence of ill health in the population.

Example/s:

The main reasons for non-participation were the participant was too ill or had died before interview (cases 30%, controls < 1%), nonresponse (cases 2%, controls 21%), refusal (cases 10%, controls 29%), and other reasons (refusal by consultant or general practitioner, non-English speaking, mental impairment) (cases 7%, controls 5%).

13c Participant journey

Consider the use of a flow diagram.

An informative and well-structured flow diagram can readily and transparently convey information that might otherwise require a lengthy description [142], as in the example above. The diagram may usefully include the main results, such as the number of events for the primary outcome. While we recommend the use of a flow diagram, particularly for complex observational studies, we do not propose a specific format for the diagram.

Example/s:

Flow diagram from Hay et al. https://doi.org/10.1371/journal.pmed.0040297.g001.

nut-13

Report the number of individuals excluded on the basis of missing, incomplete, or implausible dietary and nutritional data.

Missing and implausible data are omnipresent in dietary assessments and may introduce bias or attenuate associations (see also Nut-9 and -17. Individuals with biologically extreme values are commonly excluded. To enable the reader to better evaluate the study, information with regard to the final study power and any bias is needed. It is helpful to describe the number and characteristics of excluded individuals due to missing or incomplete dietary data. Also describe any sensitivity analyses performed to explore the robustness of study findings.

Example/s:

We excluded participants with cancer, implausible energy intakes (reported as <600 or >3600 kcal/d for women and <800 or >4200 kcal/d for men; 1 kcal = 4.18 kJ), or missing alcohol intake at baseline.

14a Descriptive data

Give characteristics of study participants (eg demographic, clinical, social) and information on exposures and potential confounders. Give information separately for exposed and unexposed groups if applicable.

Readers need descriptions of study participants and their exposures to judge the generalisability of the findings. Information about potential confounders, including whether and how they were measured, influences judgments about study validity. We advise authors to summarize continuous variables for each study group by giving the mean and standard deviation, or when the data have an asymmetrical distribution, as is often the case, the median and percentile range (e.g., 25th and 75th percentiles). Variables that make up a small number of ordered categories (such as stages of disease I to IV) should not be presented as continuous variables; it is preferable to give numbers and proportions for each category (see also Box 4). In studies that compare groups, the descriptive characteristics and numbers should be given by group, as in the example above.

Inferential measures such as standard errors and confidence intervals should not be used to describe the variability of characteristics, and significance tests should be avoided in descriptive tables. Also, P values are not an appropriate criterion for selecting which confounders to adjust for in analysis; even small differences in a confounder that has a strong effect on the outcome can be important.

In cohort studies, it may be useful to document how an exposure relates to other characteristics and potential confounders. Authors could present this information in a table with columns for participants in two or more exposure categories, which permits to judge the differences in confounders between these categories.

In case-control studies potential confounders cannot be judged by comparing cases and controls. Control persons represent the source population and will usually be different from the cases in many respects. For example, in a study of oral contraceptives and myocardial infarction, a sample of young women with infarction more often had risk factors for that disease, such as high serum cholesterol, smoking and a positive family history, than the control group [146]. This does not influence the assessment of the effect of oral contraceptives, as long as the prescription of oral contraceptives was not guided by the presence of these risk factors—e.g., because the risk factors were only established after the event (see also Box 5). In case-control studies the equivalent of comparing exposed and non-exposed for the presence of potential confounders (as is done in cohorts) can be achieved by exploring the source population of the cases: if the control group is large enough and represents the source population, exposed and unexposed controls can be compared for potential confounders [121,147].

Example/s:

Characteristics of the Study Base at Enrolment, Castellana G (Italy), 1985–1986

https://doi.org/10.1371/journal.pmed.0040297.t002.

14b Missing data

Indicate number of participants with missing data for each variable of interest.

As missing data may bias or affect generalisability of results, authors should tell readers amounts of missing data for exposures, potential confounders, and other important characteristics of patients (see also item 12c and Box 6). In a cohort study, authors should report the extent of loss to follow-up (with reasons), since incomplete follow-up may bias findings (see also items 12d and 13). We advise authors to use their tables and figures to enumerate amounts of missing data.

Example/s:

Table. Symptom End Points Used in Survival Analysis

https://doi.org/10.1371/journal.pmed.0040297.t003.

14c Follow-up time

Cohort study: Summarise follow-up time (eg, average and total amount).

Readers need to know the duration and extent of follow-up for the available outcome data. Authors can present a summary of the average follow-up with either the mean or median follow-up time or both. The mean allows a reader to calculate the total number of person-years by multiplying it with the number of study participants. Authors also may present minimum and maximum times or percentiles of the distribution to show readers the spread of follow-up times. They may report total person-years of follow-up or some indication of the proportion of potential data that was captured [148]. All such information may be presented separately for participants in two or more exposure categories. Almost half of 132 articles in cancer journals (mostly cohort studies) did not give any summary of length of follow-up.

Example/s:

During the 4366 person-years of follow-up (median 5.4, maximum 8.3 years), 265 subjects were diagnosed as having dementia, including 202 with Alzheimer's disease.

nut-14

Give the distribution of participant characteristics across the exposure variables, if applicable; specify if food consumption for the total population or consumers only was used to obtain results.

Confounding is a major concern in nutritional epidemiology, because most dietary exposures are interdependent, and many socioeconomic and lifestyle factors covary with dietary exposures (see also Nut-8.5). Reporting participant characteristics across the dietary exposure variables will enable the reader to assess the potential impact of confounders.

15. Outcome data

Cohort study: report numbers of outcome events or summary measures over time. Case-control study: report numbers in each exposure category, or summary measures of exposure. Cross-sectional study: report numbers of outcome events or summary measures.

Before addressing the possible association between exposures (risk factors) and outcomes, authors should report relevant descriptive data. It may be possible and meaningful to present measures of association in the same table that presents the descriptive data (see item 14a). In a cohort study with events as outcomes, report the numbers of events for each outcome of interest. Consider reporting the event rate per person-year of follow-up. If the risk of an event changes over follow-up time, present the numbers and rates of events in appropriate intervals of follow-up or as a Kaplan-Meier life table or plot. It might be preferable to show plots as cumulative incidence that go up from 0% rather than down from 100%, especially if the event rate is lower than, say, 30% [153]. Consider presenting such information separately for participants in different exposure categories of interest. If a cohort study is investigating other time-related outcomes (e.g., quantitative disease markers such as blood pressure), present appropriate summary measures (e.g., means and standard deviations) over time, perhaps in a table or figure.

For cross-sectional studies, we recommend presenting the same type of information on prevalent outcome events or summary measures. For case-control studies, the focus will be on reporting exposures separately for cases and controls as frequencies or quantitative summaries [154]. For all designs, it may be helpful also to tabulate continuous outcomes or exposures in categories, even if the data are not analysed as such.

Example/s:

Table. Rates of HIV-1 Seroconversion by Selected Sociodemographic Variables: 1990–1993 https://doi.org/10.1371/journal.pmed.0040297.t004.

16a Main results

Give unadjusted estimates and, if applicable, confounder-adjusted estimates and their precision (eg, 95% confidence interval). Make clear which confounders were adjusted for and why they were included.

In many situations, authors may present the results of unadjusted or minimally adjusted analyses and those from fully adjusted analyses. We advise giving the unadjusted analyses together with the main data, for example the number of cases and controls that were exposed or not. This allows the reader to understand the data behind the measures of association (see also item 15). For adjusted analyses, report the number of persons in the analysis, as this number may differ because of missing values in covariates (see also item 12c). Estimates should be given with confidence intervals.

Readers can compare unadjusted measures of association with those adjusted for potential confounders and judge by how much, and in what direction, they changed. Readers may think that ‘adjusted' results equal the causal part of the measure of association, but adjusted results are not necessarily free of random sampling error, selection bias, information bias, or residual confounding (see Box 5). Thus, great care should be exercised when interpreting adjusted results, as the validity of results often depends crucially on complete knowledge of important confounders, their precise measurement, and appropriate specification in the statistical model (see also item 20) [157,158].

Authors should explain all potential confounders considered, and the criteria for excluding or including variables in statistical models. Decisions about excluding or including variables should be guided by knowledge, or explicit assumptions, on causal relations. Inappropriate decisions may introduce bias, for example by including variables that are in the causal pathway between exposure and disease (unless the aim is to asses how much of the effect is carried by the intermediary variable). If the decision to include a variable in the model was based on the change in the estimate, it is important to report what change was considered sufficiently important to justify its inclusion. If a ‘backward deletion' or ‘forward inclusion' strategy was used to select confounders, explain that process and give the significance level for rejecting the null hypothesis of no confounding. Of note, we and others do not advise selecting confounders based solely on statistical significance testing [147,159,160].

Recent studies of the quality of reporting of epidemiological studies found that confidence intervals were reported in most articles [4]. However, few authors explained their choice of confounding variables [4,5].

Example/s:

Table. Relative Rates of Rehospitalisation by Treatment in Patients in Community Care after First Hospitalisation due to Schizophrenia and Schizoaffective Disorder https://doi.org/10.1371/journal.pmed.0040297.t008.

16b Category boundaries

Report category boundaries when continuous variables were categorized.

Categorizing continuous data has several important implications for analysis (see Box 4) and also affects the presentation of results. In tables, outcomes should be given for each exposure category, for example as counts of persons at risk, person-time at risk, if relevant separately for each group (e.g., cases and controls). Details of the categories used may aid comparison of studies and meta-analysis. If data were grouped using conventional cut-points, such as body mass index thresholds [162], group boundaries (i.e., range of values) can be derived easily, except for the highest and lowest categories. If quantile-derived categories are used, the category boundaries cannot be inferred from the data. As a minimum, authors should report the category boundaries; it is helpful also to report the range of the data and the mean or median values within categories.

Example/s:

Table. Polychlorinated Biphenyls in Cord Serum

https://doi.org/10.1371/journal.pmed.0040297.t005.

16c Relative and absolute risks

If relevant, consider translating estimates of relative risk into absolute risk for a meaningful time period.

The results from studies examining the association between an exposure and a disease are commonly reported in relative terms, as ratios of risks, rates or odds (see Box 8). Relative measures capture the strength of the association between an exposure and disease. If the relative risk is a long way from 1 it is less likely that the association is due to confounding [164,165]. Relative effects or associations tend to be more consistent across studies and populations than absolute measures, but what often tends to be the case may be irrelevant in a particular instance. For example, similar relative risks were obtained for the classic cardiovascular risk factors for men living in Northern Ireland, France, the USA and Germany, despite the fact that the underlying risk of coronary heart disease varies substantially between these countries [166,167]. In contrast, in a study of hypertension as a risk factor for cardiovascular disease mortality, the data were more compatible with a constant rate difference than with a constant rate ratio [168].

Widely used statistical models, including logistic [169] and proportional hazards (Cox) regression [170] are based on ratio measures. In these models, only departures from constancy of ratio effect measures are easily discerned. Nevertheless, measures which assess departures from additivity of risk differences, such as the Relative Excess Risk from Interaction (RERI, see item 12b and Box 8), can be estimated in models based on ratio measures.

In many circumstances, the absolute risk associated with an exposure is of greater interest than the relative risk. For example, if the focus is on adverse effects of a drug, one will want to know the number of additional cases per unit time of use (e.g., days, weeks, or years). The example gives the additional number of breast cancer cases per 1000 women who used hormone-replacement therapy for 10 years [163]. Measures such as the attributable risk or population attributable fraction may be useful to gauge how much disease can be prevented if the exposure is eliminated. They should preferably be presented together with a measure of statistical uncertainty (e.g., confidence intervals as in the example). Authors should be aware of the strong assumptions made in this context, including a causal relationship between a risk factor and disease (also see Box 7) [171]. Because of the semantic ambiguity and complexities involved, authors should report in detail what methods were used to calculate attributable risks, ideally giving the formulae used [172].

A recent survey of abstracts of 222 articles published in leading medical journals found that in 62% of abstracts of randomised trials including a ratio measure absolute risks were given, but only in 21% of abstracts of cohort studies [173]. A free text search of Medline 1966 to 1997 showed that 619 items mentioned attributable risks in the title or abstract, compared to 18,955 using relative risk or odds ratio, for a ratio of 1 to 31 [174].

Example/s:

10 years' use of HRT [hormone replacement therapy] is estimated to result in five (95% CI 3–7) additional breast cancers per 1000 users of oestrogen-only preparations and 19 (15–23) additional cancers per 1000 users of oestrogen-progestagen combinations.

nut-16

Specify if nutrient intakes are reported with or without the inclusion of dietary supplement intake, if applicable.

The total intake of nutrients could be underestimated if supplement use is not accounted for. It may be helpful to the reader if nutrient intakes are presented both including and excluding the contribution from supplements (see also Nut-8.1). However, depending on the study aim and the data available, it could be more suitable to present supplement use as a separate exposure, or as a covariate. Because both the chemical form and the dose of nutrients found in supplements often differ compared with nutrients found in foods, dietary supplements may have a different effect than food-derived nutrient exposure. In addition, when only less-detailed dietary supplement data are available (e.g., current, ever, or never use), it may not be possible to combine dietary and supplement data. Where differential absorption of food and supplemental sources is relevant, additional care should be taken to describe the methods of data collection and analysis. Any assumptions made should also be stated.

Example/s:

There was no overall association between intake of vitamin C and the risk of developing hypertension. Comparing individuals in NHS-I whose daily consumption of vitamin C was ≥1500 mg with those whose intake was <250 mg in the other 2 cohorts, the RRs (95% CIs) were 1.02 (0.91, 1.14) in NHS-II and 1.06 (0.97, 1.15) in the Health Professionals Follow-Up Study. In a secondary analysis, we excluded women and men who took supplemental vitamin C (including multivitamin users) and analyzed the association between dietary intake of vitamin C and incident hypertension. Comparing individuals whose daily dietary consumption of vitamin C was ≥250 mg with those who consumed <100 mg/d, the adjusted RRs (95% CIs) were 1.05 (0.97, 1.14) in NHS-I, 1.06 (0.92, 1.23) in NHS-II, and 0.99 (0.84, 1.17) in the Health Professionals Follow-Up Study.

17. Other analyses

Report other analyses done—eg analyses of subgroups and interactions, and sensitivity analyses.

In addition to the main analysis other analyses are often done in observational studies. They may address specific subgroups, the potential interaction between risk factors, the calculation of attributable risks, or use alternative definitions of study variables in sensitivity analyses.

There is debate about the dangers associated with subgroup analyses, and multiplicity of analyses in general [4,104]. In our opinion, there is too great a tendency to look for evidence of subgroup-specific associations, or effect-measure modification, when overall results appear to suggest little or no effect. On the other hand, there is value in exploring whether an overall association appears consistent across several, preferably pre-specified subgroups especially when a study is large enough to have sufficient data in each subgroup. A second area of debate is about interesting subgroups that arose during the data analysis. They might be important findings, but might also arise by chance. Some argue that it is neither possible nor necessary to inform the reader about all subgroup analyses done as future analyses of other data will tell to what extent the early exciting findings stand the test of time [9]. We advise authors to report which analyses were planned, and which were not (see also items 4, 12b and 20). This will allow readers to judge the implications of multiplicity, taking into account the study's position on the continuum from discovery to verification or refutation.

A third area of debate is how joint effects and interactions between risk factors should be evaluated: on additive or multiplicative scales, or should the scale be determined by the statistical model that fits best (see also item 12b and Box 8)? A sensible approach is to report the separate effect of each exposure as well as the joint effect—if possible in a table, as in the first example above [183], or in the study by Martinelli et al. [185]. Such a table gives the reader sufficient information to evaluate additive as well as multiplicative interaction (how these calculations are done is shown in Box 8). Confidence intervals for separate and joint effects may help the reader to judge the strength of the data. In addition, confidence intervals around measures of interaction, such as the Relative Excess Risk from Interaction (RERI) relate to tests of interaction or homogeneity tests. One recurrent problem is that authors use comparisons of P-values across subgroups, which lead to erroneous claims about an effect modifier. For instance, a statistically significant association in one category (e.g., men), but not in the other (e.g., women) does not in itself provide evidence of effect modification. Similarly, the confidence intervals for each point estimate are sometimes inappropriately used to infer that there is no interaction when intervals overlap. A more valid inference is achieved by directly evaluating whether the magnitude of an association differs across subgroups.

Sensitivity analyses are helpful to investigate the influence of choices made in the statistical analysis, or to investigate the robustness of the findings to missing data or possible biases (see also item 12b). Judgement is needed regarding the level of reporting of such analyses. If many sensitivity analyses were performed, it may be impractical to present detailed findings for them all. It may sometimes be sufficient to report that sensitivity analyses were carried out and that they were consistent with the main results presented. Detailed presentation is more appropriate if the issue investigated is of major concern, or if effect estimates vary considerably [59,186].

Pocock and colleagues found that 43 out of 73 articles reporting observational studies contained subgroup analyses. The majority claimed differences across groups but only eight articles reported a formal evaluation of interaction (see item 12b) [4].

Example/s:

Table. Analysis of Oral Contraceptive Use, Presence of Factor V Leiden Allele, and Risk for Venous Thromboembolism

https://doi.org/10.1371/journal.pmed.0040297.t009

Table. Sensitivity of the Rate Ratio for Cardiovascular Outcome to an Unmeasured Confounder https://doi.org/10.1371/journal.pmed.0040297.t010.

nut-17

Report any sensitivity analysis (e.g., exclusion of misreporters or outliers) and data imputation, if applicable.

Misreporting of dietary intake is common and a major challenge to nutritional epidemiology, especially underreporting, which is likely related to personal characteristics and may be associated with health outcomes. Depending on the study design and available data, researchers may select different approaches to examine the robustness of study findings and thus enhance the understanding of the impact of measurement errors. Individuals may have changed their diets before the start of the study due to ill health (e.g., diagnosed with diabetes or hyperlipidemia) or other reasons. In such cases, the reported diet may not be relevant for the outcome assessed, and therefore it may be sensible to repeat analysis excluding subgroups of the study sample.

It is often helpful to compare the reported energy intake with the TEE calculated from estimates of the resting energy expenditure and the PAL. This will enable readers to evaluate if under- or overestimation of dietary energy is present. Although studies often exclude individuals with high or low reported energy intakes, this may not always be appropriate due to excluding some true intakes. Alternative solutions could include a separate assessment of these groups (see Text Box 3). If individuals with extreme values (i.e., clearly not compatible with biological function) are excluded, the allowable range for those included should be stated.

Another concern is missing values in FFQs, especially when dietary information is combined in nutrient intake calculations or in indexes. Some missing values in an FFQ may represent random mistakes, whereas others reflect nonconsumption. To understand the procedure and enable replication of the study, details of any imputation and the statistical handling need to be provided.

Example/s:

Individuals with dietary change in the past are suspected to have unstable food habits. Dietary change in the past (yes or no) was derived from the questionnaire item: “Have you substantially changed your eating habits because of illness or for some other reason? All analyses were performed in 1) all individuals, 2) individuals reporting adequate energy intake (i.e., nonadequate reporters were excluded), and 3) individuals reporting stable dietary habits (i.e., individuals reporting dietary change were excluded).

Discussion

18. Key results

Summarise key results with reference to study objectives.

It is good practice to begin the discussion with a short summary of the main findings of the study. The short summary reminds readers of the main findings and may help them assess whether the subsequent interpretation and implications offered by the authors are supported by the findings.

Example/s:

We hypothesized that ethnic minority status would be associated with higher levels of cardiovascular disease (CVD) risk factors, but that the associations would be explained substantially by socioeconomic status (SES). Our hypothesis was not confirmed. After adjustment for age and SES, highly significant differences in body mass index, blood pressure, diabetes, and physical inactivity remained between white women and both black and Mexican American women. In addition, we found large differences in CVD risk factors by SES, a finding that illustrates the high-risk status of both ethnic minority women as well as white women with low SES.

19. Limitations

Discuss limitations of the study, taking into account sources of potential bias or imprecision. Discuss both direction and magnitude of any potential bias.

The identification and discussion of the limitations of a study are an essential part of scientific reporting. It is important not only to identify the sources of bias and confounding that could have affected results, but also to discuss the relative importance of different biases, including the likely direction and magnitude of any potential bias (see also item 9 and Box 3).

Authors should also discuss any imprecision of the results. Imprecision may arise in connection with several aspects of a study, including the study size (item 10) and the measurement of exposures, confounders and outcomes (item 8). The inability to precisely measure true values of an exposure tends to result in bias towards unity: the less precisely a risk factor is measured, the greater the bias. This effect has been described as ‘attenuation' [201,202], or more recently as ‘regression dilution bias' [203]. However, when correlated risk factors are measured with different degrees of imprecision, the adjusted relative risk associated with them can be biased towards or away from unity [204–206].

When discussing limitations, authors may compare the study being presented with other studies in the literature in terms of validity, generalisability and precision. In this approach, each study can be viewed as contribution to the literature, not as a stand-alone basis for inference and action [207]. Surprisingly, the discussion of important limitations of a study is sometimes omitted from published reports. A survey of authors who had published original research articles in The Lancet found that important weaknesses of the study were reported by the investigators in the survey questionnaires, but not in the published article.

Example/s:

Since the prevalence of counseling increases with increasing levels of obesity, our estimates may overestimate the true prevalence. Telephone surveys also may overestimate the true prevalence of counseling. Although persons without telephones have similar levels of overweight as persons with telephones, persons without telephones tend to be less educated, a factor associated with lower levels of counseling in our study. Also, of concern is the potential bias caused by those who refused to participate as well as those who refused to respond to questions about weight. Furthermore, because data were collected cross-sectionally, we cannot infer that counseling preceded a patient's attempt to lose weight.

nut-19

Describe the main limitations of the data sources and assessment methods used and implications for the interpretation of the findings.

Given the complexity of nutritional epidemiology, the discussion of study limitations is an essential part of the scientific reporting. Assumptions with regard to the accuracy of the reported dietary intake should be handled with care. Potential sources of biases and, if relevant, how these were handled, as well as degrees of error related to the dietary assessment need to be reported and thoroughly discussed when interpreting the results. To observe different health outcomes in exposed compared with nonexposed study participants, the dietary exposure gradient needs to be large enough.

Example/s:

However, the dietary history method used has limitations that may have caused some misclassification of subjects. These tend to diminish the associations observed between exposure and outcome. The result of the dietary history interview is always a subjective assessment of the respondent's own dietary habits. A period of 1 y is a lengthy time to recall. Food models were used to diminish errors in recall, and open-ended questions enabled respondents to be more specific in their answers. To minimize possible bias, trained nutrition professionals used a structured questionnaire. In general, the short-term repeatability of the dietary history method was relatively good. However, rather poor repeatability for glucose and fructose hinders the interpretation of the results and the possibility of chance findings increases. The poorer long-term consistency can be partly explained by changes in Finnish dietary habits. Changes in food consumption during follow-up tend to weaken the associations observed. For this reason, follow-up in this study was limited to 12 y.

20. Interpretation

Give a cautious overall interpretation considering objectives, limitations, multiplicity of analyses, results from similar studies, and other relevant evidence.

The heart of the discussion section is the interpretation of a study's results. Over-interpretation is common and human: even when we try hard to give an objective assessment, reviewers often rightly point out that we went too far in some respects. When interpreting results, authors should consider the nature of the study on the discovery to verification continuum and potential sources of bias, including loss to follow-up and non-participation (see also items 9, 12 and 19). Due consideration should be given to confounding (item 16a), the results of relevant sensitivity analyses, and to the issue of multiplicity and subgroup analyses (item 17). Authors should also consider residual confounding due to unmeasured variables or imprecise measurement of confounders. For example, socioeconomic status (SES) is associated with many health outcomes and often differs between groups being compared. Variables used to measure SES (income, education, or occupation) are surrogates for other undefined and unmeasured exposures, and the true confounder will by definition be measured with error [208]. Authors should address the real range of uncertainty in estimates, which is larger than the statistical uncertainty reflected in confidence intervals. The latter do not take into account other uncertainties that arise from a study's design, implementation, and methods of measurement [209].

To guide thinking and conclusions about causality, some may find criteria proposed by Bradford Hill in 1965 helpful [164]. How strong is the association with the exposure? Did it precede the onset of disease? Is the association consistently observed in different studies and settings? Is there supporting evidence from experimental studies, including laboratory and animal studies? How specific is the exposure's putative effect, and is there a dose-response relationship? Is the association biologically plausible? These criteria should not, however, be applied mechanically. For example, some have argued that relative risks below 2 or 3 should be ignored [210,211]. This is a reversal of the point by Cornfield et al. about the strength of large relative risks (see item 12b) [127]. Although a causal effect is more likely with a relative risk of 9, it does not follow that one below 3 is necessarily spurious. For instance, the small increase in the risk of childhood leukaemia after intrauterine irradiation is credible because it concerns an adverse effect of a medical procedure for which no alternative explanations are obvious [212]. Moreover, the carcinogenic effects of radiation are well established. The doubling in the risk of ovarian cancer associated with eating 2 to 4 eggs per week is not immediately credible, since dietary habits are associated with a large number of lifestyle factors as well as SES [213]. In contrast, the credibility of much debated epidemiologic findings of a difference in thrombosis risk between different types of oral contraceptives was greatly enhanced by the differences in coagulation found in a randomised cross-over trial [214]. A discussion of the existing external evidence, from different types of studies, should always be included, but may be particularly important for studies reporting small increases in risk. Further, authors should put their results in context with similar studies and explain how the new study affects the existing body of evidence, ideally by referring to a systematic review.

Example/s:

Any explanation for an association between death from myocardial infarction and use of second generation oral contraceptives must be conjectural. There is no published evidence to suggest a direct biologic mechanism, and there are no other epidemiologic studies with relevant results. (…) The increase in absolute risk is very small and probably applies predominantly to smokers. Due to the lack of corroborative evidence, and because the analysis is based on relatively small numbers, more evidence on the subject is needed. We would not recommend any change in prescribing practice on the strength of these results.

nut-20

Report the nutritional relevance of the findings, given the complexity of diet or nutrition as an exposure.

The nutritional relevance of the findings depends on a number of factors. The quality of the dietary data will determine the ability to detect significant associations. Small dietary differences without any biological significance could in large cohort studies result in significant associations with disease outcomes. Reporting an effect size of intake differences may facilitate the understanding of the practical and theoretical utility of study results. Translating an increased risk into a reduction in survival in number of months may also make it easier to judge the relevance of findings. The inherent complexities of diet as an environmental exposure pose additional challenges to the interpretation of study findings, which requires careful consideration and nuanced and balanced conclusions.

Nutrients and other bioactive substances are generally not consumed in isolation. Food contains various bioactive substances, and each meal typically consists of a combination of several foods. It might be difficult to distinguish the “true” effect of a single nutrient, because nutrients interact with each other, with other compounds, and with the surrounding food matrix in complex ways. When intercorrelated nutrients (e.g., different FAs) are examined together, there is a risk of attenuated associations; however, on the other hand, if not analyzed together, the separate effects of intercorrelated nutrients may be impossible to detect. The dietary concentration of a single nutrient may also be too low to detect any health effect. Moreover, dietary habits cluster with other health behaviors. Lifestyle factors other than diet and environmental factors, as well as the physiologic and disease status of study participants, will also influence the impact of dietary exposures. Indicate whether conclusions were based on analyses of dietary intakes alone or whether intakes through diet were combined with dietary supplements (see Nut-16).

The variation in food habits across populations, and across subgroups within populations, further complicates the interpretation and contributes to inconsistencies between studies. For example, meat and meat products are major sources of saturated fat in the United States, whereas dairy products dominate in the Nordic countries, resulting in diverging dietary covariates and potential confounders (i.e., dietary components related to fat intake will vary). Similarly, dietary carbohydrates are largely contributed by fruit and vegetables in Southern European countries, whereas sugary foods, cereals, and potatoes are major contributors in Northern Europe. Thus, the food habits in the population under study should be considered when discussing the generalizability of results, and the consistency of diet-disease associations needs to be examined in different populations.

Example/s:

A dilemma in the present study is the difference in group size. In order to avoid misinterpretations of the results and in order to deepen our understanding when analyzing data, estimations of effect sizes were calculated. Having a large sample increases the risk of overvaluing observed significant differences where the importance of the differences could be quite trivial. This occurred, for example, when we compared the differences of reported intake between the 2 nonceliac referent groups (data not shown) and found many significant differences; however, the estimated effect size revealed that the relevance of these differences was mostly small. On the other hand, a calculated large effect size on nonsignificant differences in a small sample, such as the changes in the previously diagnosed celiac disease group between baseline and follow-up, suggests a need for further research with a larger sample size.

21. Generalisability

Discuss the generalisability (external validity) of the study results.

Generalisability, also called external validity or applicability, is the extent to which the results of a study can be applied to other circumstances [216]. There is no external validity per se; the term is meaningful only with regard to clearly specified conditions [217]. Can results be applied to an individual, groups or populations that differ from those enrolled in the study with regard to age, sex, ethnicity, severity of disease, and co-morbid conditions? Are the nature and level of exposures comparable, and the definitions of outcomes relevant to another setting or population? Are data that were collected in longitudinal studies many years ago still relevant today? Are results from health services research in one country applicable to health systems in other countries?

The question of whether the results of a study have external validity is often a matter of judgment that depends on the study setting, the characteristics of the participants, the exposures examined, and the outcomes assessed. Thus, it is crucial that authors provide readers with adequate information about the setting and locations, eligibility criteria, the exposures and how they were measured, the definition of outcomes, and the period of recruitment and follow-up. The degree of non-participation and the proportion of unexposed participants in whom the outcome develops are also relevant. Knowledge of the absolute risk and prevalence of the exposure, which will often vary across populations, are helpful when applying results to other settings and populations (see Box 7).

Example/s:

How applicable are our estimates to other HIV-1-infected patients? This is an important question because the accuracy of prognostic models tends to be lower when applied to data other than those used to develop them. We addressed this issue by penalising model complexity, and by choosing models that generalised best to cohorts omitted from the estimation procedure. Our database included patients from many countries from Europe and North America, who were treated in different settings. The range of patients was broad: men and women, from teenagers to elderly people were included, and the major exposure categories were well represented. The severity of immunodeficiency at baseline ranged from not measureable to very severe, and viral load from undetectable to extremely high.

Other Information

22. Funding

Give the source of funding and the role of the funders for the present study and, if applicable, for the original study on which the present article is based.

Some journals require authors to disclose the presence or absence of financial and other conflicts of interest [100,218]. Several investigations show strong associations between the source of funding and the conclusions of research articles [219–222]. The conclusions in randomised trials recommended the experimental drug as the drug of choice much more often (odds ratio 5.3) if the trial was funded by for-profit organisations, even after adjustment for the effect size [223]. Other studies document the influence of the tobacco and telecommunication industries on the research they funded [224–227]. There are also examples of undue influence when the sponsor is governmental or a non-profit organisation.

Authors or funders may have conflicts of interest that influence any of the following: the design of the study [228]; choice of exposures [228,229], outcomes [230], statistical methods [231], and selective publication of outcomes [230] and studies [232]. Consequently, the role of the funders should be described in detail: in what part of the study they took direct responsibility (e.g., design, data collection, analysis, drafting of manuscript, decision to publish) [100]. Other sources of undue influence include employers (e.g., university administrators for academic researchers and government supervisors, especially political appointees, for government researchers), advisory committees, litigants, and special interest groups.

nut-22.1 Ethics

Describe the procedure for consent and study approval from ethics committee(s).

As stated in the Helsinki Declaration, ethics apply to all types of medical research concerning human subjects that includes research on identifiable human material or data. The Council for International Organizations of Medical Sciences has recently published a new version of its International Ethical Guidelines for Health-Related Research Involving Humans. It is useful to provide details about ethical approval, if it has been granted, and by whom. The need for ethical approval for observational studies, however, varies across countries (see also Nut-22.2).

Regardless of the legislation available in the country of research, all research studies collecting data from human participants impose ethical obligations to participants. Therefore, researchers should ensure clarity and describe how they addressed the ethical issues in their research, including the research risks of harm. In addition, the procedures to guarantee data privacy and confidentiality during the analysis and handling of personal data should be clearly described.

Example/s:

Before data collection, written consent was obtained from parent participants in the original data collection for the Early Childhood Longitudinal Programs, Birth Cohort (ECLS-B). The National Center for Education Statistics approved our use of the deidentified and anonymized restricted-use data set for the current analysis. The Johns Hopkins Institutional Review Board deemed that this analysis of deidentified secondary data involved non–human subjects research.

nut-22.2 Data statement

Provide data collection tools and data as online material or explain how they can be accessed.

Traditionally, efforts to share research have focused on manuscripts that provide a narrative summary of the conducted study and the results. However, other research outputs, such as protocols, data collection instruments, software, and algorithms, are essential for the interpretation of findings and the reproduction of the research project. An increasing number of journals allow researchers to upload supplementary online material or to link objects to online sources or repositories. This is an opportunity to maximize the build-up of scholarly knowledge and is increasingly recognized as an integral part of good research practice and academic culture.

There are ongoing discussions about ethical aspects of data-sharing: for instance, within European data infrastructure initiatives (e.g., Biobanking and BioMolecular Resources Research Infrastructure–European Research Infrastructure Consortium (www.bbmri-eric.eu) and the European Clinical Research Infrastructure Network (www.ecrin.org); see also Nut-22.1).

Data-sharing could vary from sharing information with regard to the study design, the mode of data collection and outcomes, the number of participants, and a full list of available measurements, up to the software used and individual, anonymized data. A tangible benefit of sharing protocols and hypotheses for epidemiologic studies in public repositories is that duplication of efforts potentially would be avoided. Legitimate data exploration and discovery could be cost-effective and maximize the impact of available epidemiologic data and should not be limited by preregistration of protocols. Important caveats apply, however, and whether epidemiologic studies should be preregistered or not is debated.

Researchers are encouraged to provide access to the data needed to reproduce the results. Various research-funding agencies, universities, and scientific journals have adopted policies that allow data to be accessible for the reproduction of the study findings; and different data repositories are being developed for such purposes. To ensure an effective use of data for future research and scientific discovery, data-sharing needs to be organized so that interaction with computational agents is facilitated. In other words, data should be “findable, accessible, interoperable, and reusable” (FAIR).

Research information with regard to humans should be managed with the highest and most appropriate ethical standards (see Nut-22.1). Efforts to ensure that research data are available include collection and storage of high-quality information with long-term validity. In order to do this, data must be well documented, so that other researchers can access, understand, and use these data, and add value to the original data independently of the original investigators. Furthermore, there is a need for high-quality stewardship of scientific data and adequate procedures, including long-term care, quality control, and adequate commitment as well as resources to handle the data. Funding bodies such as the United Kingdom Medical Research Council, the NIH, and the European Commission now explicitly require statements around data management. All applicants submitting funding proposals to these (and many other) funding agencies are required to include a Data Management Plan as an integral part of their application.

Example/s:

Dietary information was collected by using a 121-item, self-administered FFQ. Details of which foods were included in each food group are listed in the online Appendix.

To acknowledge this checklist in your methods, please state "We used the STROBE-nut checklist when writing our report [citation]". Then cite this checklist as Lachat C, Hawwash D, Ocké MC, Berg C, Forsum E, Hörnell A, Larsson C, Sonestedt E, Wirfält E, Åkesson A, Kolsteren P, Byrnes G, De Keyzer W, Van Camp J, Cade JE, Slimani N, Cevallos M, Egger M, Huybrechts I. Strengthening the Reporting of Observational Studies in Epidemiology-Nutritional Epidemiology (STROBE-nut): An Extension of the STROBE Statement..

The STROBE-nut checklist is distributed under the terms of the Creative Commons Attribution License CC-BY