Key Points - Inclusion criteria are derived from key questions and must be clear and sufficiently detailed to avoid inconsistent application in study selection. - The Review team should carefully consider how inclusion or exclusion of specific populations, interventions, comparators, outcomes, timeframes, settings, or study designs and characteristics may affect the review conclusions. - Dual review helps to reduce bias and can identify inclusion criteria that are not sufficiently clear and where subjective judgment may differ. - Gray literature (e.g. FDA documents, trial registry reports) is important in assessing publication bias or selective outcome reporting biases. Introduction Although systematic reviews are intended to reduce bias compared to narrative reviews, reviewers must carefully consider how each decision in selection and inclusion of studies may introduce bias to the review conclusions. The Methods Guide for AHRQ’s Evidence-based Practice Centers (EPCs) incorporates methods to identify and reduce bias in many chapters; here we focus on the methods that EPC authors can use to reduce bias in the selection and assessment of studies to be included in a systematic review (SR). In the initial stages of a SR, topic development, the EPCs develop potential topics that have been submitted by a broad range of stakeholders. Each topic is then evaluated against predefined criteria by a panel comprising various AHRQ EHC staff, the Scientific Resource Center (SRC), and the John M Eisenberg Center for Clinical Decisions and Communication Science. AHRQ ultimately makes the final decision about topics that will move forward in the process, with subsequent refinement for specific key questions by the EPC with further input from Key informants. (1) The whole topic development and refinement process defines the general scope and specific key questions of the topic for the SR; it is, therefore, directly relevant to determining the ultimate selection of studies for a review. The methods used to develop and further refine these topics work towards reducing potential bias, and are discussed elsewhere. (1) While bias may be introduced at these early stages, this paper instead focuses on the methods that EPC authors use to operationalize key questions after they have been finalized. Selecting studies directly influences the resulting conclusions. We carried out a search for studies that examined variation in study inclusion across SRs of the same topic and found a very small number of relevant studies. (2-5) The most relevant example was a prospective study designed to examine variation between review groups in determining study inclusion. (5) Two review groups (on different continents) were commissioned to review observational evidence on the same topic. For the papers retrieved in both searches but regarded as relevant by only one center (52 in center A and 20 in center B), 63 of the 72 discrepancies occurred in screening citations (titles and abstracts); 9 of the 72 discrepancies occurred during review of full-text articles. Of 310 relevant articles, 166 (54 percent) were included by both groups. Of papers included by both groups, 80 percent were described by the same study design. Agreement for inclusion of cohort-type and case-control studies was only about 63 percent, and 50 percent or less for ecological and case series studies. In other studies, published systematic reviews that included trials and appeared to focus on the same research question were examined retrospectively and also differed in their lists of included studies (Table 1). (2-4) These findings led us to conclude that variation in the details or lack of adequate specificity of inclusion criteria and methods used to apply these criteria yielded quite different sets of included studies, contributing to differing conclusions. Additionally, based on the finding that there was significant discrepancy in inclusion decisions of observational studies, there may be separate issues relating to the inclusion of randomized controlled trials (RCTs) versus observational studies. Although some differences in study inclusion decisions in the studies cited above could have been the result of one review’s having used a best evidence approach (for example, using a hierarchy of study designs as reviewed by Treadwell et al [AHRQ Method Guidance – in press, cite when published]) or excluding poor quality studies and another not doing so, such decisions should have been identified in the development of, and clearly stated in, the inclusion criteria. In most cases, we could not completely distinguish the differences among reviews in inclusion decisions related to variations in inclusion criteria and those related to variations in the search strategies. Other authors have addressed reasons for discrepant results from meta-analyses on the (seemingly) same topics. (6, 7) Ionnaidis has examined multiple such scenarios and concluded that the reasons for discrepancy are typically multifactorial, but include differing study questions and inclusion criteria as well as differences in the process of applying the criteria in study selection. He gives examples of situations where inclusion criteria for meta-analyses were apparently specified in way that would obtain results that supported the viewpoints of the authors rather than reflecting questions of clinical uncertainty. Operationalization of the key questions (decisions about what to include as "evidence") presents an ominous source of bias and in these examples, led to different conclusions. In this chapter we outline the potential for systematic bias and random error in the study selection process of SRs and discuss specific strategies to reduce and avoid potential bias when selecting studies to include in SRs. Hide
IOM Standard 3: Standards for Finding and Assessing Individual Studies
This chapter addresses the identification, screening, data collection, and appraisal of the individual studies that make up a systematic review’s (SR’s) body of evidence. The committee recommends six related standards. The search should be comprehensive and include both published and unpublished research. The potential for bias to enter the selection process is significant and well documented. Without appropriate measures to counter the biased reporting of primary evidence from clinical trials and observational studies, SRs will reflect and possibly exacerbate existing distortions in the biomedical literature. The review team should document the search process and keep track of the decisions that are made for each article. Quality assurance and control are critical during data collection and extraction because of the substantial potential for errors. At least two review team members, working independently, should screen and select studies and extract quantitative and other critical data from included studies. Each eligible study should be systematically appraised for risk of bias; relevance to the study’s populations, interventions, and outcomes measures; and fidelity of the implementation of the interventions. Hide
PREFACE This third edition of the Centre for Reviews and Dissemination (CRD) guidance for undertaking systematic reviews builds on previous editions published in 1996 and 2001. Our guidance continues to be recommended as a source of good practice by agencies such as the National Institute for Health Research Health Technology Assessment (NIHR HTA) programme, and the National Institute for Health and Clinical Excellence (NICE), and has been used widely both nationally and internationally. Our aim is to promote high standards in commissioning and conduct, by providing practical guidance for undertaking systematic reviews evaluating the effects of health interventions. WHY SYSTEMATIC REVIEWS ARE NEEDED Health care decisions for individual patients and for public policy should be informed by the best available research evidence. Practitioners and decision-makers are encouraged to make use of the latest research and information about best practice, and to ensure that decisions are demonstrably rooted in this knowledge.1, 2 However, this can be diffi cult given the large amounts of information generated by individual studies which may be biased, methodologically fl awed, time and context dependent, and can be misinterpreted and misrepresented.3 Furthermore, individual studies can reach confl icting conclusions. This disparity may be because of biases or differences in the way the studies were designed or conducted, or simply due to the play of chance. In such situations, it is not always clear which results are the most reliable, or which should be used as the basis for practice and policy decisions. Systematic reviews aim to identify, evaluate and summarise the fi ndings of all relevant individual studies, thereby making the available evidence more accessible to decisionmakers. When appropriate, combining the results of several studies gives a more reliable and precise estimate of an intervention’s effectiveness than one study alone.5-8 Systematic reviews adhere to a strict scientifi c design based on explicit, pre-specifi ed and reproducible methods. Because of this, when carried out well, they provide reliable estimates about the effects of interventions so that conclusions are defensible. As well as setting out what we know about a particular intervention, systematic reviews can also demonstrate where knowledge is lacking.4, 9 This can then be used to guide future research.10 WHAT IS COVERED IN THE GUIDANCE The methods and steps necessary to conduct a systematic review are presented in a core chapter (Chapter 1). Additional issues specifi c to reviews in more specialised topic areas, such as clinical tests (diagnostic, screening and prognostic), and public health are addressed in separate, complementary chapters (Chapters 2-3). We also consider questions relating to harm (Chapter 4) costs (Chapter 5) and how and why interventions work (Chapter 6). This guide focuses on the methods relating to use of aggregate study level data. Although discussed briefl y in relevant sections, individual patient data (IPD) metaanalysis, which is a specifi c method of systematic review, is not described in detail. The basic principles are outlined in Appendix 1 and more detailed guidance can be found in the Cochrane Handbook11 and specialist texts.12, 13 Similarly, other forms of evidence synthesis including prospective meta-analysis, reviews of reviews, and scoping reviews are beyond the scope of this guidance but are described briefl y in Appendix 1. WHO SHOULD USE THIS GUIDE The guidance has been written for those with an understanding of health research but who are new to systematic reviews; those with some experience but who want to learn more; and for commissioners. We hope that experienced systematic reviewers will also fi nd this guidance of value; for example when planning a review in an area that is unfamiliar or with an expanded scope. This guidance might also be useful to those who need to evaluate the quality of systematic reviews, including, for example, anyone with responsibility for implementing systematic review fi ndings. Given the purpose of the guidance, the audience it is designed for, and the aim to remain concise, it has been necessary to strike a balance between the wide scope covered and the level of detail and discussion included. In addition to providing references to support statements and discussions, recommended reading of more specialist works such as the Cochrane Handbook,14 Systematic Reviews in the Social Sciences,4 and Systematic Reviews in Health Care15 have been given throughout the text. HOW TO USE THIS GUIDE The core methods for carrying out any systematic review are given in Chapter 1 which can be read from start to fi nish as an introduction to the review process, followed step by step while undertaking a review, or specifi c sections can be referred to individually. In view of this, and the sometimes iterative nature of the review process, occasional repetition and cross referencing between sections has been necessary. Chapters 2-5 provide supplementary information relevant to conducting reviews in more specialised topic areas. To minimize repetition, they simply highlight the differences or additional considerations pertinent to their speciality and should be used in conjunction with the core principles set out in Chapter 1. Chapter 6 provides guidance on the identifi cation, assessment and synthesis of qualitative studies to help explain, interpret and implement the fi ndings from effectiveness reviews. This refl ects the growing recognition of the contribution that qualitative research can make to reviews of effectiveness. For the purposes of space and readability: The term ‘review’ is used throughout this guidance and should be taken as a short form for ‘systematic review’, except where it is explicitly stated that non-systematic reviews are being discussed. ‘Review question’ is used in the singular even though frequently there may be more than one question or objective set. The same process applies to each and every question. A glossary of terms has been provided to ensure a clear understanding of the use of those terms in the context of this guidance and to facilitate ease of reference for the reader. Hide