In lieu of an abstract, here is a brief excerpt of the content:

CHAPTER NINE Evaluating Model Performance and Significance Evaluating the predictive performance and statistical significance of a model constitutes a critical phase of niche modeling, and researchers should demonstrate that their models are of sufficient quality to meet the needs of the project at hand before using or interpreting them in any way (Peterson 2005a). In chapter 4, in the process of clarifying modeling concepts and developing basic mathematical notations, we also provided an overview of key principles of model evaluation . Here, we develop the topic in considerably greater depth, and discuss a framework for selecting appropriate evaluation strategies for a particular study. We begin by reviewing key concepts, but now in the light of input data characteristics discussed in chapters 5 and 6, which lead us to explore limitations inherent in most occurrence datasets available for model evaluation, and to comment on ways in which they can influence evaluation adversely. The picture contrasts rather sharply with the optimistic panorama that we painted in chapter 4, but shows clearly the need to discuss methods for selecting evaluation data carefully. We begin by presenting commonly used quantitative measures of model performance and significance. Because the aims of modeling projects vary (see chapter 3), no single “best” approach to evaluation exists (just as we saw in chapter 7 that no single “best” modeling algorithm is likely to exist); however, we can outline approaches that are more or less suited to rigorous model evaluation. Hence, we discuss various evaluation approaches in light of when they are likely to be appropriate. Finally, we set out a vision of the research agenda as regards model evaluation, highlighting areas in need of theoretical and/or methodological advances. PRESENCES, ABSENCES, AND ERRORS We recall a few critical principles that were outlined in chapter 4. There, we took for granted several rather optimistic assumptions that may be incorrect or MODEL PERFORMANCE AND SIGNIFICANCE 151 untenable in the arena of modeling species’ ecological niches EA or EP (particularly in contrast to the challenges of evaluating the related but distinct species distribution models). Foremost is the elementary assumption that evaluation data are sufficient to allow for unequivocal, transparent empirical observation of Y ⫽ 1, 0. This situation, however, is rarely the case: occurrence data can be presence-only, presence/background, presence/pseudoabsence, or presence/ absence, and even in the latter case the meaning of absence data is manifold (see chapter 5). Although most studies evaluate models based on the same kinds of data used in model calibration, this situation is not necessarily always the case. For example, Elith et al. (2006) developed models based on presenceonly or presence/background data, but evaluated them with presence/absence data. Furthermore, even when more kinds of data are available for evaluations, some model evaluation strategies use only presence records. In fact, special considerations arise because the notion of an “absence” is questionable in niche modeling (see chapter 5). That is, omission errors (Y ⫽ 1, Ŷ ⫽ 0) are usually genuine in indicating model failure, except when identification or georeferencing errors, sink populations, or other misleading factors cause problems. On the other hand, problems surrounding the concept of commission error (Y ⫽ 0, Ŷ ⫽ 1) are pervasive in niche modeling—in fact, much of the “error” ascribed to commission may not be erroneous at all in such applications .A distinction, at least conceptually, and operationally to the extent possible , between real and apparent commission error components is paramount in model evaluation (Anderson et al. 2003). Apparent commission error does not reflect real error in model calibration, but rather may derive from incomplete evaluation data, inappropriate selection of the evaluation region, or both. Two major factors contribute to apparent commission error: incomplete biological sampling across the landscapes being used to evaluate models (which is universal), and nonequilibrium distributions (e.g., owing to dispersal limitations and possibly to biotic interactions), creating absences in areas in which the species could maintain populations (see chapter 8). In figure 3.1, it is clear that GP may exist outside of M; put another way, GI is rarely or never empty, meaning that some areas within GP will be uninhabited by the species. Moreover, even within M, few or no taxonomic groups have been sampled thoroughly across their entire geographic distributions (Sober ón et al. 2007), so even within GO one should expect to see some (often many) undocumented map cells. Demonstrating absence for a species is particularly difficult for analyses at the relatively coarse resolutions typical in niche modeling studies (see chapter 5...


Additional Information

Related ISBN
MARC Record
Launched on MUSE
Open Access
Back To Top

This website uses cookies to ensure you get the best experience on our website. Without cookies your experience may not be seamless.