Читать книгу The Concise Encyclopedia of Applied Linguistics - Carol A. Chapelle - Страница 102
Validity
ОглавлениеValidity of test‐score interpretation and use needs to be justified on the basis of evidence for a correspondence between test scores and the integrated ability that the test is intended to measure, as well as evidence for the utility of the test scores. Test developers and researchers need to consider how to elicit evidence in order to conduct validation research. When dividing language into skills areas, defining the construct appears manageable; questions arise, however, with the construct underlying integrated assessment. Examining writing processes in a thematically linked integrated assessment, Esmaeili (2002) concluded that reading and writing could not be viewed as stand‐alone constructs. In a study of non‐native and native English speakers, Delaney (2008) found that reading‐to‐write tasks elicited processes attributable to unique constructs that were not merely a combination of reading ability and writing skill, but also of discourse synthesis. Plakans (2009) also found evidence of discourse synthesis in writers' composing processes for integrated writing assessment and concluded that the evidence supported interpretation of such a construct from test scores. Using structural equation modeling and qualitative methods, Yang and Plakans (2012) found complex interrelated strategies used by writers in reading–listening–writing tasks, further supporting the idea that the processes related to discourse synthesis (selecting, connecting, and organizing) improved test performance. In a similar study, focused on summarization tasks, Yang (2014) used structural equation modeling (SEM) to provide evidence that the task required comprehension and construction strategies as well as planning, evaluating, source use, and discourse synthesis strategies. While research into validity and integrated assessment is building momentum, ongoing attention and research is needed to attention to refine evolving definitions, innovation in task types, and approaches to scoring.