Introduction to Validity

2015-05-30张楠

校园英语·下旬 2015年12期

张楠

【Abstract】The designing of a test paper strictly follows the specification of the relevant Testing Syllabus， which covers various skills and abilities needed for real communicative situation. A test with high content validity is likely to have a positive impact upon the teaching. Therefore， a test with a high content validity will undoubtedly promote the college English education and it will certainly facilitate the implementation of the Syllabus.

【Key words】test； validity； content validity

The important quality of test interpretation or use is validity， or the extent to which the inferences or decisions we make on the basis of test scores are meaningful， appropriate and useful. High validity is the guarantee of a good test.

I. Concept of validity

The validity of a test is the extent to which it measures what it is supposed to measure and nothing else. Every test， whether it be a short， informal classroom test or a public examination， should be as valid as the constructor can make it. The test must aim to provide a true measure of the particular skill which it is intended to measure： to the extent that it measures external knowledge and other skills at the same time， it will not be a valid test （J. B. Heaton 2000： 159）. For example， the following test item is invalid if we wish solely to measure writing ability： “Is photography an art or a science？ Discuss.” It is likely to be invalid simply because it demands some knowledge of photography and will consequently favor certain students.

II. Classification of validity

Actually validity is a complex concept，different scholars have different versions of classifications. One expert in the field of testing in our country， Li Xiaoju （1997） classifies it into four categories. Her classification is illustrated in the following table.

Although validity has traditionally been discussed in terms of different types， as pointed above， psychomericians have increasingly come to view it as a single， unitary concept.

III. Content validity

Content validity， in the conventional measurement sense， is the extent to which the content of the test constitutes a representative sample of the domain to be tested （Bachman 1990： 306）. A test is said to be valid if its content constitutes a representative sample of the language skills， structures， etc. which it is meant to be concerned. The examination of content validity consists of the study of content relevance and content coverage. The investigation of content relevance requires “the specification of the behavioral domain in question and the attendant specification of the task or test domain （Messick 1980： 1017， cited from Bachman 1990： 244）. The second aspect of examining test content is that of content coverage， or the extent to which the tasks required in the test adequately represent the behavioral domain in question. From the perspective of the test developer， if we had a well-defined domain that specified the entire set， or population， of possible test tasks， we could follow a standard procedure for random sampling to insure that the tasks required by the test were representative of that domain （Bachman 1990： 245）.

In practice， content validity study needs a specification of the skills or structures， etc. that it is meant to cover. Such a specification should be made at a very early stage in test construction. A comparison of test specification and test content is the basis for judgments as to content validity. Ideally these judgments should be made by people who are familiar with language learning and testing but who are not directly concerned with the production of the test in question.

The content validity is of great importance：

Firstly， the greater a tests content validity， the more likely it is to be an accurate measure of what it is supposed to measure. A test in which major areas identified in the specification are under-represented- or not represented at all – is unlikely to be accurate. Secondly， a test with no content validity is likely to have a harmful backwash effect. Areas which are not tested are likely to become areas ignored in teaching and learning. The best safeguard against this is to write full test specifications and to ensure that the test content is a fair reflection of these. （Arthur Hughes 2002： 23）.

Reference：

[1]Bachman，L.F.（1990）.Fundamental Considerations in Language Testing.Oxford：Oxford University Press.

[2]Heaton，J.B.（2000）.Writing English Language Tests.Foreign Langauge Teaching and Reserach Press.

[3]李筱菊.語言测试科学与艺术[M].湖南：湖南教育出版社.1997.