What Makes a Good Test?

Psychologists use reliability and validity as measures of a test's quality, and for purposes of comparing different tests.

Reliability

Reliability means consistency or accuracy. Reliability is the ability of a test to produce consistent and stable scores. The simplest way to determine a test's reliability is to give the test to a group and then, after a short time, give it again to the same group. If the group scores the same each time, the test is reliable. The problem with this way of determining reliability is that the group may have remembered the answers from the first testing. One method of eliminating this problem is to divide the test into two parts and check the consistency of people's scores on both parts. If the scores generally agree, the test is said to have split-half reliability. Psychologists express reliability in terms of correlation coefficients, the statistical measure of the degree of linear association between two variables. Correlation coefficients can vary from -1.0 to +1.0. The reliability of intelligence tests is about .90; that is, scores remain fairly stable across repeated testing.

Validity

Validity is the ability of a test to measure what it has been designed to measure. Construct validity is the degree to which a test measures the concept it claims to measure. Predictive validity is the degree to which a construct is related positively to real-world outcomes. In general, most intelligence tests assess many of the abilities considered to be components of intelligence: concentration, planning, memory, language comprehension, and writing. However, a single test may not cover all the areas of intelligence, and tests differ in their emphasis on the abilities they do measure.

Criterion-related validity refers to the relationship between test scores and independent measures of whatever the test is designed to measure. In the case of intelligence, the most common independent measure is academic achievement. Despite their differences in surface content, most intelligence tests are good predictors of academic success. Based on this criterion, these tests seem to have adequate criterion-related validity.

Tests must be both reliable and valid.

Test-retest reliability is the consistency of scores on a test over time.

Internal reliability describes intelligence tests in which questions on a subtest correlate very highly with other items on the subtest.


return to top | previous page | next page