Psychological Testing and Assessment: An Introduction To Tests and Measurement, 5/e
Base rate | An index, usually expressed as a proportion, of the extent to which a particular trait, behavior, characteristic, or attribute exists in a population, 169
|
|
|
|
Bias | As applied to tests, a factor in-herent within a test that systematically prevents accurate, impartial measurement, 179
|
|
|
|
Central tendency error | A type of rating error wherein the rater exhibits a general reluctance to issue ratings at either the positive or negative extreme, and so all or most ratings cluster in the middle of the rating continuum, 182, 331
|
|
|
|
Concurrent validity | A form of criterion-related validity that is an index of the degree to which a test score is related to some criterion measure obtained at the same time (concurrently), 160, 161-162.
|
|
|
|
Confirmatory factor analysis (CFA) | A class of mathematical procedures employed when a factor structure that has been explicitly hypothesized is tested for its fit with the observed relationships between the variables, 178, 273-276.
|
|
|
|
Construct | An informed, scientific idea developed or generated to describe or explain behavior; some examples of constructs include "intelligence," "personality," "anxiety," and "job satisfaction," 14, 173
|
|
|
|
Construct validity | A judgment about the appropriateness of inferences drawn from test scores regarding individual standings on a variable called a construct, 173-178
|
|
|
|
Content validity | A judgment regarding how adequately a test or other tool of measurement samples behavior representative of the universe of behavior it was designed to sample, 156-160.
|
|
|
|
Convergent evidence | With reference to construct validity, data from other measurement instruments designed to measure the same or a similar construct as the test being construct-validated, which all point to the same judgment or conclusion with regard to a test or other tool of measurement; contrast with discriminant evidence, 176-177
|
|
|
|
Criterion | The standard against which a test or a test score is evaluated; this standard may take many forms, including a specific behavior or set of behaviors, 160-161, 347
|
|
|
|
Criterion contamination | A state in which a criterion measure is itself based, in whole or in part, on a predictor measure, 161
|
|
|
|
Criterion-related validity | A judgment regarding how adequately a score or index on a test or other tool of measurement can be used to infer an individual's most probable standing on some measure of interest (the criterion), 160-173
|
|
|
|
Discriminant evidence | With reference to construct validity, data from a test or other measurement instrument showing little relationship between test scores or other variables with which the scores on the test being construct-validated should not theoretically be correlated; contrast with convergent evidence, 177
|
|
|
|
Expectancy chart | Graphic representation of an expectancy table, 165, 167
|
|
|
|
Exploratory factor analysis | A class of mathematical procedures employed to estimate factors, extract factors, or decide how many factors to retain, 178.
|
|
|
|
Face validity | A judgment regarding how well a test or other tool of measurement measures what it purports to measure, based solely on "appearances" such as the content of the test's items, 155-156.
|
|
|
|
Factor analysis | A class of mathematical procedures, frequently employed as data reduction methods, designed to identify variables on which people may differ (or factors). In measurement, two types of factor analysis are common, exploratory factor analysis and con-firmatory factor analysis.
|
|
|
|
Factor loading | In factor analysis, a metaphor suggesting that test (or an individual test item) carries a certain amount of one or more abilities which, in turn, has a determining influence on the test score (or on the response to the individual test item). Unlike other metaphors, however, a factor loading can be quantified, 178, 273
|
|
|
|
Fairness | As applied to tests, the ex-tent to which a test is used in an impartial, just, and equitable way, 182-186
|
|
|
|
False negative | (1) In the general context of the miss rate of a test, an inaccurate prediction of classifica-tion indicating that a testtaker did not possess a trait or other attribute being measured when in reality the testtaker did; (2) in drug testing, an individual tests negative for drug use when in reality there has been drug use, 169, 525
|
|
|
|
False positive | (1) In the general context of the miss rate of a test, an inaccurate prediction or classifica-tion indicating that a testtaker did possess a trait or other attribute being measured when in reality the testtaker did not; (2) in drug testing, an individual tests positive for drug use when in reality there has been no drug use, 169, 525
|
|
|
|
Generosity error | Less than accurate rating or evaluation by a rater due to that rater's general tendency to be lenient or insufficiently critical; also referred to as leniency error; contrast with severity error, 181-182, 331
|
|
|
|
Halo effect | A type of rating error wherein the rater views the object of the rating with extreme favor and tends to bestow ratings inflated in a positive direction; a set of circumstances resulting in a rater's tendency to be positively disposed and insufficiently critical, 182, 331
|
|
|
|
Hit rate | The proportion of people a test or other measurement procedure accurately identifies as possessing or exhibiting a particular trait, behavior, characteristic, or attribute; contrast with miss rate,false positive, and false negative, 169
|
|
|
|
Incremental validity | Used in conjunction with predictive validity, an index of the explanatory power of additional predictors over and above the predictors already in use, 164-165
|
|
|
|
Inference | A logical result or a deduction in a reasoning process, 154
|
|
|
|
Intercept bias | Refers to the point at which a regression line intercepts the Y-axis; refers to a test or measurement procedure systematically underpredicting or overpredicting the performance of members of a group; contrast with slope bias, 180
|
|
|
|
Leniency error | Also referred to as a generosity error, a rating error that occurs as the result of a rater's tendency to be too forgiving and insufficiently critical, 181-182, 331
|
|
|
|
Local validation study | The process of gathering evidence relevant to how well a test measures what it purports to measure, for the purpose of evaluating the validity of a test or other measurement tool, typically undertaken in conjunction with a population different from the population for whom the test was originally validated, 154-155
|
|
|
|
Method of contrasted groups | A procedure for gathering construct validity evidence that entails demonstrating that scores on the test vary in a predictable way as a function of membership in a particular group, 176
|
|
|
|
Miss rate | The proportion of people a test or other measurement procedure fails to identify accurately with respect to the possession or exhibition of a trait, behavior, characteristic, or attribute; a "miss" in this context is an inaccurate classi-fication or prediction; may be subdivided into false positives and false negatives, 169
|
|
|
|
Multitrait-multimethod matrix | A method of evaluating construct validity by simultaneously examining both convergent and divergent evidence by means of a table of correlations between traits and methods, 177
|
|
|
|
Predictive validity | A form of criterion-related validity that is an index of the degree to which a test score predicts some criterion measure, 160, 162-173
|
|
|
|
Ranking | The ordinal ordering of persons, scores, or variables into relative positions or degrees of value, 182
|
|
|
|
Rating | A numerical or verbal judgment that places a person or attribute along a continuum identified by a scale of numerical or word descriptors called a rating scale, 181
|
|
|
|
Rating error | A judgment that results from the intentional or unintentional misuse of a rating scale; two types of rating error are leniency error (or generosity error) and severity error, 181-182
|
|
|
|
Severity error | Less than accurate rating or error in evaluation due to the rater's tendency to be overly critical; contrast with generosity error, 182, 331
|
|
|
|
Slope bias | A reference to the slope of a regression line being different between groups, this term refers to a test or measurement procedure systematically yielding different validity coefficients for members of different groups; contrast with intercept bias, 180
|
|
|
|
Test homogeneity | Also simply homogeneity, the extent to which individual test items measure a single construct; contrast with test heterogeneity, 135, 136, 174
|
|
|
|
Validation | The process of gathering and evaluating validity evidence, 154
|
|
|
|
Validation study | Research that entails gathering evidence relevant to how well a test measures what it purports to measure for the purpose of evaluating the validity of a test or other measurement tool, 154-155
|
|
|
|
Validity | A general term referring to a judgment regarding how well a test or other measurement tool measures what it purports to measure; this judgment has important implications regarding the appropriateness of inferences made and actions taken on the basis of measurements, 29-30, 32-33, 154-186
|
|
|
|
Validity coefficient | A correlation coefficient that provides a measure of the relationship between test scores and scores on a criterion measure, 162-164 Validity generalization (VG), 510, 511-514
|