Thứ Bảy, ngày 14 tháng 6 năm 2008

Glossary Term Corner Graphic Full Glossary Previous DisableClose PopupNext Disable
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Absolute Judgment: A criterion-referenced approach to score interpretation which evaluates whether or not a student has achieved mastery over the content domain.

ACT Scores: Customized standard scores with a mean of 18 and a standard deviation of 6, produced on the ACT college admission and placement examination.

Age Equivalent Scores: Standardized scores that convert raw test scores into corresponding age-level equivalents based on the performance of a norm group.

Constructed Response: A supply item format which requires students to construct a response. Items which require an essay answer, for example, are constructed response items.

Criterion-Referenced: An approach to scoring by comparing performance to some set of standards or criteria. Typical criteria are based on the test’s measurement objectives.

Criterion-referenced: A score interpretation approach which evaluates student performance against a predetermined set of objectives or criteria. Contrasts with norm-referenced.

Criterion-referenced Scoring: A process of scoring that evaluates a test score in relation to a set criterion or standard.

Dichotomous: Only two scoring options are possible.

Equating: The process by which raw scores from different tests or different versions of the same test are translated to a new scale so that direct comparisons can be made.

Equipercentile Equating: A process of equating based on the percentile ranks of scores.

Formative Assessment: Assessment during instruction which is meant to provide feedback to students, teachers or both. Formative assessment is meant to help students monitor their own learning and usually does not affect their grade.

Grade Equivalent Scores: Standardized scores that convert raw test scores into corresponding grade-level equivalents based on the performance of a norm group.

Horizontal Equating: A process of equating which allows for meaningful comparisons for the same group of students across time.

Inter-Rater Reliability: The level of measurement precision associated with subjectively scored tests. If two people following the same scoring key and instructions might disagree on the correct number of points to assign, then the scoring system does not have perfect inter-rater reliability.

IQ Scores: Customized standard scores produced by many intelligence tests that typically have a mean of 100 and a standard deviation of 15.

Linear Equating: A process of equating which involves specifying the desired mean and standard deviation of the final distribution ahead of time and using those values to directly calculate new scores.

Local Norms: Typical test performance for a school district.

Mean: The arithmetic average for a given test that is calculated by summing all of the test scores and dividing the sum by the number of students who took the exam.

National Norms: Typical test performance for the nation.

NCEs see Normal Curve Equivalents

Norm Group: A group of test takers with known characteristics such as age and grade meant to provide a norm-referenced comparison for purposes of score interpretation.

Norm-referenced: A score interpretation approach which evaluates student performance by comparing students to each other. Contrasts with criterion-referenced.

Norm-referenced Scoring: A process of scoring that evaluates a test score in relation to the test scores of a norm group. A student´s score depends on how well she performs on the test in comparison to how well her peers perform on the test.

Normal Curve: A bell-shaped distribution of scores that is assumed to be universal for all large, uniform populations regardless of what is being tested, as long as the scale used allows scores to vary. Properties of the normal curve serve as the basis for scoring virtually all norm-referenced tests.

Normal Curve Equivalents (also NCEs): A system of ranking test scores by dividing the normal distribution of scores into 99 equal intervals. Scores that fall within the 50th NCE represent average performance, while scores in the 1st and 99th NCEs represent the lowest and highest performing groups, respectively.

Norms: Typical test performance of a typical group of test takers.

Number Correct: A common scoring system which awards one point for each question answered correctly.

Objective Scoring: Scoring systems which do not require any expertise or opinion. If a test is computer scorable, it is objectively scored.

Percent Correct: A common scoring system which divides the total points received by the total points possible. That proportion is then multiplied by 100 to get a percentage.

Percentile Rank: The percentage of test scores within a group that fall below a given test score.

Percentiles: A test score below which a particular percentage of test scores fall.

Raw Score: A test score that has not been standardized or transformed in any way; a raw score is typically the total number of items a student answered correctly on a test.

Representativeness: The degree to which a sample matches the population it is drawn from in important characteristics.

Rubric: Organized set of performance criteria associated with a range of point values often used for scoring performance-based assessments, constructed response items and other forms of supply items.

Sampling Error: The difference between the characteristics of a sample and the population that sample is supposed to represent.

SAT Scores: Customized standard scores with a mean of 500 and a standard deviation of 100 that are used in reference to performance on the SAT college admission and placement examination.

Selection Item: A test item format where the student must select the correct answer from among presented alternatives.

Standard Deviation: A measure of variability that calculates the average distance of each score in a distribution from the mean.

Standardized Scores: Raw scores that have been converted to a common scale or metric to indicate relative performance and allow for comparison across students or across tests.

Stanines: A system of ranking test scores by dividing the normal distribution of scores into nine equal intervals, each half a standard deviation wide. Scores that fall within the 5th stanine represent average scores; while scores in the 1st and the 9th stanines represent the highest and lowest performing groups, respectively.

Subjective Scoring: Scoring systems which require some expertise or experience. If two people following the same scoring key and instructions might disagree on the correct number of points to assign, then the scoring is subjective.

Subscale: A group of items within a larger test that are all focused on a single area, skill or trait.

Summative Assessment: Assessment which occurs after instruction to provide evidence of learning. Student grades are typically based on summative assessments.

Supply Item: A test item format where answer options are not presented and the student must supply the correct answer.

T Scores: Standardized scores with a mean of 50 and a standard deviation of 10. T scores are calculated by multiplying a Z score by 10 and adding 50 to its product.

Vertical Equating: A process of equating which allows for meaningful comparison on a single test across grades or age ranges.

Z Scores: Standardized scores with a mean of zero and a standard deviation of one. Z scores are calculated by subtracting the mean of the test from an individual raw test score and then dividing this difference by the standard deviation of the test. Z scores are the most common type of standardized score.


Test Scores and Their InterpretationPrevious DisablespaceNext Disable

Không có nhận xét nào:

Đăng nhận xét