Skip to Content Skip to Global Navigation Bar
Test Construction Terms related

Terms related

Total : (1/2 page)
[Analysis of test construction] Reliability
This refers to the degree to which an examination consisently evaluates what it is intended to and it evaluates without error. Currently, the most common measure of reliability is Cronbach α, which estimates the reliability of an examination by measuring the internal consistency. The closer the coefficient, Cronbach α, is to 1, the higher the reliability becomes. The KHPLEI uses the coefficient, Cronbach α, to determine the reliability. The formula is as follows.
[Analysis of test construction] Multiple choice response analysis
This indicates the frequency of an examinee's response to each option listed in a multiple choice item. This analysis is conducted to determine the effectiveness of distracters and the function of a correct answer.
[Analysis of test construction] Item discrimination
Item discrimination indicates the extent that the test discriminates the examinees depending on their abilities. If a examinee who has high scores gets the item correct and those with low score get it wrong, the item has the power of discrimination. Discrimination index can be estimated through the correlation between the score an examinee has earned for an item and the total score he or she has earned for the test. The closer this index is to 1, the higher the item discrimination becomes. The formula to measure item discrimination is following: Another method to calculate item discrimination is to compare the number of examinees with high test scores who have answered that item correctly with the number of examinees with low test scores who have answered that item wrongly (Johnson, 1951). When separating examinees into the two groups, they may be divided into groups with the same number without regarding their scores, or they may be separated into 27% of examinees at the top of the score and 27% at the bottom as a way of estimating item discrimination (Kelly, 1939). The KHPLEI uses this method of separating groups into the upper 27% and the lower 27% to measure discrimination. The closer this level is to 1, the higher the discrimination becomes. Following is how the NHPLEB calculates the level of item discrimination: According to the above formula, when the number of examinees in the upper group who have given the correct answer is smaller than the number in the bottom group who have given the correct answer, the index is negative (-). Although there is no absolute standard to evaluate items based on the index, Ebel (1965), in reference to reliability of a test tool, set the guidelines on the evaluation of item discrimination as follows. 
[Analysis of test construction] Item difficulty
The KHPLEI uses the percentage of correct answers as an index indicating how difficult an item is. If all examinees give correct answers, the percentage of correct answers is 100% and if none of the examinees give the correct answer, the percentage is 0%. Therefore, the closer the percentage of correct answers is to 100, the easier the item gets to be answered, and the closer it is to 0, the more difficult the item gets. Test construction must consider the item difficulty depending on the purpose and subjects of an examination. In general, it is desirable for items to be evenly dispersed between 10 and 90. (That is, having suitable mixing of high difficulty items with those of low difficulty to maintain overall discrimination than having all items with similar difficulty.) Depending on the purpose and subjects of an examination, it may be that a test is composed of items only of low difficulty or high difficulty. However, it is desirable to form normal distribution curve primarily centered around 50 ~ 60. 
[Analysis of test construction] Item analysis
Item analysis is a procedure of analyzing the characteristics of each item that takes part in an examination to evaluate the quality of the test. This procedure can be divided into qualitative analysis, which is related to the content validity, and quantitative analysis for item difficulty, item discrimination and response analysis. 
[Knowledge Level] Problem-solving type
This refers to an item requiring the ability to solve specific problems by taking advantage of the knowledge that a examinee has at his or her disposal. The problem-solving type usually consists of items which require the ability not only to interpret information given in the item but also to interpret the meaning or purport of the options listed. Usually it is related to diagnosis, treatment, structuring and judgement using clinical data. These items are most comprehensive items and cover everything from the memory, understanding, application, analysis and synthesis and also the judgment and decision-making ability at each step.
[Knowledge Level] Interpretation type
This type asks an examinee to respond to a new situation based on his or her complete understanding of the acquired knowledge. This requires the examinee to remember certain facts and know the reasons for them being such, and then give a new interpretation and express it in another form. This type of item is classified as dealing with the procedures required to handle clinical information and data. For example, it presents data including medical history, radiological images, electrocardiograms, and the results of an examination and ask questions requiring interpretation, identification, analysis and explanation.
[Knowledge Level] Recall type
This type indicates items one can produce an answer to by simply recollecting a memorized fact. The recall type may be similar to the recognition type but it is a type of item that asks an examinee to respond to given contents. This type items specialized knowledge such as forms, facts, terms, mechanisms, principles, procedures, sequence, type, classification, methods, concepts, academic rationale and theories. 
[Type of test construction] Extended matching set type (Type R)
It is a type of multiple choice question. A Type R item consists of 1) theme, 2) lead-in, 3) option list, and 4) stems. The number of options allowed is from 4 to up to 26. While in a Type A item five options are used only for the item, Type R items use one list of options for all items in the set. Ex) Theme: fatigue For each item, select the suggested number of the most likely diagnosis among the 14 options. Acute leukemia Anemia of chronic disease Congestive heart failure Depression Epstein-Barr Virus Folic Acid Deficiency Glucose 6-phosphate dehydrogenase deficiency Hereditary spherocytosis Hypothyroidism Iron deficiency Lyme disease Microangiopathic Hemolytic Anemia Miliary tuberculosis Vitamin B12 deficiency A 19 year-old woman complains of two weeks of fatigue, fever and sore throat. She has a fever of 38.3°C, cervical lymphadenopathy and splenomegaly. The leukocyte count was 5,000/mm3 (80% lymphocytes, most of them atypical). Blood serum aspartate aminotransferase (AST) was 200 U/L. But serum bilirubin and alkaline phosphatase was normal (select one). A 15-year-old girl complains that she has been bruising easily for the last two weeks and has been feeling severe fatigue with pain in her back. She has widespread bruising, pallor, and tenderness over the vertebrae and both femurs. Hemoglobin concentration was 7.0 g/dL, the leukocyte count 2,000/mm3, and the number of platelets 15,000/mm3 (select one).
[Type of test construction] Multiple true-false type (Type K)
This is an item that corresponds to a stem, followed by four or more answer choices. The examinee has to select one combination consisting of the right answers. Ex) What is the correct term for a recessive disease state related to the X chromosome from among following items? Hemophilia A Cystic fibrosis Duchennes muscular atrophy Tay-Sachs disease       1) A, B and C      2) A, C      3) B,D      4) D      5) A, B, C and D
1 2