# Protected: Matching the Right Questions to the Right Students – Rasch Model for Measurement

**16**
*Thursday*
Apr 2015

Posted How-To

in
**≈ Enter your password to view comments.**

**16**
*Thursday*
Apr 2015

Posted How-To

in
**≈ Enter your password to view comments.**

**15**
*Wednesday*
Apr 2015

**Tags**

**Part 5**

Faculty of Medicine UKM is unique since we have students from various franchises beside our own students. They also have their own unique set of matriculation numbers as demonstrated below;

- UKM – The letter A followed by 6 digits, i.e. AXXXXXX (A123456)
- UNPAD-UKM – Initially the letter G followed by 5 digits. Now changed since the 2012 intake to the letter GA followed by 4 digits i.e. GAXXXX (GA1234).
- AUCMS-UKM – This has gone through 3 changes. Initially the letter M followed by 6 digits (M123456). Then BMD followed by 7 digits (BMD1234567) for 2012 intake but later reduced to 6 digits after the fiasco during end-module examination in 2012. Now that AUCMS no longer exists since January 2015, the students’ matriculation numbers follow the UKM-UNPAD numbering system; the letter GA followed by 4 digits i.e. GAXXXX (GA1234).

Weirdly enough, the OMR forms does not support all these variety of matriculation numbering system. The OBA forms only allow for a single alphabet, followed by 6 numbers (i.e. A123456).

The EMI forms only allow for 6 digits and no alphabet (i.e.123456).

So the poor students started having problems filling up the OMR forms. For the UKM-AUCMS students with BMD prefix, they have been told to use only the M prefix.

For the UKM-UNPAD students, they have been told to use only the G prefix and to add two zeros before their own digits i.e. G001234. Instead some of them had been using the A prefix or omitting from marking any of the letters at all. As for the numbers, instead of adding the zeros before their numbers, a few had been adding the zeros after their own matric numbers, i.e. G012340 or G123400.

This will cause havoc for the course/posting coordinators who are trying to tabulate and merge the marks from the various components of the examination. When you have 171 students per posting, it is very hard to detect the mistakes visually, therefore we need a good system and help in highlighting the errors.

This blog post aim to do just that, to demonstrate one such method that will help in detecting the errors.

**Caveat (Warning)**

**The students marks are confidential, therefore the data used in the demonstration here are NOT REAL MARKS! Therefore do not try to postulate anything from these marks or discussion. These are just dummy data.**

In the following demonstration, we will will show you on how to merge the OBA and EMI marks.

**Data Preparation**

Both OBA & EMI data must be sorted in ascending order, taking into account of the alphabets. Since the alphabets will distort the sorting, we have to remove the letters in the OBA matric numbers, so that it will match the matric numbers for EMI marks.

So we first duplicate the matric number column by using the the copy paste command.

Now we select the column B.

Then we remove the alphabets in column B by using the **Find & Replace** command by pressing **Ctrl & F** or **Shift & F5**. First we remove the letter B, followed by H and P.

**Merging the Two Datasets; OBA & EMI**

Now the OBA matric numbers are similar as those matric numbers in EMI. Now we can proceed merging these two data sets.

First we sort the OBA data using column B.

Then we sort the EMI data using the matric numbers.

Now we copy and paste the EMI data into the OBA spreadsheet.

Immediately we noticed the mismatched matric numbers since the students were using different matric numbers for OBA & EMI. The zeros were in a different place for both examinations i.e. OBA using B001170 versus EMI using 011700.

To help identify all the mismatched matric numbers, in cell F2, we will type “=D2-B2”. Then “drag and fill” the rest of the column. If the matric numbers do not match, then the value will not be zero. Then we we can highlight the respective cells that do not match.

Then we start from the top. We will cut out or insert cells to match the matric numbers.

After all that hard work, we are rewarded with the following scenario; all the matric numbers matched except for these 7 matric numbers.

Then we can deduce which marks belong to whom, based on the matric numbers arrangement.

Once done, just sort the data again in ascending order by the matric numbers so that you can merge it with the other marks from the examination i.e. KFQ & OSCE.

Personally I would also check the alphabets. All 7 guilty of entering the wrong matric number seems to have similar number of digits. Therefore they should share the same alphabet.

**Conclusion**

The medical students may be the crème de la crème of the Malaysian education system, but some of them are not bright enough to mark their own matriculation number correctly. Sometimes you wonder how come they get to do medicine?

**06**
*Monday*
Apr 2015

Posted How-To

in**Part 3**

This article is the third in this series of assessing validity & reliability of examination questions. The first two were published earlier last week in the same blog. Kindly read these before going through this blog article;

- Difficulty Index, Discrimination Index and Reliability in UKM OMR Report
- Calculating Difficulty Index, Discrimination Index, Reliability & S.E.M.

The reliability of our examination paper as a measuring instrument is crucial, if we want a good and valid instrument to measure our students’ knowledge and comprehension. However how sure are we, that we are measuring what we want to be measured? Bear in mind, that with this “measurement”, we are actually making some conclusion which may affect someone’s future and we should be responsible enough in exercising due care and diligence. The instrument must have a certain level of difficulty and able to discriminate between the good and poor students. The percentage of correct responses would vary according to each item’s difficulty. The proportion of good or poor students who would respond correctly determine the item’s level of discrimination.

In the earlier articles, we learnt how to calculate the item’s Difficulty Index and Discrimination Index. Then we also show how to determine the reliability of the questions. Based on the results of these calculations, we can determine whether we have been fair to the students or not.

There should be a fair amount of easy, moderate and hard questions. At the same time, there should be no questions with negative or zero discrimination index. If there is any questions with negative or zero discrimination index, we need to check the answer key of that question. A zero or negative index indicates either that topic was not taught at all or has the wrong key.

In terms of KR20 reliability, the value should be 0.7 or more. Whenever our postgraduate students conduct studies, we expect their questionnaire to have a reliability of 0.7 or higher, therefore we as lecturers should adhere to the same standard. But for the last 20 years, the reliability of our examination questions were 0, as reported in the UKM OMR analyses, yet nobody raised any red flags, not even by the MQA and the ISO auditors. I guess in this scenario, “Ignorance (among the auditors) is bliss.”

Therefore any examination, especially those examinations where more than half of the students failed due to the theory paper, should be rigorously examined using the above principles. **In summary these are what every module/posting coordinator should do at every examination for the OMR report;**

- The number of correct answers, wrong answers and blank answers should tally with the number of questions.
- There should be a fair amount of easy, moderate and hard questions. If the difficulty index is less than 10%, please check the answer key for that question.
- There should be no questions with negative or zero discrimination index. If present, please check the answer key for that question.
- The reliability index should be larger than 0.7.
- The standard error of measurement should be very small.

SEM is directly related to the reliability of a test; that is, the larger the SEM, the lower the reliability of the test and the less precision there is in the measures taken and scores obtained. In the earlier example, the SEM was smaller at 1.95 than the standard deviation at 5.64 since it was highly reliable at 0.88. SEM should not be as big as 18.37 for a 30 questions MCQ as in the above OMR report.

Of course the module/posting coordinator should also tally up all the marks from the various components correctly. Always have someone with fresh eyes to check the calculations and formulas within the spreadsheet for the final marks.

Just to illustrate this point, allow me share this tale which happened way back in 2007. However details such as name and places were removed to protect the innocent (and the guilty).

**A Tale of Two OMR Marks**

Once upon a time in September 2007, there was an examination for a clinical posting. It was a terrible time since 82 (70.1%) of 117 students failed their posting. They failed because of very poor marks for their MCQs. A total of 115 (98.3%) failed their MCQs. Only 2 students passed MCQ, both had 55% marks (23 correct out of 40 questions).

Various excuses were given for the high failure rate, the most often repeated excuse was the allegedly poor attitude of the students towards the posting. Fortunately the leadership of the faculty at that time had the foresight to order a post-mortem.

Since the high failure rate was for the MCQs, attention was given to the OMR report. Not even 5 minutes into the post-mortem meeting, the fault was quickly discovered. Although there were 40 MCQ questions, students were only given marks for the first 24 MCQ questions. The marks for the other 16 questions were ignored by the OMR machine. The MCQs were quickly scanned again at a neutral site and the new OMR marks were carefully inserted in place of the old marks.

Alhamdulillah! Praise Allah! Suddenly almost everyone passed that clinical posting. The fault was human error while programming and scanning the OMR sheets. It was a happy ending, and the students lived happily ever after. Except for the Dean of course, who had to explain to the Senate about the whole thing 😉 but he was okay after that.

**Conclusion**

It disturbs me that whenever we have more than 50% failure rate for a clinical posting in theory examination, we blamed it on the allegedly poor attitude of the students towards the posting. Yet no evidence was offered to back it up. At the very least, the affected department should conduct due diligence on the examination and marking process. The current leadership should order a post-mortem since our main business is teaching students, not failing them. When many students failed badly, the teachers should reflect and examine the examination process. No departments should be above scrutiny, all departments should be treated equally, regardless of who are their members.

We are not teaching morons or retards. We are teaching the cream of the cream (crème de la crème) of the Malaysian education system, filtered through the sieve of excellence during the preclinical years. Therefore if we want to blame the students, we must make sure that the blame do not originally lie with us.

**“Jangan sampai pisang berbuah dua kali”.**

**Caveat (warning);**

Tale = “a fictitious narrative or story, especially one that is imaginatively recounted.”

<- Part 1 Understanding the UKM OMR Report

<- Part 2 Calculating OMR Indexes

Part 4 -> Matching the right questions to the right students.

**Update 9th April 2015;**

The OMR software has been rectified for KCKL & PPUKM. It is now reporting the reliability and standard error of measurement correctly.

**02**
*Thursday*
Apr 2015

Posted How-To

in**Part 2**

In the previous article, I explained the usefulness of the OMR report as generated by the UKM OMR machine software. But not all places have such facilities and support from the IT Department. Therefore what if we want to do the same kind of analysis on our own? In this blog post, I will explain how to do such analysis just by using a spreadsheet software such as Microsoft Excel.

Whenever we are conducting an examination for our students, we are basically conducting a study to measure their knowledge and understanding of the subject matter at hand. Using the prepared questions, we can measure their knowledge, but is the resulting scores a true reflection of their comprehension?

Therefore we need to measure the reliability and the validity of the questions being asked. To do that, we need to measure the following;

- Difficulty Index
- Discrimination Index
- Reliability Index (Kuder and Richardson Formula 20 Reliability Index)
- Standard Error of Measurement

**Getting the Raw Data for Analysis**

Basically we have to enter all answers of each students into the spreadsheet. If there were a lot of questions and a large number of students, it would be difficult to enter the answers for each questions and each students into the spreadsheet. So if possible, beg, borrow or steal an OMR machine and scan the students’ answers into a text file.

Then convert the text file into Excel.

If you can’t get hold of an OMR machine, then consider having an online examination instead using Google Form, which will generate an Excel file for you at the end of the examination. Otherwise start typing in the data into Excel yourself, as illustrated above.

Then convert their answers into Correct and Incorrect Answers. Code 1 for correct answers, code 0 for wrong answers, as illustrated below.

Use the “Autosum” command to generate the total marks for each students. Do the top one first, then “drag and Autofill” for the rest of the column. If you don’t understand that, get your son or daughter to demonstrate that for you 🙂 .

Then sort the data using the “Sort Descending” command, so that the highest score is at the top and the lowest score is at the bottom.

**Difficulty Index (D)**

**D = students with correct answer x 100**

**total students**

Using the Excel table prepared earlier, go to the bottom of the first question column (cell B24) and calculate the Difficulty Index using the formula such as this; “**=SUM(B2:B23)/COUNT(B2:B23)*100**“. Then “drag & Autofill” for the rest of the row till the very last question.

Then calculate the **Average Difficulty Index **by typing command such as “**=AVERAGE(B24:AE24)**” at the very last column (cell AF24), as what is illustrated below;

**Discrimination Index (R)**

R = (H – L)

27% of Total

H = number of correct answers from top 27% of students

L = number of corrects answers from bottom 27% of students

Since we have 22 students in this example, 27% out of 22 = 6 students. So we will have to take the number of correct answers from 6 of the top students (H) and deduct the number of correct answers from 6 of the bottom students (L), then divide it by 6.

Using the Excel Table prepared earlier, go to the bottom of the first question column (cell B25) and calculate the Discrimination Index using the formula such as this; “**=(SUM(B2:B7))-(SUM(B18:B23)))/6**“. Then “drag & Autofill” for the rest of the row till the very last question.

Then calculate the **Average Discrimination Index **by typing command such as “**=AVERAGE(B25:AE25)**” at the very last column (cell AF25), as what is illustrated below;

**Transposing the Difficulty and Discrimination Index for Analysis**

We have calculated the Difficulty and Discrimination Index for all 30 questions. Now we need to categorise them and prepare a frequency table. In its current incarnation, that would be difficult. So we need to transpose the current table to facilitate analysis. This is how we do it;

The resulting transposed sheet would look like this, which could be imported into SPSS and analysed.

Questions with low difficulty indexes (below 30) and zero/negative discrimination indexes should be checked again. Either the answer keys were wrong or the questions were poorly prepared.

**Kuder and Richardson Formula 20 Reliability Index**

This index checks the internal consistency of measurements with dichotomous choices (i.e. Correct versus Incorrect). A correct question scores 1 and an incorrect question scores 0. Value of this index ranges from 0 to 1. A high value indicates reliability, while too high a value (in excess of .90) indicates a homogeneous test.

*k*= number of questions- Sum of all PQ’s (p = probability of correct answer, q = probability of wrong answer)
- σ2 = variance of the total scores of all the people taking the test.

Therefore to calculate the reliability index, you need to calculate the following items first;

- p of each questions (p = probability of correct answer)
- q of each questions (q = probability of wrong answer)
- Sum of all PQ’s
- σ2 = variance of the total scores of all the people taking the test.

Sum of all PQ’s

Using the Excel Table prepared earlier, go to the bottom of the first question column (cell B26) and calculate the **P** using the formula such as this; “**=SUM(B2:B23)/COUNT(B2:B23)**“. Then “drag & Autofill” for the rest of the row till the very last question.

Using the Excel Table prepared earlier, go to the bottom of the first question column (cell B27) and calculate the **Q** using the formula such as this; “**=1-B26**“. Then “drag & Autofill” for the rest of the row till the very last question.

Using the Excel Table prepared earlier, go to the bottom of the first question column (cell B28) and calculate the **PQ** using the formula such as this; “**=B26*B27**“. Then “drag & Autofill” for the rest of the row till the very last question.

Then calculate the **Sum of All PQ’s **by typing command such as “**=SUM(B28:AE28)**” at the very last column (cell AF28), as what is illustrated below;

Variance

Please take note that a correct question scores 1 and an incorrect question scores 0. Therefore if you have 30 questions, the minimum possible score is 0 and the maximum possible score is 30. On the OMR machine report, these are reported as the raw scores, not the final marks that is based on 100%.

Variance is calculated by deducting the mean from each observed value and then squaring it. The sum of the squares were then divided with the number of observed value minus one.

Putting it all together to calculate reliability

*k*= 30- Sum of all PQ’s = 4.75
- σ2 = 31.83

Therefore **rho KR20** = (30/29)*(1 – (4.75/31.83)) **= 0.88.**

Since rho = 0.88, therefore the questions are highly reliable.

**Standard Error of Measurement**

**Standard error of measurement is related to test reliability** in that it provides an indication of the dispersion of the measurement errors when you are trying to estimate students’ true scores from their observed test scores.

The S.E.M. formula relies on the correct value for reliability. Therefore if the reliability value is wrong, your S.E.M. would be wrong too.

**Standard Error of Measurement** = Standard deviation * SQRT (1 – Reliability)

**Standard deviation** “provides a sort of average of the differences of all scores from the mean.” This means that it is a measure of the dispersion of scores around the mean. Variance is 31.83 (refer above), therefore since standard deviation is the square root of variance, square root of 31.83 is **5.64**.

Reliability here refers to the Kuder and Richardson Formula 20 Reliability Index as calculated above. The value calculated above was **0.88**.

**Standard Error of Measurement** = 5.64 * SQRT (1 – 0.88) = 5.64 * 0.35 **= 1.95**

So it is **1.95.**

**Conclusion**

If you don’t have the OMR machine, you can still calculate all the required indexes using just Microsoft Excel.

<- Part 1 Understanding the UKM OMR Report

Part 3 -> Implication of a High Failure Rate.

**01**
*Wednesday*
Apr 2015

Posted How-To

in**Part 1**

Working in an established university like Universiti Kebangsaan Malaysia has its advantages. Whenever I send in the students’ OMR (Optical Mark Recognition) forms, after a few days, I would get their marks delivered by email along with a hard copy of the OMR analysis. The report is usually about 11 pages long and is usually discarded by the younger lecturers since they have no idea what it is all about.

To me, what is important is not only the students’ marks. Instead when I go through it, it is to make sure that the marking was done correctly by the OMR machine, and it was done using the correct answer keys.

To make sure the marking was done correctly, I would refer to the following page;

I would look at the number of correct answers, wrong answers and blank answers. Their sum should be the same as the total of questions in the examination papers. In this example, there were 25 correct answers and 5 wrong answers, giving a total of 30, the same as the number of questions.

As for the answer keys, I would check the questions difficulty index and discrimination index page, as illustrated below.

I would quickly identify those questions with

- difficulty index of 30 and below or
- negative discrimination index or less than 0.15

and immediately refer back to the original questions, to ensure whether the keys given were correct or not. This had saved me from making mistakes many times in the past. Remember that we are mere human beings therefore not free from errors. Such technology would help prevent us from making mistakes, therefore make full use of it.

The report also gives the following indexes;

- Average Difficulty Index – this average is calculated using all the questions’ Difficulty Index
- Average Discrimination Index – this average is calculated using all the questions’ Discrimination Index
- Kuder and Richardson Formula 20 Reliability – Values range from 0 to 1. A high value indicates reliability, while too high a value (in excess of .90) indicates a homogeneous test.

The above values must be included in the final examination report. However some of the values **calculated automatically by this OMR software are WRONG!** I am rather surprised that this mistake has never been rectified, since UKM has a large number of esteemed statisticians. It is rather embarrassing, hopefully it will be rectified soon.

**How to Calculate Kuder and Richardson Formula 20 Reliability Index**

The formula to calculate the rho KR20 formula is as illustrated below;

*k*= number of questions- SUM PQ = 4.77 (p = probability correct answer, q = probability wrong answer)
- σ2 = variance of the total scores of all the people taking the test. The answer given by the software is 337.54, which is totally wrong. The correct answer is 31.83.

Therefore **rho KR20** = (30/29)*(1 – (4.77/31.83)) **= 0.88**, not 0 as in the above report. If rho = 0.88, therefore high reliability of the questions.

The mistake of the OMR software calculating the wrong value for reliability lies in calculating the wrong value for variance. The formula that the OMR software used is also rather confusing due to the missing syntax for brackets.

**Other Errors – Standard Deviation & S.E. of Measurement.**

Variance is 31.83, therefore since standard deviation is the square root of variance, square root of 31.83 is 5.64. **Standard deviation is 5.64, NOT 18.37** as in the OMR report.

The other mistake is the **Standard Error of Measurement**. Standard Error of Measurement is **not the same as Standard Deviation**, which is the impression that you will get from the above OMR report since their values in the report are the same.

Standard deviation “provides a sort of average of the differences of all scores from the mean.” This means that it is a measure of the dispersion of scores around the mean.

Whereas **standard error of measurement is related to test reliability** in that it provides an indication of the dispersion of the measurement errors when you are trying to estimate students’ true scores from their observed test scores.

The S.E.M. formula relies on the correct value for reliability. Therefore if the reliability value is wrong, your S.E.M. would be wrong too.

Standard Error of Measurement = Standard deviation * SQRT (1 – Reliability)

**Standard Error of Measurement** = 5.64 * SQRT (1 – 0.88) = 5.64 * 0.35 **= 1.95**

So it is supposed to be **1.95**, not 18.37.

**Conclusion**

Please go through the OMR report after every examination. It will help you in maintaining the high quality of your examination. However please take note of it’s limitation and mistakes. Please be aware that the following values in the OMR report are **WRONG!**

- Reliability
- Variance
- Standard Error of Measurement
- Standard Deviation

Part 2 -> Calculating Difficulty Index, Discrimination Index, Reliability & SEM

**Update 9th April 2015;**

The OMR software has been rectified for KCKL & PPUKM. It is now reporting the reliability and standard error of measurement correctly.