Microsoft is Trying NOT to Trick You: Question Clarity

Microsoft is Trying NOT to Trick You: Question Clarity

Liberty Munson (Microsoft)

Well, I think I've hit on a topic that really resonates with my loyal followers. Your comments to my recent blog post, "Microsoft is NOT Trying to Trick You," have provided me with some interesting insight into your exam experiences. As one poster so aptly put it, I really should have said "Microsoft is Trying NOT to Trick You." Clearly, many of you have seen content that you feel is tricky, even if we don't intend it to be. Based on your comments, I've identified three key themes that I'd like to address:

1) Question clarity; poorly written questions

2) Out of scope content

3) Ongoing sustainment of exams (i.e., you tell us something is broken, why does it take so long to fix?)

Of course, if there's a theme that I'm missing, let me know. I'm happy to share with you what I can about how things work at Microsoft.

I have a lot to say on each of these topics, so I will blog about them separately... a series of blogs on 'trickiness' if you will. I'm going to start with the "question clarity/poorly written questions" issue because this seemed to be the most frequently occurring theme in the comments to my previous post, has been mentioned in many places--not just in response to this post but in response to other posts as well (I've also seen it mentioned on other blogs and social media sites, so you clearly have some thoughts about this!), and question clarity is mentioned quite often in the comments you provide at the end of the exam. 

So, let's start with the item writing process. Microsoft has specific requirements related to wording, phrasing, and sentence structure. Much of this is related to keeping sentences and words simple for localization. Unfortunately, this can result in weird sentence structures in English. For example, imagine you have to consider both X and Y when coming up with a solution. If we were strictly writing this for ENU candidates, we'd probably write it like this: "consider x and y," but conjunctions make sentences grammatically complex, which can be problematic when localizing. As a result, we end up stating this as two separate sentences. This can feel awkward to English speaking candidates, and awkward sentence structures are perceived as unclear.

Another factor that comes into play here is that questions can be perceived as unclear because candidates feel like we have not provided sufficient information to answer the item correctly. However, we simply cannot layout every nuance of the scenario described in the question--we already get complaints that our items are too wordy (which several of you mentioned in your comments!). All exam items, not just Microsoft's, require some level of inference that is appropriate to the target audience and level of knowledge/skill/ability being assessed (fewer inferences are expected for knowledge based items, while more inferences are expected in application types of questions). We expect that candidates who are truly qualified to make appropriate inferences based on their knowledge, skills, and experiences with the technology; after all, you do this in real life. When I discuss items with subject matter experts (SMEs), I emphasize that some level of inference is appropriate. If you think that what is required in an item is unclear because information is missing, ask yourself "what information is missing?" Based on the information that we provide, what logical inference would you make about that missing information that would help you answer the question? If you're qualified and have the experience detailed in the audience profile, that inference should be the "correct" one. Make it, and answer the question accordingly. If you answer that no logical inference can be made, then we may not have provided sufficient information... let us know in your comments. 

Most important, though, DON'T overthink the question. I think many overqualified candidates talk themselves out of the correct answer because they read more into the question than what's there. These are the inferences that someone in the target audience would make, not someone who is very qualified like you are :).

Although we work closely with our SMEs in item development and during the technical review of the questions, undoubtedly questions that are not as clear as we think they are will get through the process. I identify them in two ways. First, I can actually identify them through their psychometric performance (bet you didn't know that!). I have found one statistic, in particular, that often sheds light on confusing questions; when it falls below my psychometric guideline for acceptable performance, I often learn through conversations with SMEs that the question is unclear. This statistic doesn't always mean that the question is unclear, but it's a good indicator that it might be. Second, as you can tell from the comments to my previous post, some of you take the time to let me know--through your end of exam comments--when an item is unclear. In both cases, we discuss these items with SMEs and get their feedback on the clarity issue. In many cases, the lack of clarity stems for one of the two issues that I discussed above. We will try to reword unclear items when we can to address these issues if the SMEs believe it's necessary and possible, but sometimes, they don't agree with either the statistics or comments about the question's lack of clarity. When that happens, I will keep the item on the exam but monitor both the psychometric performance and your comments... if these issues continue (you'd be surprised that they can work themselves out), then I will remove them from the item pool.

You see I have way too much to say on this topic! Stay tuned for the next installment which will focus on "out of scope content" although I can hardly wait to see what you have to say about this one!

Comments
  • Tim Lorge
    |

    Regarding the overthinking and assuming facts not in the question, I have been guilty of this in the past, and will be in the future, so I do appreciate the reassurance.

    However, as one who openly and hotly debates grammar usage with a Chicago Manual of Style in hand, translation may be the excuse but it certainly isn’t acceptable.

    For starters, localizing means translating; so let’s be clear on that. And, there is nothing wrong with that except when someone is cheated in the process. For those of us who get the awkwardly worded sentences as a result of translation either way, we’re the ones who are cheated.

    The way this post is worded indicates that, for the test takers in the US English speaking world, the test creators don’t consider sentence structure important when creating questions.  We the test takers just have to deal with it. The fact of the matter is, as award winning journalist P.J. O’Brien, and my grandfather, said back in the 1930’s, “There’s nothing wrong in a confusing sentence that a properly placed a period and a couple extra words couldn’t fix.”

    If one doesn’t want to do that, it is, with all due respect, simply laziness.

    If no one wants to put the time in to write a sentence that is grammatically correct, it suggests, for some unknown or unstated reason, the test creators are somehow prohibited, unable or unwilling to reword the questions so they aren’t awkward.

    That highly offends me; that the language isn’t considered important when it is all that is there … in a question.

    This is larger than a certification test. The test creators are a purveyor of language. It is their duty to get it right and not take shortcuts because they are inconvenient.

    As a linguist, professional wordsmith, if you will, I understand that translations don’t always occur neatly for the translator. However, it is the translator’s job to make it neat for the listener or reader. Frankly, I don’t care how hard someone else perceives their own job to be. Their only job is to take foreign languages and decode, decipher or otherwise convert it to the native language of the listener, reader or, in this case, the test taker.

  • Liberty Munson (Microsoft)
    |

    Hi Tim,

    I agree with much of what you're saying, and I fear in my attempt to briefly describe the two key factors impacting item clarity, I may have oversimplified the localization considerations that we incorporate into our item writing process.

    We have more than 600 versions of our exams in market if you consider the number of languages into which we localize (roughly 125 exams into 5 languages on average). As a result, we have to maximize our budget, time, and people through a delicate balancing act, especially if we want to have the localized versions of exams in market at about the same time that the ENU version is available. In addition, we have to consider reading level of the typical ENU candidate, which is probably lower than most would expect, as well as that of the typical English as a second language candidate. All of these factors play a role in our item writing guidelines related to sentence and word complexity. In an ideal world, we could apply the principles that you describe below in every situation for every exam. Given our limited resources, we do the best we can with what we have to get as close to those ideals as possible (although, admittedly, we do this better at times than at other times, which is why our psychometric analysis and your comments are so important at addressing clarity issues not only in English but in our localized versions).

  • Thorsten Butz
    |

    Liberty,

    it would be helpful if you could clarify a single, but very annoying issue:

    there are lots of Drag'n Drop questions, where the "correct order" is relevant. Very often these questions do NOT have a SINGLE valid answer. More then one order can solve the subject of the question.

    I would like to know if the exam software is at all capable of accepting more than 1 valid answer in these cases.

    It's just one issue, but it's very irritating to see these kind of imprecise  questions. .

    Thorsten

  • KevinM
    |

    I certainly agree with Thorsten. I just took the 70-414 again yesterday and I ran into several "drag and drop" questions that fit that description. Incidentally, I recall commenting on some of these questions to that effect when I took the 414 the first time...back when it was in beta!

    Which leads me to my next question: I know that an exam may go a year or more between reviews, but when are you going to get around to reviewing the 414 again? I've taken over 40 exams in the past 7 years from multiple companies and I have literally NEVER seen an exam with so many ambiguous and poorly written questions on it. I've taken this exam a couple of times dating back to the beta, and I've commented on these questions every time, yet nothing has changed. It literally is to the point that yesterday I was thinking to myself "hey look, there's that same poorly written question again!"

  • soder
    |

    We are talking about a behemoth company like Microsoft. The only reason this blog is running to calm down obviously angry people, but personally I have never ever seen any of the issues presented here to be fixed on any of the exams I have taken in the past 7 years. Period. Nice to have this blog, nice to read articles about the issues. But have no unrealistic expectations: the resolution will stop here. No progress has been promised in this blog, they just talk. If a vendor is not aware of such issues, you can at least forgive the vendor: they are so stupid that the issue report dont even reach them. However in this case, MS is aware of the issues. But they dont act on them in timely manner. Now that makes people angry, and thats actually understandable.

    Slightly related: most of the MSPRess books went on a nice quality downlhill since the 2008 family. New releases come so often (every 2-3 years) that by the time -if ever- the books are revised, the new product family is about to come. So there is in fact no interest from the vendor (MS) to fix the issues. The corporate communication is always the time: "Just jump to the next release, it will have the previous issues fixed." Yes, some are. But you will get a LOT new ones. And the circle goes over and over and over.

    Welcome to the IT asylum run by MS.

  • perisa_n
    |

    Adding to what Thorsten says about correct order, if for example the exam said list the steps in order to compact the dhcp database.

    According to this link: technet.microsoft.com/.../hh875589.aspx these are the steps:

    CD %SYSTEMROOT%\SYSTEM32\DHCP

    NET STOP DHCPSERVER

    JETPACK DHCP.MDB TMP.MDB

    NET START DHCPSERVER

    So if on the exam i put NET STOP DHCPSERVER as the first step would i be wrong?

    I sat the 70-417 exam recently and saw a reference to Windows Server Enterprise 2012 and it was very important to the exam question at hand. That version doesnt even exist

  • soder
    |

    perisa_n: just for your information, you violated the MS Non-disclosure agreement (NDA) when you posted this exam question example to the public internet. So from now on MS has all the rights to revoke all your MS exams and your earned credentials. Living in a happy world controlled by multinational company interest!

  • Tim Lorge
    |

    Liberty, I totally understand where you're coming from. From your perspective, you have to manage the entire worldwide program.

    Those of us in the trenches, though, our focus is much, much smaller in terms of the tests. It's the one question we didn't' answer correctly that perhaps prevents us from getting a raise, a new job or, for trainers, being able to teach another series of courses (which could lead to more money).

    I am not suggesting for a moment that you don't get that. If anything the fact that you've taken the time to explain the process to us allows us to have the conversation. I truly appreciate that.

    I also appreciate the statistical review of the questions. While it doesn't help those who take the tests earlier, those who take it on down the line will see an improved test.

    I read today John Deardurff's article about his testing success and failures. The cynics will say that article is conveniently timed or otherwise contrived; pay them no mind.

    For those who haven't read it, John is an MCT and discusses his attempts to obtain the MCSA SQL Server 2012 certification. Here is someone who knows his stuff. He has previously taken other SQL Certs and he's a trainer. Trainers know the material backwards and forwards. John, with all he knows, still had a problem.  It is comforting to know that we aren't alone in our pursuits of IT greatness.

    To that end, I think many of us in this business are highly competitive. Whether that competition is with colleagues or one's self doesn't matter. It is a source of pride when we can say "Yeah, I passed it first time around".

    However, it is also important to note that were the tests easy, that I passed the first time claim would be a meaningless.  

    All that said, I think there are some legitimate concerns based on the language of some questions and, in the examples cited. type of questions.

    I, for one, have always disliked "best answer" questions simply because "best" is a subjective term and really based on "how" someone learned something. Book "A" has it this way; Book "B" that way.

    Using Persia_n's post as an example, there is nothing wrong with stopping the DHCP server first. It is debatable as to whether or not that is "the best".

    So, in scoring those type of questions, are they weighted? Is it possible to explain how the scoring process works? Thank you for all your hard work.

  • Veronica Sopher - Microsoft
    |

    @Tim - Liberty's and John's posts were written independently. Our bloggers here are all brilliant and have strong voices, which we love. I couldn't coordinate them that closely even if I tried. ;)

  • John Deardurff (MCT Regional Lead - Eastern US)

    Full Disclosure: I did not see Liberty's post until Tim tagged me in his comment. There was no collusion between our posts. Having said that... I am guilty of overthinking questions and have to remind myself at times not to read to much into the questions.  Any exam I've failed it has been either I was not prepared or I out guessed myself. (With the exception of the Exchange 2013 exams... On those, I think Microsoft is out to get me and wrote all the questions for the sole purpose off making me fail.)

  • Tim Lorge
    |

    @ Veronica & @ John YIKES! Sorry for that bit of confusion.

    I never said that there was any coordination nor was I making any accusation of that. I said the cynics would say that there was coordination.

    John, I thought that your post dove-tailed quite nicely into the discussion.

    Peace! :o)

  • Veronica Sopher - Microsoft
    |

    @Tim - No worries. I posted the reply in case any cynic was reading. ;)

  • Liberty Munson (Microsoft)
    |

    It's starting to look like I might have a few semi related blog posts to address some of these comments. :)

    First, when multiple paths are available to solve a problem, you receive credit if you provide any of those paths. We are adding language to our questions that clarifies that. Second, we have been reviewing the performance and comments related to all of our question types and will be implementing requirements for our item writers to provide more context so that candidates have sufficient information to determine what "best" is in that situation and how we're defining "best" (e.g., least administrative effort, least cost, quickest to implement, etc.).

    I'll keep my eye on your comments to see if this type of information is worth another/separate blog.

  • Thorsten Butz
    |

    <quote>"First, when multiple paths are available to solve a problem, you receive credit if you provide any of those paths"</quote>

    Thanks for the reply, Liberty. I never saw a remark kind of: "Order the tasks in ANY correct order. "

    So it was clear to me, that the test suite will only recognize 1 distinct answer.

    @MSL: Wouldn't it be necessary to flag this kind of "ambiguous" question during the test to make it recognizable to the student? I've done my first exam in 1999 and I never saw anything like this.

    Thorsten

  • campbell_kerr
    |

    I had a question with 4 available answers, but A and B were the same (different wording), and C and D were the same (different wording).

View All