True-False (Yes-No) items ask test-takers to decide if a statement is correct or incorrect. In language testing, they are used to assess listening and reading comprehension, i.e., whether a set of statements based on a text is true or false. Because the choice is only between two options, this item type is, in effect, a two-option multiple-choice format. Hence its biggest problem – 50-50 chance of guessing by test-takers. One way around the problem is introducing the third option, Not Given (Doesn’t Say). However, Not Given items are problematic for many reasons. Firstly, the difference between Not Given and False is often not obvious and can confuse test-takers. Secondly, the difference between Not Given and True can also be vague: test-takers might believe that the item asks for inferencing from the text (something that is not stated but can be guessed based on the information provided) and will mark it as True. Writing high-quality Not Given items is very difficult even for trained and experienced item writers. Because Not Given items often confuse test-takers, test-takers should get practice answering this item format before it is used on them.

An additional problem with this item type, and particularly with False items, is that, when test-takers identify a statement as false, it does not necessarily mean they know what is true. Test-takers might identify a statement as false (thus answering correctly) for a host of reasons which have nothing to do with their true comprehension of the text. A way out of the problem is by asking test-takers to justify their response. However, Hughes and Hughes (2020) believe that such item modification can result in construct-irrelevant variance because True-False items are normally used to assess receptive skills (reading or listening) while a justification requires a written response, which is not part of the construct being tested. Another problem is that the resultant justifications can be very idiosyncratic and difficult – if not impossible – to mark, resulting in low marking reliability.

On the plus side, True-False items are relatively easy to write (with the exception of Not Given items), take a short time to complete by test-takers, and are quick and easy to mark using a simple marking key. This makes them a useful instrument for low- or no-stakes classroom assessment. However, for all the reasons listed above – including serious validity and reliability issues – True-False tasks are not recommended for high-stakes assessment. They are rarely used in large-scale testing and those exam bodies that used to include True-False tasks in their tests (e.g., the now retired Cambridge PET test) have removed True-False tasks from the newer version of their tests.

Recommendations for writing True-False items

  • Use True-False items only to test comprehension of facts. Understanding opinions and implied meaning (inferencing) cannot be reliably tested with this item type because opinions and inferencing are not straightforward enough to give a true/false answer. 

A fact that can be tested with a True-False item.

An opinion. Unless there is a very unequivocal statement in the text, by John himself, that he finds it difficult to get up early, the information cannot be tested with a True-False item.

  • Avoid taking a positive statement from a text and changing it to a negative statement as a way of creating True-False items.

This approach to producing True-False items might be tempting as an easy fix but the resultant items are often so obvious that the majority of test-takers, even weak ones, answer them correctly. 

  • Avoid using qualifiers such as usually, generally, often, sometimes.  This is because the correct answer when using such qualifiers depends on one’s individual perception of what is usual, often, sometimes, etc.

Is the correct answer true or false? For some people, possibly those who never get up early, getting up early every other day is very often so they will mark the statement as True. Those who get up early every day, on the contrary, will not perceive ‘every other day’ as often and will respond with a False. Both groups of test-takers will have comprehended the text, though.

  • Avoid using negative statements, especially double negatives; those are very confusing to answer.
  • Avoid long, complex sentences – the longer and more complex a sentence is, the more difficult it is to say unequivocally if the statement is true of false. I would recommend up to 10 words for low-proficiency tests and up to 15 words for higher-proficiency tests. 
  • Avoid double-barrelled statements, i.e., statements that include two (or more) ideas. They confuse test-takers as it is unclear whether one of the statements or both statements should be true in order to merit a True answer.

If the first statement (“Jane and Sarah set up a project to research African elephants”) is true but the second (“the two women went to Africa together”) is false – is the correct answer true or false?

  • Make True and False statements of approximately equal length. A frequent problem with this item type is that True statements are longer than False ones because they have to be precise to ensure they are really true. Test-takers, who are often test-wise, will notice this tendency and will mark longer items as True because of the length and not because they have comprehended the text.
  • Include an approximate equal number of True, False (and Not Given – if you use them) statements in a task. At the same time, never make use of a regular pattern – e.g., always 50-50, or always 40-60 – your students will notice the pattern and will take advantage of it.

Further reading

Miller, M. D., Linn, R. L., & Gronlund, N. E. (2009). Measurement and assessment in teaching (10th ed.). Pearson Education Ltd. – pp.179-187 of this book contain some useful recommendations for constructing True-False tasks, as well as a checklist for task review.

Downing, S.M. (1992). True-false, alternate-choice, and multiple-choice items. Educational Measurement: Issues and Practice, 11(3), 27–30.

Frisbie, D. A., & Becker, D. F. (1991). An analysis of textbook advice about true–false tests. Applied Measurement in Education, 4, 67–83.

Grosse, M., & Wright, B. D. (1985). Validity and reliability of true-false tests. Educational and Psychological Measurement, 45, 1–13.

Kosuliev, A., & Stanev, E. (2020). Betting on answers as a way of engaging students. Paper presented at the 59th Annual Scientific Conference – University of Ruse and Union of Scientists, Bulgaria, 2020. (free access)