Friday, October 28, 2011

Pass/Fail vs Real Status

I just read an article "Addicted to Pass/Fail" on page 17 of  Tea time with testers (article by Rikard Edgren), and an associated blog post, "Binary Disease" by the same author

I wanted to compare this with the Low Tech Testing Dashboard.

You'll also understand this better if you watch Cem Kaner's Black Box Software Testing course videos on measurement theory in testing, and any related articles you can find by the same author.

This blog post is meant for study, mostly.  My questions are biased because I don't think a metric like "50% of our tests are passing" is meaningful.

Here are my questions for you (and myself):

  • Why is pass/fail data seemingly useful? To whom?  Why?  What decisions are made with it?
    • How powerful is each test?
    • How important is each test?
    • How important is each failure?
    • Which product areas have more failures?  How serious is each failure?
    • Which failures are acceptable?  Why?  Unacceptable?  Why?
    • Do these measures misconstrue subjective and fallible measures as being precise?
  • Why is the dashboard useful?  How does it compare to the pass/fail metric?
    • Why is a set of subjective measures better?
    • How are they better?
    • Why is it useful to differentiate the effort and quality in different feature areas?
      • Are you focusing on the right feature areas?
      • If no, maybe you need a different choice of feature areas?
    • Whose perspective does the dashboard represent?
    • Whose other perspectives are needed when making decisions?  Why?  How can they be obtained?
    • Who is the dashboard for?  (managers, usually--but all information workers, generally)
    • What does it report?  (testing progress and test team confidence, etc)
  • What might be the dashboard's shortcomings?
    • Is the dashboard the only communication tool? Can it work together with other tools, effectively?
    • Is the dashboard adaptable?
    • What other tools are needed?
    • Does the dashboard, together with other tools, serve a good purpose?  Does it mislead or misunderstand?
      • Multiple dimensions of quality?
      • Multiple values and audiences?  (initial vs late audiences? technical vs non? age/gender/ethnicity/experience/intentions/etc)
    • Is it too complex?  Too simple?
    • Does it enable informed decisions?  Does it only seem to provide good info for decisions?

No comments:

Post a Comment