timstall: Gaming Unit Test Metrics

Unit Testing is a popular buzzword - most developer jobs request it, teams want to say they have it, and most coding leaders actively endorse it. However, you won't be able to hire a team of devs who don't want to write unit tests, and then "force" them via measuring certain metrics. Developers can game metrics.

1. Metric: Code Coverage
The #1 unit testing metric is code coverage, and this is a good metric, but it's insufficient. For example, a single regular expression to validate an email could require many different tests - but a single test will give it 100% code coverage. Likewise, you could leverage a mocking framework to artificially get high code coverage by mocking out all the "real" code, and essentially having a bunch of "x = x".

2. Metric: Number of unit tests
Sure, everything being equal, 10 tests probably does more work than just 1 test, but everything is not equal. Developer style and the code-being-tested both vary. One dev may write a single test with 10 asserts, another dev may put each of those asserts in its own test. Also, you could have many tests that are useless because they're checking for the "wrong" thing or are essentially duplicates of each other.

3. Metric: % of tests passing
If you have a million LOC with merely 5 unit tests, having 100% tests passing because all 5 pass is meaningless.

4. Metric: Having X tests for every Y lines of code
A general rule of thumb is to have X unit tests (or X lines of unit testing code) for every Y lines of code. However, LOC does not indicate good code. Someone could write bloated unit test code ("copy and paste"), so this could be very misleading.

5. Metric: How long it takes for unit tests to run
"We have 5 hours of tests running, so it must be doing something!". Ignore for the moment that such long-running tests are no longer really "unit" tests, but rather integration tests. These could be taking a long time because they're redundant (loop through every of a 1 GB file), or extensively hitting external machines such that it's really testing network access to a database rather than business logic.

6. Metric: Having unit testing on the project plan
"We have unit testing as an explicit project task, so we must be getting good unit tests out of it!" Ignore for the moment that unit tests should be done hand-in-hand with development as opposed to a separate task - merely having tests as an explicit task doesn't mean it's actually going to be used for that.

7. Metric: Having high test counts on methods with high cyclometric complexity
This is useful, but it boils down to code coverage (see above) - i.e. find the method with high complexity, and then track that method's code coverage.

Conclusion
Obviously a combo of these metrics would drastically help track unit test progress. If you have at least N% code coverage, with X tests for every Y lines of code, and they all pass - it's better than nothing. But fundamentally the best way to get good tests is by having developers who intrinsically value unit testing, and would write the tests not because management is "forcing" them with metrics, but because unit tests are intrinsically valuable.

timstall

Monday, January 17, 2011

Gaming Unit Test Metrics

No comments:

Post a Comment