Cyclomatic Complexity and Testing
Contents
In addition to “Make Illegal States Unrepresentable,” another concept that I picked from Seemann’s book was “Cyclomatic Complexity.”
Cyclomatic Complexity
Cyclomatic complexity is a fairly easily computed code metric. For example, the cyclomatic complexity of the following method is two:
|
|
Ie. there are two paths through the code:
- If
im_host
does not end in a slash, one is added. - If the slash is already present, nothing is done (basically an empty and omitted
ELSE
branch)
When drawn as a graph, it would look something like the following:
Control-flow graph 1
Both graphs result in the same cyclomatic complexity when run through the formula M = E - N + 2P
, in which E is the number of edges, N is the number of Nodes and P is the number of connected components.
- Left graph: 3 - 3 + (2 * 1) = 2
- Right graph: 4 - 4 + (2 * 1) = 2
Calculating the cyclomatic complexity is simple even without any tools or graphs:
|
|
The cyclomatic complexity of the above method is six (1 +1 +1 +1 +1 +1). In graph form it would look something like the following:
Control-flow graph 2
Again both produce the same cyclomatic complexity:
- Left: 11 - 7 + 2 = 6
- Right: 15 - 11 + 2 = 6
Interpretation
What good is this metric? Looking back at the two examples, it is easy to see that cyclomatic complexity increases in hand with the number of ifs and loops. The higher the number, the more conditional logic a piece of code contains and the harder it is to understand, presumably.
It’s good to keep in mind that the cyclomatic complexities of two methods might not be directly comparable: a method consisting of a single, huge CASE
with a cyclomatic complexity of 30 (ie. just a CASE
with 30 branches) will be much easier to reason about than a method which reaches the same cyclomatic complexity by putting ifs inside ifs inside loops.
Seemann (2021, 132–134) links cyclomatic complexity to “The Magical Number Seven, Plus or Minus Two,” ie. the limits of human short-term memory. He suggests using seven as the maximum allowed cyclomatic complexity in order to keep methods small and understandable. I can’t vouch for whether or not this works, but I’ve sure felt miserable trying to grok/work with a large method, so I’m sure that setting limits on maximum complexity can’t hurt.
Link to Testing
Cyclomatic complexity has a straightforward relationship to testing: it is the maximum number of tests required to execute all code in the method.
|
|
Because both ifs test different parameters, we could cover all branches of this method by writing only two unit tests. Had the latter if instead been an elseif, we would have needed to write three unit tests, which is the cyclomatic complexity of the method.
But having an equal number of unit tests to the cyclomatic complexity of a method only guarantees that every branch is (potentially) executed. It does not guarantee that all combinations of all branches were tested. Instead, cyclomatic complexity represents the minimum number of tests required for full path coverage (ie. all combinations of all branches). To completely test all paths of the above method, we’d need at least four unit test, which is greater than the cyclomatic complexity of the method. Add another independent if to bring the cyclomatic complexity up to four, and we’d need eight unit tests to cover all combinations of branches. Add still another independent if for a cyclomatic complexity of five and we’d be up to 16 required unit tests. Extrapolate further to see that the method is not too many steps removed from being (practically) untestable.
One should keep in mind that the interpretation of cyclomatic complexity as the number of required tests does not only apply to unit tests, but also to manual testing.
The following ranking by the inventor of the metric, Thomas J. McCabe, Sr., is presented in Wikipedia:
- 1 - 10 Simple procedure, little risk
- 11 - 20 More complex, moderate risk
- 21 - 50 Complex, high risk
- over 50 Untestable code, very high risk
I’ve seen at least one custom report (=program) in a customer’s system where there was little to no modularization; that one had cyclomatic complexity of over 200, and I just counted the IF
s – I couldn’t even begin to estimate what would be required for full path coverage. Apart from singularly catastrophic reports, user/customer exits are other places in SAP systems likely to have nightmarishly high cyclomatic complexities (and, obviously, those are also almost impossible to test in isolation).
Remedies to Cyclomatic Complexity
Using different syntax does not work. The following has exactly the same cyclomatic complexity as the above snippet:
|
|
Using “Extract Method” refactoring (Fowler 2019, 106) helps, as it transforms previously conditional code into a single method call.
Using “Replace Loop With Pipeline” refactoring (Fowler 2019, 231) helps, as it, likewise, removes previously conditional code and replaces it with a method call (though, since ABAP lacks lambda functions/function literals, this is a less applicable remedy).
Sources
- Cyclomatic complexity. <https://en.wikipedia.org/wiki/Cyclomatic_complexity>, fetched 2022-11-13.
- Fowler, M. (2019) Refactoring: improving the design of existing code. Addison-Wesley. 978-0-13-475759-9.
- Seemann, M. (2021) Code That Fits in Your Head: Heuristics for Software Engineering. Addison-Wesley. 978-0-13-746440-1.