Code Entropy

While preparing my talk at Smidig 2008 I kept thinking about the second law of thermodynamics.

Given two snippets of code, A and B, that have exactly the same external behaviour. If an expert programmer is more likely to change A to B than B to A, then snippet A has higher code entropy than snippet B.

Let us consider a very simple example:

{ int a=3; while (a<9) { ... ; a++; } } // snippet A

and

{ for (int a=3; a<9; a++) { ... } } // snippet B

The external behaviour for these two code snippets are exactly the same. However, most programmers would agree that snippet A is better rewritten into snippet B. So, in this example, during a refactoring session, it is likely that someone will change A into B, but unlikely that someone will change B into A. Hence snippet A has higher code entropy than snippet B.

Now, extend this idea into larger functions, classes, modules, applications, software design and architecture. Can entropy be used to describe the state of a codebase?