Restructuring: Improving the modularity of an existing code-base BruJUG conference

Here is an article about a BruJUG conference given on 26/04/2012 by the founder of Structure101 company, Chris Chedgey. The complete conference video is available on vimeo.

What is restructuring?

Refactoring and restructuring are both terms that imply non-functional changes.

Refactoring means changing the code to make it more readable. This also means invasive code editing. It usually involves only a few classes.

Restructuring means reorganizing the code-base without change to the code to improve the modularity and make the code-base easier to understand. This involves minimally invasive code editing but the scope is the whole code-base.

The code-base structure has 3 aspects:

  • the package composition
  • the dependencies between them
  • and the hierarchy of the “nested levels” of packages

The two code-base quality factors to consider here are complexity and modularity.

And why is it important?

Because a better code-base makes your code more understandable. And understandable code is cheaper to maintain and evolve. Changes have a more predictable impact on the code. And, of course, your code-base have a better testability, reusability. At the end, your code has a better value.

Complexity

Complexity can be measured by different means. Two of them are fatness and tangles and can be used to measure complexity.

Fatness is when you have too much code in one place (number of method in a class, number of class in a package, number of package nested under the same package or in the same component, …).

Tangles occur when some code in a package reference code in another package which itself references code in the first package (cyclic dependencies).

Both fatness and tangles can be approximated by automatically by metrics which make them good candidate for automatic checking using thresholds (e.g. in your build system).

This diagram shows the link between tangles and fatness. This means that you can eliminate all tangles pretty easily by moving everything to the same place. But then you get 100% fatness. To the contrary, you can eliminate fatness by partitioning your code-base. But then you create tangles. What you seek is a compromise between the two. What you really don’t want is a code-base that is both fat and full of tangles.

Modularity

Modularity is best-defined by the mantra “high cohesion, low coupling“.

Modularity can show itself by multiple means. One of them is well-defined public interfaces while the remaining internals are kept private. Another one is when your packages have a clear responsibility.

Unfortunately, the best way to assess modularity of the code is to make it checked by a human software architect.

So how can I work on my code-base structure?

Usually, the methods and classes are OK. But there is almost no logical organisation of classes into higher level modules (= packages in Java). Packages are too often used more like a filesystem and not as an embodiment of module hierarchy.

What you need to have a good code-base structure is to understand neatly the following aspects:

  • package composition and dependencies between packages
  • the flow of dependencies
  • the application business

Once you understand all of this correctly, you can define and achieve your architectural target.

Restructuring strategies

There are a lot of strategies you can use. Here are some chosen one.

Merge parallel structures

If you have parallel structure (one for presentation, one for services, one for persistence, one for extranet, one for intranet, etc.), you’d better merge them to minimize the dependencies between packages.

Bust very large class tangles early

You often find yourself with one or a few large classes tangles spanning many packages. Fixing these will improve your code-base rapidly.

Do as much as you can by only moving classes and packages first

This is a least invasive refactoring you can do to improve your complexity and modularity. Moreover, this requires low effort.

Bottom-up or top-down approach?

Both are valid but have different impacts.

Top-down approach keeps as much of the existing package hierarchy as possible. This means that the “psychological” impact on the application team will be minimized.

Bottom-up approach uses to end far away from the current structure but is often easiest to achieve.

Tackle complexity before modularity

A structure without tangles is way easier to manipulate.

Other strategies

  • Split packages that lack cohesion
  • Split fat packages and fat classes
  • Move tangles together
  • Make the restructuring a milestone

Conclusion

That your code-base is a mess regarding modularity matters is common.

Lack of structure costs money.

That lack can be salvaged.

Restructuring your code-base is not easy but huge returns can happen.

Examples

Here ends the “theoretical part” of the presentation and begin the examples, illustrated by the ReStructure101 software which helps the architect to visualize the current structure of his code-base and allows to simulate structure changes and their impacts.

That tool philosophy is to create a task list reflecting the changes done rather than changing the code-base directly. After a restructuring session, the architects ends up with that task list that he can perform himself or plan to be executed by other developers.

Plugins allows to use that task list easily inside IDE like Eclipse or IntelliJ.

I’d say I love this philosophy because it gives you the feeling you’re always in control and that you are not only executing a drag’n’drop session in a GUI but really modifying your code-base deeply.

Thanks BruJUG for this enlightening conference.

See you next time.

Written on May 3, 2012