Clean Code, is it really worth it?

As developers, we’ve been always putting a lot of effort into making our software richer in features and more stable. Are there any other elements of our software that should concern us?

Definitely.

Clean Code, defined as code that is Reliable, Maintainable, Secure and Portable, is a discipline that will help on producing code that saves money.

Just only considering the number of defects depending on the health of the code we can see that healthier code means cheaper code.

But, not only this…. following the CPSQ report for the USA, we will see that the real cost of bad code is super high in terms of money. Only on Finding & Fixing bugs, the amount is 607 Billion

But, the software produced is not only what is important here.

So the focus should not be put only on having the validations to check the software produced, but the personal development in constant learning is also to take seriously. And yes, obviously the latter will affect the former.

Some principles

I call it the Nirvana for both software and person.

We can start with some common principles :

Consider Security
DRY (Dont Repeat Yourself)
YAGNI (You Aint Gonna Need It)
Readable & explanatory naming
Reduce Cognitive Complexity
Boy Scout
S.O.L.I.D
Test your code and test it well
KISS & don’t overengineer

But, even knowing that these principles are pretty simple and obvious, reality shows often are not covered.

Current status of Clean Code in overall projects

Using the data coming from SonarLint, the IDE linter from SonarSource, free, that collects metrics of the number of hits for each of its issues rules ( more than 600 only for Java), we can get this list of most detected issues:

These are the most detected issues in public repositories:

Cognitive Complexity of functions should not be too high
Unused assignments should be removed
Unused local variables should be removed
Track uses of “TODO” tags
Unused function parameters should be removed
Raw types should not be used
Unused “private” fields should be removed
Generic exceptions should never be thrown
Local variables should not be declared and then immediately returned
Unused “private” methods should be removed
A reference to `null` should never be dereferenced/accessed
Methods should not be empty

And the most detected security issues (number of issues):

A secure password should be used when connecting to a database
XML parsers should not be vulnerable to XXE attacks
Endpoints should not be vulnerable to reflected cross-site scripting (XSS) attacks
Credentials should not be hard-coded
I/O function calls should not be vulnerable to path injection attacks

We could reduce all these issues into 6 groups:

In terms of vulnerabilities we can hightlight 3 issues:

Hardcoded credentials
Database operations injection
HTTP request redirections

Here I add a few interesting bugs or code smells that I consider tricky or nice to consider, also as something that is not always easy to spot manually :

Spring beans annotations not in default package
Use wait instead of Thread.sleep when a lock is held
Hash-based collections with known capacity in Java 19
Tests must contain complete assertions

It’s also important to highlight a few design issues to consider, in order to not have high complexity in the code that will make it hard to read therefore hard to maintain, hard to debug, and hard to extend:

Methods should not perform too many tasks (brain method)
Class depends on an excessive number of classes
Cognitive complexity is too high

Tools

But this is only a subset of things to consider. Only for Java, one of the code analyzers, SonarLint, is having more than 600 rules.

It’s important to note that a developer should take care of all of these rules everytime they are comitting code or in the process of reviewing others’ code.

And here is where the use of code analyzers that can help in this process becomes a key differentiation. We can find several of these tools just for Java :

SonarLint , SonarQube
CodeQL
PMD
Semgrep
…

These tools will help us by showing warnings about the bad code we are introducing. Bugs, Style and Vulnerabilities can be avoided with the usage of these tools.

IDE Integration

Some of these tools can be integrated directly into our IDE (Integrated Development Environment) showing warnings directly on the code together with information about the detected issue and the best practices to solve them.

SonarLint example on IntelliJ

It’s also important to have a tool that can effectively prevent bad code to be merged into our repository, following the company’s definition of what are the thresholds accepted in terms of code coverage, number of non-blocker issues, etc.

Quality Gates

We enter into the concept of Quality Gates, where tools will set the standard definitions of what bad code is, but companies can configure the code complexity accepted, the number of issues, the code coverage, and even a different value for the new code and for the overall project.

False Positives

But, even using tools that analyze the code we can experience one important to have in mind when we choose the analyzer : False Positives Ratio.

Sometimes the analyzer will show issues where , after reviewing them, we see they are not real issues. This can happen by different causes, including the analyzer accuray, our code distribution or particular code cases that make that issue a false positive in a specific part of the code.

In this particular code we could see a false positive regarding the use of strings concatenation in logging. If CONST_VALUE was a variable, then this will be a genuine issue, but in this case CONST_VALUE is a constant calculated in compile time, so in the end we are not doing string variable concatenation.

After seeing this in our analyzer, we can apply our knowledge helping the tool by saying “This is not really an issue, don’t count it in the Quality Gates”

https://www.sonarsource.com/blog/false-positives-our-enemies-but-maybe-your-friends/

Where to put focus : Clean as You Code

Finally another important concept that can lead to wide adoption in your company and reduce frustration (leading to low adoption) is the concept of Clean as You Code. Basically the goal is to focus on the new code produced and pay very little attention to the rest of the code.

But why is not that important to focus on the overall project code ? Here we can connect to a story regarding a ship.. The Ship of Theseus. In this story there’s a ship, that gets one part renewed every now and then making the ship containing all new parts after a lot of time. The big question here was, is it the same boat?.

With this story I want to show that eventually old code dissapears from the projects, so there’s no point of putting too much effort on something that we know statiscally will dissapear.

Here you can see the evolution of code considering some interesting repos on GitHub, on the Git Of Theseus project

And just taking a specific project we can see that the code entered 10 years ago has almost vanished from the project

The goal for ourselves is to focus on not introducing bad new code. Only this will make that eventually our project will be almost completely containing good code.

It’s not harmful though to do boy scouting from time to time if we see easy issues to be fixed.

Well, this has been a brief summary of a point of view on Clean Code and how this can help on reducing the cost of your software developement.

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30