Code Quality (part 2)

[dropcaps round=”no”]H[/dropcaps]ere is our second post in the series about software Code Quality and a practical method to measure it. See Our earlier Post about code quality for an overview. The third post in the series talks about Supportability and Performance

1) Defectiveness (aka Measuring the impact of defects)

We know that all software has bugs, but some products are worse than others. The simplest way to measure relative quality of code is to look at the defects produced/caught.

Remember, you are supposed to have a reasonably good QA team that actually catches most defects before the client does, if not, get one!

Not all defects are the same, defects differ in their impact to the project, and their impact belies the quality of code. Simply put, good code does not result in many high impact defects or in a plethora of low impact defects. I propose a simple formula for rating a defect based on its impact.

The simple formula for a defect’s rating is;

Type of defect X severity of defect X time to fix defect

Type of defect is determined by the tester and is a numeric scale of 1 to 5, with 1 being a very simple defect. It differs from severity as severity is the impact to the user, whereas type is an estimate of the complexity of the defect. If the tester has to go through 5 specific steps to reproduce the issue, or it is intermittent and hard to reproduce, it will have a higher rating. If during analysis or resolution it is determined that the original estimate was incorrect, it may be updated.

Severity of a defect is a numeric scale of 1 to 5, with 1 being the least severe. Severity is most often determined by the effect of the defect on the client’s ability to use the system. If you do not already have this, then immediately hire a QA manager and visit this blog later.

Time to fix a defect would be on a scale of 1 to 5. 1 would be for a defect that takes less than one hour to fix, while 5 would be for a defect that takes more than a week. Absolute hours/days could be used, but would skew the result.

When put together, software that produces 10 annoying and simple defects would rate similarly to code that produces 1 high severity, time consuming defect;

(10) x 1 x 1 x 2 = 20
(1) x 1 x 4 x 5 = 20

Note that this formula takes into account the impact on the customer, testing and development. While this formula is overly simplistic, it captures the essence of a defect’s impact across most of the organization. If you feel the need to have a perfect formula, I would refer you to ISO 25010, as you will be happier with the results. If approximate is good enough for you, then by all means, read on my friend…

To compare two projects you might compare defects vs estimated effort (in story points/hours or some other metric), SLOCs, or function points. No metric is perfect, so choose one (or more) and see. The majority of existing analysis tools utilize KLOCs/KSLOCs as the denominator, but modern languages and frameworks tend to make lines of code less reliable as a base metric. Consistency is most important; there is no absolute scale that will work for everyone. Similar to audits, the implementation is left up to the user; it is the framework and a consistent application that is vital. I will discuss how to gather these stats and potential uses in a future blog.

The exciting thing about this metric is that it can also tell you all kinds of other interesting things. A defect rating is a more complete view of the defect’s impact to all parties, compared to a simple number of defects measurement. Two projects can have similar defect counts, but be of completely different quality due to the types of defects. If graphed, you can quickly visualize what kind of defects are being raised, thus allow you to train/mentor/berate your team to help alleviate the problem.

On one project we decided to analyze defects using a similar system. We found that the vast majority of bugs were of a low rating. The good news was that the impact was minimal, time to fix was relatively short, but the volume was very high. By spending a week reviewing 1,000s of defects (across 1,000s of pages), we could determine that approximately 70% of the bugs fell into about 10 patterns. We then used those to harass train the development team and the number of defects fell drastically with minimal time impact on development during the development stage, and significantly less time spent on bug fixing.

2) Maintainability

Code maintainability is something that is rarely measured, but often discussed and strived for. Architectural choices, design patterns and code style/structure can all affect the maintainability of the project. Maintainability is the ability to easily update a project for new requirements, such as new features, better scalability, or that new cool animation that will peg your CPU to 100% for no good reason. Highly maintainable code is generally easy to refactor, non-fragile, and includes both TDD and automated tests. While there are many proposals on how to measure code maintainability, I posit that with today’s explosion of languages, frameworks and design patterns, only direct measurement will be fairly accurate.

Note: The cool thing about writing a blog vs having an editor is the freedom to use words like posit, just ‘cause you can.

Some would argue that maintainable code is done by implementing Abstract Factory, Façade, Command, … or some other design pattern. But therein lies the problem, there are myriads of choices, many can be right, some might be less right, but at the end of the day… your client does not care. The client cares that the software works, is updated without breaking anything, and performs acceptably or better. Rather than measure the code structure, let us measure what clients actually care about.

By direct measurement, I refer to a sub-project or sprint designed to measure the maintainability. It can be performed throughout a project to make sure that the project is truly maintainable, early on in a project or most stressfully, at the end of a project. When project planning I propose that you keep some smaller features or modifications as separate from the main project. These will be the maintenance projects, perhaps implementing AJAX on the dropdown list boxes across multiple screens, or refactoring the code to implement a new type of validation, or maybe scaling from 100 simultaneous users to 5,000. The keys are;

Keep the project relatively small
Estimate the effort that it should take to do the maintenance project
Do not include in main project plan or coordinate it with the original team
Optimally, other resources do the maintenance. If the team/organization is too small then use the same team, but try to assign tasks to developers who have not worked on that part of the system previously. An alternative is to use a 3rd party contractor.
A different team member or team implements the maintenance project
Measure the actual effort vs estimated effort

The team will be implementing functionality that is part of the project, so very little additional effort is wasted. To keep it as natural as possible, it is best to keep the project details out of the standard project plan. That way it is more like true maintenance of unplanned functionality. In addition, direct measurement is by definition pretty accurate, and avoids the problem of coding for analysis purposes You know, like when a developer writes a unit test with no real test just so the code coverage will show 100%, as if somebody would do that!!!. This is not to say you cannot or should not include other code analysis, but that relying purely on code analysis to determine the maintainability of code provides only a partial answer to a complex question.

A quick meeting to discuss the results is required. The maintenance team should have notes indicating if there was anything that made their task harder or easier, and any recommendations. You should try to avoid direct architectural or design recommendations unless the original team asks for them. It might be confrontational, but it should be emphasized that it is a learning experience for all involved. A basis for comparison is the actual vs estimated effort, any statistically significant deviation indicates a likely problem. If necessary, adjustments to the estimate may be made with team consensus, but should be rare as the projects should be small in scope. The main advantage to measuring maintainability earlier in a project is that like most software projects, the earlier you find the problems, the better it is for everyone (Well, except for the high-priced consultant you would have had to pay at the end of the project to tell you that your project is screwed).

Code Quality (part 2)

1) Defectiveness (aka Measuring the impact of defects)

2) Maintainability

About the Mark Lythgoe