How to Optimize Software Production: Why Static Code Analysis Is Not Enough

Dr. Johannes Bohnet
Jul 31, 2023 3:01:25 PM

Software production is the systematic process of developing software systems. Source code is thereby the material of which software systems are built. Hence, at the heart of software production is the work with code. In this article, we elaborate on the limitations of static code analysis as a means to optimize software production and advocate for a more holistic approach that places CODING at the center of consideration instead of CODE.  

The Software Factory Metaphor

Nowadays, software production is often compared with hardware production. As such, there are attempts to borrow concepts for measuring and optimizing hardware factories and transfer them to "software factories". This is generally a good approach, because it helps evolving software development from a crafting discipline to a highly systematic production process that is measured and continuously optimized.

However, one must be aware that the hardware factory metaphor has its limitations. There are fundamental differences between hardware production and software production.

The intellectual part of hardware production is not happening in the factory. It happens in the so-called preproduction or prototyping process. The output of preproduction is a fine-granular process description of how the hardware factory shall convert raw material as input into assembled systems as output. A hardware factory is hence at its core about a mechanical process.

In software production there is also a counterpart to the mechanical assembly in a hardware factory: the build process that converts source code to executable software. Please note that we do not want to use the software factory metaphor to describe the software build process. By this, we would shift the focus to a phase in software production that is already profoundly solved and optimized. Build automation is state-of-the-art since the 1990s and the challenges to setup CI/CD (continuous integration and continuous deployment) properly are trivial compared to the real challenge, which is: Assuring an optimal use of the costliest and scarcest resource in software production – the time of the developers. 

We use the factory metaphor to focus on the challenging part of software development: the intellectual work of software developers who convert requirements/specifications into code. 

DALL·E A factory producing software, in the style of studio ghibli

The Role of Static Code Analysis in Software Production

Static code analysis refers to methods that analyze code with respect to structural aspects. The purpose of static code analysis is not to check whether a software system behaves as expected at runtime. Such analysis is performed by dynamic analysis or testing. Just as we have excluded "build automation" from the scope of the software factory metaphor, we also exclude everything related to runtime behavior from the scope. We do this here in this article to avoid defocusing from the real challenge in a software factory: the optimal use of developer time.

To better understand the aim of static code analysis, let's look at its counterpart in hardware production. For this, imagine a hardware factory that produces radios. Then static code analysis would be checking the structural quality of the assembled radio. Are resistors, capacitors and wires soldered well?

In short: Static code analysis is about the structural quality of the delivered output of the software factory, but it won't help to gain insight into the production process itself and is not capable to avoid inefficient use of developer time. Fittingly, also in hardware production no one would try to optimize the production process by just looking at the radios that are delivered out of the factory.

Above, we have discussed conceptual differences between hardware and software factory. There is one more difference: The hardware production process is a linear process that always starts from scratch. Input is converted to output. Simple. Software production, however, is an iterative process where every production cycle happens on the same material. New requirements as input are built into an evolving code base.

The hardware factory metaphor breaks unfortunately when taking the iterative character of software production into account. Such a hardware factory would only produce one single radio and re-solder and re-wire the radio in each production cycle. But let's image such an unusual radio factory. Then, static code analysis would be about checking how well capacitors, resistors, and wires are soldered. But it would also be about another aspect of the structural quality: the understandability of the inner structure of the radio.

A mechanical engineer with the requirement to add Bluetooth capabilities to the radio would work very slowly and cautiously if being confronted with a spaghetti-like wire chaos inside the radio. And this is exactly what software developers experience when they "open" a code base that lacks an understandable inner structure: too large and monolithic code units (circuit boards) with wide-spread dependencies (wires) and long methods with undocumented complex if-else structures (electronics elements where engineers must reverse-engineer their purpose und functions).

Why is a code base not immediately refactored and cleaned up, if it is so obvious that it a complex inner structure has such a negative impact on the factory performance? The reasoning is:

  • The purpose of the factory is to implement requirements, not to produce a code base with the shiniest inner(!) structure. Users and therefore product managers want features. They do not care about the inner structure of the code base. They can't even see the inner structure of a code base. Why should they deprioritize features against refactorings that would just steal precious developer time and won’t add user-value to the software? 

Hence, the problem is a communication problem between the business side and the development side. The term "technical debt" was born from this dilemma. The development side tries to explain in the language of the business side (money) that future problems are being accumulated. Unfortunately, the technical debt concept has shortcomings: 

  1. As of today, there is no practical approach to quantify the amount of technical debt in terms of money. All proposed concepts are unreliable and produce somehow arbitrary results.
  2. It is not about the debt itself. It is about the interest rates that are paid on the debt. Same as in the financial world. Why should you repay a debt if the interest rate is zero percent? Why clean up a complex code area if nobody must change it?

Consequently, using static code analysis to identify code areas that are difficult to understand is not enough. The problem is to present much more striking arguments to the business side to convince them to go without some of their feature wishes and allow for refactorings. Showing the number of rule violations found by a static analysis tool is usually a very unpersuasive argument.  

Because of the conflicting objectives between business and developers, static code analysis is usually brought into software development at a part of the process that is entirely within the realm of the development side: the definition of coding rules. Here however, static code analysis can only help keeping the status-quo, i.e., no new coding rule violations. Any investment in the removal of existing rule violations would require a “negotiation” with the business side.

Changing Perspective: From Code to Coding

As you have seen in the previous section: Static code analysis is conceptionally not a suitable way to create insights into the software production process itself. Hence, it cannot help to ensure an optimal use of the time of the developers, the scarcest resource in the production process. What now? The answer is straight-forward: Changing perspective. Shifting the focus from code to coding and measuring directly how software developer time is used. The approach sounds easy but is difficult in practice. Reasons are manifold:

  • In agile methodology, time tracking is perceived as overhead for developers. Hence, no or only limited time tracking information is available.
  • Even if time tracking information would be available, it would only tell how many developer hours were spent per work item (story, task, change request). It would not create insight whether existing code with technical debt was encountered while working a work item. 

Hence, a solution that measures efficient use of developer time cannot rely on time tracking information. What other data could be taken as an approximation of time? 

  • In agile methodology, teams use “story points” for upfront effort estimation of work items.
  • The problem with story points is, however, that they are not meant to be meaningful across multiple teams. Using them outside the scope of a single team would ignite a hyperinflation. Each team would drastically increase their number of story points to avoid being perceived as “low-performers”.

Luckily, an analytics-based approach has been invented recently that overcomes the above-mentioned obstacles. With it, the flow of developer time can be reconstructed from the technical data traces that are already automatically collected in the software development infrastructure tools such as code versioning systems (Git, Subversion, MKS, Mercurial, ClearCase, …).

Interestingly, despite the fact that the analytics approach reconstructs the flow of developer time from activities of software developers, it does not require to know who each developer is. That is, no personal related data is processed. It is somehow the proverbial "squaring the circle": The ability to see developer time flowing without the ability to track individual developer behavior. We emphasize this, because an approach with which the performance of individual developers could be tracked would be a no-go for workers' councils in Europe.

Optimizing Software Production by Focusing on Developer Time 

With the information about the flow of developer time, we can add more analytics methods to reveal inefficiencies in the software production process. The following graph shows how developer time is consumed by various inefficiencies and only a fraction of the developer time remains for business value creation. Inefficiencies include:

  • Interest Payments on Technical Debt: Developer time is spent in code that contains technical debt. As mentioned in the previous section, this solves the communication problem between business and development side as refactorings in these code areas have a direct return-on-investment.
  • Defect Fixing: Defects are not only bad from a quality assurance perspective; they typically also represent a significant fraction of developer time that cannot be used for value-add creation.
  • Unsteered Work: Usually unplanned fire-fighting situations where developers are forced to work without tickets and clear specifications (Story, Task, Change Request).

waterfall

Being able to quantify the efficiency of a software factory with such KPIs enables the management as well as the contributing teams to locate inefficiencies and drive the organization towards less loss. In practice, software factories – compared to hardware factories – often operate at a low level of efficiency and have a huge optimization potential. The reason is that until now there was no analytical approach to measure efficiency in software production.

A further advantage of this data-driven approach is that the insights are derived from technical raw data. Hence, we can trace back the root causes of inefficiencies by navigating into the technical data details. For example, we can identify where defect fixing activities take place in the code architecture. By this, the software architects and team leads have a precise x-ray picture where problematic hotspots exist in the code. A powerful and striking way to reveal hotspots in the code architecture is to visualize code as city. Buildings are code units, and the city structure mirrors the technical module structure. Height and color of the code buildings are used to depict inefficiency KPIs; for example the amount of developer time spent for defect fixing in the code building.

The advantage of such a visualization is that technical information can be interpreted both by the technical code experts as well as by managers. It helps to bridge the communication gap between business and development sides (see section above).

maps=with-labels

The analytics approach works on two levels: 

  • Strategic level: Continuously measuring efficiency and steering the software factory towards fewer and fewer inefficiencies. A typical software factory has a large potential to tap; and freeing up formerly blocked developer time allows the delivery of more business value, faster, and in higher quality with the same teams.
  • Operational level: The teams drill down into root causes and define small and focused improvements that are integrated into the normal development cycles. The continuous measurement of efficiency KPIs thereby makes sure that the improvement activities have the desired impact.

As a result, the analytics approach enables software factories for the first time to establish a data-driven continuous improvement cycle, where "becoming better" is seamlessly embedded into the daily doing of the software factory and the analytics-based insights and KPIs ensure that the factory steers towards higher and higher production excellence.