Artificial Intelligence and Machine Learning for Software Engineering

Speaker: Prof. Dr. Jürgen Döllner

Evolving the matter of intelligence must only mean that we also, simultaneously, expand the possibility of reaching beyond our human limits. Jürgen Döllner, professor for Computer Graphics Systems at the Hasso-Plattner Institute in Potsdam, Germany recently spoke of the potential for Artificial Intelligence (AI) to transform the structure, code analysis and perhaps finally code production of software development. Some fifty guests attended the presentation, including a number of representatives for companies at the forefront of the contemporary technological market.

Since Joseph Weizenbaum introduced the dialogue between man and machine with his programme Eliza in 1966, the notion of AI created some initial excitement, experienced a phase of disappointment, and has now gained tremendous momentum. As might be expected, researchers and software engineering pioneers have begun to harvest the potential of Artificial Intelligence to fundamentally change how software development will work in the future. Its ability to gather and analyse huge amounts of data in an automated fashion, detect patterns as well as process and generate natural language sentences could free developers from many time-consuming, non-creative but important tasks.

Beyond the way that AI could absorb mundane but important tasks like writing source code documentation, thus freeing the human individual for tasks engaging greater degrees of creativity and intuition, it also displays promise for revolutionising software design, business and software architecture, and a final alignment of the two. Such would ultimately promote an effect that shifts the role of human software engineers to a role, where the person oversees and understands business requirements, the definition of underlying functional and non-functional technical requirements, and finally their transformation into a technical model.

Around advancement naturally lies debate, as the threatening ability of a machine to replace a developer for less time and money has caused murmurs of apprehension in the community. However, Prof Döllner emphasised the futuristic nature of the idea that AI could replace entire teams of software developers, by foregrounding the problematic nature of giving AI the ability to synthesise a program from user intent. Much progress is left to be made before we reach the scenario of Inductive Program Synthesis, where the developer instructs the machine on what they want, and the machine produces code according to the examples and instructions given. Despite many past developments that increased the efficiency of software development and carried the threat of making software developers unnecessary, today’s market instead shows that the need for developers is greater than ever.

Rather than revolving around this abstract notion of AI occupying the key, active role as a software developer, the present excitement for software developers centres within three ideas:

How can AI help to simplify the application of neural networks and other machine learning approaches?
Will AI revolutionise code analysis and support Software Engineering processes and sit as a technical cornerstone within a software team to make software development more efficient?
Will AI ultimately come to the state of being able to produce working code that carries business logic?

While the answer to the third question leans towards the idea of Inductive Program Synthesis and so rather depends upon the scientific progress made by AI researchers between now and an indefinite time in the future, current events in the world of machine learning offer significant promise for the first two ideas.

Before diving into more details of each of the three questions, it is worthwhile explaining some fundamental concepts. Within the last few years AI has been successfully trained on natural language corpora to solve tasks like text translation, text classification and summarization. Two developments have made this possible: The availability of vast amounts of data, and new algorithms as well as computer hardware enabling to efficiently process the large amount of data. Regarding universal software repositories in the Cloud like GitHub, Prof Döllner expressed that vast code repositories have been made available of late, where the ‘language’ of code consists of programming languages. If we treat the source code available in these repositories as ‘corpora’ in the same fashion as that used to train AI with natural language, with programming languages, AI could be trained with these codebase repositories to solve similarly complex and amazing tasks such as summarizing source codes to generate comments, and so on.

Simplifying the Application of Machine Learning

So that Machine Learning can come to play an active, widespread role in software developing, scientists also face the challenge of simplifying the process of developing an AI-based application. Developing such an application currently requires enormous expertise, despite lots of progress having been made with the introduction of AI platforms and toolkits like TensorFlow. Designing the machine learning model, including the model’s topology, characteristics and parameters, requires a lot of experience and manual work, and would be difficult to achieve on a level that would allow AI to contribute efficiently as a part of a normal development team without specialism in developing AI applications. However, there is significant promise in the fact that one can apply AI algorithms to the process of developing AI-based applications, helping to automate the process and relieve this challenge.

Machine Learning: Conducting Code Analysis and Support Software Engineering Processes

Artificial Intelligence has the potential to innovate the process of code analysis, which could prove instrumental for large organisations trying to manage and maintain codebases of complex code that have accumulated after many years of development. In grand-scale companies where millions of lines of code have been implemented through the years, it can be difficult for developers to sift out code developed by a team ten years ago as they attempt to maintain and innovate the codebase to suit the current needs of the company. By adopting a summarisation of source code, i.e., generating short, descriptive function name-like summaries for code snippets, AI can transform the ways that developers search for and understand code within a professional organisation. Such would ensure that developers’ time is being applied the most effectively.

Another way in which AI could heighten the precision of software development is through the process of Comment Generation, which recently has been accomplished with a Recurrent Neural Network (RNN) model. By building a language model for natural language comments and aligning the words in comments with individual code tokens directly, CODE-NN bridges the gap between the verbally semantic human brain of developers, and the numeric logic of code. As AI operates with the strategic rigour of a machine and loses the human error involved with software developers, it can generate comments with greater accuracy, and has the potential to reduce the frequent occurrence of mismatched, missing or outdated code comments which occur due to tight project schedules, or other reasons.

Given that more than 60% of developer efforts are spent on program comprehension, this could lead to a significant redistribution of human energy to areas otherwise overlooked, and otherwise ensure that errors in the code are being targeted and addressed.

In a world where pervasive technological power leads to a constant and fluid exchange of information, a primary concern of such organisations consists of assuring maximum security to a codebase. Either introduced by intentional reuse of the same lines of code or unknowingly in the event that an engineer accidentally develops the same code again, code clones can pose a significant threat to a company in the event of a security breach; then not only is the breach present in the original code, but in the cloned copies of that code throughout the codebase. Given that fixing issues relies on finding all of the clones with that particular defect, code clone detection has been a field of active research since many years. AI is expected to boost the accuracy of the methods currently being used to tackle the problem of code clones.

Generating Code

Illustrating the nature of Inductive Program Synthesis, Prof. Döllner presented the audience with the ultimate goal AI produces a program from a mere description of what it should do, sitting beyond a simpler form of Program Synthesis where the AI is provided with input-output examples for it to produce a program that has behaviour consistent with the examples. In working towards this goal, scientists take an approach of teaching AI how to induce programs, such as, by using a corpus of program induction problems to learn strategies that generalise across problems.

The approaches to this goal are divergent and disparate: ranging from Programming by Example, i.e., where the AI bases itself on search techniques and carefully crafted heuristics, to Programming by Description where the AI synthesises from natural language descriptions. This Program Synthesis takes a twofold approach, where the machine firstly learns to induce programs, that is, use a corpus of program induction problems to learn strategies that generalise across problems, and then integrate neural network architectures with search-based techniques rather than replace them. The process is highly complicated, and would introduce the need for a slightly different skillset among engineers used to traditional programming techniques and otherwise forecasting a possible raise in the general standard of the code being produced in software development.

Amongst all this, it’s important not to indulge in leaping to the conclusion that AI has the risk of replacing all human developers in some futuristic scenario: after all, the aforementioned advantages of AI for summarisation, comment generation and sifting out clones still require a software engineer to manage and interpret the evaluations being provided by the artificial intelligence. The matter of AI being used for Program Synthesis is still very much considered as a goal and has an intrinsically problematic (and problematised) nature that spells that even if it came to fruition, it would still require careful intervention from skilled engineers.

Although perhaps discomforting, it is always worth reminding of the idea that destabilisation must always precede change, and that such pervasive change would only mark the natural evolutionary state of scientific discovery. It is contained in its very nature that science should embrace the radical as evidence of change and progression, and software engineering is likely to experience a dramatic shift in the type of programmes being generated as machines come to absorb this role. Equally, it’s always worth referring to the example of the nuclear car “Ford Nucleon”, where the promise showed by the new technologies failed to fulfil expectations, and the idea fell. The future always evolves quite differently to the ways that we imagine, and it is always wise to leave room for being surprised by the way that events unfold.

Artificial Intelligence and Machine Learning for Software Engineering

You May Also Like

Software as a Management Task – A Strategic Retreat at Schloss Fürstlich Drehna

The 60% Software Drain: Why CIOs Must Put Efficiency on the Boardroom Agenda

Melanie Kehr on the Digital Transformation and Sustainability in Software Development

Subscribe by Email

Generative AI Seerene GmbH