Coding flawlessly and providing code that is free from defects and complexity is an optimum requirement and need continuous effort and attention. Lack of adherence to coding best practices and standards may increase defect probability in code fragments. The coding styles that induce such defects include large codebase, functions with high number of arguments, deeply nested and branched conditionals and fewer comments. Homplexity is a package in Haskell that functions as a code quality checking tool for Haskell code. Homplexity evaluates code quality and complexity by measuring the relative length and depth of declarations, different metrics related to code, comments, type and code-to-comment ratio in order to provide warning about code fragments that might have less efficient or defective code.
Homplexity helps to maintain high code quality by inspecting the source code. It considers a number of different metrics in the source code fragments and these metrics are assessed to detect their values and range. Thresholds have been assigned for each of these metrics and if they crossed these thresholds, appropriate messages indicating the severity of issues are displayed. Homplexity will find the candidate trouble spots by looking for various problems that affect the code quality through these metrics. Homplexity also guide towards possible solutions like refactoring the code that were detected as having quality issues.
Software complexity increases due to many reasons; increased lines of code, unwanted operators and expressions, nested conditionals, too many function arguments, improper timing and execution of functions, unwanted code fragments, lack of documentation, and many other. As the interaction between the code elements increase, eventually a point is reached where changes become impossible and complexity becomes maximum. Relationship between software complexity and maintainability has been a prominent topic of research. One solution to this problem was suggested as the usage and implementation of deterministic complexity models.
There are two types of complexity. Accidental complexity is caused when the programmer does not have access to or the domain could not provide the necessary software engineering tools in order to devise the solution without complexity. Essential complexity is associated with the characteristics of the solution and could not be changed.
High complexity values may inhibit project outcomes. It increases effort needed for feature release, increase bugs because of reduced maintainability and changeability and may affect the amount of functionality that can be implemented with otherwise less complex software. Eventually the rising complexity may start dictating the sequence and schedule of the tasks also. So most of the complexity measurement metrics like cyclomatic complexity advises periodically checking the complexity measures from the beginning itself. This allows the programmer to identify the complexity measurements at earlier stages and enables him to make refactoring and readjustments as needed to reduce complexity.
Automated code quality metrics
These allow for the qualitative and quantitative measurement of code quality that in turn are decisive factors in the software quality. There are different code quality metrics that enable us to assess the qualitative advantages of the code namely maintainability, well-documented, readability, understandability etc. There are evaluated through methods like SLOC count, code review, code-to-comment ratio analysis, functional testing, unit test etc. There are other quantitative metrics like cyclomatic complexity, weighted micro function points and Halstead complexity measures. These software metrics are intended to function as a measurement for complexity as well as to identify measurable parameters in the software and the relations between them.
Cyclomatic Complexity is a software metric used for measuring the complexity of a complete program or individual components like functions, modules, methods or classes, developed by McCabe. It is the number of linearly independent control flow paths within the source code. For a source code with no control flow statement, there will be only one linear control flow path that is the main program and hence cyclomatic complexity is 1. Control flow statements divide the control flow paths into multiple paths and hence cyclomatic complexity increases accordingly. The cyclomatic complexity is computed with reference to the control flow graph of a program, where the graph contains nodes connected with each other through directed edges. The node represents a basic block of code (smallest group of commands) and an edge connects two nodes if control flows from the first node to the second. Cyclomatic complexity is computed as:
Cyclomatic complexity = E -- N + 2*P
- E = number of edges in the control flow graph
- N = number of nodes in the control flow graph
- P = number of nodes having exit points
Homplexity measures both code and declaration complexity. All measures taken are fully automatic and are as configured. The different metrics are read, analyzed, evaluated for ideal, best, optimum and worst case matches and output reports are generated. There are some extremely useful metrics that Homplexity consider such as the code metrics: type metrics and comment metrics. The code metrics has direct relationships with defect likelihood, complexity and understandability. Comment is a metrics that requires fair coverage over the code and improves understandability that Homplexity will look for. It is of particular importance to document information like, what each function argument does, what are the purposes of classes, variables and functions, etc.
Homplexity will check the following metrics and their properties during code quality check routine:
Lines of code
Code to comments ratio
Type tree nodes
Number of function arguments
Code metric is an important property that was targeted to examine from early days. It was used to estimate software development effort and complexity of changing software further. There were different kinds of code estimation software that would estimate schedule as well as cost. For this purpose, they made use of various input factors such as KLOC, SLOC, function points, use cases, cost drivers etc. Obviously, Homplexity also uses code metrics as a significant parameter in order to measure the software complexity. It calculates code metric complexity as a function of total lines of code.
Lines of code
Lines of code (LOC) are a critical factor in measuring software development effort and it has a direct correlation to the number of defects in the code. It has been proved that while applying refactoring and abstraction over existing code, LOC increases and hence bugs also with additional code. So it is recommended to keep a lean code base with chained, short form statements wherever possible. Homplexity assigns this direct correlation between LOC and bugs and measures defect rate automatically. Large code fragments are detected and highlighted indicating a candidate for refactoring.
Branching is used with conditionals and has been used as a criterion for both software complexity and logic. Homplexity evaluates each branch as it opens a new path of execution inside the main program. As the depth and branches increases, the line of execution also splits accordingly. This induces new complexity in software code base and also in execution. At each branching there is increase in LOC as well as incorporation of new logic at the cost of additional code. Hence branching depth has been a prominent factor for Homplexity to measure complexity of the code, both at static and at run time.
Functional languages rely heavily on types and hence it is logical to estimate complexity of function interface by estimating how hard it is to read and use its type. The interface of the functions are well defined with input and output parameters and their types and from this definition, Homplexity can read the type information and evaluate associated complexity. The Type declarations act as API interface and the complexity of these interfaces are also evaluated.
A qualitative approach rather than quantitative is prescribed for evaluating efficiency of comments in a source code. Studies reveal that the essential qualities a comment system should possess are coherence, usefulness, completeness and consistency. Each and every comment should be useful and significant in understanding the related source code. Homplexity look for total code to comment ratio and determines the value of comment metric property. Comment readability is also an important property that is measured.
Code to comments ratio
The Code Metric and Comment Metric are used to measure the Code-to-comments ratio that is a measurement for understandability. Quality comments assist well in understanding source code and also in documentation tasks. A reasonably good comments ratio thus indicates quality code.
The Fleisch-Kincaid readability test on comments for evaluating code quality is another planned feature. These readability tests are to decide how difficult a passage is to read and understand.
Type tree nodes
This is a basic metric for measuring type size, analogous to LOC for general code. Experienced programmers are known to use type synonyms to simplify types and make them easier to read. The type expressions are analogous to nodes and branches in the type tree. In order to evaluate the expressions, the nodes of the type tree need to be traversed. More type expressions in a code fragment means more branches in the type tree and higher its depth, leading to recursive traversal and hence increase in complexity. Homplexity follows tree traversal techniques to fetch and identify the type expressions and to evaluate their complexity.
Number of function arguments
It has a direct correlation with ease of use of APIs and function calls. Higher the number of arguments, higher will be the interaction and associated complexity. The related order of argument passing and referencing can easily create bugs if not carefully programmed. It is commonly thought that number of function arguments should be as low as possible. Homplexity checks for function arguments and highlights a function declaration code that has too many arguments. Homplexity thus indicates such a case for refactoring in order to reduce the number of function arguments.
Automated complexity measurement and quality checks
Homplexity also evaluates code complexity and quality by the quantitative measurement of various metrics properties like:
Functions per module -- density of functions in each module
Interactions between functions -- argument passing, API interface definition and communication.
Number of function arguments -- number of arguments passed to functions.
Types -- type definitions, function types and expression types.
Lines of code -- it relates to code size and has correlation with defect rate.
Operator complexity -- the quantity of operators involved in expressions defined in code fragments.
Understandability is another indirect metrics that is evaluated by measuring a number of direct factors. These are:
Comments per function/class/type/module -- each and every unit of the code namely, function, class, type and module needs to be well commented and documented.
Lines of code -- lines of code need to be distributed between functions, classes and modules so that any single code unit is not over burdened by too much lines of code.
Flesch-Kincaid (readability tests) -- it is performing readability tests over a passage or code fragment to measure how well it can be understood.
Once code quality is measured and factors contributing to the code complexity are traced out, Homplexity will provide the results that consist of:
Highlight the complex code
Provide complexity metrics (like SLOCCount)
Guide refactoring (like HLint)
The aim of Homplexity is to provide automatic feedback on code complexity and quality. The measures are used at multiple granularity levels, at function and program levels. Homplexity uses the validation criteria over the metrics that are applied to various factors such as Code Fragment, Code Metric, Type Metric and Comment Metric. Whenever the value of any metric crosses the threshold, appropriate messages indicating the severity of the issue are displayed. A sophisticated report including code highlighting and refactoring guidance is also provided. The overall aim of Homplexity in pointing out code quality and complexity issues in Haskell code is that appropriate refactoring and correction mechanisms are applied and at least optimum code quality is maintained.
There are some tools similar to Homplexity that are used for related tasks. There is a Haskell tool known as HLint that, on a small scale, suggests good programming style to improve the source code. The suggestions may be related to functions, code simplification and avoiding redundancies. SLOCCount is another set of tools of interest and it counts physical Source Lines of Code (SLOC) contained in the child directories and files of a specified set of directories. It can automatically identify the source code files and count the SLOC in a number of languages and even automatically detects the file types.
Code quality is related to software quality and this in turn to productivity of teams. Thus utilization of qualitative and quantitative metrics for code quality evaluation is an attempt geared towards boosting productivity and collaboration. It can be easily seen that manual attempts for tracing out complexity and refactoring efforts may cause high increase in code base and subsequently in defect rate. Also refactoring can induce high level of complexity in the form of over interaction and over activity at API, function and branching levels. While considering all these aspects, it is certainly recommended to use an automated code quality checking tool like Homplexity that not only provide automated tests but also assists in refactoring. It is required that tools like Homplexity are more efficiently utilized to uplift project execution to a higher level of efficiency and allow teams to reap the benefits of automated quality analysis. Homplexity, in coming days is certainly to contribute at a larger level for the overall software code quality and for the success of technology projects.