Collection of Useful Metrics by Software Testing Managers for Effective Test Management
Software testing managers usually come across a vital & tricky question, “What metrics should be collected?” There is no pin pointed answer to this question. Fact remains that, metrics depend upon the varying needs of every development and testing organization. In fact software testing itself is a measurement activity involving collection of various metrics related to the quality of the software application being developed by a different group of people.
The Software Engineering Institute (SEI) has prescribed four basic metrics areas:
Though it is difficult to specify any particular set of metrics for any organization, it is ideal to work out at least one metric for each of the above mentioned four areas prescribed by SEI.
Despite complexity of the process of collection & use of metrics, following article describes various tips & strategies software testing managers adopt while effectively deploying different metrics all across their testing effort. This discussion should be helpful to the testing managers in using accurate metrics for their decision making, planning of time estimates,
tracking of progress & improving their current processes.
Different metrics tabulated below represent good example in development as well as testing projects. We can create many more working examples for each of the five metrics described in the following table.
|Sr.||Metric||For Software Testing||For Software Development|
|1||Size||Number of modules, lines of code, or test cases.||Number of modules or lines of code.|
|2||Schedule||Number of test cases written or executed versus the timeline.||Number of modules completed versus the timeline.|
|3||Resources||Money spent, hours of work.||Money spent, hours of work.|
|4||Quality||Defect Removal Efficiency (DRE), coverage.||Number of defects per line of code.|
|5||Rework||Number of test cycles to test bug fixes.||Lines of code written to fix bugs.|
How the Test Managers use Metrics?
Different test managers like software testing engineers, developers, development managers, and others personnel in the development team perform the following activities.
1) Identification of risky areas needing additional testing:
Experts declare that the areas of a system that have been the source of many defects in the past will very likely be a good place to look for defects now and in the future as well. So, by collecting and analyzing the defect density by module, the tester can identify potentially risky areas that warrant additional testing. “Pareto Analysis” does the similar thing. Likewise, using a tool to analyze the complexity of the code can help identify potentially risky areas of the system that require a greater testing focus.
2) Identification of additional training needs:
Metrics that measures information about the type and distribution of defects in the software, testware, or process can help us identify the training needs. For instance, if a certain type of defect, like a memory leak, is encountered on a regular basis, it may indicate that training is required on how to prevent the creation of this type of bug. Or, if a large number of “testing” defects are discovered (e.g. incorrectly prepared test cases), it is a certain pointer to provide more training in test case design.
3) Identification of process Improvement opportunities:
Similar analysis can be used to find out opportunities for process improvement. Rather than providing training, maybe the process can be improved or simplified, or maybe a combination of the two can be used. Another example would be that if the test manager found that a large number of the defects discovered were requirements related defects, the manager might conclude that the organization needed to implement requirement�s reviews or preventive testing techniques.
4) Providing a basis for estimating:
In the absence of some kind of metrics, managers and practitioners remain helpless when it comes to estimation. Estimates of how long the testing will take, how many defects are to be expected, the number of testers needed, and other variables have to be based upon previous experiences. These previous experiences are nothing but “metrics,” whether they’re formally recorded or just happen to remain in the mind of the software engineer.
5) Providing metrics to trigger actions:
Metrics can be used as a trigger or threshold signaling that an action needs to be taken. Examples can be exit criteria, smoke tests, and suspension criteria. These are treated as mature metrics because for meters to be effective, they must be planned in advance and based upon some criteria established earlier in the project or on a previous project. However some exceptions always remain. Some organizations ship the product on a specified day, irrespective of the consequences. This is a live example of a metric triggering the action of shipment of the product when the particular date is reached.
6) Justification of budget, infrastructure, or training:
It is a common feeling among the test managers that they are understaffed and require more people, or they feel that they need a bigger budget or more training. Hence, without good metrics to support their feeling, their requests are not bound to reap any benefits. Test managers need to create sound metrics to justify their requests for additional persons, budgets and training needs.
7) Providing controlling & tracking of status:
Test managers (and testers, developers, development managers, and others) need to use metrics to control the testing effort and track progress. For example, most test managers use some kind of measurement of the number, severity, and distribution of defects, and number of test cases executed, as a way of marking the progress of test execution.
Different Groups of Popular Testing related Metrics according to the possibilities of controls they provide are as under
A) Groups of Metrics based upon the Project progress
1) Related to test planning and monitoring:
# Number of tasks started
# Number of tasks completed
2) Related to test development:
# Number of specified and approved test procedures
# Number of relevant coverages achieved in the specification, for example, for code structures, requirements, risks, business processes
# Number of other tasks completed
3) Related to test execution and reporting:
# Number of executed or initiated test procedures
# Number of passed test procedures
# Number of passed confirmation tests
# Number of test procedures run as regression testing
# Number of other tasks completed
4) Related to test closure:
# Number of tasks completed
For each of these groups described above we can collect metrics for:
# Time spent on specific tasks both in actual working hours and elapsed time
# Cost both from time spent and from direct cost, such as license fees
B) Groups of Metrics based upon the coverage:
# Number of coverage elements covered by the executed test procedures code structures covered by the test
C) Groups of Metrics based upon the incidents:
# Number of reported incidents
# Number of incidents of different classes, for example, faults, misunderstandings, and enhancement requests
# Number of defects reported to have been corrected
# Number of closed incident reports
D) Groups of Metrics based upon the confidence:
# Subjective statements about confidence from different stakeholders
Nine Thumb-rules for Collecting & Using the Metrics:
Experts prescribe certain thumb rules for collecting & using various metrics that are given below.
1) Question the developers & software testing engineers:
Developers and software testing engineers are the best first hand source of information as to what metrics would be helpful to them in doing their jobs better. If these people don’t believe in the metrics or feel that management is thrusting another useless metric on them, they will possibly revolt & avoid collecting the metric, or by falsifying the metric by writing down any older stuff.
2) Use One Metric to Validate Another:
Rarely do we have enough confidence in any one metric that we would want to make major decisions based upon that single metric. In almost every instance, managers would be well advised to try to validate a metric with another metric. For greater test effectiveness, it is a good practice to accomplish the key measures by using more than one metric (e.g., a measure of coverage and Defect Removal Efficiency (DRE), or another metric such as defect age). Likewise, the test managers would not recommend the release of a product based on a single measurement. The test manager would rather base such a decision on information about defects encountered and remaining & the coverage and results of test cases etc.
3) Normalize the Values of the Metric:
Since every project, system, release, person, etc. is unique, all metrics will need to be normalized. It is a good idea to reduce the amount of normalization required by comparing similar objects rather than dissimilar objects (e.g., it would be better to compare two projects that are similar in size, scope, complexity, etc. to each other than to compare two dissimilar projects and have to attempt to quantify the impact of the differences).
As a rule of thumb, the metric must be as far as a true replication of truth. For example, if you don’t have a reservoir of data, you could compare your project to industry data. This may be better than nothing, but you would have to try to account for differences in the company cultures, methodologies, etc. in addition to the differences in the projects. A better choice would be to compare a project to another project within the same company. Even better would be to compare your project to a previous release of the same project.
4) Measure the Value of Collecting the Metric:
Collection & analysis of metrics can be extremely time consuming effort. Seasoned test managers try to view the value of every metric collected viz.-a-viz. effort required to collect and analyze the data. Software inspections are a perfect example. A normal part of the inspection process is to measure the number of defects found per man hour. We need to be quite careful with this data, because a successful inspection program should help reduce the number of defects found on future efforts, since the trends and patterns of the defects are rolled into a process improvement process (i.e., the number of defects per man hour must go down). A good example is the collection of data that no one is using. There might be a field on the incident report, for example, that must be completed by the author of a report that no one is using.
5) Revalidate the Need for Each Metric Regularly:
Good test managers as a routine evaluate the value of collecting a metric, to see if there’s a continuing need for them. Metrics that are useful for one project, may not be as valuable for another (e.g., the amount of time spent writing test cases may be useful for systematic testing approaches, but has no meaning if exploratory testing techniques are used). A metric that is quite useful at one point of time may eventually outlive its usefulness.
6) Simplify the Process of Collection & Analysis of Metric:
It is ideal to collect the metrics automatically. For example counting the number of lines of code, which is done automatically by our compiler. Collecting metrics as a by-product of some other activity or data collection activity is almost as good as collecting them automatically. For example, defect information must be collected in order to isolate and correct defects, but this same information can be used to provide testing status, identify training and process improvement needs, and identify risky areas of the system that require additional testing.
Let us keep it in mind that some metrics will have to be collected manually. For example, test managers may ask their testers to record the amount of time they spend doing various activities such as test planning.
7) Maintain the Confidentiality of Data:
It is extremely important for the test managers to understand that certain data may be sensitive to other groups or individuals, and act accordingly. Test managers could benefit by understanding which programmers have a tendency to create more defects or defects of a certain type. While this information may be useful, the potential to alienate the developers should cause test managers to carefully weigh the benefit of collecting this information. Other metrics can be organizationally sensitive. For example, in some classified systems, the information about defects itself can be considered to be classified.
8) Look for Alternate Interpretations:
Generally there can be more than one way to interpret the same data. If, for example, you decide to collect information on the distribution of defects by programmers, you could easily assume that the programmers with the most defects are bad programmers. Upon further analysis, though, you may find out that they just write a lot more code, are always given the hardest programs, or have received particularly poor specifications.
9) Customize the Format of the Data as per the Audience:
Usually the test manager are given the opportunity to brief developers, users, upper management, marketing, and many other interested parties. When presenting the data, it is important for the test manager to consider the background of the audience and their needs. For example, if you are presenting the data that shows testing progress, the data might be presented to the users in a different format than it would to the developers. The users may want to know how much of the functionality has passed the test, while developers might want to see the data presented with respect to the amount of code that was tested.
Advantages of Deploying Metrics: Metrics provides following help to the software testing managers.
1) They help in Identification of risky areas that may need more testing, training needs, and process improvement opportunities.
2) They are helpful in controlling & tracking the project status by providing a basis for estimating how long the testing will take.
Important Lessons Learnt:
1) Developers and software testing engineers should be involved in deciding what metrics will be used and how they will be collected.
2) By collecting and analyzing defect density by module, the software-testing engineer can identify potentially risky areas that need additional testing.
3) When you can measure a thing & express it in numbers, you can say that you know something about it; but when you cannot measure it & cannot express it in numbers, your knowledge is limited about it. Or in other words, a thing that can not be measured can not be controlled.
4) Compelling people in using metrics without proper explanation and implementation is counter productive.
5) Collecting metrics as a by-product of some data collection activity is as good as collecting them automatically.
6) Farther a metric is from the truth, the less reliable the metric becomes.
7) Every measurement must have a linkage to a need, failing which it becomes a sheer wastage of time.
8) Use of an incorrect metric can lead to catastrophic results like dissatisfaction among employees or even several unpleasant situations in the organization.