Checklist for Software Reliability and Availability Acceptance Criteria
Software reliability is defined as the probability that the software executes without failure for a specified amount of time in a specified environment. The longer a system runs without failure, the more reliable it is.
A large number of reliability models are available to predict the reliability of software. A software reliability model provides a family of growth curves that describe the decline of failure rate as defects are submitted and closed during the system-testing phase.
The failure rate is often calculated in terms of MTBF.
A growth model can answer the following questions, which can be part of the reliability acceptance criteria:
|1.||What is the current failure rate of the software?|
|2.||What will be the failure rate if the customer continues acceptance testing for a long time?|
|3.||How many defects are likely to be in the software?|
|4.||How much testing has to be performed to reach a particular failure rate?|
The failure rate goal that is acceptable must be set separately for each level of problem severity – from critical to low.
A customer may be willing to tolerate tens of low-severity issues per day but not more than one critical problem in a year.
System availability consists of proactive methods for maximizing service uptime, for minimizing the downtime, and for minimizing the time needed to recover from an outage. Downtime is measured in terms of MTTR.
Gathering an operational profile from the customer facilitates the creation of a customer environment. An operational profile describes the ways the system is to be used. One can uncover several deficiencies in the system while tuning the parameters of the system; parameter tuning will improve system availability level.
Customers must be willing to share the operational profile of their computing environment to improve the target availability level, which may be proprietary information.