Notes from yesterday Sept 24

Measuring and Improving Your Test process - Robin Goldsmith

Where we (testing) are as an industry. Different people, different industries, similar responses.
All trying to learn. How does it affect us all? Predictabity helps us measure if/how we improve.

Slide from Course materials:

Warm up exercise

An orderly testing process is commonly known & followed
Defect data is used to improve later development & tests
We know in detail how we spend our work time
Management understands and supports needs of testing
Test effectiveness is routinely measured and rewarded
High-quality systems are delivered with few defects

The value of improving our testing measurement
To my organization...
To me...
My objectives for this course are...

My objectives. Common standards, common processes and definitions, produce better services for our customers.

One reason we don't get rewarded with test effectiveness is because we don't know how to measure.

Note: Resources;
Jones, Capers Applied Software Measurement McGraw-Hill 1991
Robin F. Goldsmith Discovering REAL Business Requirements for Software Project Success
Artech House http://www.artechhouse.com/Default.asp?Frame=Book.asp&Book=1-58053-770-7

Section 1 of the course - Mesurement and Process

Slide from the course:

The term "metric" is out of favor, not used in ISO/IEC 15939. A (primitive or raw) or base measure quantifies a single attribute.
A derived measure (often used to be called a metric) combines two or more base measures using a mathematical function.
An indicator is one or more measures with which decision criteria are associated.
_{David Card "Making Measurement Understandable" IEEE Software January-February 2000}

I liked this slide from the course:

Click on the thumbnail above to see the full image.

A guy is crawling around on his hands and knees. I cop comes by and asks him what he's doing. The guy replys:

"I'm hunting for my keys."
"Did you them here?" the cop asks
"No" says the guy - "But the light is better here!"

We tend to do what is easier. We may measure something because it's something we are doing already. We tend not to measure what might make us look bad. No one wants to look bad,
there's often a perception that the metrics will be used to "punish". So how do we measure, how do we improve? What we should be measuring vs. what we can measure or prefer to measure.

Click on the thumbnail above to see the full image.

DEFINITION: Defect density: Number of defects relative to size (Size= Requirements, Design, Code, Tests)

DEFINITION: Blocked test: usually test can't be run because of some other defect (code won't install, etc)

From course slide:
Main Reasons (Most) Measurement Programs Fail:

Fear/perception/actuality measures are misused
Lots of extra time/delay/effort to collect measure without apparent meaningful value to collector
Measuring the wrong things
- Measuring (everything!) for the sake of measuring
- Failing to turn data into meaningful information, e.g., over-generalizing from single/wrong data points
- Not taking relevant actions based on measurements (often because they're not measuring right things)

NOTE: interesting point to use performance testing to produce (at input) to SLAs.

NOTE: Tool to keep/track test cases "QA Center"

Noticed that there was a lot of commonality, different terminology, maybe different ways to tie testing activities togehter to create processes.
But still, people in class seem to have similar ideas and processes.

I think one big area for improvement for us (CIT) as an organization would be process measurement.

From course slide:

Determine benefits and value - How would measures to show results differ from measures to guide improvement?

From course slide:

Real process vs. Presumed process:

A result is the inevitable outcome of the process followed, regardless of whether it is intended, desired, or even recognized.
Knowing your process enables predicting your results

If you want to change, you must change the process. But which process are you changing? The presumed process or the real process?
Changing the presumed process, when it is in fact not the real process will not affect the results.

Measuring must be done to the full end result. For example, tracking defects before a product ships and not tracking the defects that occur after shipping (go live).

The Hawthorn Effect - make sure your not interfering with the measurement by how your measuring.

Measure everyone to the same crime. The end product.

Not pay per line of code
Not pay pay by defect discovered

Section 2 of the course - Defining the Testing Process.

Determine process capability - what/how much a process can produce when stable (in statistical control).

Example: # of cars on a specific strech of highway (miles) can run at 70 mph.

What happens when you and another to 100 cars?
Special causes, rush hour, snow, cop car, accident, etc.

Once stable, then you can take steps to improve.

Model Fitting

Maturity model. 0ne of the best MOST common INDIRECT ways to measure.

One of the reasons for institutionalizing, when people change processes don't. From course slide:

"Organization institutionalizes process via policies, standards, and organizational structures so they endure after those who originally defined them have gone".

Sometimes seen as a high overhead process. May be awhile before you see the benefit.

Robin's example of kids playing football in the last two minute vs. pro football

Stick to the process.

NOTE: The ultimate quality in in the product. The quality is in the car, not the processes that built the car.

Robin would contend that traditional testing methods are reactive.

www.ddj.com/dept/debug/184414873
www.ddj.com/dept/debug/184414883
www.ddj.com/dept/debug/184414897
www.ddj.com/dept/debug/184414911
www.gopromanagement.com

NOTE: IEEE 829 standard was revised for 2608

Section 3 of the course - Testing Activity Measures

Top down estimating - estimate of total cost, then broken out to major areas.

Maintenance is 66-90% of system cost, mainly completion / correction of development.

Do your organizations routine measures show these affects?
*If you arent aware you can't correct.

9/24/2008 m 141

Fixing a requirements error will cost

10x during programming
75× - 1000× after installation (maintenance)

Number of test cases does not imply good coverage.

Better to have the right 10 than the wrong 50.

Note: check STP site for Robin's V model article.

Section 4 of the course - Outcomes, Analysis, and Reporting

DEFINITION: Defect Age - difference in time between when a defect injection vs. detection. Good measure of testing process.

NOTE: Title your defects with the business impact.

It's important to discuss and define what is a defect and how test effectiveness will be measured.

Page tree