Menu
A+ A A-

Paving the Way

  • Hits: 2966
Share

Seven data collection strategies enhance your quality analyses. Data and information are at the heart of good investigations and decision making, but are all kinds of data the same? What are the major categories and types of questions to ask to collect and analyze data?

 

Understanding what roles data can play in quality analyses and how to tailor what you collect to better answer these questions can accelerate learning about your products and processes, resulting in better quality answers to the important questions of your study.

There are seven types of data collection and analysis methods, each with their own strengths, weaknesses and limitations. Each strategy can be connected to the stages of the define, measure, analyze, improve and control (DMAIC) method and the problem-solving process.

Granted, each strategy can be examined in much greater detail, and each works in tandem with an extensive set of associated tools. In this article, a basic overview of each tool highlights some of their distinguishing features and key associated terms and objectives. An extensive list of references is provided if you wish to explore further.

Be aware: There is some overlap among the different categories. An observational study can be used to monitor a process, for example, and many measurement assessment studies feature a designed experiment. With that said, the goal of these data collection summaries is to highlight the essence of each type and what might be accomplished through their use.

Table 1 shows the main elements for each category, with more details provided in the individual sections.


1. Observational studies
Common in many applications and often readily and cheaply available, observational data can provide insight into process and product characteristics. There are many types of observational data, ranging from continuous measurements to categorical (ordinal and nominal) data to text.

In the DMAIC process, this form of data can appear at any stage. Because there is no active manipulation of inputs, however, its role in determining solutions should be more limited than for data from a designed experiment.

A key role for this data is in the define stage in which observational data can provide a good overview of current status and the magnitude of an issue. It may suggest potential driving factors of a good or bad outcome to explore further. Observational data have an important role in pointing the way forward, but they should not be a primary ingredient of making final decisions.

Depending on the types of data available, exploratory and graphical tools, regression and contingency tables can be helpful for highlighting relationships in data. Patterns in the data that suggest the existence and form of the relationship between inputs and outputs can be good indicators of the general pattern.

The biggest dangers in using exploratory data are:

-Assuming a pattern observed in a convenience sample will be present in the larger population.
-Assuming causation between a set of inputs and its effect on the responses.
-Missing a lurking variable that wasn’t measured and is driving change in both the input values and the response.
- Sampling from a portion of the total population, which might give a biased view of the relationship.
-The data resulting from observational studies can be helpful, informative and revealing, but it’s important to be cautious about drawing formal conclusions or making final decisions based strictly on the gleaned information.

One important exception to these warnings results from having complete population data instead of just a sample. In these cases, you’re often dealing with large data sets, and the field of data mining1 can be useful for extracting patterns in the data. The reason this situation differs from those involving only a portion of the population is because you are looking at a complete set of population data, so any observed patterns actually exist.

An example of this is grocery loyalty cards, which track purchases. If there was interest in determining what products are commonly bought along with sliced cheese at a particular store, looking at all of the historical records for purchases involving the product of interest would establish the most commonly grouped products. This pattern is known to exist, and decisions can be based on this, despite the observational nature of the data.

Data mining provides a suite of tools to efficiently summarize population data with the goal of identifying and extracting relationships from large complex databases. Hence, three key issues to consider when looking at observational data include:

- The relevance of the sample to the current study. Are there key differences in time, product or process that may change patterns?
- The quality of the data. How were the units selected? Have all potentially relevant inputs been gathered?
- How the data can be used to advance the study without drawing conclusions that are not justifiable, given the lack of established causality.

2. Monitoring techniques
Variability exists in all processes, regardless of how well maintained they are. There are generally two types:

- Common-cause variability, which is inherent in the process or variability that is due to chance.
- Assignable-cause variability, which is caused by some disruption or outside sources that should be identified and removed.
- Assignable-cause variability is generally the most problematic, resulting in unstable and out-of-control systems. Control charts are the most commonly used tools to quickly detect the assignable cause with as few false alarms as possible.

Although control charts were first developed more than 80 years ago for manufacturing processes, the types of charts available and their uses now include applications in healthcare and service industries, biosurveillance and public health monitoring. In addition, control of a single process characteristic—such as the process mean or fraction nonconforming (univariate process)—has been expanded to monitor multiple quality characteristics simultaneously (multivariate process).

In the DMAIC process, charts for control and monitoring of processes are commonly used in the analyze—to assess current conditions—and control—to maintain gains—stages.

Quality characteristic of interest

Following are an overview and some guidelines for choosing among options. Before any control chart can be applied to a process or system, you must identify the type of variables to be measured and quality characteristics to monitor. What are the key quality characteristics to focus on? Are the variables of interest continuous, discrete or categorical?

If you manufacture latex gloves for medical use, for example, glove thickness may be the variable of interest and it is continuous. As a result, average glove thickness could be the quality characteristic monitored using a control chart.

In public health surveillance, for example, you may be interested in the number of people who visit a local emergency room each day with flu-like symptoms, and this variable is discrete. For a loan servicing company where applications are assessed by loan officers to determine whether the application was completed correctly, the variable of interest is correctness of the loan application and is a categorical variable—either correct or incorrect. For a random sample of applications assessed every week, the proportion of incorrect applications—the fraction nonconforming—would be the quality characteristic of interest.

Control charts monitoring quality characteristics fall into one of two groups:

- Variables control charts for continuous quality characteristics.
- Attributes control charts for discrete or categorical variables.
- In the latex glove example, a single quality characteristic is monitored over time, so an univariate control chart is appropriate.When two or more quality characteristics are investigated, it is possible there is a relationship—or correlation—between characteristics. In these situations, multivariate control charts are appropriate. See Figure 1 to compare individual control charts—the first two—versus a combined multivariate chart that’s better able to detect changes.

 

A good example of this is in public health monitoring when it is more appropriate to simultaneously monitor several hospitals for the number of people with flu-like symptoms. If an influenza outbreak is near, you would likely see a positive correlation between ERs, with an increase in the number of patients in all of the hospitals. In this situation, a univariate chart applied to each individual hospital may not signal a change in the process as quickly as a multivariate control chart of all hospitals at once.

Next, consider the purpose for implementing a control chart for the quality characteristic of interest. In particular, are you working with a process or system currently unstable and out of control—namely, a system that has not been analyzed to determine whether only common-cause variation exists? This is common for new processes or processes believed to be unstable. For these purposes, historical data often are collected and appropriate control charts applied to this data. This is called a Phase I analysis.

Phase I is considered retrospective because it involves assessing historical data to determine whether the process was stable or in control over that time period. Assignable causes resulting in large shifts in the process often are present, so it is important to use control charts that are able to quickly detect large shifts. In phase one, the objectives are to bring an unstable process under control and determine appropriate control limits for long-term monitoring, which is done in Phase II.

In Phase II analysis, you assume the process or system is relatively stable, with the goal of identifying small to medium shifts.Control charts that detect moderate shifts quickly and with few false alarms are helpful here. Phase two control charts are often used in the control stage of the DMAIC process.

After the variable of interest and quality characteristics are identified, focus should turn to determining the sample size, sampling frequency and time period during which the samples are taken. The allocation of resources should be guided by several factors.

Costs associated with missed process shifts, sampling costs and even production rate, for example, should significantly affect how large a sample will be selected and how often. There are trade-offs between selecting small samples more frequently and selecting larger samples less frequently.

Control and monitoring techniques are important quality tools for collecting and displaying data in a way that helps the practitioner identify out-of-control processes. It is important to make thoughtful choices about which quality characteristic should be monitored and how the data should be sampled. The following describes commonly used control charts (Shewhart, time-weighted and multivariate) and monitoring techniques and how they should be applied.

Shewhart control charts: The oldest and most commonly encountered control charts are the Shewhart control charts, first proposed by Walter A. Shewhart. In these charts, a summary of the quality characteristic is calculated for each subgroup and plotted on the control chart, with limits usually set at ±3 standard errors around the centerline for the process average or in-control target value.

The Shewhart variables control charts include the and R chart, individuals chart, and and s charts. The Shewhart attribute control charts include the fraction nonconforming (p chart), number nonconforming (np chart), number of nonconformities (c chart) and average number of nonconformities (u chart).

Shewhart control charts have been shown to detect large shifts or changes in a process soon after the shift has occurred, but they are slower to signal when small process shifts indicate the process is out of control. In addition, Shewhart charts are sensitive to the normality assumption.

Based on these properties, Shewhart control charts are logical choices for phase one analysis for either variables or attributes data, but they are less attractive for phase two analysis in which the goal is to detect small to medium shifts.

An important property of Shewhart control charts is that the statistic being plotted contains information from only the current subgroup. It contains no memory of the previous values.

Time-weighted control charts: Time-weighted control charts differ from Shewhart charts in that the statistic is a function of the current observation and all previous observations. This recursive nature of the statistic results in the chart being able to reveal small to medium shifts in the process more quickly than control charts that lack memory.

Two of the most common time-weighted control charts are the exponentially weighted moving average (EWMA) control chart introduced by S.W Roberts13 and the cumulative-sum (CUSUM) control charts unveiled by E.S. 

Because these charts quickly detect small to medium shifts in a process, they are attractive as monitoring tools in phase two analysis. In addition, the EWMA and CUSUM are robust to the assumption of normality. These charts have been used to monitor variables and attributes data with very positive results.

Multivariate control charts: When monitoring more than one quality characteristic at a time with evidence of association or correlation between the quality characteristics, multivariate control charts are appropriate. Harold Hotelling’s T2 control chart is the multivariate extension of the univariate Shewhart control chart.

Similarly, there are multivariate extensions of the EWMA (MEWMA) and CUSUM (MCUSUM) control charts. A drawback to these types of charts has been the difficulty in diagnosing which quality characteristic has caused the out-of-control signal. For a breakdown of the uses of the control charts, see Online Table 1. 

 

Planning and data collection are key to the successful implementation of monitoring techniques. When properly designed, these tools can be cost effective and extremely beneficial to the enterprise and its customers.24

3. Process capability studies
After a process is found to be stable through monitoring tools such as control charts, it is important to determine whether the process is also capable. A capable process is one that performs according to its stated requirements.

Although not necessary, requirements are often written as specification limits, which are tied to customer requirements and differ from control limits. Control limits are based on the process distribution and are determined statistically. Specification limits are requirements connected to product performance.

Process stability compares process variation to control limits, while process capability compares process variation to specification limits. Lower and upper specification limits are usually written as LSL and USL if two-sided limits are warranted.

Process capability studies are commonly implemented in the DMAIC process, including the analyze and improve stages, and later in the control stage to determine whether implemented changes have improved process capability.

Process capability metrics
Process capability often is evaluated through various metrics, including capability indexes (for example, Cp, Cpk, Cpm, Pp and Ppk), defects per unit (DPU), defects per million opportunities (DPMO) and percentage nonconforming. Graphics such as histograms can identify off-center processes and outliers that may need investigation and possible removal. Metrics such as DPU and DPMO are useful capability measures for attribute data.

The most common metrics used are capability indexes, which are ratios of the specification limits to the process variation. Because it is common to assume the characteristic of interest is normally distributed, the process variation in the denominator is usually six process standard deviations, for example Cp = (USL – LSL)/6σ.

A large value of the capability index indicates a process is capable of producing product that consistently meets or exceeds the customer’s requirements. Capability indexes are popular because complex information can be summarized into a single number.

A single metric, however, can sometimes oversimplify and fail to capture all capability information for a complex system. As a result, it is recommended to not only report the estimated capability indexes, but also to construct confidence intervals on the true process capability index. In addition, providing graphics of the data using histograms and control charts aids understanding.

Short and long-term variation
There’s much debate about short and long-term variation and their connection to capability analyses. With respect to capability indexes, the Cp metrics are calculated using short-term variation from a stable process, which involves the variability within each subgroup during a given timeframe. Short-term variation is used for constructing the limits on control charts.

Long-term variation is the standard deviation calculated using all values in the timeframe of interest. It captures the variability due to common and assignable causes. Figure 2 illustrates the difference between short and long-term variation for unstable and stable processes.

Long-term variation should be considered when calculating the indexes Pp and Ppk. Some are emphatic and say indexes such as Pp and Ppk should never be used because they provide no useful information. Others argue Pp metrics are extremely valuable and should be examined before the general Cp indexes.36-40

To carry out a meaningful process capability study, carefully consider the following:

- Timeframe. Should cover a period of time that allows for evaluation of the process performance while controlling costs of sampling. The timeframe for collecting data may be dictated by customers, who often look for monthly or quarterly process capability reports.

- Sampling method. The data selected must be representative of the process performance. Important steps include determining the appropriate size, frequency and number of the subgroups to be selected. The concept of rational subgrouping plays a key role in process capability analysis.41
- Presentation of results. Process capability reports should include graphics of the data, such as histograms and control charts, and capability indexes accompanied by confidence intervals when possible.
- Process capability studies are important for determining process performance relative to customer requirements during a particular timeframe. For results to be meaningful, it is imperative the process be stable before attempting to interpret process capability results.

4. Measurement systems capability studies
Conducting a capability analysis on a process tells you about its characteristics, but it is important to also determine whether the measurement system in place to obtain summaries of the process is also capable.

Critical-to-quality characteristics chosen to describe the system’s most important attributes are estimated using collected data. Variability in observed response values could be due to the process or the measurement system used to acquire them.

Because measurements are subject to variability, a process cannot be analyzed accurately unless results from the measuring device are repeatable and reproducible. If variation due to the measurement system is small compared to the process variation, the measurement system is deemed capable. Measurement systems capability studies are important tools in the measure stage of the DMAIC process.

Measurement system capability studies involve the analysis of the measurement system to:

Determine the amount of variability attributable to the measurement system.
Identify and isolate the sources of variability in the measurement system.
Assess whether the measurement system is suitable for its intended application.
Identifying and isolating the sources of variability due to the measurement system is often achieved through a gage repeatability and reproducibility (GR&R) study.

GR&R studies
A measurement system may consist of multiple interdependent components: the measuring device (for example, the gage), personnel (such as operators), time periods or setups. To assess the characteristics of a measurement system, replicate measurements are usually obtained under several conditions.

The purpose of a GR&R study is to determine if the size of measurement system variability is small relative to process variability. Repeatability is the variability due to the gage when multiple measurements are made by the same operator with the same setup. Reproducibility represents the variability due to different operators.

A traditional GR&R study consists of a crossed two-factor designed experiment with operators and parts and each operator measuring each part multiple times. Measurement system variation (including repeatability and reproducibility), process variation and total variation are estimated as accurately as possible in these studies.

There are two common methods for estimating these variance components:

The tabular method, which uses ranges and control charts, has several disadvantages. The range is an inefficient estimate of gage variability, particularly for moderately large sample sizes. In addition, the method is generally difficult or impossible for constructing appropriate confidence intervals on characteristics of interest.


Analysis of variance (ANOVA), which is based on having run a designed experiment involving operators, parts, replicates and other factors and estimating appropriate variance components. The ANOVA approach is widely available to practitioners, and can be adapted to deal with complex experiments and used to construct confidence intervals for components of variability.

Common metrics
Several metrics can be calculated and interpreted to determine whether the measurement system is capable after measurement variation and process variation have been estimated. The two categories of measurement system capability metrics include:

- Measurement variation compared to total variation (or part variation), such as total GR&R criterion (%R&R)47-51 and number of categories criteria, including signal-to-noise ratio (SNR), discrimination ratio (DR) and number of distinct categories (ndc).
- Measurement variation compared to tolerance or specification width, including precision-to-tolerance ratio (PTR) and misclassification rates.
- General rules exist for each metric but should be used with caution because they may not be realistic for all measurement systems.

For example, one rule suggests that if the R&R percentage is less than 10%, the measurement system is considered acceptable, while if the R&R percentage is greater than 30%, the measurement system is ruled unacceptable. But the R&R percentage is a ratio of standard deviations that are not additive, and R&R percentage is not a true percentage of the total variation. As a result, an R&R percentage less than 10% may be acceptable for manufactured parts, for example, but too restrictive for chemical processes. The user should consider these other guidelines for a specific application or process:

A highly capable process often can tolerate a measurement system with a higher PTR than a less capable process. For example, a rule for PTR is if PTR is less than 10%, the measurement system is acceptable, and if PTR is greater than 30%, it is unacceptable. However, studies have shown PTR less than 10% does not necessarily offer a good indication of how well a measurement system performs.

The guidelines for the number of categories criteria are inconsistent. These metrics indicate how well the measurement system can discriminate between different parts. For example, if 10 parts are being measured, but SNR is found to be two, the measurement system cannot distinguish some of the parts from each other. The rules for SNR, DR and ndc are inconsistent for a given capability level of a system.

Point estimates should be accompanied by confidence intervals to quantify the associated uncertainty. The standard GR&R study recommends including 10 parts measured by three operators two times each. To adequately estimate many of these metrics, a standard GR&R study should include at least six operators with at least 20 parts.

Attribute GR&R studies should be used when results are categorical. Most of the metrics described are not appropriate for GR&R studies for attribute data. Agreement analysis, applicable in the service industries, assesses the agreement between appraisers and relative to known standards.

Remaining MSA components
A complete measurement system analysis (MSA) should involve more than a capability study. Three additional but lesser-known aspects of an MSA that should be considered are the stability, bias and linearity of a measurement system.

Stability means the ability of the measurement system to produce the same values over time when measuring the same part or sample. If the measurement system is stable—as with control charts in process monitoring—only common cause variation is present.

Bias quantifies the difference between the master (true) value and the average value of the measurements taken on it.

Linearity measures the consistency of the bias over the range of the measurement device. It addresses the question of whether the bias is the same across the expected range of measurements.

Measurement systems capability studies reveal whether the responses obtained for a system can be trusted as representative of the actual system. If the current measurement system is not capable, assignable cause variability exists somewhere in the system. As a result, other analyses may not yield results on which decisions should be based.

For guidance on where improvement efforts should be focused if the measurement system is not capable, see this month’s Statistics Roundtable column,"Getting Graphical."

5. Sampling
When the study’s goal is to understand the characteristics of a population of units, sampling may be more cost-effective or practical than performing a census, which looks at all units in the population. For destructive testing, sampling is essential to not consume all of the products. The key idea of sampling is that with careful planning, you can get an accurate and precise estimate of population characteristics at a fraction of the price.

You’re all familiar with political polls, which can estimate election results based on a relatively small number of people. Common examples of sampling studies involve evaluating populations of parts for adequate performance or collecting data from customers to quantify their opinions. In the DMAIC process, sampling has a role to play in the analyze stage to evaluate the current status of a process, and in the control stage when improvements can be quantified for the adjusted process.

Five key considerations for sampling studies include:

1. Defining the population frame. This is a precise statement and itemization of which units are included for consideration in the population of interest. In manufacturing, for example, you may want to restrict your study to units from a single problematic machine or units from the entire process.

Depending on what you select, the questions you’re able to answer and how they may generalize to broader conclusions will be affected. For customer-based studies, a precise statement of which people to include will define whether you’re able to comment on all customers or just a subset—that is, those who submitted a complaint.

2. Representative sample. The key to being able to generalize the observed sample characteristics to the overall population is having a representative sample. An example of a nonrepresentative sample might occur when wanting to characterize customer satisfaction with a particular product. If the sample was drawn from only those customers who called the complaint hotline, you wouldn’t expect the results to be indicative of the overall population.

Similarly, in a production environment, if units were sampled from only one manufacturing line, it would be unlikely the results would be a good summary across all lines. The statistical mechanism for obtaining a representative sample often is based on ensuring all units have an equal probability of being sampled, and different subpopulations have appropriate representation.

3. Sampling units and sample size. Determining whether to sample individual units or groups of units is also an important consideration. There are usually pros and cons to collecting information from groups of units. Coupled with this is the decision of how many units to sample.

As with most data collection endeavors, more data give more precise estimates, but with diminishing returns. The appropriate sample size should be based on available resources, the study’s goals and the desired precision of estimation.

4. Using supplementary information. If additional relevant demographic information is available on the sampled units and for the overall population, it’s possible this information can be used to improve estimation and verify how representative the sample appears to be.

5. Nonresponseor missing observations. Typical when sampling people, but also possible for inanimate objects, there are often problems with getting responses for all of the units. If there are systematic reasons for missing values in the response—such as calling home phones during the day and missing the 8 a.m. to 5 p.m. working population, or being unable to measure the characteristic of interest for units that have been scrapped—the summary of the sample characteristic may not necessarily reflect the matching population characteristic.

There are ways to adapt the sampling design for an increasing response rate and for using appropriate estimation techniques to quantify the effect of nonresponse on the estimate and mitigate the possible bias introduced from systematic missing values.

Some of the more common sampling techniques are illustrated in Figure 3.

 

1. Simple random sampling: The sample is randomly drawn from all units in the population of interest in which the probability of each unit being selected is the same.

2. Stratified sampling: The population is grouped into subpopulations in which units in the same subpopulation are thought to be similar to each other or to one or more concomitant variables. The sample is drawn by randomly selecting units for the various subpopulations in numbers proportionate to the relative sizes of the subpopulations.

3. Cluster sampling: This approach is helpful when units are naturally organized in small collections of units —clusters). For example, voters living in households or units may be organized into lots. Time or cost efficiency can be gained by sampling the clusters instead of individual units.

4. Multistage sampling: The first stage is similar to cluster sampling because it selects groups of units. The subsequent stages look at the selected clusters and then select a subset of those units for measurement.

5. Systematic: If the units in the population are naturally ordered—such as units on a production line —it is possible to just sample every jth unit. This simplifies the rules for sampling, but has a small risk if there is some natural periodicity in the data.

In general, sampling can be an advantageous strategy for answering questions about the population without doing 100% inspection. Sampling can sometimes be coupled with designed experiments to select collections of units that seek to match design structures of known experimental designs.

6. Design of experiments
Strategic data collection through a designed experiment provides a gateway to answer questions about what input factors are driving changes in the response of interest, and to establish causality. It was championed by Ronald Fisher and George Box. Now design of experiments (DoE) is a major subdiscipline in statistics, with many tailored design strategies to provide methods for a diverse set of scenarios.

The key advantage of using designed experiments is that the experimenter controls the combinations of inputs being explored, which allows control over the ranges to explore, as well as the establishment of a causal connection between inputs and outputs. It is the direct opposite of observational data over which the user has no direct control and there’s no active manipulation of inputs.

In the DMAIC process, a designed experiment is an important tool in the improve stage because different inputs can be actively manipulated to determine which combination will yield the desired improvement.

The basic setup of many experiments is to apply a set of factor (input) levels manipulated by the experimenter to experimental units and observe the responsebased on these factor settings.

If you’re interested in determining what causes differences in the size of four-week-old seedlings, for example, you might begin with seeds (experimental units), expose them to different amounts of sunlight, water and fertilizer (factors, in which zero, two and four hours of sunshine per day would be the factor levels for sunlight) and measure the height of the seedlings after four weeks (the response).

Decision making using DoE often involves a sequential process in which learning from earlier stages is used to guide subsequent experiments:

The first stage may involve a pilot study to establish what data are possible, what ranges of the factor levels are sensible to explore, whether the response will be effectively measured (see the measurement assessment section), and what kind of natural variability exists between similar experimental units.

The second stage—called a screening design—considers a potentially large number of factors to determine which ones most influence the response. Factorial, fractional factorial or definitive designs,or constructed designs optimized using D-optimality (good estimation of model parameters) are common (see Figure 4). These typically consider only a small number of factor-level values because the focus of the designs is to investigate the gross connection between inputs and outputs through linear and two-factor interaction effects.

 

After the important factors have been identified, the third stage involves describing the relationship between input factors and the response through response surface methods.

In this stage, designs with a larger number of factor levels are used to estimate a potentially curved surface that connects inputs to response. This allows for better characterization of the relationship and facilitates optimization—minimizing, maximizing or hitting a target value—of the response. Common designs for this stage include the central composite (Figure 4) and Box-Behnken designs, or constructed designs optimized using I or G-optimality—good prediction in the input experimental region.

The final stage is used to confirm the previous experiments have found an ideal combination of input factor levels, and the estimated response range at this location is attained.

The core principles guiding DoE methods are:

Randomization. By randomly determining the assignment of treatments to experimental units and the order in which they are run, the analysis can provide a probabilistic framework for determining whether the size of factor effects is large or small relative to the natural variation. Randomization also provides protection against systematic bias from unknown influences on the response—such as run order or missing factors.

Replication. To be able to understand the natural variability in the process or product, it is beneficial to study what range of response values is possible, given the same experimental setup.

Blocking. In some cases, the characteristics of all the experimental units are not thought to be homogeneous. When this happens, it is advantageous to group similar units into groups and apply a variety of different treatments within these groups to be able to discern smaller differences in the response. 

Because the specific goals of experimentation vary, and there are frequently restrictions on how the inputs can be manipulated, there are many different types of designed experiments. A few of the more common variations include:

- Robust parameter design. In these applications, two types of factors exist: control factors, which are controllable in the experiment and in production, and noise factors, which are controllable during the experiment but not during regular production. The experiment’s goal is to determine a combination of the control factors that provides desirable, low-variation results across the range of anticipated fluctuation of the noise variables during production.
- Split-plot designs.I n these applications, there are difficult and easy-to-change factors. To reduce the cost and time of data collection and potentially improve the quality of estimation, these designs do not reset the levels of all of the factors for each experiment run. The difficult-to-change factors are reset less often, and this difference in the setup and randomization structure requires different analysis approaches.
- Computer-generated experiments.These designs are generated in statistical software to solve a tailored problem. For instance, with a flexible size, a region of input combinations or optimization objective. Because of improvements in software, this option allows the experimenter to focus on the problem of interest and satisfy logistical constraints. Key to the success of selecting a good design is to optimize based on objectives that match the study goals.
- Designed experiments are an essential tool for the quality professional. They allow guided exploration of the relationship between input factors and responses and provide a formal method for establishing causality. Key questions to ask when deciding what designed experiment to run include:

- Is the experiment focused on screening, modeling the surface or confirmation?
- What factors should be included?
- What ranges of factor levels are of interest?
- What size of experiment should be chosen to appropriately balance cost and precision of results?

7. Complementary data and information
In addition to primarily quantitative data, there are often supplementary collections of data and information you may wish to include in your study of a process or product. These categories of data serve important roles in the different stages of the DMAIC process, including problem formulation in the define stage and understanding the process in the analyze stage.

Fortunately, there are some useful overviews of the key tools associated with this type of data: histograms, control charts, Pareto analysis, cause and effect diagrams, check sheets, scatter plots and stratification.

These tools can help glean understanding of the basics of a current process, including the interrelationship of inputs and outputs, candidate drivers of change in the system and importance of different failure mechanisms.

An additional set of complementary quality tools, including affinity diagrams, arrow diagrams, matrix data analysis, matrix diagrams, process decision program charts, relations diagrams and tree diagrams. When combined, this collection of summaries provides a suite of tools to organize the connections, structures and importance of various attributes of a process.

Several additional tools and methods provide valuable insight to a process that is not necessarily identifiable using the methods discussed so far. Methods and tools that map voice of the customer data to outcomes and deliverables and identify possible failures or opportunities for improvement, for example, can be helpful to any enterprise.

Some of these additional tools and methods include: failure mode and effects analysis, defect concentration diagrams, flowcharts, quality function deployment matrixes (houses of quality) and prioritization matrixes.

In the early stages of understanding a process, having techniques for compiling and organizing knowledge, information and data can go a long way toward avoiding costly replications. Being able to extract what is already known can form a solid foundation on which to perform more formal data collection exercises.

 Reference: ASQ

Share

Contact

Papaflessa 119 Piraeus 185.46, Greece
Email: This email address is being protected from spambots. You need JavaScript enabled to view it.
Web: www.sqss.gr