ASQ CQA – 5. Quality Tools and Techniques Part 10

  1. 5C3 Qualitative and Quantitative Analysis

Here we need to understand the type of data, what type of data we are dealing with. Data could be broadly classified into two types, qualitative or quantitative. As the name suggests, quantitative is quantity and qualitative is related to the quality. Qualitative is the measure of type. Let’s say if we have number of cars and we note down the color of those cars, the blue, red, white, blue, white, red and so on, this will be a qualitative data.

Here we are looking at the quality or the type. On the other hand, when it comes to quantitative data, there we are talking about the quantity. Quantity could be in terms of the measurement or quantity could be in terms of the count. So let’s say if we have some data where we have listed down the height of students or height of a piece, 230 centimeters, 240, so on, that is quantitative data, because that’s the quantity. This could be height, this could be weight, volume, length, time, those will be the quantitative data.

Quantitative data also includes the count. The number of defects will be a quantitative data. So with this understanding of qualitative versus quantitative data so let’s look at this diagram here. Here the data is classified as qualitative and quantitative. In quantitative data also we have two types of quantitative data, continuous or discrete. As we earlier said that quantitative data is the measurement or the count, in that if the measurement is continuous, then this will be a continuous data. If this is in steps, then this will be a discrete data.

So let’s understand, let’s say if you are measuring the height of students, height of student could be 120, height of student could be 121, or height of student could be anything in between this 120. 1 or 120. 1356. Here, in between two values, a lot of other values can come. Whereas in case of discrete, discrete data is in the form of steps. Let’s say we are counting number of items, number of items in the basket, this will be a discrete data in that basket. Either we can have, let’s say 50 items or we can have 51 items.

We cannot have 50. 53 items in that basket. Similarly, number of defects, if you are counting number of defects, number of defects on a particular piece could be two defects, three defects, but there cannot be 2. 35 six defects on that piece of item. So that way that data will be discrete data, which is in form of steps. Now, coming back to the measure of central tendency, which we learned earlier there we talk about three measures of central tendency. The mean mode and the median when it comes to qualitative data. And once again qualitative data here is, let’s say color, red, blue, another car was again red, the next car was black and so on. So here if we have to take the measurement of central tendency, you really cannot have the average of the red, blue, and green color.

There is no average for that. What you can have is the mode. Mode is a good measurement of central tendency when your data is qualitative data. So here what you can do is you can count number of cars with the red color. Okay? There are five cars with the red color. With a blue color, there are ten cars with the black color, there is just one car. So here the maximum number of cars are blue. So blue is the mode. So this is how you can find out the measure of central tendency in case of qualitative data. When it comes to quantitative data, you can either find out the mean or the median. Both of these will work depending on the situation earlier. Also we talked that the mean is affected by extreme values, whereas the median is not affected by extreme values. So these could be the measures of central tendency in case of quantitative data. So understanding this is important.

That what sort of data you are looking at. Because as an auditor, you need to look at the data, you need to look at trends, you need to look at graphs. So in addition to qualitative and quantitative data in this video we also need to talk about trends. And for seeing trends, there are a number of visual tools. Some of visual tools we have already talked earlier in section five A, where we talked about quality tools such as the Scatter diagram, the Pareto chart, the histogram. Those are something which we have already talked about to visualize the data. But then here we will talk about few more visualization tools. And these are the pie chart, the bar chart and line chart.

So whatever data you have, you need to visualize to see what are the trends, where things are going, what is the future, what is expected in the future. So let’s quickly talk about these three types of data visualization Tool pie chart looks something like this. So here in this particular example, I’m looking at the number of defects in the water bottle manufacturing plant. The water bottle might have scratch, might have loose cap, might have a problem with the label, or might have a problem with the volume. I recorded those numbers and based on that number, I used Excel to plot a pie chart. So looking at this pie chart, you can see the trend. Looking at this, you can see that scratch is the biggest problem.

46% of problems are because of scratch. Lose cap contributes to 16% problems, level causes 23% problems and volume causes 15% problems. So this is one way to visualize data to see that where the big problems are. Other tool here is the bar chart. We have already talked the difference between bar chart and histogram. Histogram was for continuous data. Bar chart is for discrete data. So here, if you look at this particular bar chart, this is exactly using the same data with which I prepared this pie chart. So here the scratch has six defects. Six bottles are having scratch, two bottles are having loose cap, three bottles are having label problems, and two bottles are having volume. So this data could be displayed in form of bar chart as well. Both of these charts, the pie chart and the bar chart below, show the same thing. Bar chart could also be used for showing the time trend.

So, if you see here what I have recorded is the total number of defects on day number two, day number three, day number four, second of a particular month, third day of a particular month, and so on. And I have plotted the total number of defects. So just by looking at this, you can see that, okay, there is a rising trend here. So as an auditor, you might be concerned that there might be something wrong here which the organization need to attend to. The third chart is line chart. So what I have done here is the pie chart which I prepared for the water bottle manufacturing plant where I have four types of defects. If I plot the same thing using a line chart in line chart, the six problems because of scratch, two problems because of loose cap. If you see this doesn’t make any sense. Line chart normally is used where you have a time series where it gives a feeling of time.

Here it really doesn’t make any sense to use line chart for this particular data. So let’s look at a good example of line chart here. So here is the line chart which I’m using for the same data which we talked earlier, which was about the total number of defects on day number two, three and four and so on. So instead of bar chart here, I could plot a line chart also. Line chart also shows the trend that this is a rising trend which shows that there is a need to take some action to reduce the total number of defects. So these are some of the tools which you could use for seeing the trend in types of defects or the problems which you observe as an.

  1. 5D1 Common and Special Cause

And when we talk of variation, many times we talk about a normal distribution. Normal distribution looks something like this. We are not going into details of normal distribution formulas and other calculations related to normal distribution in this course. However, we need to understand that most of the variation which we observe follows normal distribution. And what does this mean is that normal normal distribution has a peak here in the center. That means most of the items are centered around the mean. And then there are some items on the left and some items on the right. Let’s say if we take an example of height of students in a class and we take all these measurements, measurements could be 120, so on. And if we draw a histogram for that and histogram will look something like this. And on top of this histogram, if we put a smooth curve like this, let me put this in red, this will be a normal distribution. Normal distribution will tell that on the average student height is, let’s say 120.

Then this is spread somewhere between, let’s say 90 to 150. When you are making something, let’s say if you are making something with a size of 100 CM, some of the items will be made as 100 CM, some will be 99, some will be 101 and so on. Now, your problem is that when you need to take action, let’s say if you find out one item which instead of 100 centimeter is 120 CM, this is too far from 100. That means something has gone wrong and you need to take action. On the other hand, let’s say if you get an item which is 10 one or 102 CM, you might say that okay, this is just a minor fluctuation, so I can live with that. And for this I don’t need to adjust the machine. So understanding common and special causes helps you in understanding when to take action and when not to take action.

Here you want to find out the trend and you don’t want to be too early and you don’t want to be too late. You don’t want to be too early because you don’t want to make adjustment for every small variation. If you keep on adjusting, then this will lead to another problem which is over adjusting problem. But then you don’t want to be too late.

Also, let’s say in that case where we were making 100 centimeter item and suddenly you find one item which was 10 five. Now, should you take action or not? Probably you don’t know. So this is where you need to understand whether this change from 100 to 10 five was because of common cause or was this because of a special cause. So, as we said earlier, understanding common cause and special cause helps us in understanding when and when not to take action. So here we have a diagram which shows the difference between the common cause and special cause. And here let’s take a simple example of me going from home to office and the time it takes to reach from home to office. Let’s say on the average it takes ten minutes. Someday it’s eight minutes. Some day it is twelve minutes, someday nine minutes. Some day it is ten minutes.

But one day it takes me 60 minutes. Because there was a problem with my car or there was a traffic jam or something. That means there was something special. This is special cause. The day I reached office in 60 minutes, that was because of a special cause and the variation which I normally have between eight minutes to twelve minutes or 13 minutes. This was because of the common cause. So when it comes to common cause there are many causes. Someday lot of signals might be read when I reach there. Someday there might be a fluctuation in the traffic. Someday there is some road condition problems. Those are small problems which happen day to day. So there are many causes for common cause. But for special causes there are only a few causes which you need to attend to. And each of these cause in special cause has a significant impact. In that example which I took when my car broke down or when I had a traffic jam, this was a special cause because this was a single thing which had a significant impact on me reaching from my home to office. You can eliminate common causes as well.

But then eliminating common causes will be uneconomical. It will not be economical. Whereas when you want to remove the special causes, these are economically feasible to remove. There are different terms for common cause and special causes as well. So let’s look at that. Common cause is also called as random cause because these causes are random. These are also called as chance causes or nona assignable causes. Non assignable because you cannot assign that to a specific problem. On the other hand, special causes are called as the signal or the systematic cause or the assignable cause. You can assign that particular problem to a specific reason or a specific cause.

So these are the differences between the common cause and special causes. So whenever you take measurement, whenever you see fluctuation and whenever you see variation in the measurement, you need to understand whether this variation is because of common cause or special cause. Control charts are the tool which you can use to find out whether this particular measurement is because of special cause or common cause. So there are limits set. If something goes above and below those limits, then you can say that this is because.

  1. 5D2 Process Performance Metrics

We have four of these measurements. CP which is Process Capability which will tell whether the process is capable or not. And then CPK which is Process Capability Index which will give us a value which will tell that how this process is performing. PP is Process Performance and PPK is Process Performance index. These two sets which are CP, CPK and PP and PPK are similar. The only difference is the way we calculate the standard deviation. We will not go into much of the details here, but the standard deviation which is calculated in CP and CPK is short term and in PP and PPK is long term standard deviation. But other than that, both of these are same. So for this reason I will be focusing on CP and CPK. And then later on I will summarize how this will apply to PP and PPK as well. Let’s start with CP which is process capability. Process capability is the ratio of spread between process specification to the spread of process values. Six process standard deviation and I’m sure this would not make any sense.

Let’s go to the basics. As we earlier said that there is a variation and the variation is represented by a normal distribution curve. So if I take a lot of measurements of the film which I’m making, my plastic film which I’m making is of 50 microns. So if I take number of measurements and if I draw a histogram, histogram would look something like this. Most of the values are centered and then there are some values which are on the extreme. And this is the center. Let’s say this is 50 microns and this is let’s say 47 microns. And this is, let’s say 53 microns. So most of my film thicknesses are between 47 and 53. And if I plot a line on top of that, a smooth line, this will give me a normal distribution. This is how my process is behaving, this is my process measurement. But then on top of that there is a specification which does that what customer wants. The specification here in this case is that customer wants this thickness to be between 47 and 53 microns. Anything which is between these two values is good, anything which is away from these two values is bad.

Now this is what we are using to measure how capable our process is. So the voice of customer is the specification which is this one. And the voice of process is the chart which we plotted here. If this value or if this ratio is greater than one and preferably 1. 33, then we can say that process is good. If this value is less than one, then we can say that this process is not capable. This is what process capability is about. So here we have four numbers or the four values when we want to calculate the process capability. And process capability is CP. Four values here are lower specification limit and upper specification limit. And we have lower control limit and upper control limit. Specification limit is something which we want. We want our film to be between 47 and 53 micros. This is what the design department wants. That factory should produce films between these two values.

So there will be some machines which are good, which can produce film between these two values. Some machines might have too much of variation, so that might lead to defect. And this is what is shown here, which is the lower control limit and upper control limit. This is based on the actual measurements which we take based on which we plotted the histogram. So these four values you need to have lower specification limit, upper specification limit, lower control limit and upper control limit. Now, to understand this, let’s take the same example, because with the help of example we can understand this better. So here what we want to do is we want to make a film with the 50 microns plus minus three microns. So that means anything between 47 microns to 53 microns. This is my specification. So the first one is my lower specification limit. The second one is my upper specification limit. Now, what I do is I take 100 films and take the thickness of that.

Based on that, I find out that the average is 50 microns. That’s good because that’s what I wanted. So average or the mean is 50 microns. But then the standard deviation is 0. 4 microns, zero four microns as the sigma value or the standard deviation. Now, when we have a normal distribution curve in normal distribution curve, let me plot it here. This is my mean, which in this case is 50 microns. If I have three standard deviation on the left and three standard deviation on the right. So one standard deviation, two standard division and three standard division on left one two and three standard division on right. This is plus three sigma. This is minus three sigma. In between these two values I have 99. 73% values. And once again we are going fast in this topic. We are not really going into too much of detail, but let’s understand that most of the values, most of the things will be within plus minus three sigma and sigma here is zero four microns. So now let’s look at the formula for CP.

The formula for CP is upper specification limit minus lower specification limit which are 53 and 47 which we know that this is the range within which our film thickness should be divided by six into sigma within, sigma within and sigma overall. These are two different calculations for standard deviation. Sigma within is used for CP and CPK and sigma overall is used for PP and PPK. We are not going into the details how we calculate these two sigmas. But let’s understand that this is the standard division. So if I put this value 53 -47 divided by six into . 4, this gives me 2. 5. And if I plot this to have a better understanding, this is how my process looks like. I have the lower specification limit and the upper specification limit which are 53 and 47. My actual process is centered at 50 and this is the curve for that plus three sigma and minus three sigma from 50.

So plus three sigma will be 1. 2. So this will be 51. 2 and this will be 48. 8. So this is minus three sigma and plus three sigma on this side. This is how everything could be shown in the form of a picture that most of the films which I’m producing are between 48. 8 and 51. 2. Whereas my acceptance limit is or the specification limit is 47 and 53. So my everything is well within these two numbers, 47 and 53. And this is the reason I’m getting CP value of 2. 5. CP of 2. 5 is good, CP of less than one is really bad. One means you are just meeting the requirement. Normally you are looking at 1. 33, but anything above that is good.

So here this process is running good. So this is how you calculate CP value. So now in this let’s say 100 samples which we took instead of getting 50 as the mean, now the mean is 52 microns. Will that make any change to the CP value? No, unfortunately even if the mean is 52 microns, you see in this formula, there is nowhere we are putting the mean value here, we are just putting sigma. Sigma will be 0. 4 upper specification limit and lower specification limit will be 53 and 47. Even if my mean has changed from 50 to 52, the value of CP will not change. That is the problem with CP and for that we have another measurement which is CPK. So instead of CP now if we measure CPK here, that will show that there is some problem. Let’s look at that and calculate CPK here. For CPK you need to calculate two values cpl, the lower one and CPU the upper one. Let me plot that and that will make things more clear. So here is my 50 microns which is the average here.

This is my 47 microns which is the lower specification limit. 53 is my upper specification limit. In this particular example, my mean is coming out to be 52. So my curve is not here, which earlier was in the first example, this was now my curve is here, which has the average of 52 and the standard deviation of zero four. So standard deviation of zero four means if I go three sigma here, this will be 52 minus 1. 250. 8 on this side 53. 2. Now 53. 2 is more than 53. So that means there will be some defectives here because most of the items I am producing are between 50. 8 and 53. 253. 2 is more than the upper specification which is 53. That means some of the items will be rejected. This will be shown by the value of CPK. Now let’s calculate CPK. To calculate CPK, we calculate these two values cpl, CPU and whatever is the minimum will be the CPK. Let’s calculate Cpl which is the lower one. Lower one is on the left side which will be process mean. The process mean here is 52 minus the lower specification limit which is 47 divided by three into sigma which is 0. 4. So this gives me five divided by 1. 2. Similarly, if I calculate CPU which is the upper one, which is U-S-L-U-S-L upper specification limit is 53 minus process mean which is 52 divided by three into 0. 4 which will give me one divided by 1. 2. Now, which one of these is minimum? Obviously this is minimum and if I calculate the value of this, this comes out to be zero point 83. So my CPK here is zero point 83, which is less than one. For the same example, CP was coming out to be 2. 5. That was not showing that there is some problem. And this problem is because of the mean getting shifted from 50 to 52.

So CPK is zero 83. That means there is some problem here and something needs to be done. So this is how you calculate CP and CPK. Now let’s quickly look at two types of sigmas which we use for Cpcpk and PP and PPK. In CP and CPK we use the sigma within, which is a short term standard division. Long term standard division is also called as sigma overall, which we use in PP and PPK, all the formulas which we used in CP and CPK are same. You can use the same formula to calculate PP and PPK, but the only thing is you need to calculate the standard division in a different way. Now, here we have what is the good value of CPK? The good value of CPK is greater than 1. 33 if it is more than 1. 33, which will be equivalent to four sigmas in terms of six sigma. So we are not talking about the six sigma project here. We are not going into the details of six sigma concepts.

But in terms of six sigma, the CPK of 1. 33 means we are at four sigma level. If you want to go to higher level, then you need to go to the CPK as two which will be equivalent to six sigma. So if your CPK comes out to be two, which is excellent. This was the case when we previously calculated the CPK came out to be 2. 5. When the process was centered, that was excellent. When you have a CPK value of less than one, that means there is a problem, there are defective items being produced. CPK of one is just capable. Any small disturbance will affect and will start making rejectable items. Now, let’s quickly look at some cases where we have CP and CPK value and with this we will end our session here. So let’s say in the case number one, we have a CP of 1. 33 and CPK of 2. 0. So if you get these two values, and if you are asked to interpret these values, you should immediately say that CPK cannot be greater than CP. CPK will always be less than CP. So there is something wrong with the calculation. So these values are wrong. Now let’s look at the second example where the CP is 1. 33 and CPK is zero eight. How is this possible? This is possible when your mean has shifted.

So if you have these are the lower specification limit, upper specification limit. This is the mean. And if your mean has shifted to the right, let’s say so this is your USL, this is your LSL, this is your mean. And if your mean has shifted to this place here, you might have CP of 1. 33 because the width of this is less than the width of the specification. So you get good CP, but your CPK is bad because the mean has shifted. So if you get CPK which is less than CP, then your immediate reaction will be that this process could be improved by centering this process. So if you enter it back, your CPK also will increase. You can take your CPK to at the maximum at the CP level by centering it. Another example here is when you have a CP of zero eight and CPK is also zero eight. So when you have both of these equal, that means the process is centered. This is something which you can conclude from here. Second thing which you can conclude from here is that process is not capable because the CP or CPK of less than one means your process has too much of variation. So the shape of this process will look something like this LSL lower specification limit, upper specification limit. And your curve will be something like this.

Your process is something like this, which is very wide, which has a lot of fluctuation or variation. What you need to do in this particular case is reduce the variation in the process. This is one thing or maybe you might have to revise your specification limit if customer is willing to accept that. So, this was a brief discussion on CPK.

img