December 22, 2005 at 7:16 am #41802
Can anyone explain to me what is the difference between standard deviation (between), standard deviation (within) and standard deviation (overall)? what is the formula? And what are their relationship?
In Minitab, when I am using the “Capability Sixpack” to calculate the Cp, Cpk, Pp and Ppk, the standard deviation got from there is different from graphical summary, why?
be notified via email.CG, it refers to the variation within and between subgroups to get a better understanding of the special causes and compare short term and long term effects.
The formula is SStotal = SSwithin + SSbetween
where SS : Sum of square of variation
For your second problem, I need more details. At the same time, you may want to look at the way, you have handled the subgroups.
Hope this helps.
A.0December 22, 2005 at 9:26 am #131493Thanks Anirvan Sen. I am studying about the standard deviation. I will post my question up again if I got any problem. Thanks a lot.
0December 22, 2005 at 10:17 am #131496Anirvan,
maybe due to my poor english I don’t correctly understand what you mean for Sum of sqare of variation. Are you saying that ‘total standard deviation’ is the sum of ‘within standard deviation’ and ‘between standard deviation’ ?
Rgs, Peppe0December 22, 2005 at 10:25 am #131497Cg,
Overall SD is the long term standard deviation. SD (within)is the Short term deviation.
“within” indicates the variation of the individual points in the subgroup from the subgroup mean or in other words Short term.
“Between” indicates the variation of the individual subgroup means from the overall mean of the sample.
“Overall” indicates the variation of the individual data points from the overall mean. This also indicates the long term variation. As mentioned by Anirvan SS overall = SSwithin +SSbetween. If you find out the standard deviation in the Excel “=stdev(data points) you will get the overall deviation.
For your second problem of the graphical summary SD value not matching with the value in the capability analysis, the reason is you are considering the “unbiasing constants” in the estimate tab of minitab. If you uncheck this you will get the overall value same as the graphical summary. I have no clue about the unbiasing constants. May be someone in the forum can tell us.
Thanks …….Siva
0December 22, 2005 at 10:35 am #131498Peppe,
To make it simple if you are having 5 data points, 1. Find out the mean 2. Find the difference between each data point and mean. This is called deviaiton. 3. Square the deviation and sum it . this is sum of squares. 4. Variance is Sum of squares divided by degrees of freedom (n1) in this case 51 = 4. 5. Square root of variance is your Standard deviation.
If you are at a GB level you need not bother about the formula involved with the calculation of Sum of square (overall, within and betwee), SImply use the minitab to find out your standard deviation.
Thanks … Siva0December 22, 2005 at 10:39 am #131499Siva, thanks for explanation, but my question was if it is correct to talk of sum of standard deviations, because I’ve always heard about sum of variance.
be notified via email.Sum of Squares refer to the sm of squared deviations. Therefore, it will be represented as :
summation of square of deviation of individual points and the mean of the subgroup.
SSTotal: assesses the deviation of each individual point from the overall meanSSBetween:assesses the deviation of each subgroup mean from the overall mean
SSWithin: assesses the deviation of each individual point from its corresponding group mean0December 22, 2005 at 10:50 am #131501peppe,
There is nothing called “sum of standard deviation” or “sum of variance”. You can only find out the sum of squares SS .
Thanks.Siva
0December 22, 2005 at 11:17 am #131502Siva, thanks.
Rgs, Peppe0December 22, 2005 at 1:37 pm #131503CG,
The standard deviations are different for PPK and CPK because one represents the variability based on all data and the other only considers the withinsubgroup variation. The one based on all the data is self explanatory.
Estimating sigma from withinsubgroup variation is based on the average ranges of all the subgroups (Range1 + Range2 + Range3 +….+Range n) / n (There is more to the formula)
If the subgroup means happen to change due to “temporal” variation, this will not affect the estimate of the standard deviation when using the withinsubgroup method.
You will have the check the history books, but I would say that the method of using the average range was originated before calculators were invented. One could easily determine the range of many subgroups and then average them in the process of estimating sigma.
The question about Std within, between, and overall is suggestive of ANOVA. Variability is decomposed into parts and then the ratio of between to within (Fratio)is used for determining significance. Within represents the error term (replicating under the same conditions).0 
