Chi-Square Distribution
A lot of people have been asking lately, “Bruck, please tell me about the Chi-Square distribution.” Applied statistics being near and dear to my heart, I’m all to happy to oblige. But first, as they say, a picture is worth a thousand words, so here’s one of the Chi-Square distribution with ten degrees of freedom (Chi being a Greek letter that looks like a cursive capital X):
The next question, of course, is what can you do with it? Some big-brained genius long ago discovered that variances (squares of standard deviations) distribute according to the Chi-Square distribution. Actually proving this is way beyond the scope of your typical VOB missive, but it does make intuitive sense: for a given sample size, which indicates a particular number of degrees of freedom, the probability of a variance being equal or less than zero is zero (squares of simple numbers can never be negative), but rapidly rises toward a mode value. It then gradually tails off, as the probablility of a variance being really large shrinks in likelihood with increasing variance.
Ms. Palfrey was indicted last week on charges related to her purportedly running a prostitution business employing 132 women in DC. Palfrey insists that her business, Pamela Martin and Associates, is a perfectly legal escort service that employed “contractors” and clientele from “the higher walks of life.” The “contractors” were required to be college-educated women, and charged the clients up to $300 for an evening of “fantasy.”
Okay, fine, Bruck, but what has the Chi-Square distribution done for me lately?
The simplest thing you can do with it is compare the variance of a sample to that of a population whose variance is assumed to be known. Let’s say you were producing multiple copies of a particular item, and those items showed a certain known amount of variability when one item is compared to another. Then you change something in the process to try to reduce that variability. You could use the Chi-Square test to determine whether or not the process change made a difference.
For several weeks, Palfrey has been threatening to release her client records to the public, claiming that she needs the money for her legal defense, and also claiming that there are so many “high profile” clients on her lists that the uproar would be huge.
Of course you might say, what if you don’t know the variance in advance, or what if you want to compare the variance of one sample of items to that of another sample. Then what? Why then, of course, you would use the F Test, named in honor of Sir Robert Fisher, and which is the ratio of two Chi-Square distributions.
Of course the tacit implication is that if the prosecution drops the case against her, the records will remain sealed. Presumably this would also allow her to continue to operate her escort service, and of course it would also allow her to continue to apply other forms of extortion as she sees fit.
Another, related use of the Chi-Square distribution is the test for association, or more directly, the test for proportions of attribute data. Let’s say you want to know if one teacher is better than another, based on which one’s students pass a test more frequently. You could use the Chi-Square test for association to make this determination.
But… is she bluffing? Is she holding two pair and raising? Or does she really have a bomb to drop on numerous prominent businessmen and politicians? Bruck says, “There’s only one way to find out!”
A third use of our friend Mr. Chi-Square, is the “goodness of fit” test, which is used to determine whether or not a sample of data belongs to a particular family of distributions - Normal, Binomial, Poisson, Uniform, or even the Chi-Square distribution itself. The way this is done is by geometrically applying the test for proportions to “slices” of the data as it’s naturally distributed.
Of course you might well ask, “Bruck, what gives you the right to cast the first stone?” Well, I’m not really in the stone-throwing business - that’s the prosecutor’s job. I’m just telling you what I think. It seems pretty obvious to me that the charges are true, and I’m actually surprised that it’s taken this long for them to make a case against her. Rumors have been going around for years about this “escort service,” and she advertises for employees (or at least used to) pretty explicitly right in the local “alternative” papers. Whether she’s bluffing or not, we should follow the laws under which we live, and make her play her trump card in the process. If some dirty secrets get exposed, then her clients will eventually thank her and the prosecutors for giving them a good hard shove down the road to redemption.
Isn’t applied statistics fascinating? If you’re really interested, next time I’ll discuss the Student’s T test, another veritable goldmine of statistical functionality.
0 Comments:
Post a Comment
<< Home