Thursday, October 25, 2012

Graphs, correlations, and deviations!

Statistics Unit: Week 1

This week in Math 252 we definitely went through a plethora of new information and that means we have a lot to cover today, so be prepared for a very long post! I will be breaking them up in sections for easy reading.

Part 1: Box & Whiskers Plot

A Box & Whisker plot is a perfect way to look at a set of data in a broader sense by only highlighting the 5 main numerical values. This eliminates the extra numbers that may not have any prominent significance. The lower extreme is the minimum or lowest number used in the sample as a whole, while the upper extreme is the highest. 

Ex.) If you were doing a data set of the number of cookies someone has, with 1 being
the lowest and 100 being the highest, 1 = Lower Extreme and 100 = Upper Extreme.

Then, we have the quartiles and the median, which is a lot of what a Box & Whiskers Plot relies on to represent their data. The best way to describe this would be to show you:

Ex.) Take this set of numbers: 1  2  3  4  5  6  7  8  9

First you would find the median of the set, which is the middle number.
 1  2  3  4  5  6  7  8  9
In this case, the median would be the number 5.

Then you find the lower quartile, which is found by looking at the left side of 
the median, or the lower set of numbers and finding the median in that set. 
 1  2  3  4  5  6  7  8  9

Because the set of numbers is even, you would have to find the average of 
the two middle numbers. In this case, those numbers would be 2 and 3.
2+3 = 5 --> 5/2 = 2.5

The lower quartile is 2.5

Finally, you would find the upper quartile, which would be the set of numbers 
to the right of the median. You would then find the median of that set.
 1  2  3  4  5  6  7  8  9

Again, you would find the average of the two numbers.
7+8 = 15 --> 15/2 = 7.5

The upper quartile is 7.5

Now, it's time to construct your graph! 


Ultimately, the idea of a Box & Whiskers plot, when drawn out, is supposed to like the image above. The lower quartile, upper quartile, and median, should all be marked with a long, vertical lines and thus connected together in order to make a "box" shape. The purpose of this is to show the viewer that the majority of the numeric values given falls somewhere within that box.

**IMPORTANT: When creating a Box & Whiskers Plot, always keep the scale of your graph in mind. For example, your lower quartile is 4 but drawn out to look like it is closer to your upper extreme of 30, then your scale is completely unbalanced. Always remember that a unit = 1 UNIT, and to draw your points as you would expect to see them on a regular number line.

Part 2: Causation v. Correlation

We also learned the differences between causation and correlation in class in reference to scatter plots and graphs in general. 
  • Causation means that a cause and effect inevitably will happen, there is no way around it. 
    • A non-mathematical example of this would be if you fail to feed your pet - ultimately, your pet will die. 
  • Correlation, on the other hand, means that while there is a relationship between two things, it is not always correct to assume that relationship will cause something to happen. 
    • A non-mathematical example of correlation would be the relationship between hair length and how tall you are (as seen in the activity we did in class). Students found that there is no direct correlation of how tall you are being connected to your hair length, despite both being related in that they are apart of your body.
That being said, there are three types of correlation:
  • Positive Correlation: When one set of data increases, the other set of data also increases. Correspondingly, if one set of data decreases, so does the other set of data. An example of positive correlation in the form of a graph would look like this:
  • Negative Correlation: When one set of data increases, the other set of data decreases. An example of a negative correlation in the form of a graph would look like this:


  • Zero Correlation: This means there is no discernable relationship between the two sets of data. An example of zero correlation in the form of a graph would look like this:

Part 3: Standard Deviation

Finally, we learned about standard deviation. While we didn't go too in-depth, we did learn a couple basics. For example, standard deviation works in specific percentage increments of 68%, 95%, and 99.7%.  When drawn out, the percentages of data creates a curved line. This is called the Bell Curve and it looks like this:

While it is not necessary to memorize the calculation for deviation (for it's quite long and tedious amounts of works), I find that it's good to know what it is - if not for future reference. There are two formulas to calculate SD, but mathematicians tend to use the simple equation. An example of using this equation would look like, which was done in class during our activity:


I apologize for the poor quality and if it looks confusing, but I personally have trouble doing this equation so I'm definitely going to look into practicing it many more times!

Closing
Whelp, that's all for now! Hope I helped in reiterating some of the lessons we learned in class this week. Until next week!