Wednesday, October 18, 2006

Statistics -- Chapter 2

organizing and summarizing data

Frequency distribution -- lists each category of data and the number of occurrences for each category of data
relative frequency -- the proportion (or percent) of observations in the category and is found using the formula = = relative frequency equals frequency divided by some of all frequencies
relative frequency distribution -- lists each category of data together with the relative frequency
bar graph -- constructed by labeling each category of data on a horizontal axis in the frequency or relative frequency of the category on the vertical axis. Rectangles of equal with are drawn for each category. The height of each rectangle is equal to the category's frequency or relative frequency
pareto chart -- a bar graph news bars are drawn in decreasing order of frequency or relative frequency
pie chart -- a circle divided into sectors. Each sector represents a category of data. The area of each sector is proportional to the frequency of the category
histogram -- constructed by trawling rectangles for each class of data. The height of each rectangle is the frequency or relative frequency of the class. The width of each rectangle is the same in the rectangles touch each other
classes -- categories of data created by using intervals of numbers
lower class limit -- smallest value within the class
upperclass limit -- largest value within the class
class width -- difference between consecutive lower class limits
open ended -- first class has no lower limit or the last classes not have an upperclass limit
stem and leaf plot -- another way to represent quantitative data graphically
dot plot -- drawn by placing each observation horizontally in increasing order and placing a dot above the observation each time it is observed
class midpoint -- found by adding consecutive lower class limits and dividing the result by two
frequency polygon -- drawn by plotting a point above each class midpoint on a horizontal axis at a height equal to the frequency of the class. After the points for each class are plotted, straight lines are drawn between consecutive points
cumulative frequency distribution -- displays the aggregate frequency of the category. For discrete data, it displays the total number of observations less than or equal to the category. For continuous data, it displays the total number of observations less than or equal to the upperclass limit of a class
cumulative relative frequency distribution -- displays the proportion (or percentage) of observations less than or equal to the category for discrete data and the proportion of observations less than or equal to the upper class limit for continuous data
ogive -- a graph that represents the cumulative frequency or cumulative relative frequency for the class. It is constructed by plotting points whose x- coordinates are the upperclass limits and whose y-coordinates are the cumulative frequencies or cumulative relative frequencies. After the points for each class are plotted, straight lines are drawn between consecutive points
time-series data -- value of the variable is measured a different point in time
time-series plot -- obtained by plotting the time in which a variable is measured on the horizontal axis and the corresponding value of the variable on the vertical axis. Lines are then drawn connecting points

Summary
raw data are first organized into tables. Data are organized by creating classes into which they fall. Qualitative data and discrete data have values that provide clear-cut categories of data. However, with continuous data the categories, called classes, must be created. Typically, the first table created is a frequency distribution, which lists the frequency with which each class of data occurs. Other types of distributions include the relative frequency distribution and the cumulative frequency distribution.

Once data are organized into the table, graphs are created. For data that are qualitative, we can create bar charts and pie charts. For data that are quantitative, we can create histograms, stem and leaf plots, frequency polygons, and ogives.