COURSE INSTRUCTOR:TEHSEEN IMRAAN
We continue our study of descriptive statistics with measures of dispersion, such as dot plots, stem and leaf displays, quartiles, percentiles, and box plots. Dot plots, a stem-and-leaf display, and box plots give additional insight into where the values are concentrated and dispersed and the general shape of the data. Finally we considerbivariatedata where we observe two variables for each individual or observation selected.
A graph for displaying a set of data. Each numerical value is represented by a dot placed above a horizontal number line.To develop a dot plot we display a dot for each observation along a horizontal number line indicating the value of each piece of data. For multiple observations we pile the dots on top of each other.
STEPS TO CONSTRUCT DOT PLOT
Sort the data from smallest to largest.Draw and label a number line.Place a dot . for each observation.
Step 1:Sort the data from smallest to largest.
Step 2:Draw the number line and label it as shown.
Step 3:Place a dot for each observation.
STEM AND LEAF DISPLAYS
A statistical technique for displaying a set of data. Each numerical value is divided into two parts: The leading digit(s) become thestem, and the trailing digits theleaf. The stems are located along the main vertical axis, and the leaf for each observation along the horizontal axis.To develop a stem-and-leaf chart the first step is to locate the largest value and the smallest value. This will provide the range of the stem values. The stem is the leading digit or digits of the number, and the leaf is the trailing digit. For example, the number 15 has a stem value of 1 and a leaf value of 5. For another problem the number 231 has a stem value of 23 and a leaf value of 1.
OTHER MEASURES OF DISPERSION
QUARTILES:First QuartileThe point below which one-fourth or 25% of the ranked data values lie. (It is designatedQ1)Third QuartileThe point below which three-fourths or 75% of the ranked data values lie. (It is designatedQ3)Logically the median is theSecond Quartile(designatedQ2). The values corresponding toQ1,Q2and Q3divide a set of data into four equal parts.
DECILES AND PERCENTILES
Just as quartiles divide a distribution into 4 equal parts, deciles divide a distribution into ten equal parts; and percentiles divide a distribution into 100 equal parts.The procedure for finding the quartile,decile, and a percentile for ungrouped data is to order the data from smallest to largest. Then use text formula [4-1].
DECILES AND PERCENTILES
Location of a Percentile,
A graphical display based on five statistics: the minimum value,Q1(the first quartile),Q2the median,Q3(the third quartile) and the maximum value.To construct a box plot we need five pieces of information. We need the minimum value,Q1(the first quartile),Q2the median,Q3(the third quartile) and the maximum value.
Coefficient of variation: The ratio of the standard deviation to the arithmetic mean, expressed as a percent.
FORMULA FOR CV
COEFFICIENT OF VARIATION
Characteristics of the coefficient of variation are:It reports the variation relative to the mean.It is useful for comparing distributions with different units.
Four shapes of distributionCoefficientofskewness:A measure to describe the degree ofskewness. How the distribution is skewed?
Text Formula [4–3] is for Pearson’s Coefficient ofSkewness.
Characteristics of the coefficient ofskewnessare:
The coefficient ofskewness, designatedsk, measures the amount ofskewnessand may range from -3.0 to +3.0.A value near -3, such as -2.57, indicates considerable negativeskewness.A value such as 1.63 indicates moderate positiveskewness.A value of 0, which will occur when the mean and median are equal, indicates the distribution is symmetrical and that there is noskewness.
SUMMARY OF CHARTS
RELATIONSHIP BETWEEN TWO VARIABLES
Bivariatedata:A collection of paired data values.Scatter diagram:A graph in which paired data values are plotted on anX,Y Axis.The steps to follow in developing a scatter diagram are:We need two variables.We scale one variable (x) along the horizontal axis (X – Axis) of a graph and the corresponding variable (y) along the vertical axis (Y – Axis).Place a dot for each (x, y) pair of observations.
A table used to classify sample observations according to two or more identifiable characteristics.When we study the relationship between two or more variables when one or both are nominal or ratio scale, we tally the results into a two-way table. This two-way table is referred to as acontingency table.
A contingency table is a cross tabulation that simultaneously summarizes two variables of interest and their relationship.A survey of 60 school children classified each as to gender and the number of times lunch was purchased at school during a four-week period. Each respondent is classified according to two criteria – the number of times lunch was purchased and gender.