Navigation bar
    Page Start Previous page
 28 of 29 
Next page End  

Frequency
© Peter Broadfoot 2008
Histograms
Appendix B –  Variable Class Width
You have seen that if you use frequency on the y-axis of a histogram, there is a problem
with different class widths.  This isn’t just a problem when comparing two histograms that
use different class widths.  Different widths are commonly used within the same histogram.
For example, in the first example of a histogram, distances travelled to work were grouped
into equal 2mile widths.  You don’t have to use equal widths.  You could create classes
from 1 to 4miles, then 4 to 6miles, then 6 to 7miles, then 7 to 8miles, then 8 to 10miles and
finally 10 to 13miles.  The class widths would be 3, 2, 1, 1, 2, and 3miles.  The total range
is still from 1 to 13miles.
There are a number of reasons why a histogram is drawn with different class widths.
The outer groups typically contain relatively few data, whereas the central part of a
histogram contains most of the data.  If you use narrow groups, the height of the
outer bars will fluctuate randomly so that a pattern cannot be seen.  If the groups are
too wide you can miss some of the finer detail of the pattern.  To make the shape of a
histogram clearer, you choose the class widths carefully – not too narrow and not too
wide.  For example, you may use narrow groups in the central region and wider
groups for the outer regions.
The data may already be grouped and you do not have the original raw data.
Sometimes the class widths are pre-determined.  This is similar to the previous
reason.  If the data are about students, you may group according to age, with 1year
class widths.  However, the data may be available in natural groupings that are
generally not equal width, such as pre-school, early primary, late primary, 11-16yrs,
16-18yrs and adult students.
The definition of a histogram, in which the area, not the height, represents frequency, is
better, because then the overall height and shape of the histogram do not depend on the
class widths you use.  To understand this point you have to think of a histogram as a picture
that is characteristic of the data.  Both its shape and its height are important.  By using
frequency for the area of each bar, and frequency density for the height, the histogram
retains its recognisable shape and its correct height when the class widths are adjusted.
With different class widths on a histogram, the use of frequency for the height of the bars
would be like viewing a histogram through a distorting mirror.  The heights of the wider
bars grow out of proportion and the shape is not recognisable.  The use of frequency density
scales the height of each bar according to its class width.
We’ll illustrate this with an example similar to Appendix A, where we compared the bars
on a histogram to boxes for holding data.
In this histogram we’ll start with just the first two bars.  They
are the same height and width.  The height h=4 and the width
of each bar is w=1. Because the width is one, the frequency
equals the frequency density. fd = 4/1 = 4. You can therefore
use frequency or frequency density on the y-axis.  Here we
have used frequency because the intention is to show the
effect on the shape.
0
1
2
3
4
5
6
7
8
3
4
5
6
7
8
9
10
Previous page Top Next page