Ratings
are a central component of the television industry, and almost a
household word. They are important in television because they indicate
the size of an audience for specific programs. Networks and stations
then set their advertising rates based on the number of viewers
of their programs. Network revenue is thus directly related to the
ratings. The word "ratings," however, is actually rather confusing
because it has both a specific and a general meaning. Specifically,
a rating is the percentage of all the people (or households) in
a particular location tuned to a particular program. In a general
sense, the term is used to describe a process (also referred to
as "audience measurement") that endeavors to determine the number
and types of viewers watching TV.
One
common rating (in the specific sense) is the rating of a national
television show. This calculation measures the number of households--out
of all the households in the United States that have TV sets--watching
a particular show. There are approximately 92.4 million households
in the United States and most of them have sets. In order to simplify
the example, assume that there are 100,000,000 households. If 20,000,000
of them are watching NBC at 8:00 P.M. NBC's rating would be 20 (20,000,000/100,000,000=20).
Another way to describe the process is to say that one rating point
is worth 1,000,000 households.
Ratings
are also taken for areas smaller than the entire nation. For example,
if a particular city (Yourtown) has 100,000 households, and 15,000
of them are watching the local news on station KAAA, that station
would have a rating of 15. If Yourtown has a population of 300,000
and 30,000 people are watching KAAA, the station's rating would
be 10. And because television viewing is becoming less and less
of a group activity with the entire family gathered around the living
room TV set, some ratings are expressed in terms of people rather
than households.
Many
calculations are related to the rating. Sometimes people, even professionals
in the television business, confuse them. One of these calculations
is the share. This figure reports the percentage of households (or
people) watching a show out of all the households (or people)
who have the TV set on. So if Yourtown has 100,000 households
but only 50,000 of them have the TV set on and 15,000 of those are
watching KAAA, the share is 30 (15,000/50,000=30). Shares are always
higher than ratings unless, of course, everyone in the country is
watching television.
Another
calculation is the cume, which reflects the number of different
persons who tune in a particular station or network over a period
of time. This number is used to show advertisers how many different
people hear their message if it is aired at different times such
as 7:00 P.M., 8:00 P.M., and 9:00 P.M.. If the total number of people
available is 100 and 5 of them view at 7:00 and those five still
view at 8:00 but 3 new people watch, and then two people turn the
TV off but 4 new ones join the audience at 9:00, the cume would
be 12 (5+3+4=12). Cumes are particularly important to cable networks
because their ratings are very low. Two networks with ratings of
1.2 and 1.3 can not really be differentiated, but if the measurement
is taken over a wider time span, a greater difference will probably
surface.
Average
quarter hours (AQH) are another measurement. This calculation is
based on the average number of people viewing a particular station
(network, program) for at least five minutes during a fifteen-minute
period. For example, if, out of 100 people, 10 view for at least
five minutes between 7:00 and 7:15, 7 view between 7:15 and 7:30,
11 view between 7:30 and 7:45, and 4 view between 7:45 and 8:00,
the AQH rating would be 8 (10+7+11+4=32/4=8).
Many
other calculations are possible. For example, if the proper data
has been collected, it is easy to calculate the percentage of women
between the ages of 18 and 34 or of men in urban areas who watch
particular programs. Networks and stations gather as much information
as is economically possible. They then try to use the numbers that
present their programming strategies in the best light.
The
general ratings (audience measurement) process has varied greatly
over the years. Audience measurement started in the early 1930s
with radio. A group of advertising interests joined together as
a non-profit entity to support ratings known as Crossleys, named
after Archibald Crossley, the man who conducted them. Crossley used
random numbers from telephone directories and called people in about
thirty cities to ask them what radio programs they had listened
to the day before his call. This method became known as the recall
method because people were remembering what they had listened to
the previous day. Crossleys existed for about fifteen years but
ended in 1946 because several for-profit commercial companies began
offering similar services that were considered better.
One
of these, the Hooper ratings, was begun by C. E. Hooper. Hooper's
methodology was similar to Crossley's except that respondents were
asked what programs they were listening to at the time of the call--a
method known as the coincidental telephone technique. Another service,
The Pulse, used face-to-face interviewing. Interviewees selected
by random sampling were asked to name the radio stations they had
listened to over the past twenty-four hours, the past week, and
the past five midweek days. If they could not remember, they were
shown a roster containing station call letters to aid their memory.
This was referred to as the roster-recall method.
Today
the main radio audience measurement company is Arbitron. The Arbitron
method requires people to keep diaries in which they write down
the stations they listen to at various times of the day. In these
diaries, they also indicate demographic features--their age, sex,
marital status, etc.--so that ratings can be broken down by sub-audiences.
The main television audience measurement company is the A.C. Nielsen
company. For many years Nielsen used a combination of diaries and
a meter device called the Audimeter. The Audimeter recorded the
times when a set was on and the channel to which it was tuned. The
diaries were used to collect demographic data and list which family
members were watching each program. Nielsen research in some markets
still uses diaries, but for most of its data collection, Nielsen
now attaches Peoplemeters to TV sets in selected homes. Peoplemeters
collect both demographic and channel information because they are
equipped with remote control devices. These devices accommodate
a number of buttons--one for each person in the household and one
for guests. Each person watching TV presses his or her button, which
has been programmed with demographic data, to indicate viewing choices
and activities.
There
are also companies that gather and supply specialized ratings. For
example, one company specializes in data concerning news programs
and another tracks Latino viewing.
All audience measurement is based on samples. As yet there is no
economical way of finding out what every person in the entire country
is watching. Diaries, meters, and phone calls are all expensive
so sometimes samples are small. In some cases no more than .004
percent of the population being surveyed. However, the rating companies
try to make their samples as representative of the larger population
as possible. They consider a wide variety or demographic features--size
of family, sex and age of head of household, access to cable TV,
income, education--and try to construct a sample comprising the
same percentage of the various demographic traits as in the general
population.
In order to select a representative sample, the companies attempt
to locate every housing unit in the country (or city or viewing
area), mainly by using readily available government census data.
Once all the housing units are accounted for, a computer program
is used to randomly select the sample group in such a way that each
location has an equal chance of being selected. Company representatives
then write or phone people in the households that have been selected
trying to secure their cooperation. About 50% of those selected
agree to participate. People are slightly more likely to allow meters
in their house and to answer questions over the phone than they
are to keep diaries. Very little face-to-face interviewing is now
conducted because people are reluctant to allow strangers into their
houses. When people refuse to cooperate, the computer program selects
more households until the number needed for the sample have agreed
to volunteer.
Once sample members have agreed to participate, they are often contacted
in person. In the case of a diary, someone may show them how to
fill it out. In other cases the diary and instructions may simply
be sent in the mail. For a meter, a field representative goes to
the home (apartment, dorm room, vacation home, etc.) and attaches
the meter to the television set. This person must take into account
the entire video configuration of the home--multiple TV sets, VCRs,
satellite dishes, cable TV, and anything else that might be attached
to the receiver set. The field representative also trains family
members in the use of the meter.
People
participating in audience measurement are usually paid, but only
a small amount, such as fifty cents. Ratings companies have found
that paying people something makes them feel obligated, but paying
them a large amount does not make them more reliable.
Ratings
companies try to see that no one remains in the sample very long.
Participants become weary of filling out diaries or pushing buttons
and cease to take the activities seriously. Soliciting and changing
sample members is expensive, however, so companies do keep an eye
on the budget when determining how to update the sample.
Once
the sample is in order, the data must be collected from the participants.
For phone or face-to-face interviews, the interviewer fills in a
questionnaire and the data is later entered into a computer. For
meters, the data collected is sent over phone lines to a central
computer. People keeping diaries mail them back to the company and
employees then enter the data into a computer. Usually only about
50% of diaries are useable; the rest are never mailed back or are
so incorrectly filled out that they can not be used.
From
the data collected and calculated by the computer, ratings companies
publish reports. These vary according to what was surveyed. Nielsen
covers commercial networks, cable networks, syndicated programming,
public broadcasting, and local stations. Other companies cover more
limited aspects of television. Reports on each night's primetime
national commercial network programming, based on Nielsen Peoplemeters,
are usually ready about twelve hours after the data is collected.
It takes considerably longer to generate a report based on diaries.
The reports dealing with stations are published less frequently
than those for primetime network TV. Generally station ratings are
undertaken four times a year--November, February, May, and July--periods
that are often referred to as "Sweeps." The weeks of the Sweeps
are very important to local stations because the numbers produced
then determine advertising rates for the following three months.
Most reports give not only the total ratings and shares but also
information broken down into various demographic categories--age,
sex, education, income. The various reports are purchased by networks,
stations, advertisers, and any other companies with a need to know
audience statistics. The cost is lower for small entities, such
as TV stations, than for larger entities, such as commercial networks.
The latter usually pay several million dollars a year to receive
a ratings service.
While
current ratings methods may be the best yet devised for calculating
audience size and characteristics, audience measurement is far from
perfect. Many of the flaws of ratings should be recognized, particularly
by those employed in the industry who make significant decisions
based on ratings.
Sample
size is one aspect of ratings that is frequently questioned in relation
to rating accuracy. Statisticians know that the smaller the sample
size the more chance there is for error. Ratings companies admit
to this and do not claim that their figures are totally accurate.
Most of them are only accurate within two or three percent. This
was of little concern during the times when ratings primarily centered
around three networks, each of which was likely to have a rating
of 20 or better. Even if CBS's 20 rating at 8:00 P.M. on Monday
was really only 18, this was not likely to disturb the network balance.
In all likelihood CBS's 20 rating at 8:00 Tuesday evening was really
a 22, so numbers evened out. Now that there are many sources of
programming, however, and ratings for each are much lower, statistical
inaccuracies are more significant. A cable network with a 2 rating
might actually be a 4, an increase that might double its income.
Audience
measurement companies are willing to increase sample size, but doing
so would greatly increase their costs, and customers for ratings
do not seem willing to pay. In fact, Arbitron, which had previously
undertaken TV ratings, dropped them in 1994 because they were unprofitable.
As
access to interactive communication increases, it may be easier
to obtain larger samples. Wires from consumer homes back to cable
systems could be used to send information about what each cable
TV household is viewing. Many of these wires are already in place.
Consumers wishing to order pay-per-view programming, for example,
can push a button on the remote control that tells the cable system
to unscramble the channel for that particular household. Using this
technology to determine what is showing on the TV set at all times,
however, smacks of a "Big Brother" type of surveillance. Similarly,
by the 1970s a technology existed that enabled trucks to drive along
streets and record what was showing on each TV set in the neighborhood.
This practice, perceived as an invasion of privacy, was quickly
ended.
Sample
composition, as well as sample size, is also seen as a weakness
in ratings procedures. When telephone numbers are used to draw a
sample, households without telephones are excluded and households
with more than one phone have a better chance of being included.
For many of the rating samples, people who do not speak either English
or Spanish are eliminated. Perhaps one of the greatest difficulties
for ratings companies is caused by those who eliminate themselves
from the sample by refusing to cooperate. Although rating services
make every attempt to replace these people with others who are similar
in demographic characteristics, the sample's integrity is somewhat
downgraded. Even if everyone originally selected agreed to serve
the sample can not be totally representative of a larger population.
No two people are alike, and even households with the same income
and education level and the same number of children of the same
ages do not watch exactly the same television. Moreover, people
within the sample, aware that their viewing or listening habits
are being monitored, may act differently than they ordinarily do.
Other
problems rise from the fact that each rating technique has specific
drawbacks. Households with Peoplemeters may suffer from "button
pushing fatigue" thereby artificially lowering ratings. Additionally,
some groups of people are simply more likely to push buttons than
others. When the Peoplemeter was first introduced, sports viewing
soared and children's program viewing decreased significantly. One
explanation held that men, who were watching sports intently, were
very reliable about the button pushing, perhaps, in some cases,
out of fear that the TV would shut off if they didn't push that
button. Children, on the other hand, were confused or apathetic
about the button, therefore underreporting the viewing of children's
programming. Another theory held that the women of the household
had previously kept the diaries and though not always aware of what
their husbands were actually viewing, were much more conscious of
what their children were watching. Under the diary system, in this
explanation, sports programming was underrated.
But
diaries have their own problems. The return rate is low, intensifying
the problem of the number of uncooperative people in the sample.
Even the diaries that are returned often have missing data. Many
people do not fill out the diaries as they watch TV. They wait until
the last minute and try to remember details--perhaps aided by a
copy of TV Guide. Some people are simply not honest about
what they watch. Perhaps they do not want to admit to watching a
particular type of television or a particular program.
With
interviews, people can be influenced by the tone or attitude of
the interviewer or, again, they can be less than truthful about
what they watched out of embarrassment or in an attempt to project
themselves in a favorable light. People are also hesitant to give
information over the phone because they fear the person calling
is really a sales person.
Beyond sampling and methodological problems, ratings can be subject
to technical problems--computers that go down, meters that function
improperly, cable TV systems that shift the channel numbers of their
program services without notice, station antennas struck by lightning.
Additionally,
rating methodologies are often complicated and challenged by technological
and sociological change. Videocassette recorders, for example, have
presented difficulties for the ratings companies. Generally, programs
are counted as being watched if they are recorded. However, many
programs that are recorded are never watched, and some are watched
several times. In addition, people replaying tape often zip through
commercials, destroying the whole purpose of ratings. And ratings
companies have yet to decide what to do with sets that show four
pictures at once.
Another major deterrent to the accuracy of ratings is that fact
that electronic media programmers often try to manipulate the ratings
system. Local television stations program their most sensational
material during ratings periods. Networks preempt regular series
and present star-loaded specials so that their affiliates will fare
well in ratings and can therefore adjust their advertising rates
upward. Cable networks show new programs as opposed to reruns. All
of this, of course, negates the real purpose of determining which
electronic media entities have the largest regular audience. It
simply indicates which can design the best programming strategy
for Sweeps week.
Because
of the possibility for all these sampling, methodological, technological,
and sociological errors, ratings have been subjected to numerous
tests and investigations. In fact, in 1963, the House of Representatives
became so skeptical of ratings methodologies that it held hearings
to investigate the procedures. Most of the skepticism had arisen
because of a cease and desist order from the Federal Trade Commission
(FTC) requiring several audience measurement companies to stop misrepresenting
the accuracy and reliability of their reports. The FTC charged the
rating companies with relying on hearsay information, making false
claims about the nature of their sample populations, improperly
combining and reporting data, failing to account for non-responding
sample members, and making arbitrary changes in the rating figures.
The main result of the hearings was that broadcasters themselves
established the Electronic Media Rating Council (EMRC) to accredit
rating companies. This group periodically checks rating companies
to make sure their sample design and implementation meets preset
standards that electronic media practitioners have agreed upon,
to determine whether or not interviewers are properly trained, to
oversee the procedures for handling diaries, and in other ways to
assure the ratings companies are compiling their reports as accurately
as possible. All the major rating companies have EMRC accreditation.
The
EMRC and other research institutions have continued various studies
to determine the accuracy of ratings. Some of the findings are:
People who cooperate with rating services watch more TV, have larger
families, and are younger and better educated than those who will
not cooperate; telephone interviewing gets a 13% higher cooperation
rate than diaries; Hispanics included in the ratings samples watch
less TV and have smaller families than Hispanics in general.
Both
electronic media practitioners and audience measurement companies
want their ratings to be accurate, so both groups undertake testing
to the extent they can afford it. In 1989, for example, broadcasters
initiated a study to conduct a thorough review of the Peoplemeter.
The result was a list of recommendations to Nielsen that included
changing the amount of time people participate from two years to
one to eliminate button pushing fatigue, metering all sets including
those on boats and in vacation homes, and simplifying the procedures
by which visitors log into the meter.
Still,
the weakest link in the system, at present, seems to be how the
ratings are used. Networks tout rating superiorities that show .1
percent differences, differences that certainly are not statistically
significant. Programs are canceled because their ratings fall one
point. Sweeps weeks tend to become more and more sensationalized.
At stake, of course, are advertising fees that can translate into
millions of dollars. Advertisers and their agencies need to remain
vigilant so that they are not paying rates based on artificially
stimulated ratings that bear no resemblance to the programs in which
the sponsor is actually investing. At this time all parties in the
system seem invested in some form of audience measurement. So long
as the failures and inadequacies of these systems are accepted by
these major participants, the numbers will remain a valid type of
"currency" in the system of television.
-Lynn
Schafer Gross
Beeville, Hugh Malcolm. Audience Ratings: Radio, Television,
and Cable. Hillsdale, New Jersey: Erlbaum, 1985; revised edition,
1988.
Buzzard,
Karen. Electronic Media Ratings. Boston, Massachusetts: Focal,
1992.
Clift,
Charles III, and Archie Greer, editors. Broadcast Programming:
The Current Perspective. Washington, D.C.: University Press
of America, 1981.
Dominick,
Joseph R., and James E. Fletcher. Broadcasting Research Methods.
Boston, Massachusetts: Allyn and Bacon, 1985.
Gross,
Lynne S. Telecommunications: An Introduction to Electronic Media.
Madison, Wisconsin: Brown and Benchmark, 1995.
Webster,
James G., and Lawrence W. Lichty. Ratings Analysis: Theory and
Practice. Hillsdale, New Jersey: Erlbaum, 1991.