Richard Harker

Richard Harker

by Richard Harker

Make no mistake, Arbitron’s Personal People Meter will dramatically impact the accuracy and timeliness of audience ratings. For media buyers, PPM will provide a higher level of confidence in the numbers. It will also ultimately provide Program Directors a wealth of new information. Unfortunately, in their zeal to convince broadcasters of the benefits of PPM, Arbitron may have over-sold PPM’s impact on programming and marketing decisions. As we get closer to PPM’s roll-outs in Los Angeles, Chicago, and beyond, it is a good time to look beyond the hype and understand the true capabilities (and limitations) of PPM in telling programmers about their listeners.

Over the years Arbitron has talked about many different advantages of PPM. They’ve talked about the advantages of passive measurement over a written diary. They’ve talked about the advantages of monthly ratings over quarterly ratings. They’ve talked about the advantages of using an on-going panel of participants rather than a constantly changing set of diary keepers. However, the one advantage Arbitron has talked about most is PPM’s benefit to programmers. Arbitron has aggressively marketed PPM as a device that can tell us precisely when a person starts listening and when he or she stops listening. To reinforce this selling point, Arbitron has joined with outside companies to publish analyses of PPM’s minute by minute data.

This has led to the release of a number of PPM studies that conveyed the impression that with PPM programmers would have unprecedented insights into listener behavior. One analysis purported to show that listeners continue to listen through spot sets of four, five, and six minutes. Another study claimed to show that with PPM one could analyze tune-in and tune-out patterns for specific songs. A third analysis claimed that PPM proved that more people listen to Rush Limbaugh’s commercials than Limbaugh himself.

In the aggregate, these studies convey the impression to programmers that with PPM they will be able to see minute-by-minute audience reactions and learn which songs, spots, and jocks people like and dislike. Programmers have historically tried to extract every possible bit of information from the numbers. Quarterly ratings weren’t good enough for programmers so we started extrapolating monthly ratings. Months were too long, so we tried to look at weeks, then days, and then individual hours. It shouldn’t be too surprising that programmers salivate over the chance to look at minute by minute ratings. Yes, PPM can produce minute by minute audience levels, but despite the hype, minute by minute ratings are not accurate enough to draw any meaningful conclusions from what one finds.

As long as radio has had ratings, programmers have tried to read in the rating tea leaves much more than the method was ever capable of. Arbitron has been more than willing to tacitly participate in this ever finer slicing and dicing of ratings because programmers want it. The problem is that as we look at smaller and smaller slices of time, the accuracy of the ratings declines until the numbers are meaningless. The bottom line is that just because we can pull minute-by-minute data from the software, doesn’t mean it is either useful or accurate.

The focus of PPM discussions has been on its new features and capabilities. As we read about the changes and innovations PPM brings to radio ratings, we should not lose sight of the fact that even with PPM, a great many limitations and challenges of audience measurement remain. Both diary-based and PPM ratings are estimates with built-in inaccuracies – inaccuracies that increase as we slice the numbers thinner and thinner. Both PPM and the diary method are based on sampling, drawing what we hope is a representative sample of listeners from the population, and then determining the listening patterns of these few people. While the way PPM draws a sample is different, the fact remains that we are still surveying a very small percentage of the population. Arbitron makes every effort to sample a cross section of listeners that mirror the market, but there is no guarantee they will succeed. That is as true with PPM ratings as it is with diary based ratings.

Every Arbitron report includes the following admission: “Due to (the) limitations inherent in Arbitron’s methodology, the accuracy of Arbitron audience estimates cannot be determined to any precise mathematical value or definition.” Yet for MRC accreditation, “Each rating report shall contain standard error data.” We still don’t have error data for PPM, but Arbitron has promised to provide it at some date. Regardless of what Arbitron ultimately produces, we know that PPM ratings will be estimates just like diary ratings are estimates. There will be a degree of uncertainty in every rating number whether it is share, cume, or TSL. What that means is that if a station has a 5 share in the book, it may actually be a 3 share or a 7 share. There’s no way of knowing for sure, and that does not change with PPM.

The important point to remember is that uncertainty goes up as the size of the slice of listening goes down. Monthlies are less accurate than the quarterly reports. An hourly is less accurate than a daypart report. Ratings in a narrow demographic are less accurate than one for a larger demographic. Regardless of how we slice the ratings, the thinner the slice, the greater the uncertainty.

In this regard, PPM is no different than the diary method. But PPM adds to the uncertainty an additional technological twist to which the industry has paid little attention. We all know that unlike the diary method where listeners record their listening in a paper diary, PPM is a passive electronic measurement.  We always knew that the majority of listeners did not carry their diaries with them and record the stations as they listened. Arbitron’s early competitors did studies and showed that the majority of diary keepers filled out their diaries hours and even days afterwards. In that regard, PPM is potentially much more accurate than the diary method. A code is added to each radio station’s signal, so when an Arbitron panelist carrying a PPM comes in contact with an encoded signal, the PPM records the time and station tuned to. While it sounds very high-tech, PPM is not fool-proof.

First, there is the issue of signal detection. The meter has to “hear” the station to record the listening. Arbitron writes, “If the program source is audible, tests have shown that the PPM reliably detects the code. If the program source is not audible – either due to distance, low volume, or excessive background noise – the PPM is designed to not credit such instances as media exposure.”

Arbitron has not shared with the industry at what level the PPM meter can hear a station. However, in independent tests in Europe, the detectability of PPM was an issue. A PPM tucked away in a purse or inside a coat or jacket may not detect a station that is clearly audible to the wearer. If the sound level is close to the threshold of detectability, there is the possibility that the meter will think that listening has started and stopped as if the wearer is switching stations. This is where Arbitron’s editing rules kick in.

Program Directors who have examined diaries at Arbitron’s offices have seen editing rules at work. Diary keepers make many mistakes while writing down their listening, and it is the job of editors to try to correct these errors. A diary keeper might misidentify a station, record the frequency of one station and the call letters of another, or note listening without identifying the station. Editors follow specific rules, and whenever possible assign the quarter-hours to the station the listener probably meant. One might assume with electronic measurement, there would be no need for editing. That is not the case. Editing with PPM has become even more complicated, not less.

The PPM editing rules are quite lengthy, but every Program Director should take the time to read them very carefully. Quoting Arbitron:

Because the PPM works by detecting embedded audio codes in media programs, there is always a slight lag time between when the media exposure actually began and when the meter detected and recorded the code. The minimum lag is five seconds, as this is the time required to transmit or read a code, but the actual lag time varies depending on a number of factors, such as volume level of the program, program content and the presence of background noises that might interfere with code detection (e.g., a fire siren, vacuum cleaner or barking dog). To account for this code-detection lag time, as well as possible interruptions in code detection during continuous media exposure events, lead-in edits are applied whenever a media code is preceded by a blank time segment… For radio, the maximum lead-in edit is 60 seconds.

This means that when Arbitron’s computers run into blank detections, they can credit an identified station an additional minute of listening. Consider a listener channel surfing in a noisy environment such as a car with the windows down. In this case of multiple marginally detectable signals, this editing rule could conceivably both scramble the proportion of listening to each station as well as time shift the apparent listening by a considerable amount. In addition, another editing rule can impact reported listening even more. The rule states:

A code that is not an exact match to an encoding outlet is only considered usable if it can be attributed to a media outlet. A nonmatching code is considered usable if it is detected within 15 minutes of a code that it matches on two out of three characters and on media type, or it is within five minutes of a code that it matches on media type.

So unidentifiable code may be credited to a station if the PPM wearer was listening to an identifiably coded station within 5 or 15 minutes depending on the type of code the PPM recorded. Both rules are intended to err on the side of giving listening credit to an encoded station if the PPM might have picked up the station.

Arbitron argues that these editing rules are rarely invoked. They state that the vast majority of PPM data are accurate and credited corrected without any need for editing. While that may be true, we have no way of knowing. When Arbitron edited a diary, we could see the edits right in the diary and we could estimate the impact of any errors on a station’s ratings. With PPM, the editing process is invisibly handled by computer, so we never know to what degree editing impacted the numbers.  In a market like Houston with a couple of thousand PPMs in use in a wide range of noisy environments, it is reasonable to assume that there is some editing going on. But even if it is, any editing will probably have a minor impact at the monthly report level.

But the claim is that we can look at a single minute during a single day and use PPM to see how our listeners felt about what we were doing. In Houston, a reasonably popular station might have no more than five or so PPM wearers hearing the station at any given time. A single time shifted occasion, mis-credited station, or the drop-out of a single PPM could give the illusion of significant listening shifts when it is only a PPM technical problem.

As PPM rolls out in additional markets and more Program Directors start dissecting their minute by minute ratings, there is a real potential for PDs to overreact and make changes based on an excessive confidence in the reliability of PPM numbers. We’ve all developed a healthy skepticism about ratings and an innate understanding that thinner slices are less reliable than thick slices. When you make the transition to PPM, make sure you don’t retire that skepticism.

Richard Harker is President of Harker Research, a company providing a wide range of research services to radio stations in North America and Europe. Twenty-years of research experience combined with Richard’s 15 years as a programmer and general manager helps Harker Research provide practical actionable solutions to ratings problems. Visit www.harkerresearch or contact Richard at (919) 954-8300.