Radio is in the business of “picking the hits,” but of course this concept is far more complex than the average person might think. In his latest Programming To Win column, Richard Harker looks back at the history of how programmers chose their playlist, while also analyzing how PPM has affected this practice in the present.
By Richard Harker
We know that for broadcast radio to survive in this digital world we have to offer unique compelling content.
For a music station it begins with playing the right songs, the songs that your listeners want to hear.
“Everybody knows that,” you’re thinking. It’s obvious.
But if it is so obvious, why are so many stations pretending they don’t have to play the right songs? Why do some stations play the wrong songs or the right songs at the wrong time?
Music testing dates back to the 1970s. Before then Program and Music Directors just guessed about which songs to play.
They might use sales figures, they might tally requests, but for the most part a song ended up on air because somebody at the station liked it.
Then a few of us decided that we ought to let listeners pick the songs we play.
Stations started doing Call-Out, and we found that what we thought were hits were often stiffs that listeners didn’t want to hear.
Over the years we discovered something else. People grow tired of songs.
Of course we already sensed that. What we hadn’t realized was that some songs can last many weeks without fatiguing listeners while others burn out very quickly.
In the coming years we learned a great deal about the life-cycles of songs.
We realized that we couldn’t tell the difference between hits and stiffs without testing. We found out that even hits burn out, and there is an optimum time to get on a song, an optimum duration to play it, and an optimum time to get off it.
And most importantly, we learned that stations that test their music beat stations that don’t.
Music testing became essential, and the more competitive the battle, the more essential it becomes.
Fast forward four decades. We’re entering a post-music testing period where some Program Directors are being told that there’s no money for music testing.
“Wing it. Just play what other stations are playing,” the PD is told.
We find that even good experienced Program Directors using their best judgment manage to get only 30-50% of the playlist right.
In other words, without research, up to 70% of the songs on a station are either stiffs, too unfamiliar, or burned out. You might as well throw darts at a Billboard chart pinned to the wall.
Not the kind of playlist to keep listeners happy.
Good Program Directors understand the need to test music even when there’s little money in the budget for it, so some are experimenting with using PPM data to pick the hits.
Stations already pay big bucks to Nielsen for PPM ratings, so if you can get a little more mileage out of PPM by using it to test music, why not?
It’s better than nothing, isn’t it? Maybe not.
Here’s the theory: Using minute by minute PPM data synched with the station’s music log, you can watch how your meter count changes as different songs come on.
The belief is that if the number of meters goes up, you’ve got a hit on your hands.
If the number of meters goes down, then you’ve got a stiff.
And if you see a pattern repeated over time, like a decline in meters most every time you play a song, you can be certain the trend is real.
Sounds pretty powerful, doesn’t it? After all, these are actual PPM panelists essentially participating in a real-time music test.
What could go wrong? As it turns out, a lot.
PPM does not measure listenership. It measures exposure. There’s a big difference.
A panelist need only be exposed to a station for the meter to register the station. The panelist may be listening by choice, but there’s an equal or greater chance that the panelist doesn’t like the station, or isn’t even aware that the station is playing in the background.
Compare that to diary keepers.
Diary keepers write down the stations they listen to. They rarely write down radio stations they dislike or that they hear because it is playing in the background.
That’s why diary keepers write down just the two or three radio stations they listen to regularly. Our research confirms that people have a P1 station, a P2 station, and sometimes a P3 station.
That’s it. Three stations.
Yet PPM data suggest that people are exposed to five to seven stations.
If people only like two or three stations and PPM typically detects five or more stations, that means that the majority of stations recorded by PPM are probably not favorites of the panelist.
Relying on PPM data is like testing your music with people who don’t like your format. Does that make sense?
But that’s just one of several dangers of using PPM meter flow to choose songs.
Even if we knew the meter captures only a panelist’s favorite stations, there’s the problem of the small number of meters placed by Nielsen.
Music testing is most accurate when we use highly targeted narrow demo cells. The more focused the screening criteria, the more accurate the results.
You probably have no more than a dozen meters in use in the demographic you would choose for a music test. And minute by minute typically finds a change of just a couple of meters.
Relying on a dozen meters makes no more sense that a music test with just a dozen people. Would you trust a music test with only a dozen listeners?
The work around is to watch meter counts over time in the belief that over time reliability increases as the number of responses increases.
That might be true if PPM panelists were rotated out every week like diary participants. Unfortunately, today’s panelists can continue to participate for years.
You have the same panelists week after week showing up in your minute by minute. It may look like your sample size is increasing, but it isn’t.
Think of it this way: Let’s say you have five people rate your music. Then the following week have the same five people rate your music.
Have you really doubled the sample?
If these problems don’t bother you, consider the technological limitations of PPM.
The 1980s technology behind the meter relies on an analog system to separate the encoded signal from the broadcast audio, ambient noise, and interference.
Research has shown that a PPM meter misses at least 30% of the listening that it ought to recognize.
So there’s drop-out.
Because of the drop-out, Nielsen computers use a series of editing rules to fill in the gaps. For example, there are times when you can actually get up to three minutes of credit for a period of time when the meter can’t ID the station.
This three minute window along with other editing rules fill in the gaps created when the meters get confused.
The editing means that the minute by minute is approximate. After editing it is sort of minute by minute.
For the purpose for which PPM was designed, that is estimating audience sizes over long time spans, it may not be a big deal. Most likely the edits average out over time.
However, the PPM technology was never designed to accurately measure minute by minute flows of listeners.
Another clue that we ought to be suspect of minute by minute is that repeated Arbitron analyses have shown that when panelists leave one station, they rarely tune to another.
Panelists simply disappear.
If the majority of panelists left one station to go to another, one might reasonably conclude that something drove the listeners away.
However, since the majority of panelists do not reappear on another station, it is hard to argue that they were driven away by a bad song.
A more likely explanation is that the meter dropped the signal or that the panelist needed to turn off the radio.
The disappearance of a panelist says nothing about the song that was playing.
So it is speculation of the highest order to argue that a drop in meters is because listeners don’t like the song you’re playing, or that an increase in meters means people like the song playing.
And if these reasons weren’t enough to question the value of using PPM to test your music, there’s the problem of bias.
There’s something called Confirmation Bias. It is the tendency to search for, interpret, focus on and recall something in a way that confirms what you already believe.
Every Program and Music Director with the job of picking the hits falls prey to confirmation bias. We see what we want to see.
Marshall McLuhan said, “I wouldn’t have seen it if I hadn’t believed it.”
There is enough randomness in the meter count changes to see anything you want to see. Think a song is a stiff? You’ll see it in the meters. Really believe in a song? You’ll see enough in the meters to justify staying on the song.
Confirmation bias also influences music decisions even with Call Out and other methodologically sound ways to test music, but reliable research leaves less room for interpretation and subjective music decisions.
The bottom line is that growing and retaining audience requires that you play the songs that listeners want to hear.
We’ve got forty years of experience that proves that the best way to be sure you’re playing the right songs is through methodologically sound research testing your music with your target listeners.
That’s always been the case, but now with services like Pandora and Spotify in addition to your competitor across the street, accurate music research is more important than ever.
There are no shortcuts.
Richard Harker is President of Harker Research, a company providing a wide range of research services to radio stations in North America and Europe. Twenty-years of research experience combined with Richard’s 15 years as a programmer and general manager helps Harker Research provide practical actionable solutions to ratings problems. Visit www.harkerresearch or contact Richard at (919) 954-8300.