Sunday, December 12, 2021

Random Duplication gives Random Results

“Random Duplication” is a very often heard term in media agency corridors and even more so these days in the Zoom / Teams meetings online. Those better-read, also sprinkle conversations related to ‘random duplication’ with “stochastic’  or more popular ‘Sainsbury Formula’ and with “Normally Distributed”.

Whats the Context?

The context in which we see these words being used in media agencies is when discussing “multi-media planning”. So, if there is a TV Campaign that reaches 60% of an audience and there is a Digital Campaign which reaches say, 40% of the same audience – then, how does one estimate the net reach of the TV+Digital campaign for the audience.  It is in this context that all the above terms are oft heard. 

Of Course! Multi-media reach is one of the most important element in this increasing multi-media universe that we are getting deeper and deeper into.

While, I don’t claim to be pure-blooded statistician – nor am I going to try to explain these statistical terms to you but, this article will help put the issue in perspective from a media planning point-of-view. Later, I will come back with a more technical answer to the issues with - and alternatives to random duplication along with the help of more statistics-inclined colleagues 😃.  

Why is multi-media reach a challenge? 

Those of you who are in the media domain are aware of the limitations of the measurements systems available. 
  • Syndicated research and reports such as IRS, TGI, GWI, i-Cube and some others provide a survey-based estimate of all media but, are often challenged due to lack of vehicle level granularity/ accuracy and dated reporting in a fast-moving digital age. These systems however, only provide “Max-Reach” estimates for various media/ platforms.
  • For Campaign Measurement of the non-digital media – none other than BARC (TV Viewership Measurement System) provide any kind of campaign measurement capability. The IRS does offer campaign planning in the IRS Software, but no one in the industry uses it.
  • Then there is a huge plethora of Digital Platforms that have their own dashboards/ server reports to map the Max-Reach of the platforms or MAUs which is the more widely used term in digital conversations. These platforms, provide estimates of MAUs as well as of Campaign Reach. There are limitations as some platforms only share impressions and not unique impressions; some don’t provide frequency-based estimates of impressions; the reporting parameters are different across platforms, across formats within a platform, and so on.
  • Then there are also the panel based digital measurement syndicated sources such as Comscore, Similarweb, etc. And, there are many other app-download measurement and various other measurement platforms. While, recency of data is not an issue of data on these platforms; the numbers are so different across platforms and there is mush to be done in decoding them yet. 
Of course! there are those who just don’t understand the concept of scientific survey research and discard the syndicated reports due to low sample (but, continue to spend millions on their hunches and beliefs). I find these reports extremely powerful, a good reference and best used layered with assumptions of the changes in the ever-changing real-world. 

And, don’t be so naïve to even try to corroborate the digital universe estimates across any of the digital platforms. Even the most reputed platform claims/ estimates have continued to confound me. Have often seen claims of more females in a certain geo-demographic than estimated to be present in that geography. And, this applies to different demographics across – not only females. Wonder where the world is hiding these people as the Census also could not find them 😄. 

Now again, there are those would say that the estimates that I have are wrong; the Census is too old - as they don’t understand the concept of statistical forecasting. Happy to have a discussion on the universe estimation and forecasting that is in use for the syndicated databases in the industry, 

So , what could be the problems in measurement?

When one executes a multimedia campaign say, across TV, Youtube Trueview, Facebook Video, Disney Hotstar pre-roll & in-stream options – one would have to look at multiple sources of data each with their own idiosyncrasies. 

While ,TV is a broadcast media with certain rules of reach build-up and OTS applicable; on the other hand the patterns of reach and frequency build-up as seen on these digital platforms unique to each:  

each having their own universe estimate
own definition of an impression
own definition of a view
in fact, even the targeting parameters will be different
different levels of reporting by period 
different parameters reported 
and so on

Don’t expect an easy answer

So, when Clients ask the question – “what is the net reach of the multi-media campaign?” do you think the answer would be simple?

Let me clarify that, I am not advising that reach is the right measure for every campaign or that in every campaign every medium/platform should have reach as a primary metric. There are various reasons for a medium or platform to be included into a campaign and the metric for measurement of that medium/ platform should be based on the campaign/ platform role. 

However, in case it is so decided that Reach is the campaign measure, then all the complexities stated above in the note need to be managed by the multi-media reach estimation methodology. And, it is not going to be an easy answer. And, I haven’t even added the problem of reach @3+ yet 😝 or yet not added in performance media which is a different ball-game all-together.

Lets also discuss the elephant in the room - NCCS. Most client briefs even today have NCCS as an important audience descriptor but, most digital platforms have no direct design to deliver NCCS-based audiences. I dont even want to discuss what gets delivered using 'surrogates of NCCS'. Again, am not saying that NCCS is crucial and the right way to define audiences. Personally, I believe we should be able to define audiences with far more direct descriptors than identifiers such as NCCS. Am sure things will change but, today most campaigns have NCCS since, TV research too is built around NCCS, 

Things are going to get even more complicated if we now, want to optimize the campaign and define budget allocations across platforms at the pre-planning stage and to also report similarly during and after the campaign.

Maximize from Wavemaker

And, here is a commercial break with a plug-in for my company 😇. Jokes apart, the tool Maximize at Wavemaker is the best that I have come across in the media industry ever. It is conceptually light-years ahead of the competitive tools in other agencies. Statistically so robust that even I don’t try to understand the finer details of the agent-based-modeling techniques that it uses. And, it is never about only having a tool – what matters is the people who pilot the tool and Wavemaker has a very ‘qualified’ team on Maximize. I can connect any one of you to the “Maximize_Desk@Wavemaker” for a deeper interaction. 

In Conclusion

So, if anyone gives an answer to the above question about multi-media campaigns - using just Random Duplication or Sainsbury Formula that are stochastic approaches said to be applicable to Normally Distributed variables – do look at the inferences with a hand-full of salt.

As I say “Random Duplication gives Random Results”. When you invest Millions you deserve better than a random result. 

No comments: