Tamara Zubatiy is the CEO and co-founder of Barometer — this interview has been lightly edited for style and readability
TZ: Barometer is an AI company that has thrust itself into media to provide brand suitability and contextual targeting for user-generated content. Our first user-generated content set that we’re really passionate about is podcasting. We think that podcasting represents the future of media in a way, because it’s the opposite of Twitter and the short form exchanges. And as we’re seeing now with the political conversations being had on podcasts that have usually been held behind closed doors, it’s quite a powerful free speech tool. What we wanted to do was make the biggest advertisers in the world be able to confidently advertise in the conversations that are changing the world. Currently we feel that they’re sitting on the sidelines because the same tools that they’re used to having in other media types they don’t have in podcasting, and so they can’t do the things that they would like to do with their purpose-driven marketing.
JC: What’s your definition of brand suitability and why is it important?
TZ: Brand safety is the founding father of brand suitability. There’s certain content that objectively nobody would agree should be monetized by an advertiser. This could include child pornography or selling illegal drugs, things like that.
Why this is such a mainstream concept is a L’Oreal ad ran on an ISIS video on YouTube. Not a video about ISIS, but a real first-person one. So Joe Barone from Group M and some of the other folks at the time came together and they created GARM, which stands for the Global Alliance for Responsible Media. It’s a working group within the World Federation of Advertisers and they define categories of things that could be in content that advertisers may seek to avoid. Every large social media company reports to them annually about the number of posts that they actually take down that is considered not brand-safe.
Then they wondered - what happens when NPR covers the ISIS video? That’s a different context than if it was just the first-person video. So they created the concept of brand suitability, and an accompanying framework to define the different kind of levels of each of the different categories that they had defined on the brand safety framework, from the point of view of suitability.
Brand safety is a binary and it’s an inherent property of content - the content either is or is not brand safe. But brand suitability is a spectrum, and it’s defined in relationship to a brand’s standards. So defining whether or not something is suitable is only possible in the context of a specific campaign. It might even be too broad for a whole brand that might have a number of sub products - for one product, a given show might be perfectly suitable, and then for another product, the same show might not be appropriate at all.
So from that point of view, I almost see brand suitability as a type of targeting - it’s more contextual and it allows you to kind of better refine what context or what conversations your ad is going to appear on.
JC: I used to work occasionally in a commercial trafficking department at a radio station. And when there was a big plane crash, all of the airlines wanted to pull off the air for a couple of days just to be away from any talk of plane crashes. Is this the same sort of thing but at scale?
TZ: Yeah, pretty much, it’s not a new idea. But a lot of the historical approaches leverage keywords and keyword density. Part of what we bring to the table is an AI approach, which adds additional context like what is the topic as opposed to the keyword, what ’s the tone, what else is going on, besides just the keyword. So that allows us to make distinctions between shooting a gun or shooting a movie or shooting a basketball, right?
In the plane crash - maybe the context is that it’s a podcast about aviation safety. So, they talk about plane crashes, but then maybe the airline is a hero - there’s so much nuance.
We’re still living that nightmare in terms of people not wanting to buy news podcasts. That’s one of the reasons we’re making it a lot more nuanced, so you can actually see what topics are being talked about. Maybe you don’t run on the one episode where they were talking about abortion and that’s something that’s really polarizing in your space - but run on all of the content about climate change that supports your purpose.
JC: I think content creators think of brand suitability as being a reason why a brand should NOT advertise. But actually what you’re suggesting here is that this is a reason why brands CAN advertise on shows that historically they wouldn’t have gone near.
TZ: Totally. I think that what’s been missing is the resolution that’s needed to understand just how much volatility there is in the content. Prior to these solutions, the state of the art was somebody listening to a couple minutes of an episode and trying to determine a judgment about the whole show, or even worse, a whole genre. Then what happens is the genre gets cut or the show gets cut.
Our thesis is never cut anything, never cut an entire show, never cut an entire genre. Use the scalpel rather than the axe, as people say. We’re trying to bring that level of nuance so that people aren’t axing entire shows or genres. When you do, you narrow your reach so much that even across a variety of publishers, it’s hard to fill inventory that perfectly matches your narrow constraints. So part of what we’re trying to do is show people visually in the platform the impact of a decision to cut - how much of your reach would you give up.
We’re also offering advertisers a way to iterate instead of just set it and forget it. We’re offering a checkpoint early in the campaign, so that you don’t get to the end of the campaign and say, oh, it didn’t work. You can have that intermediate check in and say: “OK, this is working well. We thought we can’t do any profanity, but look at how well we’re doing on the show that says one bad word”.
JC: Does the AI understand sarcasm? We’ve seen Roseanne Barr recently, reading all kinds of very unpleasant stuff but from all of the reporting of it that I understand she was actually being sarcastic. What happens there?
TZ: There’s a lot of nuance in how people’s tone is. Some people naturally sound more excited than other people. What we look at is a variety of different emotions and so sarcasm can be read as a fingerprint consisting of a certain ratio of emotions. For every person’s voice that exact frequency of what sarcasm is going to sound like is going to be a little bit different: it’s not 100% accurate at being able to figure out if it’s sarcasm, there’s a lot of nuance there. We still rely on external contextual cues like whether this is comedy or something like that.
JC: This person is a comedy actor therefore, look at this slightly differently than if it’s a broadcast journalist?
TZ: Exactly. And the GARM definitions allow for that. So the medium risk definition includes a note for the purposes of entertainment, versus the high risk which does not.
JC: The other question that I hear talked about a lot around brand suitability conversations is the G in GARM. In Thailand, for example, they revere the king. The king is a very important person and it is not just illegal to be rude about the king, but it’s also deeply deeply frowned on. Some people would argue that GARM is a cultural invasion - US culture being used everywhere else. Or, have we got a bit more understanding about different cultures, like the way the Thais treat their king?
TZ: When I first heard that some group got together and decided the things that nobody should ever monetize, I literally freaked out because I was born in a former Soviet country, and I know what it’s like to not have freedom of speech. I was like: “What do you mean, these people got together and decided that we can’t say this?”
From the point of view of Barometer, we’re currently only active in US English, in British English, and we’re starting to become active in Canadian English. For us to start being active in British English, for example, it meant we had to go read a bunch of stuff about the nuances of British debates and social issues and we referenced the UK Home Office and a bunch of different online resources to understand like what are the current problems that the British people are considering today. Then we also had to learn about the unique British vernacular, and the differences in the swearing and other idioms. It’s not perfect, but the approach is not just about the language, but it’s also about the culture, and what is and isn’t normal.
For example, one of the GARM categories is hate speech. In Germany, hate speech is illegal. In America, hate speech is not illegal. So, in Germany, it’s not brand safe - it’s the binary level if there’s hate speech. That’s why we’re not rushing to get into other markets yet, but we are being very diligent about how we’re approaching it.
So yes, we could today translate content from German into English, analyze it in English and then report the results. But it wouldn’t be calibrated to the expectations of the German reality. So what we expect in the future is people who are buying shows in the country, whatever the country is of origin for the show, the norms will be applied and interpreted based on that country.
If I’m an American company, but I’m operating in Germany, I should abide by the German norms: and it could be our job to help them with that.
JC: How does GARM address it?
TZ: They do not. They do not provide a lot of guidance. The GARM framework is vague at best, dangerous at worst, with some of the definitions. We are on a lot of working groups to try to change that, and to get them to be a little bit more transparent about what it is that they mean. We’ve made some progress there. I think that that is a positive step in the right direction. They’re just people like anybody else, trying to do the right thing and a lot of the time when we ask them a question like “what’s the difference between hate speech that’s below the floor and hate speech that is high risk?” or “what happens to misinformation analysis when the vaccine does one thing a year ago and a different thing today and masks did one thing a year ago and a different thing today?”. Their answer is often to make a working group and decide what to do.
JC: You’re working with News Guard now who does misinformation detection; you’re also working with ArtsAI, helping their clients with brand safety and suitability. Which are the partnerships that you’re most excited about right now?
TZ: I think each of the different steps that we’re taking represents a different part of our comprehensive offering.
The partnership with Audiology and Katz really spoke to our ability to show our scale and our ability to handle that volume of scale. I think that was an important signal to the market that we don’t just do a handful of shows, we’re on our path to covering the top 50,000 monetized shows with Audiohook. I think they represent a frontier tech DSP for audio. So yes obviously we’re working on becoming a rail in the trade desk and all of the operational leverage that we need to create to facilitate that. But at the same time Jordan from Audiohook has been really hard at work engaging smaller publishers, and we see our partnership as more than just a channel partnership but as a leap forward in the self service ecosystem for buying across these smaller publishers.
There’s 12 GARM components: but we only had 11 covered. The 12th was misinformation, which was added in June of last year. We went back and forth internally about whether we should build it or buy it, and we talked to a lot of different providers. One thing that we were really opposed to is giving a score to a single show. We really don’t grade shows we just provide episode level analysis and then some basic averages. So, what we loved about News Guard’s misinformation fingerprint product is it’s not labeling any sources. All it’s doing is identifying narratives that are verifiably false. Even within the set that they’ve curated, we curate a smaller subset that we can actually confirm are verifiably false.
Our partnership with NewsGuard is focused on any state sponsored disinformation narratives that they have verifiably tracked are from a state sponsored source of disinformation. When somebody cites that in a podcast, we can say that they are referencing a known state disinformation news provider. Especially in the US, people can be ignorant about the reality of state sponsored media in other countries, and how a source that seems perfectly legitimate could actually be state sponsored. That partnership allows us to get expertise, and it’s like buying data. If you ask them about it, they would say the same thing - it’s like a data seed for our models to train off of. They’re also providing the labor force of people who can speak all those different languages to understand.
I think maybe the coolest partnership we have right now is what ArtsAI is doing with enabling this real time monitoring. If you look at the offerings of companies like us in other media areas, they have pre-bid targeting and then post campaign monitoring and reporting. The pre-bid targeting piece is a little bit out of our control - we’re working on a big collaboration to make that happen - but the monitoring stuff, we can do that already. So what’s cool about the ArtsAI partnership is - sure we could go make our own pixel: but why, when we could just turn this on. The power of the partnership is, publishers can just decide that they want it, and they don’t have to do any coding. We can bill them monthly by CPM instead of charging them a subscription, which is also a huge change in their ability to do it. And then the other cool thing is it makes literally everybody look good. Part of the fear of the publishers was “oh my gosh we’re going to expose stuff”, and our whole argument was: “you know what you’re doing, we’re literally just proving that you know what you’re doing”. Fortunately, so far, those have been our results - we’ve gotten literally 99%, sometimes 99.9% alignment to set standards once we do the monitoring. We’ve got budgets committed from test to full because of the ability to have this auditing. We hope that it is confirming our thesis that people are able to invest more once they have this transparency.
JC: I came into this thinking - brand suitability: it’s a way for me to lose money. But from everything that you’ve been saying, brand suitability is actually a way for me to MAKE money, because otherwise buyers may not go anywhere near any of my stuff. So that’s really interesting.
TZ: Yeah, it’s really cool. And I think the targeting opportunity is also really huge. If you try to ask your network to not target your ads on terrorism content, they might laugh. “Of course we’re not going to target you on terrorism content!” But if you say that you’re going to pay 50 cents per CPM so that they do not target you on terrorism content… of course. It’s like insurance. Even if nothing happens, having that peace of mind for them is good. We think that in any targeting scenario the publisher should be incentivized via direct revenue share, because they have to feel like they’re winning. We’re not fighting the publisher - that would not make any sense. We have been investing a lot into our publisher relationships.
JC: What’s on your roadmap? What are you working on in the future?
TZ: We’re working on going beyond high risk, medium risk or low risk for brand suitability. We think that that’s too coarse . We want to be more nuanced than “high, medium, low”. That’s a really big push for us. The other big push is those real time workflows - so you will very soon be able to do real time targeting, which is super cool. I think it’s been really important that we waited this long, because we’ve been calibrating and understanding what the real expectation is. Then, finally towards the end of the year, we will be announcing our foreign language offerings: starting with German, Spanish and then Arabic.
JC: Finally, you work with AI, but how are you approaching Generative AI in podcasting?
TZ: My PhD is in generative AI and in large language models which is probably handy! Literally all I’m doing this year is reading about LLM. Basically the TL;DR of it is right now they are making stuff up - they’re little hallucinating monsters, not to be trusted, when using public data.
But recently there’s been a lot of infrastructure developments making open source models available, and infrastructure platforms like AWS let you add your own data in silos and then train them on your own information. That’s promising. We’re doing that.
It’s not really asking factual questions where the outcome is significant: that is for expert systems for now. It’s asking “why is the score high risk?” and being able to have access to all of our data and an understanding of the methodology and be able to make a human response that says it’s high risk and adult content because it talks about this this and that in this type of context with this tone.
JC: Thank you so much for your time.