Modulate gets $30M to detox game voice chat with AI

Interested in learning what’s next for the gaming industry? Join gaming executives to discuss emerging parts of the industry this October at GamesBeat Summit Next. Register today.

Modulate has raised $30 million to build out its AI product, ToxMod, which scans voice chat using machine learning to find toxic players in online games.

ToxMod uses artificial intelligence to highlight problems that human moderators should pay attention to as players chat with each other in online games. It’s a problem that will only get worse with the metaverse, the universe of virtual worlds that are all interconnected, like in novels such as Snow Crash and Ready Player One.  The company raised the round thanks to large customers such as Rec Room and Poker Stars VR relying on it to help their community managers find the biggest toxicity problems.

“This is a problem that everyone in the industry has desperately needed to solve,” said Mike Pappas, CEO of Modulate, in an interview with GamesBeat. “This is such a large-scale market need, and we were waiting to prove that we’ve actually built the product to satisfy this.”

Lakestar led the round with participation from existing investors Everblue Management, Hyperplane Ventures, and others. In addition, Mika Salmi, managing partner of Lakestar, will join Modulate’s board.

Modulate’s ToxMod is a proactive voice moderation system designed to capture not just overt toxicity (hate speech, adult language) but also more insidious harms like child grooming, violent radicalization, and self-harm. The system’s AI has been trained on more than 10 million hours of audio.

Cambridge, Massachusetts-based Modulate wants to change the way that game developers undertake the unending fight against online toxicity, said Pappas. He said the funding is a validation of the importance of the company’s mission.

“The core business is proactive voice moderation,” Pappas said. “Rather than just relying on player reports, this is saying you can actually fulfill that duty of care and identify all of the bad behavior across your platform and really do something about it in a more comprehensive way.”

ToxMod uses sophisticated machine learning models to go beyond transcription and understand not just what each player is saying but how they are saying it – including their emotion, volume, prosody, and more. This is crucial, as what’s harmful in one context may be friendly trash talk or genuinely supportive in another.

ToxMod said it uses its nuanced understanding of voice to differentiate between these types of situations, identifying the worst actors while leaving everyone else free to enjoy their own approach to each game. Thanks to this sophistication, ToxMod can detect offenses with greater than 98% accuracy (which further improves over time), and enable moderation teams to respond to incidents over 25 times faster.

“I first saw the company probably about a year and a half ago. We saw it as a team with best-in-class technology. And that’s what we invest in,” Salmi said in an interview. “When I saw it, I couldn’t believe what I saw.”

The big question was whether they could commercialize that. They have done that, Salmi said. And Pappas said the company has a number of unannounced large customers using it.

“Clearly nobody else out there has it. We looking for a long time at this kind of technology and nothing came close,” Salmi added.

Many companies face huge volumes of reports about toxicity. Dennis Fong, CEO of GGWP, which uses AI to scan text chat, reported that human moderators at those companies can only go through a tiny percentage of those reports. GGWP focuses on different problems than Modulate, and GGWP also looks at building reputation scores for players that can help assess their behavior over a long period.

Using that kind of long-term approach, companies can deal in different ways with players who are only occasionally toxic versus those who engage in it much more frequently. These so-called reputation scores can travel with players.

“For us, the immediate problem we’re trying to really hone in on is how do we shine a light on what’s going on in the first place,” Pappas said. “We start with understanding the landscape and how toxicity emerges, where it is happening, how players are acting, and how do we work with our customers closely in designing education campaigns.”

If players are punished, they need to understand why. If toxicity occurs amid allegations of cheating, that’s important to know. Modulate is also thinking about how to preserve the mental health of moderators who have to deal with all of the abuse.

As for the metaverse, it makes sense for game companies to try to solve these problems in the smaller context of their own games before they try to go and connect with everybody else’s applications.

Modulate’s team in Cambridge, Massachusetss.

Where existing voice moderation tools focus only on the 8% of players that submit reports, ToxMod offers proactive moderation that empowers platform and game moderators to make informed decisions to protect players from harassment, toxic behavior, and even more insidious harms. Modulate has helped customers address thousands of instances of online toxicity.

Pappas said the company is making sure that it isn’t misclassifying things like trash talk, which can be acceptable in games like mature-rated Call of Duty, versus racial slurs. The idea is to make moderators more effective across the platform. Pappas said the model for spotting problems is trained over time and it keeps on getting better.

Human moderators can sift through the results and identify false positives, and the system can learn from that.
“They can start taking immediate action,” Pappas said. “Occasionally, it can misunderstand the conversation as human language is complicated. And each game has different standards.”

Words like “p0wned” have to be considered in context as to whether it is being used in an aggressive context. Pappas said it is important for voice moderate that you cannot rely on commodity transcription, which converts spoken words to text, as it doesn’t capture things like tone or whether you are shouting or not.

“No company world has built out this kind of data set specifically designed to focus on real social voice chats online,” Pappas said. “That has allowed us to build accuracy in our models that beats any of the public big company transcription models out there by a pretty handy percentage.”

Since Modulate has focused on the fundamentals of running a strong business, Pappas said it had good relationships with VCs and had an easier time raising money at a time when it’s tough to do so, even for game companies. Salmi said it’s true that VCs are getting more discerning and investments are taking longer, and that’s why he is happy to find a company like Modulate.

The company has hit its milestones with just 27 people, and that speaks to the power of AI.

GamesBeat’s creed when covering the game industry is «where passion meets business.» What does this mean? We want to tell you how the news matters to you — not just as a decision-maker at a game studio, but also as a fan of games. Whether you read our articles, listen to our podcasts, or watch our videos, GamesBeat will help you learn about the industry and enjoy engaging with it. Learn more about membership.