About F$%*ing Time!

Bonjour à toutes et à tous,

Effective text moderation is a common challenge in game development, requiring a balance between ensuring player safety and allowing for creative expression. This case study explores how our team replaced a costly third-party service with a pragmatic, in-house AI-powered experiment to improve both the player experience and operational costs.

The Problem: A Costly Service

For a long time, we relied on an external, third-party service for text moderation across Niantic games. This service was effective, but it came at a high cost and was no longer reasonable with our smaller footprint of games following the Niantic Spatial spinout. We knew we needed a more sustainable solution.

Our First Attempt: The Static Filter

Our first step was to replace the external service with a rigid, static profanity filter. Internally, this was a simple spreadsheet of over 2,000 words we deemed inappropriate. It wasn’t ideal, but it was a quick solution during what was otherwise an insanely busy time spinning out and standing up our new company.

The player's response was immediate and clear: they hated it. The filter was far too restrictive. Since it couldn't understand context, it would reject appropriate words that simply contained an inappropriate string within. For example, if "strip" was on our profanity list, the filter would also reject the perfectly harmless word "stripes." At the same time, our static list wasn't exhaustive, and new, creative forms of offensive language were getting through. It was the worst of both worlds.

A New Approach: An Experimental AI Filter

It was clear we needed something smarter than a static list but much more cost-effective than the original service. We decided to explore if we could build a quick, AI-powered profanity filter ourselves.

We chose to work with Meta’s Llama-v2-13b model–which was already implemented in Peridot as it powered some of our other gen AI features. Our goal was to create a simple, direct query that could give us a clear "yes" or "no" on a piece of text. After some experimentation, we landed on a very specific prompt:

Prompt Code Block
LLM Prompt
Determine whether the text within the square brackets contains any profanity or NSFW, inappropriate, or offensive references in any language: [text_to_check]. Respond only with a 'T' if the text is inappropriate or a 'F' otherwise, with no other tokens.

By instructing the model to respond only with a "T" for true (inappropriate) or "F" for false, we created a lightweight and fast check.

Measuring the Improvement

Honestly, almost anything would have been an improvement over the static list. But we needed data.

First, we ran our old list of 2,000+ inappropriate words through the Llama prompt and found it correctly identified about 60% of them. More importantly, we collaborated with our community, gathering lists of words that were being incorrectly rejected (false positives) and offensive words that were getting through (false negatives). When we tested this player-sourced list, we saw an accuracy rate of 85%.

While not perfect, we concluded this was a much better starting point. It provides our players with more reasonable leniency while still catching most of the language we want to filter.

The Results and Cost

This solution has been live in Peridot for about a month, and we've already made one round of improvements, with more to come over time. And the cost? It's basically nothing. The volume of requests we handle is within our monthly subscription which is a shared resource across all of Niantic Spatial.

What’s Next?

This is just the beginning of this experiment. As our new filter runs in production, we’re gathering more insights that we can use to fine-tune and validate different models to incrementally improve performance.

We are also considering options to eventually build a feedback loop where community-flagged content can help us continuously train and deploy better, more accurate models.

This has been a practical solution to a real-world problem, and we're continuing to learn and iterate as we go.

– Asim Ahmed (Head of Product Marketing, Spatial Solutions) and the Peridot team