Brand Safety ‘Under the Hood’: How Machine Learning experts taught computers to block mature content

Posted on

With Brand Safety buzzing loudly in everyone’s ears, we often find ourselves talking about Luminous’ unprecedented ability to automatically recognize and block mature content.

This AI engine has established itself as an excellent gatekeeper, but many still wonder how the keeping is actually done. We headed down to our Machine Learning (ML) labs to interview the ML team working on Luminous and get some answers.

The ML team’s office is everything you would expect it to be. Dimly lit, freezing cold, and full of the whispers of high-power computers processing away.

The interview begins, and we are thrown into a world of algorithms, models, and ML. Lots of ML.     

Your task was to teach computers to be the responsible adult, where do you even begin?

“You may be surprised to learn that there already is a machine that excels at recognizing inappropriate content: The human brain.

Our challenge therefore was to understand how humans learn to easily differentiate between safe and unsafe mature content, and then teach a computer to do the same.”

How do you teach a computer to think like a human?

“We created a content profiling method for the different types of problematic content, which helped us find common denominators such as objects, locations, and themes that can be used as warning signals for various unsafe content categories.

For each category, we created libraries of images that reflected the different brand safety categories, such as violence, war and terror, drug and alcohol abuse, as well as nudity and both implicit and explicit sexual content.

Let’s focus on sexual content as an example, as this is a big concern for many family friendly advertisers and publishers. The libraries we created included images such as people in horizontal positions, people wearing minimal clothing, and scenes taking place in a room with a bed.”

So now you have the syllabus ready, but how do you teach a computer to see?

“It is extremely difficult to build an algorithm that recognizes every possible item in the world. There are billions of items and billions of possible scenarios.

If we were to build something that looks at every single item in a frame, and analyzes each one in depth, it would look at a sex scene and analyze every single detail, like the labels on items of clothing, the pictures hanging on the wall, and maybe even on what can be seen out the window.

As impressive as this may sound, it’s inevitable that such a system would compromise its accuracy per item in order to cover as many items as possible. At this point in time, you simply cannot be an expert at recognizing everything.”

When it comes to Brand Safety, accuracy is key. How did you improve it?

“We decided to tackle the problem using a different approach.

The existing computer vision technology provides one solution for many needs, whereas we decided to create a customized solution for each of our customers’ pain points.

We used the libraries to create specialized models for each brand safety category, so we trained our models to look only for what they know they should be looking for.

When focusing on particular items, we were able to reach unprecedented accuracy.”

What type of models did you develop to combat Brand Safety concerns?

“One interesting example is nudity, as algorithms are often thrown-off by a lot of skin, which doesn’t necessarily indicate nudity.

To help them understand the difference between revealing-yet-acceptable apparel and nudity, we defined a percentage of skin that is acceptable.

This enables our model to measure the percentage of revealed skin per body, and determine if its above the appropriate threshold or not. When surpassing the approved threshold, the body is recognized as nude, or close enough to nude to be flagged as such”.

But the definition of nudity changes from culture to culture! What’s considered inappropriate in some cultures might be fine by others.

“That’s a valid point, and it’s exactly why we have created several different models that can meet the needs of our different customers.

In some cultures, footage of women in bikinis is considered highly inappropriate. However, there are plenty of cultures where bikinis are completely acceptable and often included in family-friendly content.

Our specialized models allow a high degree of flexibility when it comes to defining the threshold, so we are able to support a wide variety of customer standards.”    

How about implicit sexual content? Plenty of TV shows include sex scenes with no actual nudity. How could an algorithm cope with this?

“This is a good question, and in fact it is extremely difficult to detect implicit sexual content.

We tackled this challenge by creating a large collection of these non-obvious sex scenes and used them to train another algorithm.

To increase the accuracy, we distinguished such situations using our unique approach – we looked for combinations that are common in sex scenes such as people lying horizontally and within close proximity to each other.”

Sounds complicated.

“Actually, it gets even more complex. Most models look only at images and are not equipped to understand movement and interpret action. We developed the ability to look at the video as a whole, not as a series of individual images.

This means that we are able to take dynamic features into account, such as correlations between moments of time, and the way the bodies move within the the scene. All of these factors are used to calculate whether or not the scene includes sexual activity, explicit or implicit.”

That sounds a lot like how the brain works.

“The processes are indeed similar.

As a team, we always ask ourselves what we are able to do as humans, and then try and teach our machines to apply the same logic.

Something we worked on for a while is the ability to recognize nudity in different lighting conditions. The challenge with skin is that it does not only come in many different colors, it’s that its color is highly influenced by lighting.

For example, algorithms have a hard time recognizing nude people in red lighting. The red light creates an unnatural skin color which can be very confusing for a computer. In fact, we have seen many cases where shadow over skin in red lighting in mistaken for clothing.

With this in mind, we developed a model which considers the color of the light in the video, and takes it into consideration when looking for nudity, just like the brain would.”

 

To learn more about the business implications of mature content blocking, read this article written by Gil Becker, CEO at AnyClip, published by Marketing Tech.