19.05.2022

AI-Based Profanity Editor: Way to Make Online Communication Safer

MTS AI and Skoltech Develop AI-Based Profanity Editor MTS AI and Skoltech developers have come up with a language detoxifier – an AI-based solution that detects and replaces “toxic” words and expressions.

How Profanity Editor Works?

How can we make online communication safe and non-offensive? The answer came from NLP experts at MTS AI and Skoltech. They created a profanity editor solution that can both lower aggression and preserve the tenor of a message. In addition to its social network applications, the language detoxifier may soon find use in creating voice assistants, chat bots and voice bots, NLP experts explain.

“Online content is generated at a rate that makes its adequate manual filtering impossible. Social networks will often simply block offensive language. Our solution doesn’t just delete messages or ban users, it offers replacement phrases to make the language more neutral while preserving the intent of the message,” says Irina Krotova, senior developer at MTS AI’s NLP unit.

Language detoxifier from MTS AI and Skoltech is a unique product on the Russian market, since most of these solutions are intended for the English language. There are almost no such services for Russian-speaking users, and any previous solutions have been proven ineffective.

MTS AI and Skoltech proposed two types of models for creating bots and applications that remove obscene language. The first one uses the BERT language model based on the Transformer neural network architecture. The model makes point editing of the text: it finds obscene words and expressions in it and replaces them with neutral synonyms or deletes them altogether.

The second approach is also based on the Transformer architecture, but it handles the job a bit differently, generating conditional text based on an input request. In other words, the language model creates a neutral version of a toxic phrase. A detoxification model based on the ruT5 language model was designed for an academic contest held as part of the Dialogue conference.

“The proposed methods and models can be used to mitigate reputational risks of a company (a chatbot that learned from online texts may produce a toxic response). Other applications are also possible. For example, the system may suggest a less toxic wording for the user’s message before sending the comment. This usage scenario does not impact the freedom of expression, but it can minimize the number of emotionally written negative comments,” says Alexander Panchenko, PhD, senior lecturer at Skoltech and head of the MTS-Skoltech joint laboratory.

Examples of language replacements offered by the detoxifier

You’re crazy, a**hole! You’re crazy, author!
(Да ты обалдел, м****! Да ты обалдел, автор!)

I’m f***ing tired of your markups. I’m sick and tired of your markups.

(З****** со своим повышением цен. Надоели со своим повышением цен.)

Why the f*** did you write that? Why the hell did you write that?

(Какого х*** ты это написал? Зачем ты это написал?

Shut off this f***ing service. Shut off this service.
(Отключите этот п******** сервис. Отключите этот сервис.)

You can easily test the language detoxifier solution. Simply swear at out Telegram chat bot. To learn more about the methods and models used in this approach, read the “Text detoxification methods for the Russian language” article prepared by MTS AI and Skoltech teams, or visit the website of the MTS-Skoltech joint AI lab.

News