Unmasking hidden online hate: A new tool helps catch nasty comments—even when they’re disguised

November 28, 2024

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

written by researcher(s)

proofread

Unmasking hidden online hate: A new tool helps catch nasty comments—even when they're disguised

cursing — Credit: Pixabay/CC0 Public Domain

People determined to spread toxic messages online have taken to masking their words to bypass automated moderation filters.

A user might replace letters with numbers or symbols, for example, writing "Y0u're st00pid" instead of "You're stupid."

Another tactic involves combining words, such as "IdiotFace." Doing this masks the harmful intent from systems that look for individual toxic words.

Similarly, harmful terms can be altered with spaces or additional characters, such as "h a t e " or "h@te," effectively slipping through keyword-based filters.

While the intent remains harmful, traditional moderation tools often overlook such messages. This leaves users—particularly vulnerable groups—exposed to their negative impact.

To address this, we have developed a novel pre-processing technique designed to help moderation tools more effectively handle the subtle complexities of hidden toxicity.

An intelligent assistant

Our tool works in conjunction with existing moderation. It acts as an intelligent assistant, preparing content for deeper and more accurate evaluation by restructuring and refining input text.

By addressing common tricks users employ to disguise harmful intent, it ensures moderation systems are more effective. The tool performs three key functions.

It first simplifies the text. Irrelevant elements, such as excessive punctuation or extraneous characters, are removed to make text straightforward and ready for evaluation.
It then standardizes what is written. Variations in spelling, phrasing and grammar are resolved. This includes interpreting deliberate misspellings ("h8te" for "hate").
Finally, it looks for patterns. Recurring strategies such as breaking up toxic words ("I d i o t"), or embedding them within benign phrases, are identified and normalized to reveal the underlying intent.

These steps can break apart compound words like "IdiotFace" or normalize modified phrases like "Y0u're st00pid." This makes harmful content visible to traditional filters.

Importantly, our work is not about reinventing the wheel but ensuring the existing wheel functions as effectively as it should, even when faced with disguised toxic messages.

Catching subtle forms of toxicity

The applications of this tool extend across a wide range of online environments. For social media platforms, it enhances the ability to detect harmful messages, creating a safer space for users. This is particularly important for protecting younger audiences, who may be more vulnerable to online abuse.

By catching subtle forms of toxicity, the tool helps to prevent harmful behaviors like bullying from persisting unchecked.

Businesses can also use this technology to safeguard their online presence. Negative campaigns or covert attacks on brands often employ subtle and disguised messaging to avoid detection. By processing such content before it is moderated, the tool ensures that businesses can respond swiftly to any reputational threats.

Additionally, policymakers and organizations that monitor public discourse can benefit from this system. Hidden toxicity, particularly in polarized discussions, can undermine efforts to maintain constructive dialogue.

The tool provides a more robust way for identifying problematic content and ensuring that debates remain respectful and productive.

Better moderation

Our tool marks an important advance in content moderation. By addressing the limitations of traditional keyword-based filters, it offers a practical solution to the persistent issue of hidden toxicity.

Importantly, it demonstrates how small but focused improvements can make a big difference in creating safer and more inclusive online environments. As digital communication continues to evolve, tools like ours will play an increasingly vital role in protecting users and fostering positive interactions.

While this research addresses the challenges of detecting hidden toxicity within text, the journey is far from over.

Future advances will likely delve deeper into the complexities of context—analyzing how meaning shifts depending on conversational dynamics, cultural nuances and intent.

By building on this foundation, the next generation of content moderation systems could uncover not just what is being said but also the circumstances in which it is said, paving the way for safer and more inclusive online spaces.

Provided by The Conversation

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Citation: Unmasking hidden online hate: A new tool helps catch nasty comments—even when they're disguised (2024, November 28) retrieved 28 November 2024 from https://techxplore.com/news/2024-11-unmasking-hidden-online-tool-nasty.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Study finds bias in language models against non-binary users shares

Feedback to editors

Unmasking hidden online hate: A new tool helps catch nasty comments—even when they’re disguised

An intelligent assistant

Catching subtle forms of toxicity

Better moderation

By cryptoadmin

You Missed

Cardano rallies 13% ahead of van Rossem upgrade—but can the move last?

How to claim a WhatsApp username

CLARITY Act: Law Enforcement Group Shifts From Opposition to Neutral on DeFi Provision

Amazon’s Fire HD 10 tablet just got a refresh with a bit more RAM

Categories

Unmasking hidden online hate: A new tool helps catch nasty comments—even when they’re disguised

An intelligent assistant

Catching subtle forms of toxicity

Better moderation

By cryptoadmin

Related Post

Move over, Messi! Robot footballers thrill crowds in South Korea

AI race weakens climate pledges at Google, Amazon

By modeling visual saliency, AI improves ratings of artistic product designs

You Missed

Cardano rallies 13% ahead of van Rossem upgrade—but can the move last?

How to claim a WhatsApp username

CLARITY Act: Law Enforcement Group Shifts From Opposition to Neutral on DeFi Provision

Amazon’s Fire HD 10 tablet just got a refresh with a bit more RAM