CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
Friday, July 4, 2025
No Result
View All Result
CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
No Result
View All Result
CRYPTOREPORTCLUB

Innovative detection method makes AI smarter by cleaning up bad data before it learns

June 13, 2025
150
0

June 12, 2025

The GIST Innovative detection method makes AI smarter by cleaning up bad data before it learns

Related Post

Regional dialect scam warning as research uncovers AI capabilities

Regional dialect scam warning as research uncovers AI capabilities

July 4, 2025
Young children outperform state-of-the-art AI in visual object recognition

Young children outperform state-of-the-art AI in visual object recognition

July 3, 2025
Lisa Lock

scientific editor

Andrew Zinin

lead editor

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

proofread

AI
Credit: Unsplash/CC0 Public Domain

In the world of machine learning and artificial intelligence, clean data is everything. Even a small number of mislabeled examples known as label noise can derail the performance of a model, especially those like support vector machines (SVMs) that rely on a few key data points to make decisions.

SVMs are a widely used type of machine learning algorithm, applied in everything from image and speech recognition to medical diagnostics and text classification. These models operate by finding a boundary that best separates different categories of data. They rely on a small but crucial subset of the training data, known as support vectors, to determine this boundary. If these few examples are incorrectly labeled, the resulting decision boundaries can be flawed, leading to poor performance on real-world data.

Now, a team of researchers from the Center for Connected Autonomy and Artificial Intelligence (CA-AI) within the College of Engineering and Computer Science at Florida Atlantic University and collaborators have developed an innovative method to automatically detect and remove faulty labels before a model is ever trained—making AI smarter, faster and more reliable.

Before the AI even starts learning, the researchers clean the data using a math technique that looks for odd or unusual examples that don't quite fit. These "outliers" are removed or flagged, making sure the AI gets high-quality information right from the start. The paper is published in IEEE Transactions on Neural Networks and Learning Systems.

"SVMs are among the most powerful and widely used classifiers in machine learning, with applications ranging from cancer detection to spam filtering," said Dimitris Pados, Ph.D., Schmidt Eminent Scholar Professor of Engineering and Computer Science in the FAU Department of Electrical Engineering and Computer Science, director of CA-AI and an FAU Sensing Institute (I-SENSE) faculty fellow.

"What makes them especially effective—but also uniquely vulnerable—is that they rely on just a small number of key data points, called support vectors, to draw the line between different classes. If even one of those points is mislabeled—for example, if a malignant tumor is incorrectly marked as benign—it can distort the model's entire understanding of the problem.

The consequences of that could be serious, whether it's a missed cancer diagnosis or a security system that fails to flag a threat. Our work is about protecting models—any machine learning and AI model including SVMs—from these hidden dangers by identifying and removing those mislabeled cases before they can do harm."

The data-driven method that "cleans" the training dataset uses a mathematical approach called L1-norm principal component analysis. Unlike conventional methods, which often require manual parameter tuning or assumptions about the type of noise present, this technique identifies and removes suspicious data points within each class purely based on how well they fit with the rest of the group.

"Data points that appear to deviate significantly from the rest—often due to label errors—are flagged and removed," said Pados. "Unlike many existing techniques, this process requires no manual tuning or user intervention and can be applied to any AI model, making it both scalable and practical."

The process is robust, efficient and entirely touch-free—even handling the notoriously tricky task of rank selection (which determines how many dimensions to keep during analysis) without user input.

Researchers extensively tested their technique on real and synthetic datasets with various levels of label contamination. Across the board, it produced consistent and notable improvements in classification accuracy, demonstrating its potential as a standard pre-processing step in the development of high-performance machine learning systems.

"What makes our approach particularly compelling is its flexibility," said Pados. "It can be used as a plug-and-play preprocessing step for any AI system, regardless of the task or dataset. And it's not just theoretical—extensive testing on both noisy and clean datasets, including well-known benchmarks like the Wisconsin Breast Cancer dataset, showed consistent improvements in classification accuracy.

"Even in cases where the original training data appeared flawless, our new method still enhanced performance, suggesting that subtle, hidden label noise may be more common than previously thought."

Looking ahead, the research opens the door to even broader applications. The team is interested in exploring how this mathematical framework might be extended to tackle deeper issues in data science such as reducing data bias and improving the completeness of datasets.

"As machine learning becomes deeply integrated into high-stakes domains like health care, finance and the justice system, the integrity of the data driving these models has never been more important," said Stella Batalama, Ph.D., dean of the FAU College of Engineering and Computer Science.

"We're asking algorithms to make decisions that impact real lives—diagnosing diseases, evaluating loan applications, even informing legal judgments. If the training data is flawed, the consequences can be devastating. That's why innovations like this are so critical.

"By improving data quality at the source—before the model is even trained—we're not just making AI more accurate; we're making it more responsible. This work represents a meaningful step toward building AI systems we can trust to perform fairly, reliably and ethically in the real world."

More information: Shruti Shukla et al, Training Dataset Curation by L 1-Norm Principal-Component Analysis for Support Vector Machines, IEEE Transactions on Neural Networks and Learning Systems (2025). DOI: 10.1109/TNNLS.2025.3568694

Provided by Florida Atlantic University Citation: Innovative detection method makes AI smarter by cleaning up bad data before it learns (2025, June 12) retrieved 12 June 2025 from https://techxplore.com/news/2025-06-method-ai-smarter-bad.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

New technique reduces bias in AI models while preserving or improving accuracy 1 shares

Feedback to editors

Share212Tweet133ShareShare27ShareSend

Related Posts

Regional dialect scam warning as research uncovers AI capabilities
AI

Regional dialect scam warning as research uncovers AI capabilities

July 4, 2025
0

July 3, 2025 The GIST Regional dialect scam warning as research uncovers AI capabilities Lisa Lock scientific editor Andrew Zinin lead editor Editors' notes This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility: fact-checked trusted...

Read moreDetails
Young children outperform state-of-the-art AI in visual object recognition

Young children outperform state-of-the-art AI in visual object recognition

July 3, 2025
One Tech Tip: Get the most out of ChatGPT and other AI chatbots with better prompts

One Tech Tip: Get the most out of ChatGPT and other AI chatbots with better prompts

July 3, 2025
European companies urge EU to delay AI rules

European companies urge EU to delay AI rules

July 3, 2025
Motor safety: AI-powered warning system enhances capability to uncover hidden faults

Motor safety: AI-powered warning system enhances capability to uncover hidden faults

July 3, 2025
Key biases in AI models used for detecting depression on social media

Key biases in AI models used for detecting depression on social media

July 3, 2025
Hertz customer hit with $440 charge after AI inspection at Atlanta airport

Hertz customer hit with $440 charge after AI inspection at Atlanta airport

July 3, 2025

Recent News

Regional dialect scam warning as research uncovers AI capabilities

Regional dialect scam warning as research uncovers AI capabilities

July 4, 2025

Crypto needs more no-fee, open-source payment tools

July 4, 2025
Anthem is officially shutting down on January 12

Anthem is officially shutting down on January 12

July 4, 2025
Crunchyroll blames third-party vendor for AI subtitle mess

Crunchyroll blames third-party vendor for AI subtitle mess

July 4, 2025

TOP News

  • Top 5 Tokenized Real Estate Platforms Transforming Property Investment

    Top 5 Tokenized Real Estate Platforms Transforming Property Investment

    536 shares
    Share 214 Tweet 134
  • Bitcoin Bullishness For Q3 Grows: What Happens In Every Post-Halving Year?

    534 shares
    Share 214 Tweet 134
  • Buying Art from a Gallery. A Guide to Making the Right Choice

    534 shares
    Share 214 Tweet 134
  • Nintendo Miis are post-gender on Switch 2

    532 shares
    Share 213 Tweet 133
  • How AI helps push Sweet Crush gamers by its most tough puzzles

    532 shares
    Share 213 Tweet 133
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Use
Advertising: digestmediaholding@gmail.com

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Crypto news
  • AI
  • Technologies

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved