CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
Tuesday, October 14, 2025
No Result
View All Result
CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
No Result
View All Result
CRYPTOREPORTCLUB

Multimodal AI learns to weigh text and images more evenly

October 14, 2025
157
0

October 14, 2025

The GIST Multimodal AI learns to weigh text and images more evenly

Related Post

OpenAI to ease ChatGPT restrictions, allowing adult content for verified adults

OpenAI to ease ChatGPT restrictions, allowing adult content for verified adults

October 14, 2025
It’s called automated officiating. The NBA is utilizing it to get even more calls right

It’s called automated officiating. The NBA is utilizing it to get even more calls right

October 14, 2025
Lisa Lock

scientific editor

Robert Egan

associate editor

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

preprint

trusted source

proofread

Multimodal AI that understands text and images the way humans do
MIDAS trains a multimodal model on both aligned and misaligned samples with conflicting semantics simultaneously. Credit: arXiv (2025). DOI: 10.48550/arxiv.2509.25831

Just as human eyes tend to focus on pictures before reading accompanying text, multimodal artificial intelligence (AI)—which processes multiple types of sensory data at once—also tends to depend more heavily on certain types of data. KAIST researchers have now developed a new multimodal AI training technology that enables models to recognize both text and images evenly, enabling far more accurate predictions.

A research team led by Professor Steven Euijong Whang from the School of Electrical Engineering has developed a novel data augmentation method that enables multimodal AI systems—those that must process multiple data types simultaneously—to make balanced use of all input data. The findings are posted to the arXiv preprint server.

Multimodal AI combines various forms of information, such as text and video, to make judgments. However, AI models often show a tendency to rely excessively on one particular type of data, resulting in degraded prediction performance.

To solve this problem, the research team deliberately trained AI models using mismatched or incongruent data pairs. By doing so, the model learned to rely on all modalities—text, images, and even audio—in a balanced way, regardless of context.

The team further improved performance stability by incorporating a training strategy that compensates for low-quality data while emphasizing more challenging examples. The method is not tied to any specific model architecture and can be easily applied to various data types, making it highly scalable and practical.

Professor Whang explained, "Improving AI performance is not just about changing model architectures or algorithms—it's much more important how we design and use the data for training. This research demonstrates that designing and refining the data itself can be an effective approach to help multimodal AI utilize information more evenly, without becoming biased toward a specific modality such as images or text."

The study was co-led by doctoral student Seong-Hyeon Hwang and master's student Soyoung Choi, with Professor Steven Euijong Whang serving as the corresponding author. The results will be presented at the Conference on Neural Information Processing Systems (NeurIPS 2025), which will be held this December in San Diego, U.S., and Mexico City, Mexico.

More information: Seong-Hyeon Hwang et al, MIDAS: Misalignment-based Data Augmentation Strategy for Imbalanced Multimodal Learning, arXiv (2025). DOI: 10.48550/arxiv.2509.25831

Journal information: arXiv Provided by The Korea Advanced Institute of Science and Technology (KAIST) Citation: Multimodal AI learns to weigh text and images more evenly (2025, October 14) retrieved 14 October 2025 from https://techxplore.com/news/2025-10-multimodal-ai-text-images-evenly.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Apple's MM1: A multimodal large language model capable of interpreting both images and text data

Feedback to editors

Share212Tweet133ShareShare27ShareSend

Related Posts

OpenAI to ease ChatGPT restrictions, allowing adult content for verified adults
AI

OpenAI to ease ChatGPT restrictions, allowing adult content for verified adults

October 14, 2025
0

October 14, 2025 The GIST OpenAI to ease ChatGPT restrictions, allowing adult content for verified adults Andrew Zinin lead editor Editors' notes This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility: fact-checked reputable news agency...

Read moreDetails
It’s called automated officiating. The NBA is utilizing it to get even more calls right

It’s called automated officiating. The NBA is utilizing it to get even more calls right

October 14, 2025
Lancelot federated learning system combines encryption and robust aggregation to resist poisoning attacks

Lancelot federated learning system combines encryption and robust aggregation to resist poisoning attacks

October 14, 2025
Hollywood-AI battle heats up, as OpenAI and studios clash over copyrights and consent

Hollywood-AI battle heats up, as OpenAI and studios clash over copyrights and consent

October 14, 2025
Millions of children face sexual violence as AI deepfakes drive surge in new cases—latest global data

Millions of children face sexual violence as AI deepfakes drive surge in new cases—latest global data

October 13, 2025
Why industry-standard labels for AI in music could change how we listen

Why industry-standard labels for AI in music could change how we listen

October 13, 2025
California enacts first US law requiring AI chatbot safety measures

California enacts first US law requiring AI chatbot safety measures

October 13, 2025

Recent News

OpenAI to ease ChatGPT restrictions, allowing adult content for verified adults

OpenAI to ease ChatGPT restrictions, allowing adult content for verified adults

October 14, 2025

Chainlink Price Eyes $100 as S&P Global Partnership Expands Institutional Adoption

October 14, 2025
Meta removes Facebook Group for tracking ICE agents after DOJ pressure

Meta removes Facebook Group for tracking ICE agents after DOJ pressure

October 14, 2025
It’s called automated officiating. The NBA is utilizing it to get even more calls right

It’s called automated officiating. The NBA is utilizing it to get even more calls right

October 14, 2025

TOP News

  • Tron Looks to go Public in the U.S., Form Strategy Like TRX Holding Firm: FT

    597 shares
    Share 239 Tweet 149
  • God help us, Donald Trump plans to sell a phone

    596 shares
    Share 238 Tweet 149
  • Investment Giant 21Shares Announces New Five Altcoins Including Avalanche (AVAX)!

    596 shares
    Share 238 Tweet 149
  • WhatsApp has ads now, but only in the Updates tab

    596 shares
    Share 238 Tweet 149
  • AI generates data to help embodied agents ground language to 3D world

    595 shares
    Share 238 Tweet 149
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Use
Advertising: digestmediaholding@gmail.com

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Crypto news
  • AI
  • Technologies

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved