CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
Monday, June 30, 2025
No Result
View All Result
CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
No Result
View All Result
CRYPTOREPORTCLUB

AI vision language models provide video descriptions for blind users

June 30, 2025
157
0

June 30, 2025

The GIST AI vision language models provide video descriptions for blind users

Related Post

Using generative AI to help robots jump higher and land safely

Using generative AI to help robots jump higher and land safely

June 30, 2025
Creating a 3D interactive digital room from simple video

Creating a 3D interactive digital room from simple video

June 30, 2025
Lisa Lock

scientific editor

Andrew Zinin

lead editor

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

proofread

AI vision language models provide video descriptions for blind users
Blind and low-vision people request descriptions of videos on YouDescribe, but only 7% are completed. AI is speeding up the process. Credit: Matthew Modoono/Northeastern University

For people who are blind or have low vision, the audio descriptions of action in movies and TV shows are essential to understanding what is happening. Networks and streaming services hire professionals to create audio descriptions, but that's not the case for billions of YouTube and TikTok videos.

That doesn't mean people don't want access to the content.

Using AI vision language models (VLM), researchers at Northeastern University are making audio descriptions available for user-generated videos as part of a crowdsourced platform called YouDescribe. Like a library, blind and low-vision users can request descriptions for videos, and later rate and contribute to them.

"It's understandable that a 20-second video on TikTok of somebody dancing may not get a professional description," says Lana Do, who received her master's in computer science from Northeastern's Silicon Valley campus in May. "But blind and low-vision people might like to see that dancing video too."

In fact, a 2020 video of the South Korean boy band BTS's song "Dynamite" is at the top of YouDescribe's wishlist, waiting to be described. The platform has 3,000 volunteer describers, but the wishlist is so long, they can't keep up. Only 7% of requested videos on the wishlist have audio descriptions, Do says.

Do works in the lab of Ilmi Yoon, teaching professor of computer science on the Silicon Valley campus. Yoon joined YouDescribe's team in 2018 to develop the platform's machine learning elements.

This year, Do added new features to speed up YouDescribe's human-in-the-loop workflow. New VLM technology provides better quality descriptions, and a new infobot tool will allow users to ask for more information about a specific video frame. Low-vision users can even correct mistakes in the descriptions with a collaborative editing interface, Do says.

The result will make video content descriptions better and more quickly available. AI-generated drafts ease the burden on human describers, and users can easily engage in the process through ratings and comments, she said.

"They could say that they were watching a documentary set in a forest and they heard a flapping sound that wasn't described," Do says, "and they wondered what it was."

Do and her colleagues presented a paper recently at the Symposium on Human-Computer Interaction for Work in Amsterdam about the potential for AI to accelerate the development of audio descriptions. AI does a surprisingly good job, says Yoon, at describing human expressions and movements. In this video, an AI agent describes the steps that a chef takes while making cheese rolls.

But there are some consistent weaknesses, she says. AI isn't as good at reading facial expressions in cartoons. And overall, humans are better at picking up on the most important details in a scene—a key skill in creating a helpful description.

"It's very labor-intensive," Yoon says.

Graduate students in her lab compare the AI first drafts to what human describers create.

"Then we measure the gaps so we can train the AI to do a better job," she says. "Blind users don't want to get distracted with too much verbal description. It's an editorial art to verbalize the most important information in a concise way."

YouDescribe was launched in 2013 by the San Francisco-based Smith-Kettlewell Eye Research Institute to train sighted volunteers in the creation of audio descriptions. With a focus on YouTube and TikTok videos, the platform offers tutorials for recording and timing narration that make user-generated video content accessible.

Provided by Northeastern University

This story is republished courtesy of Northeastern Global News news.northeastern.edu.

Citation: AI vision language models provide video descriptions for blind users (2025, June 30) retrieved 30 June 2025 from https://techxplore.com/news/2025-06-ai-vision-language-video-descriptions.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

AI is now used for audio description. But it should be accurate and actually useful for people with low vision shares

Feedback to editors

Share212Tweet133ShareShare27ShareSend

Related Posts

Using generative AI to help robots jump higher and land safely
AI

Using generative AI to help robots jump higher and land safely

June 30, 2025
0

June 30, 2025 The GIST Using generative AI to help robots jump higher and land safely Lisa Lock scientific editor Robert Egan associate editor Editors' notes This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:...

Read moreDetails
Creating a 3D interactive digital room from simple video

Creating a 3D interactive digital room from simple video

June 30, 2025
Meta spending big on AI talent but will it pay off?

Meta spending big on AI talent but will it pay off?

June 30, 2025
How AI is revolutionizing ATL’s international terminal

How AI is revolutionizing ATL’s international terminal

June 30, 2025
AI is learning to lie, scheme, and threaten its creators

AI is learning to lie, scheme, and threaten its creators

June 29, 2025
China’s humanoid robots generate more soccer excitement than their human counterparts

China’s humanoid robots generate more soccer excitement than their human counterparts

June 29, 2025
Hide and seek: Uncovering new ways to detect vault apps on smartphones

Hide and seek: Uncovering new ways to detect vault apps on smartphones

June 27, 2025

Recent News

Judge rules Apple must face antitrust lawsuit brought by the US DOJ

Judge rules Apple must face antitrust lawsuit brought by the US DOJ

June 30, 2025
Using generative AI to help robots jump higher and land safely

Using generative AI to help robots jump higher and land safely

June 30, 2025
A Super Mario Maker 2 player has cleared an astonishing 1 million levels

A Super Mario Maker 2 player has cleared an astonishing 1 million levels

June 30, 2025

Is Bitcoin (BTC) Currently Overpriced or Undervalued? Here’s What Analysts Think

June 30, 2025

TOP News

  • Apple details new fee structures for App Store payments in the EU

    Apple details new fee structures for App Store payments in the EU

    540 shares
    Share 216 Tweet 135
  • Buying Art from a Gallery. A Guide to Making the Right Choice

    534 shares
    Share 214 Tweet 134
  • Machine learning method for early fault detection could make lithium-ion batteries safer

    534 shares
    Share 214 Tweet 134
  • Bitcoin Bullishness For Q3 Grows: What Happens In Every Post-Halving Year?

    534 shares
    Share 214 Tweet 134
  • New Pokémon Legends: Z-A trailer reveals a completely large model of Lumiose Metropolis

    563 shares
    Share 225 Tweet 141
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Use
Advertising: digestmediaholding@gmail.com

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Crypto news
  • AI
  • Technologies

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved