CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
Saturday, September 13, 2025
No Result
View All Result
CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
No Result
View All Result
CRYPTOREPORTCLUB

Using AI to turn sound recordings into accurate street images

November 27, 2024
157
0

November 27, 2024

Editors' notes

Related Post

AI hype has just shaken up the world’s rich list: What if the boom is really a bubble?

AI hype has just shaken up the world’s rich list: What if the boom is really a bubble?

September 12, 2025
US regulator probes AI chatbots over child safety concerns

US regulator probes AI chatbots over child safety concerns

September 12, 2025

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

proofread

Using AI to turn sound recordings into accurate street images

Researchers use AI to turn sound recordings into accurate street images
Credit: University of Texas at Austin

Using generative artificial intelligence, a team of researchers at The University of Texas at Austin has converted sounds from audio recordings into street-view images. The visual accuracy of these generated images demonstrates that machines can replicate human connection between audio and visual perception of environments.

In a paper published in Computers, Environment and Urban Systems, the research team describes training a soundscape-to-image AI model using audio and visual data gathered from a variety of urban and rural streetscapes and then using that model to generate images from audio recordings.

"Our study found that acoustic environments contain enough visual cues to generate highly recognizable streetscape images that accurately depict different places," said Yuhao Kang, assistant professor of geography and the environment at UT and co-author of the study. "This means we can convert the acoustic environments into vivid visual representations, effectively translating sounds into sights."

Using YouTube video and audio from cities in North America, Asia and Europe, the team created pairs of 10-second audio clips and image stills from the various locations and used them to train an AI model that could produce high-resolution images from audio input. They then compared AI sound-to-image creations made from 100 audio clips to their respective real-world photos, using both human and computer evaluations.

Computer evaluations compared the relative proportions of greenery, building and sky between source and generated images, whereas human judges were asked to correctly match one of three generated images to an audio sample.

Researchers use AI to turn sound recordings into accurate street images
Credit: University of Texas at Austin

The results showed strong correlations in the proportions of sky and greenery between generated and real-world images and a slightly lesser correlation in building proportions. And human participants averaged 80% accuracy in selecting the generated images that corresponded to source audio samples.

"Traditionally, the ability to envision a scene from sounds is a uniquely human capability, reflecting our deep sensory connection with the environment. Our use of advanced AI techniques supported by large language models (LLMs) demonstrates that machines have the potential to approximate this human sensory experience," Kang said.

"This suggests that AI can extend beyond mere recognition of physical surroundings to potentially enrich our understanding of human subjective experiences at different places."

In addition to approximating the proportions of sky, greenery and buildings, the generated images often maintained the architectural styles and distances between objects of their real-world image counterparts, as well as accurately reflecting whether soundscapes were recorded during sunny, cloudy or nighttime lighting conditions.

The authors note that lighting information might come from variations in activity in the soundscapes. For example, traffic sounds or the chirping of nocturnal insects could reveal time of day. Such observations further the understanding of how multisensory factors contribute to our experience of a place.

"When you close your eyes and listen, the sounds around you paint pictures in your mind," Kang said. "For instance, the distant hum of traffic becomes a bustling cityscape, while the gentle rustle of leaves ushers you into a serene forest. Each sound weaves a vivid tapestry of scenes, as if by magic, in the theater of your imagination."

Kang's work focuses on using geospatial AI to study the interaction of humans with their environments. In another recent paper published in Humanities and Social Sciences Communications, he and his co-authors examined the potential of AI to capture the characteristics that give cities their unique identities.

More information: Yonggai Zhuang et al, From hearing to seeing: Linking auditory and visual place perceptions with soundscape-to-image generative artificial intelligence, Computers, Environment and Urban Systems (2024). DOI: 10.1016/j.compenvurbsys.2024.102122

Kee Moon Jang et al, Place identity: a generative AI's perspective, Humanities and Social Sciences Communications (2024). DOI: 10.1057/s41599-024-03645-7

Provided by University of Texas at Austin Citation: Using AI to turn sound recordings into accurate street images (2024, November 27) retrieved 27 November 2024 from https://techxplore.com/news/2024-11-ai-accurate-street-images.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Exploring text-to-audio models to make music from scratch 30 shares

Feedback to editors

Share212Tweet133ShareShare27ShareSend

Related Posts

AI hype has just shaken up the world’s rich list: What if the boom is really a bubble?
AI

AI hype has just shaken up the world’s rich list: What if the boom is really a bubble?

September 12, 2025
0

September 12, 2025 The GIST AI hype has just shaken up the world's rich list: What if the boom is really a bubble? Sadie Harley scientific editor Andrew Zinin lead editor Editors' notes This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the...

Read moreDetails
US regulator probes AI chatbots over child safety concerns

US regulator probes AI chatbots over child safety concerns

September 12, 2025
FTC launces inquiry into AI chatbots acting as companions and their effects on children

FTC launces inquiry into AI chatbots acting as companions and their effects on children

September 11, 2025
Albania appoints AI-generated minister to avoid corruption

Albania appoints AI-generated minister to avoid corruption

September 11, 2025
OpenAI, DeepSeek, and Google vary widely in identifying hate speech

OpenAI, DeepSeek, and Google vary widely in identifying hate speech

September 11, 2025
Artificial intelligence enables exoskeletons to assist users more efficiently

Artificial intelligence enables exoskeletons to assist users more efficiently

September 11, 2025
Do chatbots have a moral compass? Researchers turn to Reddit to find out

Do chatbots have a moral compass? Researchers turn to Reddit to find out

September 11, 2025

Recent News

Bitcoin Bears Shaken—Analyst Says Local Bottom 90% Likely Set

Bitcoin Bears Shaken—Analyst Says Local Bottom 90% Likely Set

September 13, 2025
FTC investigating ad sale practices at Google and Amazon

FTC investigating ad sale practices at Google and Amazon

September 13, 2025

LINK Leads Top 10 AI Agent projects by Social Activity on LunarCrush

September 13, 2025
Mario, Metroid, Virtual Boy and more: all the biggest announcements from today’s Nintendo Direct

Mario, Metroid, Virtual Boy and more: all the biggest announcements from today’s Nintendo Direct

September 13, 2025

TOP News

  • WhatsApp has ads now, but only in the Updates tab

    WhatsApp has ads now, but only in the Updates tab

    575 shares
    Share 230 Tweet 144
  • God help us, Donald Trump plans to sell a phone

    576 shares
    Share 230 Tweet 144
  • Investment Giant 21Shares Announces New Five Altcoins Including Avalanche (AVAX)!

    575 shares
    Share 230 Tweet 144
  • Tron Looks to go Public in the U.S., Form Strategy Like TRX Holding Firm: FT

    575 shares
    Share 230 Tweet 144
  • AI generates data to help embodied agents ground language to 3D world

    575 shares
    Share 230 Tweet 144
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Use
Advertising: digestmediaholding@gmail.com

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Crypto news
  • AI
  • Technologies

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved