CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
Sunday, July 27, 2025
No Result
View All Result
CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
No Result
View All Result
CRYPTOREPORTCLUB

Beating the AI bottleneck: Communications innovation could markedly improve AI training process

July 11, 2025
154
0

July 11, 2025

The GIST Beating the AI bottleneck: Communications innovation could markedly improve AI training process

Related Post

Urgent need for ‘global approach’ on AI regulation: UN tech chief

Urgent need for ‘global approach’ on AI regulation: UN tech chief

July 27, 2025
China urges global consensus on balancing AI development, security

China urges global consensus on balancing AI development, security

July 26, 2025
Gaby Clark

scientific editor

Andrew Zinin

lead editor

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

proofread

Beating the AI bottleneck
ZEN System Overview. Credit: Zhuang Wang et al.

Artificial intelligence (AI) is infamous for its resource-heavy training, but a new study may have found a solution in a novel communications system, called ZEN, that markedly improves the way large language models (LLMs) train.

The research team at Rice University was helmed by doctoral graduate Zhuang Wang and computer science professor T.S. Eugene Ng with contributions from two other computer science faculty members: assistant professor Yuke Wang and professor Anshumali Shrivastava. Stevens University's Zhaozhuo Xu and Jingyi Xi of Zhejiang University also contributed to the project.

Distributed training, sparsity and communication

Wang said there are two phases where LLMs can bottleneck during the distributed training process: computation and communication.

The first occurs when the model needs to crunch through a large amount of data. It can bog down the system, consuming time and computing power. Splitting the data among hundreds, sometimes thousands, of graphics processing units (GPUs) helps manage that problem. They process multiple data samples separately, then feed them back into the model.

The second bottleneck happens when all those GPUs need to sync up so they can "talk" to the model and convey what they've learned. They need to efficiently communicate with one another to complete each training run smoothly and can slow down if the model gradients they have to sync are very large, which they often are.

"The previous solution was to send all the data out. But in practice, we observe that the data has a lot of zero values in the 'talk,'" Wang said. "We need a data structure to represent the communication information correctly."

Removing those zero or near-zero values and leaving only the relevant ones to be synchronized during communication is called "sparsification." The values that are left are aptly named "sparse tensors." It's common practice in LLM training and can save the system the effort of communicating billions of extra gradients. But it still leaves the communication bottleneck, which is where the team focused its research.

"There's actually not a lot of fundamental understanding of how to support these sparse tensors inside of distributed training," Ng said. "People propose the idea, but they don't understand what the optimal way of handling them is. One of the contributions of our work is to analyze these sparse tensors to understand how they behave."

Mapping the system, finding the structure

There were essentially three parts to this research: Part one was figuring out the characteristics of sparse tensors in popular models. The nonzero gradients left after sparsification aren't uniformly distributed; their location and tensor density depend on factors like the training model and dataset used.

That scattering of nonzero gradients leads to an imbalance during the communication phase that slows down synchronization and, by extension, slows down the training process. This new understanding sheds light on how to design better communication schemes to use with sparse tensors.

Once they knew how to approach their design, part two was figuring out the optimal communication schemes to use. Wang and Ng analyzed several options to determine what those were.

Because there was no optimal solution before this research, the third and final step was building a real-world system based on their research and applying that system to practical LLM training to see if it worked. ZEN was that system, and it displayed a stark difference in training speed when used on real-world LLMs.

"What we basically show is that we can accelerate the time to completion of the training because the communication is more efficient. … The time it takes to perform one step in the training is much faster," Ng said.

Since sparse tensors are used often and the field of LLM training is so broad, this discovery can be applied to just about any model with, as Ng phrased it, "the characteristics of sparsity." Be it text or image generation, ZEN can speed up model training if sparse tensors are present.

Wang isn't new to this area of research. He and Ng previously collaborated on a project to minimize the failure recovery overhead of LLMs after a hardware or software failure during training, which they named GEMINI—unveiled at the ACM Symposium on Operating Systems Principles in 2023.

Wang recently presented his paper on this newer research, entitled "ZEN: Empowering Distributed Training with Sparsity-driven Data Synchronization," at the 19th USENIX Symposium on Operating Systems Design and Implementation (OSDI) held in Boston.

More information: ZEN: Empowering Distributed Training with Sparsity-driven Data Synchronization, www.usenix.org/conference/osdi … entation/wang-zhuang

Provided by Rice University Citation: Beating the AI bottleneck: Communications innovation could markedly improve AI training process (2025, July 11) retrieved 11 July 2025 from https://techxplore.com/news/2025-07-ai-bottleneck-communications.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Improve machine learning performance by dropping the zeros 0 shares

Feedback to editors

Share212Tweet133ShareShare27ShareSend

Related Posts

Urgent need for ‘global approach’ on AI regulation: UN tech chief
AI

Urgent need for ‘global approach’ on AI regulation: UN tech chief

July 27, 2025
0

July 27, 2025 The GIST Urgent need for 'global approach' on AI regulation: UN tech chief Andrew Zinin lead editor Editors' notes This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility: fact-checked reputable news agency...

Read moreDetails
China urges global consensus on balancing AI development, security

China urges global consensus on balancing AI development, security

July 26, 2025
Trump’s AI plan calls for massive data centers. Here’s how it may affect energy in the US

Trump’s AI plan calls for massive data centers. Here’s how it may affect energy in the US

July 25, 2025
Tradition meets AI in Nishijinori weaving style from Japan’s ancient capital

Tradition meets AI in Nishijinori weaving style from Japan’s ancient capital

July 25, 2025
AI tackles notoriously complex equations, enabling faster advances in drug and material design

AI tackles notoriously complex equations, enabling faster advances in drug and material design

July 25, 2025
AI will soon be able to audit all published research—what will that mean for public trust in science?

AI will soon be able to audit all published research—what will that mean for public trust in science?

July 25, 2025
A human-inspired pathfinding approach to improve robot navigation

A human-inspired pathfinding approach to improve robot navigation

July 25, 2025

Recent News

How Much of Ethereum’s Supply Is Lost Forever? Here’s the Amount That Must Be Excluded When Calculating Supply

July 27, 2025

Users Are Unstaking Their ETH in Unusual Amounts on Ethereum – What Does This Mean and Why Is It Happening? Cathie Wood Weighs In

July 27, 2025
Urgent need for ‘global approach’ on AI regulation: UN tech chief

Urgent need for ‘global approach’ on AI regulation: UN tech chief

July 27, 2025

Bitcoin Cash Surges Past $580 as Analysts Predict Breakout Toward $620–$680 Range

July 27, 2025

TOP News

  • Bitcoin Sees Long-Term Holders Sell As Short-Term Buyers Step In – Sign Of Rally Exhaustion?

    Bitcoin Sees Long-Term Holders Sell As Short-Term Buyers Step In – Sign Of Rally Exhaustion?

    534 shares
    Share 214 Tweet 134
  • The AirPods 4 are still on sale at a near record low price

    533 shares
    Share 213 Tweet 133
  • Ripple Partners With Ctrl Alt to Expand Custody Footprint Into Middle East

    533 shares
    Share 213 Tweet 133
  • Cyberpunk 2077: Ultimate Edition comes to the Mac on July 17

    533 shares
    Share 213 Tweet 133
  • HBO confirms The Last of Us season 3 will arrive in 2027

    533 shares
    Share 213 Tweet 133
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Use
Advertising: digestmediaholding@gmail.com

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Crypto news
  • AI
  • Technologies

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved