CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
Tuesday, July 22, 2025
No Result
View All Result
CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
No Result
View All Result
CRYPTOREPORTCLUB

Scalable transformer accelerator enables on-device execution of large language models

July 22, 2025
157
0

July 21, 2025

The GIST Scalable transformer accelerator enables on-device execution of large language models

Related Post

Democratizing AI-powered sentiment analysis

Democratizing AI-powered sentiment analysis

July 22, 2025
New AI method boosts reasoning and planning efficiency in diffusion models

New AI method boosts reasoning and planning efficiency in diffusion models

July 22, 2025
Sadie Harley

scientific editor

Andrew Zinin

lead editor

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

proofread

Scalable transformer accelerator enables on-device execution of large language models
Differences in processes with and without hardware accelerators. Credit: Electronics (2024). DOI: 10.3390/electronics13234683

Large language models (LLMs) like BERT and GPT are driving major advances in artificial intelligence, but their size and complexity typically require powerful servers and cloud infrastructure. Running these models directly on devices—without relying on external computation—has remained a difficult technical challenge.

A research team at Sejong University has developed a new hardware solution that may help change that. The work is published in the journal Electronics.

Their Scalable Transformer Accelerator Unit (STAU) is designed to execute various transformer-based language models efficiently on embedded systems. It adapts dynamically to different input sizes and model structures, making it especially well-suited for real-time on-device AI.

At the heart of the STAU is a Variable Systolic Array (VSA) architecture, which performs matrix operations—the core workload in transformer models—in a way that scales with the input sequence length. By feeding input data row by row and loading weights in parallel, the system reduces memory stalls and improves throughput. This is particularly important for LLMs, where sentence lengths and token sequences vary widely between tasks.

In benchmark tests published in Electronics, the accelerator demonstrated a 3.45× speedup over CPU-only execution while maintaining over 97% numerical accuracy. It also reduced total computation time by more than 68% when processing longer sequences.

Since then, continued optimizations have further improved the system's performance: according to the team, recent internal tests achieved a speedup of up to 5.18×, highlighting the architecture's long-term scalability.

  • Scalable transformer accelerator enables on-device execution of large language models
    Top module architecture. Credit: Electronics (2024). DOI: 10.3390/electronics13234683
  • Scalable transformer accelerator enables on-device execution of large language models
    Processing Element (PE) and Variable Systolic Array (VSA) architecture. Credit: Electronics (2024). DOI: 10.3390/electronics13234683

The researchers also re-engineered a critical part of the transformer pipeline: the softmax function. Typically a bottleneck due to its reliance on exponentiation and normalization, it was redesigned using a lightweight Radix-2 approach that relies on shift-and-add operations. This reduces the hardware complexity without compromising output quality.

To further simplify computation, the system uses a custom 16-bit floating-point format specifically tailored for transformer workloads. This format eliminates the need for layer normalization—another common performance bottleneck—and contributes to a more efficient, streamlined datapath.

STAU was implemented on a Xilinx FPGA (VMK180) and controlled by an embedded Arm Cortex-R5 processor. This hybrid design allows developers to support a range of transformer models—including those used in LLMs—by simply updating software running on the processor, with no hardware modifications required.

The team sees their work as a step toward making advanced language models more accessible and deployable across a broader range of platforms—including mobile devices, wearables, and edge computing systems—where real-time AI execution, privacy, and low-latency response are essential.

"The STAU architecture shows that transformer models, even large ones, can be made practical for on-device applications," said lead author Seok-Woo Chang. "It provides a foundation for building intelligent systems that are both scalable and efficient."

More information: Seok-Woo Chang et al, Scalable Transformer Accelerator with Variable Systolic Array for Multiple Models in Voice Assistant Applications, Electronics (2024). DOI: 10.3390/electronics13234683

Provided by Sejong University Citation: Scalable transformer accelerator enables on-device execution of large language models (2025, July 21) retrieved 21 July 2025 from https://techxplore.com/news/2025-07-scalable-enables-device-large-language.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

A sort-in-memory hardware system eliminates need for comparators in nonlinear sorting tasks 0 shares

Feedback to editors

Share212Tweet133ShareShare27ShareSend

Related Posts

Democratizing AI-powered sentiment analysis
AI

Democratizing AI-powered sentiment analysis

July 22, 2025
0

July 21, 2025 dialog The GIST Democratizing AI-powered sentiment analysis Lisa Lock scientific editor Andrew Zinin lead editor Editors' notes This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility: fact-checked trusted source written by researcher(s)...

Read moreDetails
New AI method boosts reasoning and planning efficiency in diffusion models

New AI method boosts reasoning and planning efficiency in diffusion models

July 22, 2025
Probing AI ‘thoughts’ reveals models use tree-like math to track shifting information

Probing AI ‘thoughts’ reveals models use tree-like math to track shifting information

July 22, 2025
AI vision, reinvented: Vision-language models gain clearer sight through synthetic training data

AI vision, reinvented: Vision-language models gain clearer sight through synthetic training data

July 21, 2025
AI comes to California’s electric grid

AI comes to California’s electric grid

July 21, 2025
AI models learn to split up tasks, slashing wait times for complex prompts

AI models learn to split up tasks, slashing wait times for complex prompts

July 21, 2025
Platform can make machine learning more transparent and accessible

Platform can make machine learning more transparent and accessible

July 21, 2025

Recent News

Blockstream Expands in Europe With Acquisition of Swiss Crypto Firm Elysium Labs

July 22, 2025
Democratizing AI-powered sentiment analysis

Democratizing AI-powered sentiment analysis

July 22, 2025
New AI method boosts reasoning and planning efficiency in diffusion models

New AI method boosts reasoning and planning efficiency in diffusion models

July 22, 2025
Scalable transformer accelerator enables on-device execution of large language models

Scalable transformer accelerator enables on-device execution of large language models

July 22, 2025

TOP News

  • Bitcoin Sees Long-Term Holders Sell As Short-Term Buyers Step In – Sign Of Rally Exhaustion?

    Bitcoin Sees Long-Term Holders Sell As Short-Term Buyers Step In – Sign Of Rally Exhaustion?

    534 shares
    Share 214 Tweet 134
  • AI-driven personalized pricing may not help consumers

    541 shares
    Share 216 Tweet 135
  • Our favorite power bank for iPhones is 20 percent off right now

    541 shares
    Share 216 Tweet 135
  • God help us, Donald Trump plans to sell a phone

    541 shares
    Share 216 Tweet 135
  • Investment Giant 21Shares Announces New Five Altcoins Including Avalanche (AVAX)!

    541 shares
    Share 216 Tweet 135
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Use
Advertising: digestmediaholding@gmail.com

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Crypto news
  • AI
  • Technologies

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved