CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
Wednesday, July 30, 2025
No Result
View All Result
CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
No Result
View All Result
CRYPTOREPORTCLUB

AI agent autonomously solves complex cybersecurity challenges using text-based tools

July 29, 2025
157
0

July 29, 2025

The GIST AI agent autonomously solves complex cybersecurity challenges using text-based tools

Related Post

Apple Manufacturing Academy opens in Detroit amid Trump pressure on US production

Apple Manufacturing Academy opens in Detroit amid Trump pressure on US production

July 30, 2025
‘Marathon at F1 speed’: China bids to lap US in AI leadership

‘Marathon at F1 speed’: China bids to lap US in AI leadership

July 30, 2025
Gaby Clark

scientific editor

Andrew Zinin

lead editor

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

proofread

AI agent autonomously solves complex cybersecurity challenges using text-based tools
EnIGMA presentation at ICML 2025. Researchers Talor Abramovich (left) and Minghao Shao (right). Credit: Minghao Shao

Artificial intelligence agents—AI systems that can work independently toward specific goals without constant human guidance—have demonstrated strong capabilities in software development and web navigation. Their effectiveness in cybersecurity has remained limited, however.

That may soon change, thanks to a research team from NYU Tandon School of Engineering, NYU Abu Dhabi and other universities that developed an AI agent capable of autonomously solving complex cybersecurity challenges.

The system, called EnIGMA, was presented this month at the International Conference on Machine Learning (ICML) 2025 in Vancouver, Canada.

"EnIGMA is about using Large Language Model agents for cybersecurity applications," said Meet Udeshi, a NYU Tandon Ph.D. student and co-author of the research. Udeshi is advised by Ramesh Karri, Chair of NYU Tandon's Electrical and Computer Engineering Department (ECE) and a faculty member of the NYU Center for Cybersecurity and NYU Center for Advanced Technology in Telecommunications (CATT), and by Farshad Khorrami, ECE professor and CATT faculty member. Both Karri and Khorrami are co-authors on the paper, with Karri serving as a senior author.

To build EnIGMA, the researchers started with an existing framework called SWE-agent, which was originally designed for software engineering tasks. However, cybersecurity challenges required specialized tools that didn't exist in previous AI systems. "We have to restructure those interfaces to feed it into an LLM properly. So we've done that for a couple of cybersecurity tools," Udeshi explained.

The key innovation was developing what they call "Interactive Agent Tools" that convert visual cybersecurity programs into text-based formats the AI can understand. Traditional cybersecurity tools like debuggers and network analyzers use graphical interfaces with clickable buttons, visual displays, and interactive elements that humans can see and manipulate.

"Large language models process text only, but these interactive tools with graphical user interfaces work differently, so we had to restructure those interfaces to work with LLMs," Udeshi said.

The team built their own dataset by collecting and structuring Capture The Flag (CTF) challenges specifically for large language models. These gamified cybersecurity competitions simulate real-world vulnerabilities and have traditionally been used to train human cybersecurity professionals.

Researchers develop AI agent that solves cybersecurity challenges autonomously
EnIGMA is an LM agent fed with CTF challenges from the NYU CTF benchmark. It interacts with the computer through an environment that is built on top of SWE-agent (Yang et al., 2024) and extends it to cybersecurity. We incorporate new interactive tools that assist the agent in debugging and connecting to remote server. The agent iterates through interactions and feedback from the environment until it solves the challenge. Credit: Talor Abramovich et al.

"CTFs are like a gamified version of cybersecurity used in academic competitions. They're not true cybersecurity problems that you would face in the real world, but they are very good simulations," Udeshi noted.

Paper co-author Minghao Shao, a NYU Tandon Ph.D. student and Global Ph.D. Fellow at NYU Abu Dhabi who is advised by Karri and Muhammad Shafique, Professor of Computer Engineering at NYU Abu Dhabi and ECE Global Network Professor at NYU Tandon, described the technical architecture: "We built our own CTF benchmark dataset and created a specialized data loading system to feed these challenges into the model." Shafique is also a co-author on the paper.

The framework includes specialized prompts that provide the model with instructions tailored to cybersecurity scenarios.

EnIGMA demonstrated superior performance across multiple benchmarks. The system was tested on 390 CTF challenges across four different benchmarks, achieving state-of-the-art results and solving more than three times as many challenges as previous AI agents.

During the research conducted approximately 12 months ago, "Claude 3.5 Sonnet from Anthropic was the best model, and GPT-4o was second at that time," according to Udeshi.

The research also identified a previously unknown phenomenon called "soliloquizing," where the AI model generates hallucinated observations without actually interacting with the environment, a discovery that could have important consequences for AI safety and reliability.

Beyond this technical finding, the potential applications extend outside of academic competitions. "If you think of an autonomous LLM agent that can solve these CTFs, that agent has substantial cybersecurity skills that you can use for other cybersecurity tasks as well," Udeshi explained. The agent could potentially be applied to real-world vulnerability assessment, with the ability to "try hundreds of different approaches" autonomously.

For Udeshi, whose research focuses on industrial control system security, the framework opens new possibilities for securing robotic systems and industrial control systems. Shao sees potential applications beyond cybersecurity, including quantum code generation and chip design vulnerability detection.

The researchers acknowledge the dual-use nature of their technology. While EnIGMA could help security professionals identify and patch vulnerabilities more efficiently, it could also potentially be misused for malicious purposes. The team has notified representatives from major AI companies, including Meta, Anthropic, and OpenAI about their results.

More information: EnIGMA: Interactive Tools Substantially Assist LM Agents in Finding Security Vulnerabilities: icml.cc/virtual/2025/poster/45428

Provided by NYU Tandon School of Engineering Citation: AI agent autonomously solves complex cybersecurity challenges using text-based tools (2025, July 29) retrieved 29 July 2025 from https://techxplore.com/news/2025-07-ai-agent-autonomously-complex-cybersecurity.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Researchers develop simple, low-cost method to detect GPS trackers hidden in vehicles 24 shares

Feedback to editors

Share212Tweet133ShareShare27ShareSend

Related Posts

Apple Manufacturing Academy opens in Detroit amid Trump pressure on US production
AI

Apple Manufacturing Academy opens in Detroit amid Trump pressure on US production

July 30, 2025
0

July 30, 2025 The GIST Apple Manufacturing Academy opens in Detroit amid Trump pressure on US production Sadie Harley scientific editor Andrew Zinin lead editor Editors' notes This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's...

Read moreDetails
‘Marathon at F1 speed’: China bids to lap US in AI leadership

‘Marathon at F1 speed’: China bids to lap US in AI leadership

July 30, 2025
Fraud detection strategies outlined may explain how to survive explosion of deepfakes

Fraud detection strategies outlined may explain how to survive explosion of deepfakes

July 30, 2025
Why AI leaderboards are inaccurate and how to fix them

Why AI leaderboards are inaccurate and how to fix them

July 29, 2025
How US adults are using AI, according to AP-NORC polling

How US adults are using AI, according to AP-NORC polling

July 29, 2025
Trading AI. How Artificial Intelligence Is Revolutionizing Financial Markets

Trading AI. How Artificial Intelligence Is Revolutionizing Financial Markets

July 29, 2025
‘AI veganism’: Some people’s issues with AI parallel vegans’ concerns about diet

‘AI veganism’: Some people’s issues with AI parallel vegans’ concerns about diet

July 29, 2025

Recent News

London-Based Web Design Firm The Smarter Web Company Announces It Acquired Bitcoin! Here’s the Total BTC It Holds!

July 30, 2025
A new, faster-paced game mode is coming to Apex Legends on August 5

A new, faster-paced game mode is coming to Apex Legends on August 5

July 30, 2025
Apple Manufacturing Academy opens in Detroit amid Trump pressure on US production

Apple Manufacturing Academy opens in Detroit amid Trump pressure on US production

July 30, 2025
YouTube will no longer limit ads on videos that drop the f-bomb early

YouTube will no longer limit ads on videos that drop the f-bomb early

July 30, 2025

TOP News

  • AI-driven personalized pricing may not help consumers

    AI-driven personalized pricing may not help consumers

    543 shares
    Share 217 Tweet 136
  • Our favorite power bank for iPhones is 20 percent off right now

    543 shares
    Share 217 Tweet 136
  • God help us, Donald Trump plans to sell a phone

    544 shares
    Share 218 Tweet 136
  • Investment Giant 21Shares Announces New Five Altcoins Including Avalanche (AVAX)!

    543 shares
    Share 217 Tweet 136
  • WhatsApp has ads now, but only in the Updates tab

    543 shares
    Share 217 Tweet 136
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Use
Advertising: digestmediaholding@gmail.com

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Crypto news
  • AI
  • Technologies

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved