CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
Thursday, July 31, 2025
No Result
View All Result
CRYPTOREPORTCLUB
  • Crypto news
  • AI
  • Technologies
No Result
View All Result
CRYPTOREPORTCLUB

AI agent autonomously solves complex cybersecurity challenges using text-based tools

July 29, 2025
157
0

July 29, 2025

The GIST AI agent autonomously solves complex cybersecurity challenges using text-based tools

Related Post

Microsoft nears OpenAI agreement for ongoing tech access

Microsoft nears OpenAI agreement for ongoing tech access

July 31, 2025
New algorithm enables efficient machine learning with symmetric data structures

New algorithm enables efficient machine learning with symmetric data structures

July 30, 2025
Gaby Clark

scientific editor

Andrew Zinin

lead editor

Editors' notes

This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility:

fact-checked

trusted source

proofread

AI agent autonomously solves complex cybersecurity challenges using text-based tools
EnIGMA presentation at ICML 2025. Researchers Talor Abramovich (left) and Minghao Shao (right). Credit: Minghao Shao

Artificial intelligence agents—AI systems that can work independently toward specific goals without constant human guidance—have demonstrated strong capabilities in software development and web navigation. Their effectiveness in cybersecurity has remained limited, however.

That may soon change, thanks to a research team from NYU Tandon School of Engineering, NYU Abu Dhabi and other universities that developed an AI agent capable of autonomously solving complex cybersecurity challenges.

The system, called EnIGMA, was presented this month at the International Conference on Machine Learning (ICML) 2025 in Vancouver, Canada.

"EnIGMA is about using Large Language Model agents for cybersecurity applications," said Meet Udeshi, a NYU Tandon Ph.D. student and co-author of the research. Udeshi is advised by Ramesh Karri, Chair of NYU Tandon's Electrical and Computer Engineering Department (ECE) and a faculty member of the NYU Center for Cybersecurity and NYU Center for Advanced Technology in Telecommunications (CATT), and by Farshad Khorrami, ECE professor and CATT faculty member. Both Karri and Khorrami are co-authors on the paper, with Karri serving as a senior author.

To build EnIGMA, the researchers started with an existing framework called SWE-agent, which was originally designed for software engineering tasks. However, cybersecurity challenges required specialized tools that didn't exist in previous AI systems. "We have to restructure those interfaces to feed it into an LLM properly. So we've done that for a couple of cybersecurity tools," Udeshi explained.

The key innovation was developing what they call "Interactive Agent Tools" that convert visual cybersecurity programs into text-based formats the AI can understand. Traditional cybersecurity tools like debuggers and network analyzers use graphical interfaces with clickable buttons, visual displays, and interactive elements that humans can see and manipulate.

"Large language models process text only, but these interactive tools with graphical user interfaces work differently, so we had to restructure those interfaces to work with LLMs," Udeshi said.

The team built their own dataset by collecting and structuring Capture The Flag (CTF) challenges specifically for large language models. These gamified cybersecurity competitions simulate real-world vulnerabilities and have traditionally been used to train human cybersecurity professionals.

Researchers develop AI agent that solves cybersecurity challenges autonomously
EnIGMA is an LM agent fed with CTF challenges from the NYU CTF benchmark. It interacts with the computer through an environment that is built on top of SWE-agent (Yang et al., 2024) and extends it to cybersecurity. We incorporate new interactive tools that assist the agent in debugging and connecting to remote server. The agent iterates through interactions and feedback from the environment until it solves the challenge. Credit: Talor Abramovich et al.

"CTFs are like a gamified version of cybersecurity used in academic competitions. They're not true cybersecurity problems that you would face in the real world, but they are very good simulations," Udeshi noted.

Paper co-author Minghao Shao, a NYU Tandon Ph.D. student and Global Ph.D. Fellow at NYU Abu Dhabi who is advised by Karri and Muhammad Shafique, Professor of Computer Engineering at NYU Abu Dhabi and ECE Global Network Professor at NYU Tandon, described the technical architecture: "We built our own CTF benchmark dataset and created a specialized data loading system to feed these challenges into the model." Shafique is also a co-author on the paper.

The framework includes specialized prompts that provide the model with instructions tailored to cybersecurity scenarios.

EnIGMA demonstrated superior performance across multiple benchmarks. The system was tested on 390 CTF challenges across four different benchmarks, achieving state-of-the-art results and solving more than three times as many challenges as previous AI agents.

During the research conducted approximately 12 months ago, "Claude 3.5 Sonnet from Anthropic was the best model, and GPT-4o was second at that time," according to Udeshi.

The research also identified a previously unknown phenomenon called "soliloquizing," where the AI model generates hallucinated observations without actually interacting with the environment, a discovery that could have important consequences for AI safety and reliability.

Beyond this technical finding, the potential applications extend outside of academic competitions. "If you think of an autonomous LLM agent that can solve these CTFs, that agent has substantial cybersecurity skills that you can use for other cybersecurity tasks as well," Udeshi explained. The agent could potentially be applied to real-world vulnerability assessment, with the ability to "try hundreds of different approaches" autonomously.

For Udeshi, whose research focuses on industrial control system security, the framework opens new possibilities for securing robotic systems and industrial control systems. Shao sees potential applications beyond cybersecurity, including quantum code generation and chip design vulnerability detection.

The researchers acknowledge the dual-use nature of their technology. While EnIGMA could help security professionals identify and patch vulnerabilities more efficiently, it could also potentially be misused for malicious purposes. The team has notified representatives from major AI companies, including Meta, Anthropic, and OpenAI about their results.

More information: EnIGMA: Interactive Tools Substantially Assist LM Agents in Finding Security Vulnerabilities: icml.cc/virtual/2025/poster/45428

Provided by NYU Tandon School of Engineering Citation: AI agent autonomously solves complex cybersecurity challenges using text-based tools (2025, July 29) retrieved 29 July 2025 from https://techxplore.com/news/2025-07-ai-agent-autonomously-complex-cybersecurity.html This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Explore further

Researchers develop simple, low-cost method to detect GPS trackers hidden in vehicles 24 shares

Feedback to editors

Share212Tweet133ShareShare27ShareSend

Related Posts

Microsoft nears OpenAI agreement for ongoing tech access
AI

Microsoft nears OpenAI agreement for ongoing tech access

July 31, 2025
0

July 30, 2025 The GIST Microsoft nears OpenAI agreement for ongoing tech access Sadie Harley scientific editor Andrew Zinin lead editor Editors' notes This article has been reviewed according to Science X's editorial process and policies. Editors have highlighted the following attributes while ensuring the content's credibility: fact-checked reputable news...

Read moreDetails
New algorithm enables efficient machine learning with symmetric data structures

New algorithm enables efficient machine learning with symmetric data structures

July 30, 2025
Hiding secret codes in light can protect against fake videos

Hiding secret codes in light can protect against fake videos

July 30, 2025
Too many em dashes? Weird words like ‘delves?’ Spotting text written by ChatGPT is still more art than science

Too many em dashes? Weird words like ‘delves?’ Spotting text written by ChatGPT is still more art than science

July 30, 2025
AI can evolve to feel guilt—but only in certain social environments

AI can evolve to feel guilt—but only in certain social environments

July 30, 2025
As AI booms, data centers threaten energy grid and water supplies, expert says

As AI booms, data centers threaten energy grid and water supplies, expert says

July 30, 2025
Apple Manufacturing Academy opens in Detroit amid Trump pressure on US production

Apple Manufacturing Academy opens in Detroit amid Trump pressure on US production

July 30, 2025

Recent News

The country that lifted the Bitcoin (BTC) ban a year ago is making rapid progress! Another significant move is coming!

July 31, 2025
Dropbox is pulling the plug on its password manager

Dropbox is pulling the plug on its password manager

July 31, 2025

Bitcoin Exchange OKX Releases 33rd Proof of Reserve! Bitcoin and Ethereum Assets Decline! Here Are the Details

July 31, 2025
LinkedIn quietly removed references to deadnaming and misgendering from its hateful content policy

LinkedIn quietly removed references to deadnaming and misgendering from its hateful content policy

July 31, 2025

TOP News

  • The AirPods 4 are still on sale at a near record low price

    The AirPods 4 are still on sale at a near record low price

    535 shares
    Share 214 Tweet 134
  • Ripple Partners With Ctrl Alt to Expand Custody Footprint Into Middle East

    535 shares
    Share 214 Tweet 134
  • Cyberpunk 2077: Ultimate Edition comes to the Mac on July 17

    535 shares
    Share 214 Tweet 134
  • HBO confirms The Last of Us season 3 will arrive in 2027

    535 shares
    Share 214 Tweet 134
  • Reddit is back online after a brief outage

    535 shares
    Share 214 Tweet 134
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms of Use
Advertising: digestmediaholding@gmail.com

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result
  • Crypto news
  • AI
  • Technologies

Disclaimer: Information found on cryptoreportclub.com is those of writers quoted. It does not represent the opinions of cryptoreportclub.com on whether to sell, buy or hold any investments. You are advised to conduct your own research before making any investment decisions. Use provided information at your own risk.
cryptoreportclub.com covers fintech, blockchain and Bitcoin bringing you the latest crypto news and analyses on the future of money.

© 2023-2025 Cryptoreportclub. All Rights Reserved