Cloudflare highlights emerging risks and realities of frontier AI models in cybersecurity research

PHOTO

Dubai, UAE, Cloudflare today shared new insights into the evolving role of frontier AI models in cybersecurity research, outlining both the promise and the operational challenges these systems present for defenders. The findings were published in the company’s latest blog examining the capabilities of advanced cyber-focused AI systems under Project Glasswing.

As cyber threats continue to accelerate in speed and sophistication, Cloudflare emphasized that organizations must rethink how they approach resilience and security engineering. The company stated:

“Attacker timelines are shortening, but defenders need more than speed. We must harden systems to make exploitation difficult by design. That way, we can ensure that a vulnerability's existence doesn’t dictate the speed of our defeat.”

As part of Project Glasswing, Cloudflare tested Mythos against live code across its runtime, edge data path, protocol stack, control plane, and the open-source projects it depends on. The company said one of the most significant findings was the model’s ability to connect multiple low-severity vulnerabilities into a more dangerous exploit chain.

“The most important distinction: In our experience, other models found some of the same underlying bugs/issues, but what they didn't do was build the chains. They would surface bugs and stop there – which is the easy part. Mythos can take low-severity bugs (which would traditionally be invisible) and chain them into a single, more severe exploit.”

Cloudflare also highlighted concerns around inconsistent model safeguards and refusals during security research tasks.

“Model refusals aren't a reliable safety boundary: Mythos sometimes pushes back, and the reasons don't follow any policy we can see from the outside. In one case, the model refused to do vulnerability research, then agreed to do the same research on the same code once we deleted the hidden .git folder. Nothing about the code being analyzed had changed.”

In addition, the company noted that human oversight remains essential due to high volumes of speculative findings and false positives generated during testing.

“Non-actionable findings: Findings are requiring significant human effort to filter false positives from a subset of actual vulnerabilities. The noise is driven by programming language context, where memory-unsafe languages like C/C++ trigger more speculative flags. Mythos suffers from an inherent bias toward over-reporting potential issues, turning a helpful exploratory tool into a costly triage burden for human reviewers.”

The research underscores a broader industry challenge as frontier AI models become increasingly capable in cybersecurity-related tasks, requiring organizations to balance innovation, governance, and operational resilience.

To learn more, read the full blog here: Project Glasswing: what Mythos showed us.

About Cloudflare

Cloudflare, Inc. (www.cloudflare.com / @cloudflare) is on a mission to help build a better Internet. Cloudflare’s suite of products protect and accelerate any Internet application online without adding hardware, installing software, or changing a line of code. Internet properties powered by Cloudflare have all web traffic routed through its intelligent global network, which gets smarter with every request. As a result, they see significant improvement in performance and a decrease in spam and other attacks. Cloudflare was named to Entrepreneur Magazine’s Top Company Cultures 2018 list and ranked among the World’s Most Innovative Companies by Fast Company in 2019. Headquartered in San Francisco, CA, Cloudflare has offices in Austin, TX, Champaign, IL, New York, NY, San Jose, CA, Seattle, WA, Washington, D.C., Toronto, Lisbon, London, Munich, Paris, Beijing, Singapore, Sydney, and Tokyo.

ZAWYA NEWSLETTERS

RELATED ARTICLES

ZAWYA NEWSLETTERS

LATEST VIDEO