In April 2024, researchers at the University of Illinois demonstrated that large language models like GPT-4 could autonomously exploit 87% of a predefined set of one-day vulnerabilities when provided with CVE descriptions. Without those details, the success rate dropped to just 7%. For years, this "margin of safety" reassured security teams that while AI could leverage existing flaws, it couldn’t uncover new ones.
That margin vanished on April 7, 2024, when Anthropic unveiled Claude Mythos Preview. The model didn’t just replicate known vulnerabilities—it discovered thousands of zero-days across major operating systems and browsers. In a controlled campaign targeting OpenBSD over 1,000 scaffold runs, Mythos achieved an 83.1% reproduction rate on the CyberGym benchmark at a total compute cost under $20,000.
The implications are stark. Exploitation timelines are collapsing. Langflow’s CVE-2026-33017 (CVSS 9.8) saw active exploitation within 20 hours of disclosure, despite no public proof-of-concept. Marimo’s CVE-2026-39987 (CVSS 9.3) was weaponized in under 10 hours. Google’s M-Trends 2026 report confirms this trend: exploits are now happening before vendors release patches. The Langflow advisory arrived at 9 AM; by 5 AM the next day, attackers had breached systems. For Marimo, the window was even tighter—exploitation began before most teams even finished their morning coffee.
The old assumption—that there’s always time to patch—no longer holds. Security teams must act before attackers do.
Adopt a three-layer filter to prioritize patches
Many organizations still rely solely on CVSS scores to prioritize vulnerabilities, but this approach fails to account for real-world risk. A CVSS 8.8 flaw with active exploits in the wild (like Docker’s CVE-2026-34040) often gets deprioritized compared to a theoretical CVSS 9.8 vulnerability that may never be exploited.
A 2025 study published in arXiv analyzed 28,377 real-world vulnerabilities and validated a three-layer decision tree that combines CISA’s Known Exploited Vulnerabilities (KEV) catalog, EPSS scores from FIRST.org, and CVSS ratings. Here’s how it works:
- Layer 1: Active exploitation – If a vulnerability appears in the CISA KEV catalog, patch it immediately. SLA: Within hours.
- Layer 2: Predicted exploitation – If the EPSS score (from FIRST.org) is 0.088 or higher, escalate to Tier 0 remediation. SLA: Within 24 hours.
- Layer 3: Severity baseline – If the CVSS score is 7.0 or higher, follow your standard remediation policy.
The results speak for themselves: an 18x efficiency gain, 85.6% coverage of exploited vulnerabilities, and a 95% reduction in urgent remediation workload. All three data sources are open and free, making integration straightforward.
Automation is key. Security teams can write a script to query the CISA KEV API, the EPSS API, and the NVD database, then cross-reference these against their asset inventory for every new CVE. The human role shifts from trigger to approver, ensuring speed without sacrificing oversight.
Secure agent-driven systems against authorization gaps
AI agents aren’t just accelerating exploit discovery—they’re also exposing weaknesses in authorization controls. Traditional policies weren’t designed with autonomous agents in mind, and many authorization frameworks are now vulnerable to bypasses that agents can exploit.
CVE-2026-34040 demonstrated this risk when researchers found that Docker’s authorization plugin architecture silently bypasses all plugins when a request body exceeds 1MB. Common solutions like OPA, Casbin, and Prisma Cloud miss this flaw because the bypass occurs in Docker’s middleware before the request reaches the plugin layer.
Cyera’s team showed how an AI agent debugging infrastructure could infer this bypass path while performing a legitimate task—without any explicit instruction to exploit anything. The agent’s behavior revealed the vulnerability purely through its interactions with the system.
The IETF is developing new standards to address this gap. Draft-klrc-aiagent-auth-01, published in March by contributors from AWS, Zscaler, Ping Identity, and OpenAI, proposes using SPIFFE and OAuth 2.0 to issue short-lived, dynamically provisioned credentials to AI agents. Separately, draft-prakash-aip-00 found that none of the approximately 2,000 surveyed MCP servers had implemented authentication mechanisms.
These standards are still months or years from widespread adoption. Until then, security teams must proactively test authorization boundaries for agent-driven systems. Focus on scenarios like oversized requests, burst frequency, and multi-step privilege escalation—all tactics agents might use to probe for weaknesses.
Reduce credential blast radius in agent environments
AI agents often operate with broad permissions, increasing the risk of scope violations. According to a 2026 study by the Cloud Security Alliance and Zenity, over half of organizations have experienced unauthorized access or privilege escalation by AI agents in cloud environments.
The challenge isn’t just granting credentials—it’s ensuring those credentials can’t be abused. Start by implementing short-lived, role-based access tokens with strict time-to-live (TTL) policies. Monitor agent behavior for anomalous patterns, such as sudden bursts of activity or requests to unrelated systems.
Consider adopting the principle of least privilege for agent permissions. Instead of granting broad access, define granular roles for specific tasks. For example, an agent debugging a database shouldn’t also have permissions to modify network configurations.
Regular audits are essential. Review agent permissions monthly, and revoke any credentials tied to inactive or decommissioned agents. Automate as much of this process as possible to avoid manual oversight gaps.
The future of patch management isn’t just faster—it’s smarter. Teams that integrate AI-driven threat intelligence, enforce granular authorization policies, and minimize credential exposure will stay ahead of the curve. The question isn’t whether your processes will change, but how quickly you can adapt before the next zero-day becomes tomorrow’s breach.
As AI continues to evolve, security strategies must keep pace. The tools to close these gaps exist today; the only missing piece is action.
AI summary
Güvenlik açıklarının hızlı bir şekilde keşfedilmesi ve kullanılabilir hale gelmesi, şirketlerin yama süreçlerini hızlandırmasını gerektiriyor.
