- Key insight: Integrating AI as a read-only assistant removes workflow friction, but giving it too much autonomy breaks production environments.
- What's at stake: Allowing an AI model to execute actions without human oversight exposes banks to severe risks, including prompt injection attacks that can create new incidents.
- Forward look: Successful AI deployments will require ironclad guardrails, continuous adversarial testing and human-in-the-loop controls.
Overview bullets generated by AI with editorial review
SAN FRANCISCO — Banks are facing increasing pressure to deploy artificial intelligence in their cybersecurity operations centers to combat rising and increasingly automated attacks by threat actors.
The pressure has pushed cybersecurity leaders across the world to ask: Can AI be trusted to autonomously react to and defend the enterprise?
While many executives rush to automate their defenses, frontline security teams warn that treating the technology as an autonomous investigator introduces severe risks.
The reality of integrating large language models into enterprise detection and response workflows reveals a sharp divide between vendor promises and production outcomes, according to a Monday presentation at the 2026 RSAC Conference.
Ankit Gupta, principal security engineer at auto lender Exeter Finance, and Shilpi Mittal, lead security engineer at Tyson Foods, detailed their firsthand experiences building AI tools for their security operations centers.
The presenters built an AI assistant to summarize alerts and gather context, but they quickly learned that giving the model too much autonomy breaks production environments.
For banks facing a constant barrage of cyberattacks, finding a balance between human oversight and AI automation is critical. It will help their security analysts cope with information overload and keep a leash on AI agents that might create new problems.
Financial security teams must treat the technology like a "junior analyst" that excels at drafting and summarizing but becomes "dangerous when it guesses or acts without constraints," according to the presentation from Gupta and Mittal.
Removing friction, not replacing humans
The most successful applications of AI in a security operations center do not replace human decision-making, but rather remove workflow friction, according to Gupta and Mittal.
During their pilot programs, the two found that AI delivered immediate value when deployed for alert-to-case summarization, evidence stitching and drafting first-pass communications.
Evidence stitching means pulling together relevant security artifacts (IP addresses, browsing sessions, unique identifiers, etc.) and logs from various sources to build a cohesive summary of an alert.
Instead of analysts having to manually hunt for context across five to seven different tools (a tedious process known as swivel-chairing), the AI compiles the necessary data and links it directly to the evidence. This creates a consistent, structured draft of the incident and even suggests the next best queries for the analyst to run.
By doing this repetitive data-gathering work, evidence stitching drastically reduces the time analysts spend copying and pasting, allowing them to focus on actually analyzing the threat.
Implementing the AI assistance in this manner drove a 36% reduction in the mean time to detect a threat and a 22% reduction in the mean time to respond, according to results the two presented. The AI integration also yielded a 16-point drop in false positives, and analyst sentiment toward the tools improved over time.
These returns came from using the technology for documentation and evidence stitching, "not in the autonomous response," according to Gupta.
"If we asked the model to summarize, draft and link evidence, it made analysts faster," Gupta and Mittal said in their presentation. By contrast, if the teams asked the agent to make decisions or act on security alerts, it would create new incidents.
By limiting the AI model to these read-only, supportive tasks, the security teams achieved measurable returns on their investment.
This supportive, friction-reducing approach aligns with how the broader financial industry has used the technology for years.
Banks actively deploy AI to optimize back-office operations and enhance their cybersecurity monitoring, according to a 2021 letter to federal regulators from the Bank Policy Institute, an advocacy group representing leading U.S. banks.
For example, institutions use natural language processing to monitor emails and detect phishing attacks. Automating these data-heavy processes allows human investigators "to focus efforts on responding to a smaller number of higher-risk activities," the Bank Policy Institute wrote at the time.
When automation backfires
The transition from a controlled pilot to a live production environment exposed severe security risks, particularly when the AI models interacted with untrusted data.
Large language models do not reliably separate instructions from data, creating a risk known as prompt injection. This means security teams must treat everyday items such as support tickets, system logs and pasted text or images as untrusted, according to Gupta and Mittal.
If an AI system ingests a file or log containing a hidden malicious command, an attacker can manipulate the model's output or force it to take unauthorized actions.
Giving an AI model too much autonomy exacerbates these injection risks. If security teams grant the technology excessive agency, allowing it to execute actions or interface with other systems without human verification, a wrong action can create a new incident.
This vulnerability enables the AI to perform damaging actions in response to manipulated or unexpected outputs, according to a 2025 report from the Open Worldwide Application Security Project, or OWASP, a nonprofit foundation that researches software security.
Financial regulators and international watchdogs share these concerns, warning that banks must secure their AI deployments against these novel attack vectors.
In an October 2024 letter, the New York State Department of Financial Services warned that attackers can leverage AI to "conduct reconnaissance to determine, among other things, how best to deploy malware and access and exfiltrate" sensitive nonpublic information.
Similarly, in a September 2025 statement, the G7 Cyber Expert Group advised finance ministers that AI introduces new cybersecurity risks, such as attackers using prompt injection to "manipulate outputs or retrieve sensitive information."
Implementing ironclad guardrails
To live in production environments, AI assistants require strict boundaries. Gupta and Mittal said they continuously ran automated scoring and simulated attacks (known as red-teaming) to test the system against data exfiltration and unsafe tool usage.
Gupta suggested operating the technology like production software, utilizing continuous evaluations and version controls to prevent the system from degrading over time.
Other cybersecurity experts agree that adversarial testing is mandatory for language models. A 2025 guide on language model vulnerabilities from OWASP implored organizations regularly conduct simulated attacks that treat the model as an untrusted user, which helps to validate the effectiveness of model's access boundaries.
Additionally, OWASP implored organizations to implement human-in-the-loop controls for any privileged operations to prevent unauthorized actions. This is exactly the approach Gupta and Mittal took when they brought AI into their security operations centers.
The Tyson and Exeter security teams operated under a firm "no gate, no action" policy, the two said. This meant the systems remained read-only by default and required human analysts to approve any responsive action, known as putting a human in the loop.
Additionally, the engineers required the model to support every claim it made with specific evidence, enforcing a "no citations, no trust" mandate to ensure auditability and build analyst confidence, according to Gupta.
At the end of their presentation, Gupta and Mittal summarized their 40-minute talk with one sentence:
"Make the model prove its work, and make humans own the decision."











