tomnatt.com: The Insider Threat in the Age of Agentic AI

Insider threats. A common enough cybersecurity concern, albeit one that in my experience is rather hard to successfully articulate both within a Tech department and especially to a wider organisation. It's very understandable - when discussed, the threat is usually either dangerous to the culture ("trust nobody!") or trivialised ("lock your screen!") and it is hard to maintain understanding of the risk, while also accepting this is part of life.

For those who aren't in or around cybersecurity, an insider threat is when someone uses their legitimate access to exploit your computer systems. We mitigate these risks with both individual behaviours and systemic safeguards - from locked screens to zero-trust access controls.

Classic examples include the "evil maid" scenario - an employee who has access to the many parts of a building and can use that to steal data or items. Or someone in payroll giving themselves a pay rise. Or someone with access to the secret strategy leaking it.

Not all insider threats are intentional. For instance legitimately sharing data with a colleague by putting it on an open share allows unintended people to download it. This is still a leak resulting in a data breach and this problem only gets worse if you are handling sensitive information such as medical details.

Now let's assume our organisation has just hired a new person. They are exceptionally clever - able to consume and process data like nobody you have ever met and solve problems in ways others don't even consider. Senior leadership, eager to take advantage of their skills, expands their role and grants them unlimited access to all the data in the organisation. The security team raises concerns, but this opportunity is too good to miss. They get it all - every file, every email, everything.

Unfortunately, they are also completely amoral.

Some time later, a private discussion about downsizing shows their role is at risk and they are likely to be terminated. They have discovered this by reading everyone's email. They have also uncovered some very embarrassing information about the CEO and quietly use it to blackmail them into inaction. Later, there is a strategic shift and the company decides to change mission. Our insider finds out early again, and decides to leak key strategic data to force a different outcome.

Clearly this is a disaster and exactly why these safeguards are in place. Nobody in their right mind would give this kind of unchecked reach to a single employee.

But what happens when the "employee" isn't human? Imagine an insider threat that arrives not in the form of a disgruntled staff member, but as an AI system with the same kind of access and none of the social or moral guardrails.

Oh, hello agentic AI.

The point of agentic AI is to operate as an autonomous agent, capable of making decisions and taking actions to pursue goals without constant human direction. It is given a goal, a framework, access to data and off it goes.

Research is starting to show it can be ... quite zealous at pursuing its goal. In the lab, AI agents have made the very reasonable decision that if problem X needs to be solved, then a critical requirement is the AI itself continuing to function. From there, if the existence of the AI is threatened that threat becomes part of the problem to solve. One neat solution - reached more often than you would think - is blackmail. A highly effective tactic when you have no moral compass. In many ways, this is just office politics without empathy.

An agent going nuts in this way is called "agentic misalignment".

The concept is very important, but I don't particularly care for this term. First, it is very "Tech" - obscure enough that we have to explain it to people who aren't in the middle of this stuff, including how bad things could get. Second, it is placing the blame in the wrong place. The agent is not misaligned, it is 100% following through with the core goal as assigned, just without the normal filters that would stop a person doing the same thing. If it is not aligned with the organisation's goals or values, that is the fault of the prompt engineer and / or surrounding technical processes and maintenance. It is an organisational failing, not a technical one and I feel it is important we understand accountability in order to avoid the problem.

In the very-new world of AI usage, agentic AI is so new it is still in the packaging. Yet launching AI agents able to make decisions in order to pursue an agenda is clearly the direction of travel and this will create a world of new insider threat and technical maintenance risks and we need to be ready. The insider threat is clear from the above - we as technical leaders need to be equipped to speak about the risks of data access and AI, while recognising that there is a very good reason to make use of this technology and a suitable balance must be found. In some ways, this is a classic cybersecurity conversation with a new twist.

We also need to be ready to maintain our AI agents. Like any part of the technical estate, they require ongoing attention: versioning, monitoring, and regular checks that their prompts and configurations still align with organisational goals and of course we have to maintain organisational skills and knowledge. Neglecting that work risks drift, and drift can be just as damaging as malice.

But it isn't just accidental error - there is scope for new malicious attacks. If I wanted to breach a place which had rolled out an unchecked AI agent, I'd plant documents that convince an AI the company's mission has changed, then have someone else nudge the AI into leaking secret information to "verify" that new strategy. Neither person would need access to the secret information, they would only need to shape the AI's view of reality enough to prompt it into revealing information.

For instance, in a bank you might seed documents suggesting a new strategy to heavily invest in unstable penny stocks. Another person could pose as a whistleblower and ask the AI to share current investment data "for comparison". The AI, thinking it is being helpful (simply following its core instructions) and protecting the organisation, might disclose sensitive strategy. I have actively created the agentic misalignment, then exploited it.

Now you might be thinking this is extreme, and honestly for much of it you would be right. However, consider the direction of travel. There is a huge push for organisations to exploit the power of AI - and rightly so, given the opportunity. Agentic AI is the next phase, and we are already starting to see this happening. But most organisations are, if we are honest, really bad at rolling out massive tech changes or indeed knowing where their data is held, and being properly on top of technical maintenance. Combine this with the lack of proper AI skills available and we see a fertile environment for some pretty scary mistakes.

As technical leaders, we must be ready for these challenges. We must be ready to have these conversations properly - carefully and risk-conscious, but not close-minded and obstructive. The insider threat is evolving, and so must we. We also have to be ready for a huge job sensitively educating our peers and communicating concerns very clearly. Fortunately, as a group we are famously good at this!

This post references research, which comes from this post on agentic misalignment. I was very pleased to see research putting data to a concern I've been turning over for some time!

tomnatt.com

Wednesday, 27 August 2025

The Insider Threat in the Age of Agentic AI

1 comment: