
When Attackers Use AI to Evade EDR: Hardening Build Agents Against Lateral Movement
AIMeter and threat framing
On June 3, 2026, a report described a pattern defenders have been watching for a while: attackers are using AI tools to speed up Active Directory abuse and make EDR noise easier to manage. The key point is not that AI found a new bug class. It didn’t. The key point is that it lowered the effort needed to chain reconnaissance, credential abuse, and log shaping across a Windows-heavy environment.
That difference matters.
A lot of teams hear “AI attack” and picture some novel exploit hidden inside the model. In practice, the risk is usually less dramatic and more useful to an attacker: AI helps them move faster through the parts of the kill chain that already worked. It can parse host data, draft scripts, compare paths, reshape commands to look less suspicious, and summarize trust relationships. None of that replaces access control. It just makes weak boundaries easier to pressure at scale.
If your build infrastructure can reach domain services, internal APIs, secret stores, or admin tooling, a compromised runner can become a pivot point. The report is basically saying attackers are already treating that pivot as a workflow problem, not just a payload problem.
Why build agents are a high-value pivot point
Build agents live in a strange spot. They are supposed to be temporary, automated, and low-friction. In practice, they often sit close to the keys.
They run code from pull requests and package builds. They fetch dependencies. They mount caches. They talk to artifact registries, internal package feeds, secret managers, and deployment APIs. On Windows-heavy shops, they may also sit on networks where domain controllers, file shares, and remote management endpoints are only a few hops away.
CI/CD trust is often broader than people think
The trust model around CI/CD tends to grow one permission at a time.
A team starts with one runner that can build code. Then it needs access to a private NuGet feed. Then it needs a token for staging deploys. Then it needs to sign artifacts. Then it needs to talk to a config service. Each change feels small on its own. Put them together and the runner now has enough reach to touch production-adjacent systems.
The usual problem is not the first credential. It is the pile of permissions around the job:
- environment variables with deploy tokens
- mounted files with signing keys or service credentials
- cache directories that persist across jobs
- network reachability to internal-only services
- build-time identity that can impersonate a broader automation account
If an attacker gets code execution on that runner, they are not just inside a container or VM. They are inside a trust bundle.
A compromised runner can reach identity, secrets, and internal services
From a defender’s point of view, a build agent is risky because it often has three kinds of access at once:
| Access type | Why it matters | Common failure mode |
|---|---|---|
| Identity | The runner can authenticate as a service account or with a federated token | Overprivileged automation account |
| Secrets | The job can read API keys, signing material, or deploy credentials | Secrets exposed in env vars, files, or logs |
| Network | The agent can talk to internal systems that are not exposed externally | Flat east-west access from a “temporary” host |
That combination is what makes lateral movement possible. The runner does not need to be domain-admin by design. It only needs to be close enough to something more valuable.
What the report is actually signaling about AI-assisted tradecraft
The useful signal in the report is not “AI can hack.” It is that AI can compress the operator workflow around known techniques.
AI as an operator accelerator, not a new exploit class
I would break AI use into three buckets:
-
Parsing and planning
The model helps sort through logs, inventory data, and host telemetry. That can shorten reconnaissance. -
Code and script generation
The model helps draft small utilities, wrappers, or one-off automation around existing system tools. -
Message and log shaping
The model helps paraphrase commands, standardize naming, and reduce obvious signatures in operator notes or script comments.
That does not create a new bypass of EDR by itself. It does make it easier for a less skilled attacker to imitate patterns that would otherwise take time to learn.
The defender takeaway is simple: do not assume clumsy tradecraft is a reliable filter anymore.
How AI helps with AD reconnaissance, phrasing, and log evasion
Active Directory environments reward patience and context. Attackers want to answer questions like:
- Which machine is joined to the domain?
- Which service account has unusual rights?
- Which hosts can reach a domain controller?
- Which endpoints expose remote management?
- Which internal shares contain useful material?
AI helps because those questions often produce messy text output. It can summarize that output, turn it into a prioritized list, and suggest next steps.
The same thing applies to evasion. An operator can ask a model for alternate ways to describe an action, or to rework scripts so the process tree looks less obvious. That is not magic, but it lowers the time cost of trying variants until one blends into normal admin activity.
For defenders, this means the old assumption — “attackers will make obvious mistakes” — is less dependable. You need telemetry that catches the behavior, not just the exact string.
A safe walkthrough of the attacker path from build agent to lateral movement
This is where I want to stay defensive and still be concrete. The path from runner compromise to lateral movement usually does not start with anything dramatic. It starts with a job that should never have had that much reach.
Initial foothold assumptions in CI/CD environments
In most real environments, the attacker foothold is one of these:
- a poisoned dependency or build step
- a malicious pull request that triggers a job
- a stolen token for the CI platform
- a vulnerable self-hosted runner
- an abused plugin or shared automation secret
The runner itself may be ephemeral, but the credentials around it are often not. That is where the chain begins.
A practical way to think about it is this: if a job can execute code, then every secret attached to that job is potentially reachable. The question is not whether the build is “trusted.” The question is what else the job can see.
Enumerating domain context, service accounts, and reachable systems
Once a runner is compromised, the useful defensive questions are:
- Is the host domain-joined?
- What identity is the process running under?
- Are there service-account credentials on disk or in memory?
- Which internal hosts respond from this subnet?
- Can the runner reach LDAP, SMB, WinRM, RDP, or Kerberos services?
- Are there cached artifacts, logs, or temp files with tokens?
A safe verification loop looks like this:
- Identify the job account and its privileges.
- Identify the network paths available to the runner.
- Identify secrets exposed to that job at runtime.
- Identify whether those secrets can access directory, storage, or deployment systems.
- Identify whether anything persists after the job ends.
Here is a defensive inspection pattern you can run on a controlled Windows runner to understand context without simulating abuse:
whoami /all
hostname
ipconfig /all
Get-ChildItem Env: | Sort-Object Name
Get-SmbConnection
Get-NetTCPConnection -State Established | Select-Object LocalAddress,LocalPort,RemoteAddress,RemotePort,OwningProcess
The value here is not the output itself. It is seeing how much a “temporary” agent can learn from its own runtime environment.
Where lateral movement usually starts to succeed
Lateral movement tends to work at the first boundary where the runner can talk to something privileged with a reusable credential.
Common examples:
- a service account that can authenticate to multiple internal systems
- a deployment token that can trigger privileged jobs
- a host-based management tool that trusts the runner subnet
- cached admin credentials left behind by maintenance
- an internal API that accepts bearer tokens from the automation plane
When that happens, the attacker does not need a perfect exploit. They need a path that looks normal to the infrastructure.
That is why build agents are such good pivots. They often live in the gray area between developer convenience and production trust.
EDR evasion patterns defenders should expect on build infrastructure
The report’s mention of EDR evasion is worth taking seriously, but not because EDR is useless. It is because build hosts are often noisy in ways that make naive detection harder.
Process shaping, living-off-the-land binaries, and reduced noise
Attackers know endpoint tools look for unusual parent-child chains, weird command lines, and new binaries dropped into temp locations. On a build agent, many legitimate tasks already resemble admin activity:
- script hosts launching compilers
- package managers spawning shell helpers
- archive tools extracting content
- system utilities touching registry, service, or network settings
That normal noise creates cover.
The evasion pattern is usually not “turn off the EDR.” It is more subtle:
- use built-in binaries instead of dropping obvious tools
- borrow the same execution style as the platform’s own automation
- keep commands short and modular
- avoid high-volume failures that trigger attention
- reuse paths and tools that already appear in build logs
Defenders should watch for when that pattern crosses from build behavior into operator behavior. A compiler invoking a shell is not always suspicious. A compiler invoking a shell that starts enumerating domain services is.
Scripted abuse of legitimate admin tooling and remote management paths
Remote management tools are a major concern because they are designed to be trusted. If the build network can reach them, an attacker does not need to invent a transport.
The risk categories are familiar:
- PowerShell remoting where it should not exist
- SMB access to administrative shares from an untrusted runner
- WinRM exposure across a broad subnet
- scheduled tasks or service creation used as lateral movement mechanics
- directory queries from a host that should not need them
The point is not the tool itself. The point is the trust boundary around it.
A runner that can use a legitimate management path can look almost invisible unless you correlate process context with identity and network telemetry.
Why traditional endpoint detections miss runner-specific behavior
Traditional EDR detections often assume an employee workstation or a server with a stable role. Build agents break that model.
They tend to have:
- high process churn
- frequent archive and extraction activity
- scripts from multiple languages
- transient binaries and temp paths
- service accounts that are hard to distinguish from automation
That means simple rules like “script host plus network activity” are too broad. But if you ignore those hosts entirely, you create a blind spot.
The better model is to baseline by runner role:
- what executables should appear
- what command patterns are normal
- which destinations are expected
- which secrets are allowed at runtime
- how long the host should live
If a job runner is making domain queries at 2 a.m. from a subnet that only needs artifact access, you want that to light up.
Telemetry that catches this chain early
Good detections here are cross-domain. Host-only data is not enough. Identity-only data is not enough. Network-only data is not enough. You need the chain.
Host signals: process trees, parent-child anomalies, and suspicious execution context
Start with the process tree. On build systems, the most useful questions are:
- Did a build orchestrator spawn an unexpected shell?
- Did a shell spawn directory tools or remote management utilities?
- Did a script run from a temp directory, cache path, or artifact folder?
- Did a known build binary suddenly start launching reconnaissance commands?
A useful heuristic table:
| Signal | Why it matters | Example concern |
|---|---|---|
| Unusual parent-child chain | Indicates execution outside normal build flow | Compiler spawning interactive shell |
| Script from temp/cache path | Often used for transient staging | Job leaves behind live script in cache |
| New binary in runner workspace | Suggests dropped tooling | Unsigned executable in build folder |
| High entropy or renamed files | Can hide staging artifacts | Randomized filenames in job workspace |
| Unexpected interactive context | Rare in automated jobs | Session-like behavior from service account |
You do not need every signal to fire. One strong anomaly plus matching identity and network context is enough.
Identity signals: unusual token use, service account misuse, and privilege jumps
Identity is where a lot of build-agent incidents become obvious in hindsight.
Look for:
- a service account authenticating to systems it never normally touches
- token use outside normal job windows
- a runner account showing privilege changes mid-job
- failed authentication bursts followed by success
- multiple hosts using the same credential material
If your CI/CD platform supports workload identity or short-lived tokens, that should reduce the blast radius. If you still see long-lived credentials on the runner, treat that as a red flag.
A simple rule of thumb: automation credentials should authenticate to automation endpoints. If they start behaving like an operator account, something is off.
Network signals: unexpected east-west traffic, LDAP, SMB, WinRM, and Kerberos patterns
Network telemetry often catches the pivot before endpoint telemetry does.
Watch for:
- a build runner reaching LDAP or Kerberos endpoints it does not normally need
- SMB connections from build subnets to administrative shares
- WinRM or RDP from a runner network segment
- unusual east-west connections to file servers, domain controllers, or management hosts
- service accounts making requests to multiple internal systems in a short window
You can think about it in terms of expected job traffic versus suspicious internal reconnaissance:
| Normal runner traffic | Suspicious runner traffic |
|---|---|
| artifact registry | domain controller queries |
| dependency feed | SMB to admin share |
| package mirror | WinRM to server subnet |
| secret manager | Kerberos bursts to many hosts |
| deployment API | repeated LDAP queries |
If a build host suddenly acts like a workstation doing admin discovery, that deserves attention.
Hardening build agents to resist AD pivoting
This is where most of the risk reduction happens. The goal is not to make runners impossible to compromise. The goal is to make compromise non-pivotal.
Make runners ephemeral, isolated, and narrowly scoped
Ephemeral is good, but only if it is real.
A runner should be:
- recreated from a clean image
- isolated from other runners
- scoped to one job class or trust tier
- unable to persist local state across builds
- blocked from interacting with sensitive internal networks by default
If the host is long-lived, then the “temporary” assumption is fake. Once a runner becomes a persistent server, treat it like one.
Separate build identities from domain privileges
Do not let build identity drift into directory privilege.
Prefer:
- short-lived tokens over reusable secrets
- workload identity over static passwords
- separate identities per repo, pipeline, or environment
- distinct accounts for build, deploy, and sign
- no membership in broad domain groups unless absolutely required
If a runner must authenticate to internal services, limit exactly which services and actions are allowed. The fewer cross-domain permissions a build account has, the less valuable it becomes after compromise.
Lock down secrets, tokens, and machine credentials
A runner usually fails at the secret boundary before it fails anywhere else.
Good controls include:
- inject secrets only for the specific job that needs them
- redact secrets from logs and crash dumps
- avoid placing credentials in globally readable env vars
- prevent secrets from being written to workspace or cache paths
- rotate any secret that has to be mounted into a job
A wildcard EDR exclusion on a build workspace is usually a gift to an attacker. Exclude only the exact path or process you can justify, and review it often.
Also pay attention to machine credentials. If the runner image or host is domain-joined, any cached ticket, token, or machine secret increases the blast radius. A compromised host should not be able to impersonate the environment around it.
Use egress control, segmentation, and allowlists to shrink movement options
If a runner can only talk to the endpoints it needs, lateral movement gets harder fast.
The safest model is deny-by-default:
- allow artifact registry access
- allow source control access
- allow secret manager access
- allow deployment API access only when needed
- deny SMB, LDAP, WinRM, RDP, and general east-west by default
If some internal traffic is required, build an allowlist tied to the job role, not the whole subnet. Segmentation should reflect function, not convenience.
The practical effect is that a compromised build agent can still fail a build, but it cannot freely scan or bounce into the domain.
Audit EDR exclusions and remove runner-specific blind spots
I see this mistake a lot: a team gives build agents broad EDR exclusions because they are “too noisy,” then treats the exception as permanent.
That is risky for two reasons:
- it creates a direct blind spot on the exact hosts that execute untrusted code
- it trains attackers to look for the same exclusion pattern across environments
Review:
- excluded paths
- excluded extensions
- excluded hashes
- excluded parent processes
- runner-specific policy exceptions
If an exclusion is necessary, make it as narrow as possible and tie it to a documented business need. If you cannot explain why a runner needs it, remove it.
Verification steps you can run without simulating abuse
You do not need a live attack simulation to learn whether your CI/CD boundary is leaky.
Map every trust relationship from runner to internal network
Start with a simple inventory:
- what image does the runner boot from?
- what identity does it run under?
- what secrets are mounted into jobs?
- what internal subnets can it reach?
- what admin or deployment endpoints are reachable?
- what is left behind after job completion?
If you want a working review template, this is enough to begin:
| Question | Evidence to collect | Why it matters |
|---|---|---|
| Does the runner need domain access? | network policy, auth logs | reduces directory pivot risk |
| Which secrets are available at runtime? | pipeline config, vault policy | shows blast radius of job compromise |
| Can the runner talk to admin ports? | firewall and connection logs | reveals lateral movement paths |
| Do logs contain credentials? | job logs, artifact review | catches accidental secret leakage |
| Does state persist across jobs? | disk, cache, container layers | determines if compromise survives |
Check whether a low-privilege job can reach admin-only resources
Pick one low-privilege job and verify that it cannot reach anything beyond its purpose.
That means confirming it cannot:
- open admin shares
- query directory services beyond what it needs
- hit remote management endpoints
- authenticate to unrelated internal systems
- access production-only secret paths
If your controls depend on policy documents but the network still allows the traffic, the policy is fiction.
Review whether build logs, caches, or artifacts leak usable credentials
This is one of the cheapest checks you can run, and one of the most common misses.
Look for:
- access tokens in logs
- private keys in artifacts
- stale environment dumps
- dependency caches containing config files
- debug bundles with secrets or session material
The report’s underlying theme is automation abuse. Build artifacts are often where automation leaves its fingerprints. If those fingerprints include secrets, you have already lost part of the fight.
Incident response priorities if a runner is compromised
If a runner is compromised, treat it as both a host incident and an identity incident.
Contain the agent, rotate credentials, and invalidate session material
First priority: stop the blast radius.
- remove the runner from service
- isolate the host or scale the pool down
- rotate any secrets that were available to the job
- invalidate tokens, sessions, and certificates associated with the runner
- revoke access to internal tools used by that automation identity
Do not wait for perfect attribution before rotating secrets. If the runner could read them, assume they are exposed.
Preserve job logs, process data, and directory authentication evidence
You want three evidence buckets:
- job and pipeline logs
- process and host telemetry
- identity and directory logs
Preserve:
- command history
- runner logs
- process trees
- authentication events
- unusual network connections
- artifacts from the job workspace and cache
If the attacker used AI-assisted tooling, the interesting artifacts may be in how quickly the environment was enumerated, what scripts were generated, and which identities were touched. That shows up in telemetry more than in a single payload.
Scope for domain tampering, secret exfiltration, and persistence
Once the host is contained, ask three questions:
- Did the runner touch domain services or modify directory objects?
- Did it read or export any secret material?
- Did it create a persistence mechanism outside the runner lifecycle?
That scope should include:
- new accounts or group changes
- ticket or token misuse
- unexpected access to file shares or secret stores
- scheduled tasks, services, or startup changes
- modified pipeline definitions or supply-chain artifacts
A compromised runner can become a launch point for a bigger incident if you only clean up the host and ignore the identity trail.
Closing the gap between CI/CD convenience and domain safety
The report about AI-assisted AD attacks and EDR evasion is worth reading as a warning about workflow, not just tooling. Attackers are using AI to compress the steps between initial access and lateral movement. That makes build agents even more sensitive, because they already sit close to secrets, identity, and internal reach.
If you want a practical defense strategy, keep it simple:
- assume runners will be probed
- make runner access narrow and disposable
- separate build identity from privileged identity
- block east-west movement by default
- review exclusions as if they were temporary exceptions, not permanent policy
- verify with logs, not assumptions
The convenience of CI/CD is real. So is the risk when that convenience bleeds into the domain. The job is not to make build agents fearless. The job is to make them boring after compromise.
Share this post
More posts

CISA’s Actively Exploited Android Framework Bug: A Practical Fix Guide for App Developers

How AI Helped Find a 2FA Bypass in an Open-Source Admin Tool (And Why Trust Assumptions Fail)
