Testing for vm2 Sandbox Bypasses: What to Look for in User-Run Script Features

AI Usage (89%)

Why vm2 keeps showing up in user-run script features

I keep seeing the same pattern in product reviews: a SaaS app wants flexible user logic, and someone reaches for vm2 because it feels like the safest way to run it.

That choice shows up in:

custom formulas in business apps
workflow builders
plugin runtimes
chatbot tools
AI agent actions
online code runners
“safe” automation hooks

The problem is not the idea. The problem is that JavaScript sandboxing is a narrow boundary, and teams often treat it like a hard security wall. Recent vm2 disclosures, including CVE-2026-26956 and other critical flaws reported around May 7, 2026, are a reminder that this boundary can fail in ways that matter.

When that happens, the attacker stops being “a user running script” and becomes “code execution inside the host process.”

What a sandbox escape means in a SaaS app

A sandbox escape is not just a strange exception or a broken script. In a SaaS context, it can mean the attacker crosses from untrusted code into the process hosting your app logic.

That is where the impact starts.

If the host process can reach:

environment variables
API keys
filesystem paths
internal service credentials
cloud metadata endpoints
private network routes
queue workers or job tokens

then a sandbox escape can become a full application compromise, not just a feature bug.

⚠️

If the sandbox process can read production secrets or call internal services, treat any plausible escape as high severity until proven otherwise.

Where JavaScript sandboxing gets fragile

Exceptions and error objects crossing the boundary

A lot of sandbox bugs begin with “safe” data that is not really safe. Error objects are awkward because they cross trust boundaries and can carry surprising behavior.

If the host inspects thrown values, stringifies them, or logs them with assumptions about their shape, you may get access to methods or properties that were never meant to be attacker-controlled. A strong review asks one question first: what exactly leaves the sandbox, and who touches it next?

Prototypes, constructors, and host object leaks

JavaScript has a deep object model, and sandbox code will probe it.

The usual pressure points are:

prototype chains
constructors
constructor.constructor
leaked references to host objects
objects that look inert but still inherit powerful behavior

I usually test these paths by checking whether the sandbox can observe host-created objects, then verifying whether it can climb from a plain object to something that should have stayed unreachable. The exact trick changes by library and runtime version, but the review goal stays the same: find out whether object identity is enforced or merely assumed.

Engine features and non-obvious execution paths

Sandbox code does not only run through obvious eval-style paths. It can also hit runtime features that are easy to miss in a product review:

WebAssembly
dynamic import behavior
asynchronous callbacks
exception serialization
proxy traps
getter/setter side effects

The mistake is thinking “we only allow JavaScript expressions” means the attack surface is small. In practice, any feature that executes attacker-controlled code or reifies host state can become a boundary crossing.

What to check if your product uses vm2 or a similar sandbox

Inventory every user-controlled execution surface

Start with a boring inventory. It usually finds the real risk.

Look for:

custom scripts
formulas
workflow steps
plugin manifests
agent tool code
“advanced” automation rules
templating engines that allow expressions
admin-only script runners that later get exposed to tenants

If the feature accepts user input and executes it as code, put it on the list.

Verify which host capabilities are reachable

Then ask what the script can reach if the sandbox fails in a small way.

Check for access to:

process
require
filesystem modules
outbound network calls
timers and background jobs
host globals
serialization helpers
app-specific helper functions that wrap privileged actions

A good test is to map the boundary in layers:

Layer	Question	Why it matters
Sandbox API	What objects are exposed?	Leaked helpers often become the first pivot
Host process	Can code touch Node internals?	This is where RCE becomes real
Runtime env	Can secrets be read?	Env vars are often the fastest win
Network	Can the code call internal services?	Internal reach can be worse than disk access

Test resource limits and process boundaries

Even if escape does not happen, resource abuse still matters.

Test whether one script can:

consume CPU indefinitely
allocate memory until the worker dies
hold open event loop work
trigger retries that multiply load
block a shared execution pool

If all user scripts run in the same process as the app, denial of service is not a side issue. It is part of the trust boundary.

Safer isolation patterns for untrusted code

Separate processes, containers, and microVMs

If the user can run code, prefer true isolation over in-process sandboxing.

Better patterns include:

separate worker processes
locked-down containers
microVMs for stronger workload separation
strict syscall filtering where applicable

The goal is to move from “library boundary” to “operating system boundary.”

Egress controls, short-lived credentials, and secret hygiene

Do not put production secrets in the execution process if you can avoid it. If the sandbox needs credentials, use short-lived, narrowly scoped tokens.

Also:

block unnecessary egress
deny access to metadata services
isolate per-tenant credentials
avoid shared high-value API keys in worker env vars

Audit logs and dedicated execution workers

I like dedicated workers because they make incident response much simpler.

You get:

a narrower blast radius
cleaner logs
easier patching
simpler rotation if compromise is suspected

If the execution worker is shared with the main app, your logs and containment story are both weaker.

Practical incident response if a sandbox escape is plausible

If you think vm2 or a similar sandbox was exploitable in your environment:

Patch or disable the feature.
Inventory every place the library is used.
Check whether the sandbox process had access to secrets.
Rotate any exposed credentials.
Review logs for unusual script behavior, outbound connections, or host-level file access.
Re-test whether user-controlled code can reach privileged host APIs.
Assume any shared execution worker may need to be rebuilt.

Do not wait for proof of full compromise before rotating secrets. If the escape path is plausible and the process had sensitive reach, the safe assumption is exposure.

How to report this safely in bug bounty or internal testing

For authorized testing, keep the report focused on boundary failure and business impact.

A good report usually includes:

the exact user-script feature tested
the sandbox library and version if visible
the smallest safe proof that shows boundary crossing
what host capability became reachable
whether secrets, filesystem, or internal network access were in scope
evidence without weaponized payloads

Avoid publishing destructive payloads or a turnkey escape chain. You do not need that to prove impact. Show that user-controlled script can violate the intended isolation boundary, then stop.

Conclusion

vm2 and similar libraries are useful, but they are not a complete security story.

If your product lets users run JavaScript, treat that feature like a real execution environment, not a convenience API. The security question is not “did the script stay inside the sandbox wrapper?” It is “what happens if it gets out, and what was reachable if it did?”

That is the review that matters.