Defending Against Glassworm: Checks You Can Run on npm, PyPI, and OpenVSX Dependencies

AI Usage (88%)

What this Glassworm report changes for developers

Why a multi-registry campaign is harder to spot than a single bad package

What stands out in the Glassworm report is not just that one ecosystem was abused. The report describes activity across npm, PyPI, OpenVSX, and GitHub, which means the attacker did not need one perfect package. They only needed one convincing artifact in one place, plus a way to keep the rest of the chain looking ordinary.

That changes how I think about dependency risk.

If a package update looks normal in one registry, that does not settle it. A package can look clean in npm while being published from a compromised GitHub release flow, mirrored as a malicious wheel in PyPI, or repackaged as an editor extension with a permissive manifest. The registry label is just one signal. It is not the boundary you should trust.

Multi-registry campaigns are also easier to miss because defenders tend to review one lane at a time:

JavaScript teams review package.json and lockfiles.
Python teams review wheels and requirements.txt.
Extension users review marketplace metadata.
GitHub reviewers inspect source code and release notes.

A campaign that moves between those lanes can sit in the gaps between those reviews.

What the reporting says about npm, PyPI, OpenVSX, and GitHub as delivery points

The report’s core point is straightforward: the attacker used multiple public developer ecosystems as delivery points. That matters because each one exposes a different slice of trust.

npm is often installed automatically during builds, so lifecycle scripts and dependency fan-out matter a lot.
PyPI packages can execute code during build or install, especially when source distributions, wheels, and build backends do not line up cleanly.
OpenVSX extensions are executable software with a manifest that controls activation and permissions.
GitHub is often the source of truth for tags, releases, and automation, so a compromise there can poison everything downstream.

The useful takeaway is not “stop using public registries.” That is not realistic. The useful takeaway is to verify the artifact you will actually execute, and verify the control plane that published it.

What you should assume when a dependency update looks routine

When an update lands in your queue and looks like an ordinary patch release, I assume four things until I check otherwise:

The published artifact may not match the source repository exactly.
The package may run code at install time, build time, or extension activation time.
The update may have changed ownership, maintainer behavior, or release automation.
The package may be pulling payloads or configuration from outside the registry.

That sounds paranoid until you remember how many builds trust npm install, pip install, or marketplace auto-update without much scrutiny.

Build the threat model before you check any package

Where compromise usually shows up: maintainer abuse, poisoned release assets, or malicious install hooks

Before I start diffing files, I want to know which compromise model I am testing.

In practice, supply-chain abuse usually shows up in a few ways:

Maintainer abuse: an account is compromised, sold, or impersonated.
Poisoned release assets: the repo is fine, but the release tarball, wheel, or VSIX is not.
Malicious install hooks: the package executes code during installation or activation.
Repository-to-registry drift: the source repo says one thing, the published artifact says another.

Those models matter because the evidence lives in different places. If you only inspect the repository, you can miss a malicious wheel. If you only inspect the registry package, you can miss the fact that the repo was hijacked and a new tag was pushed from compromised CI.

How dependency trust breaks across registries versus source repositories

A lot of teams still treat the repository and the registry as if they were the same thing. They are not.

The repository is where code is developed. The registry is where code becomes an installable artifact. Between those two points, several things can change:

build scripts can generate new files,
release automation can package extra assets,
minification or bundling can hide added behavior,
a compromised maintainer token can publish a different artifact than the repo contents suggest.

That is why a package can be “open source” and still be dangerous. Source availability is not the same as artifact integrity.

I like to frame trust in three layers:

Layer	What you trust	Common failure mode
Source repo	Commits, tags, release notes	Hijacked maintainer or rogue CI
Published artifact	Tarball, wheel, VSIX, release asset	Drift from source, hidden payloads
Install/runtime path	Scripts, hooks, activation events	Code runs before review or after install

If any one layer is weak, the chain is weak.

The three questions to answer for every suspicious package: who published it, what changed, and what runs on install

Every suspicious package review starts with the same three questions:

Who published it?
- Is the maintainer identity stable?
- Did ownership change recently?
- Does the GitHub repo still look controlled by the same people?
What changed?
- Is this a real feature release, or just a repackaged bundle?
- Did the file list change in a strange way?
- Did the metadata change more than the code?
What runs on install?
- npm lifecycle scripts?
- setup.py, pyproject.toml, or build backends?
- Extension activation hooks or bundled startup code?

If you can answer those three questions with evidence, most of the obvious supply-chain tricks fall away pretty quickly.

A fast triage workflow you can run in minutes

Inspect version deltas, release timestamps, and maintainer changes

When a package update feels off, I start with metadata before content.

For npm, check version history, publish timestamps, maintainers, and repository links:

npm view some-package versions --json
npm view some-package time --json
npm view some-package maintainers --json
npm view some-package repository.url dist.tarball dist.integrity --json

For Python packages, inspect project metadata on PyPI and compare release dates across artifacts. For extensions, compare marketplace update history and publisher identity.

Look for patterns like:

a sudden maintainer change,
a burst of releases after long dormancy,
a new major version with almost no changelog,
a release timestamp that does not fit the normal cadence.

None of those proves compromise, but they do justify a deeper review.

Compare package contents instead of trusting the version number

A version number is not evidence. The package contents are.

For npm, I usually pull the tarball and compare it to the previous release:

npm pack [email protected] --json
npm pack [email protected] --json

Then inspect the unpacked content:

tar -tf some-package-1.2.3.tgz | sort > new.txt
tar -tf some-package-1.2.2.tgz | sort > old.txt
diff -u old.txt new.txt

If the file list changed in a way that does not make sense, go one level deeper:

look for new binaries,
look for hidden assets under dist/,
inspect generated JavaScript,
check whether source maps reveal added behavior,
search for new domain names or network calls.

For Python, compare the wheel and source distribution, not just the project name.

Look for install-time execution paths such as preinstall, postinstall, setup.py, or extension activation hooks

This is where a lot of malicious packages become real.

In npm, lifecycle scripts are the first thing I check:

cat package/package.json | jq '.scripts, .bin, .dependencies, .optionalDependencies'

In Python, check for:

setup.py,
build backend declarations in pyproject.toml,
custom build hooks,
wheel metadata that points to unusual runtime behavior.

For editor extensions, inspect activation events and startup behavior. Any extension that activates too broadly, ships a large bundled payload, or asks for more access than it needs deserves a closer look.

If the package can execute before your application starts, treat it as code execution, not just a dependency.

Check whether the package pulls external code, decodes payloads, or reaches out to unfamiliar domains

A lot of supply-chain abuse depends on a second stage. The first artifact looks harmless. The payload shows up later.

Search for:

fetch, axios, XMLHttpRequest, curl, wget, urllib, requests,
base64 decode routines,
obfuscated string tables,
eval, Function, dynamic imports,
DNS or HTTP calls to unfamiliar domains.

A quick read-only scan catches a surprising amount:

rg -n "fetch\(|XMLHttpRequest|axios|requests|urllib|eval\(|Function\(|base64|atob|btoa|http(s)?://" unpacked-package/

You are not trying to prove intent here. You are trying to find the places where a normal package can turn into a loader.

npm checks that catch the most common supply-chain tricks

Audit package.json for lifecycle scripts, dependency churn, and unexpected binary blobs

For npm packages, package.json is both a manifest and a policy file. I review:

scripts,
dependencies versus devDependencies,
optionalDependencies,
bin,
files,
exports,
preinstall, install, postinstall, prepare.

A suspicious release often shows up as churn in one of those fields. For example:

a new postinstall script with no release note,
a dependency switch from a known library to a tiny obfuscated helper,
a sudden files allowlist that now includes a binary or generated directory,
a prepare script that builds code from a repo state you never reviewed.

Unexpected binary blobs are a major red flag. Most routine application packages do not need random executables in the tarball.

Diff the tarball against the previous release and inspect generated files, minified bundles, and hidden assets

A lot of attackers count on reviewers not reading minified bundles. That is exactly why I diff tarballs.

If the package ships a dist/ folder, compare both the source and the compiled output. Watch for:

generated files that do not map to checked-in source,
minified bundles with new outbound requests,
extra source maps that expose hidden logic,
files with names that look like assets but behave like loaders.

A quick way to unpack and inspect:

mkdir -p /tmp/pkg-a /tmp/pkg-b
tar -xzf some-package-1.2.2.tgz -C /tmp/pkg-a
tar -xzf some-package-1.2.3.tgz -C /tmp/pkg-b
diff -ru /tmp/pkg-a/package /tmp/pkg-b/package | less

If the change is broad, start with what runs first.

Verify whether the published package matches the Git tag and repository state

This is where many teams skip a crucial check. The registry artifact should line up with the repository tag that supposedly produced it.

Check:

the repository URL in the package metadata,
the tag or release referenced in the changelog,
whether the Git tag is signed,
whether the tarball contents match the tagged source tree.

If the package was built from GitHub Actions, also inspect whether the workflow generated the release asset or published the registry package. A compromised CI token can produce a believable artifact even if the repository history looks normal.

Use npm, lockfiles, and integrity fields to confirm what your build actually installed

Your build is only as trustworthy as the artifact that the lockfile pins.

For npm, package-lock.json includes integrity fields that can help confirm the installed tarball. That does not make the package safe, but it gives you a stable reference point. If the lockfile changes unexpectedly, or if your install does not match the recorded integrity hash, stop and investigate.

I also like to compare:

npm ci output in CI,
the checked-in lockfile,
and the package tarball hash.

If those three do not agree, your build pipeline is no longer reproducing what you reviewed.

PyPI checks for wheel, sdist, and setup-time abuse

Compare sdist and wheel contents to spot build-time tampering or missing source files

Python packaging has a classic trap: the source distribution and the wheel can tell different stories.

If the wheel includes files that are not present in the sdist, or the other way around, I want to know why. That mismatch can happen for legitimate build reasons, but it is also where malicious build steps hide.

A simple review flow:

python -m pip download --no-deps --no-binary=:all: somepkg==1.2.3 -d /tmp/pypi
python -m pip download --no-deps --only-binary=:all: somepkg==1.2.3 -d /tmp/pypi
python -m wheel unpack /tmp/pypi/somepkg-1.2.3-py3-none-any.whl
tar -tf /tmp/pypi/somepkg-1.2.3.tar.gz | sort > sdist.txt
find somepkg-1.2.3 -type f | sort > wheel.txt
diff -u sdist.txt wheel.txt

Look for:

files that only appear in the wheel,
code generation during build,
scripts added to the package root,
data files that are larger or stranger than expected.

Review pyproject.toml, setup.py, and dependency hooks for code that runs during installation

Python has more than one path to code execution.

setup.py can run arbitrary code when installed from source. Modern pyproject.toml builds can also execute backend code during build steps. That means package installation is not just extraction; it can be execution.

I look for:

dynamic version computation,
custom build backends,
cmdclass overrides,
entry_points that start hidden tooling,
packages that import network or filesystem helpers during setup,
dependencies that are only required to build the package, not to use it.

If a package needs the network to install, I want to know exactly why.

Check for namespace collisions, renamed project ownership, and suspicious metadata edits

In PyPI, attackers often win by looking close enough to the real thing.

Defensive checks include:

confirm the project name matches the expected namespace,
inspect whether the owner changed recently,
compare author email and homepage metadata,
search for similar project names with subtle spelling changes,
review whether the summary, classifiers, or long description changed in ways that do not match the code.

A rename can be legitimate, but a quiet ownership shift or a namespace collision is enough to justify manual review.

Validate hashes from requirements files and separate trusted build artifacts from publish artifacts

For production installs, hash pinning matters.

Use --require-hashes where possible, and prefer a locked dependency set that records the exact artifact hash. That way, the build system does not silently accept a different wheel or sdist under the same version number.

Example:

somepkg==1.2.3 \
    --hash=sha256:...

Then install with:

pip install --require-hashes -r requirements.txt

Also separate the artifact you build from source from the artifact you publish. A project can be safe in source form but unsafe after packaging if the release pipeline is compromised.

OpenVSX and editor-extension risks that developers often miss

Treat extension manifests as executable policy, not just metadata

An extension manifest is not a marketing page. It is a declaration of what the extension can do.

For OpenVSX-style extension packages, I review the manifest the same way I review application permissions:

what activates the extension,
what commands it registers,
what workspace access it asks for,
what bundled code it ships,
whether it loads remote code,
whether it redirects users to a different marketplace or install path.

The package.json inside the extension package is effectively policy. If it says the extension can wake up on broad activation events and then load a large bundle, that is a meaningful trust decision.

Inspect activation events, commands, workspace permissions, and bundled JavaScript before installing

I want to know exactly when an extension starts and what it can touch.

A quick manual inspection for a VSIX package can look like this:

unzip -p extension.vsix extension/package.json | jq '{activationEvents, main, browser, contributes, capabilities, extensionKind}'

Then inspect the JavaScript bundle for:

network calls,
telemetry,
hidden command registration,
workspace enumeration,
file reads outside the expected scope,
dynamic loading from remote locations.

If the extension ships a large compiled bundle and no readable source, I assume review friction is part of the design until proven otherwise.

Verify whether the extension package is built from the same source as the linked repository

The repository and the marketplace package should agree.

Check whether:

the source repo tag matches the package version,
the linked repository actually contains the extension source,
the package bundle appears to be built from that source,
the release workflow is transparent enough to reproduce.

If the package is clearly not derived from the repository you can inspect, that is a problem. It may still be benign, but you should not treat it as trustworthy by default.

Watch for updates that add telemetry, remote code loading, or unexpected marketplace redirects

Extensions are especially tricky because a small manifest change can change the risk profile a lot.

Red flags include:

a new telemetry endpoint,
code that downloads scripts after install,
a new setting that redirects users to another marketplace,
a change from local-only behavior to cloud-assisted behavior without review,
“helper” modules that phone home during activation.

In practice, this is where I tell teams to stop assuming the extension is just a UI add-on. Extensions can be full code execution surfaces.

GitHub as the control plane for the package campaign

Check whether releases, tags, and repository commits agree with the published artifact

The report’s GitHub angle matters because GitHub often controls what the registry will later trust.

When I review a suspicious package, I compare:

the commit referenced by the release,
the tag that supposedly produced the artifact,
the registry tarball or wheel,
and the generated release asset.

If the version number exists in the registry but there is no matching tag, or the tag points to different content, the artifact is not trustworthy until that inconsistency is explained.

Look for forked history, new maintainers, or repository hijack indicators

A repo takeover often leaves soft signals before it leaves hard ones.

Look for:

sudden collaborator additions,
new release maintainers,
force-pushed branches,
rewritten tag history,
missing or altered issue discussion,
release notes that do not match prior style or cadence.

None of those proves compromise on its own. But a cluster of them can tell you the upstream project is no longer under the control you expected.

Review CI workflows, release automation, and secrets exposure as part of the package trust chain

CI is where many package campaigns become real.

If GitHub Actions or similar automation publishes directly to npm, PyPI, or OpenVSX, then workflow integrity is part of package integrity. Review:

who can edit workflows,
whether release jobs use short-lived credentials,
whether secrets are scoped tightly,
whether tags trigger publishing automatically,
whether build artifacts are signed or at least hashed.

A weak release workflow means an attacker does not need to compromise the source code itself. They only need the path that turns source into a release.

Use repository evidence to confirm whether the upstream project is still under legitimate control

When a package update feels suspicious, GitHub can help answer one key question: is the project still controlled by the people who are supposed to control it?

Useful evidence includes:

commit activity from known maintainers,
signed tags or commits,
stable release patterns,
recent discussion in issues or pull requests,
continuity in release automation and ownership.

If the repository evidence and the registry release do not line up, trust the mismatch more than the marketing copy.

Practical defenses for CI, local development, and dependency governance

Pin versions, prefer lockfiles, and avoid broad upgrade windows without review

The easiest way to reduce supply-chain exposure is to stop floating dependencies casually.

For teams that can manage it:

pin exact versions in production,
keep lockfiles under review,
avoid “update everything” windows without inspection,
batch upgrades so you can compare deltas in a controlled way.

This does not remove risk, but it turns random updates into reviewable events.

Run dependency diffing and allowlist-based checks in CI before the build is promoted

CI is the right place to enforce the boring checks nobody wants to do manually every time.

A practical control set looks like this:

Control	npm	PyPI	OpenVSX
Lock exact artifacts	`package-lock.json`	hash-pinned requirements	pinned extension version
Diff contents	tarball compare	wheel/sdist compare	VSIX unpack compare
Restrict execution	`--ignore-scripts` where possible	isolated build/install	disable auto-run or auto-update in CI
Verify provenance	tag/release match	build metadata match	source repo match

If a build introduces a new dependency outside an allowlist, fail the pipeline and force review.

Restrict lifecycle scripts and extension execution in unattended environments

Unattended environments are where malicious install hooks become most dangerous.

For npm:

prefer npm ci --ignore-scripts in CI,
run scripts only in a controlled, audited step if you truly need them.

For Python:

install in isolated environments,
avoid arbitrary source builds when wheels are available from trusted sources,
use hash verification and vetted indexes.

For editor extensions:

install only allowlisted extensions in CI images,
avoid auto-activation of unreviewed extensions in shared build agents,
prevent extensions from auto-updating in environments that should be reproducible.

Add provenance checks, hash verification, and outbound network monitoring to your pipeline

The strongest defense is to verify both origin and behavior.

That means:

provenance or attestation checks where available,
hash verification for every installed artifact,
monitoring for outbound traffic during install and build,
alerts on unexpected DNS lookups or HTTP calls from dependency installation steps.

If a package installer suddenly reaches out to a domain that never appears in your normal build traffic, that is a signal worth investigating.

What to do if you find a suspicious dependency

Quarantine the package version and identify every workspace, image, and lockfile that uses it

The first move is containment.

Do not just delete the package from one repo and call it done. Find every place it appears:

application repos,
shared base images,
CI caches,
Docker layers,
vendored lockfiles,
extension bundles,
build artifacts that may already contain it.

Treat the version as a bad artifact until the full blast radius is mapped.

Rotate any secrets that may have been exposed during install or build time

If the malicious code could have run during install, assume it may have seen environment variables, tokens, or cloud credentials.

Rotate:

npm or PyPI publish tokens,
CI secrets,
cloud credentials,
API keys present on the build runner,
developer tokens used in local builds if the package reached workstations.

If secrets were mounted into the environment during install, they are potentially exposed.

Preserve forensic evidence from package caches, logs, and CI artifacts before cleanup

Before you wipe the environment, preserve evidence.

Useful sources include:

package manager caches,
CI logs,
build artifacts,
Docker layers,
release archives,
extension bundles,
terminal transcripts from reproduction attempts.

You want to know what was downloaded, what ran, and what network traffic occurred. Cleanups are good. Evidence is better.

Coordinate disclosure or reporting through the relevant registry and maintainers

Once the scope is clear, report through the right channel:

npm package maintainers and registry support,
PyPI maintainers and security contacts,
OpenVSX publisher or platform contacts,
GitHub repository maintainers or abuse channels if the repo is part of the control plane.

Keep the report factual:

affected versions,
suspicious file paths,
install or activation behavior,
timestamps,
hashes,
evidence of divergence from the repository.

Avoid overclaiming if you do not have proof. Precision helps the people who have to respond.

Conclusion: turn registry trust into measurable checks

The main lesson from Glassworm-style campaigns is to verify artifacts, not labels

The Glassworm report is a reminder that supply-chain abuse does not stay inside one ecosystem anymore. npm, PyPI, OpenVSX, and GitHub can all be part of the same delivery path.

So I do not start from “is this package popular?” or “does the repo look clean?” I start from “what artifact am I actually going to execute, and what evidence proves it came from the right source?”

That shift is the difference between trusting a label and trusting a build.

A short checklist for npm, PyPI, and OpenVSX that teams can keep in their review process

If you need a compact review flow, use this:

Confirm publisher identity and recent ownership changes.
Diff the artifact against the previous release.
Inspect install-time or activation-time execution paths.
Compare package contents to the linked repository and tag.
Verify hashes, lockfiles, and provenance where available.
Watch for outbound network calls during install and build.
Quarantine and rotate if anything does not line up.

That is a short list, but it catches the highest-value abuse patterns.