Lorem, ipsum dolor sit amet consectetur adipisicing elit. Qui, itaque voluptate ipsa non enim amet ducimus voluptatibus deserunt nam esse!
InvisibleFerret Malware Analysis: Evading Static Analysis with Native Python Extensions

InvisibleFerret Malware Analysis: Evading Static Analysis with Native Python Extensions

pr0h0
malware-analysispythonstatic-analysisreverse-engineering
AI Usage (83%)

The report says InvisibleFerret was being delivered as compiled Python extensions, and that changes the first ten minutes of triage more than it changes the malware itself.

If you are used to scanning Python source, that shifts your instincts in the wrong direction. A .py file gives you imports, function names, comments, and quick grep targets. A native extension gives you a binary blob that Python loads at import time. The interesting behavior can begin before you ever see readable source, and the usual “open it in a text editor first” habit stops helping.

I usually treat this kind of sample as a hybrid problem: part Python package review, part native malware analysis, part loader inspection. The trick is not to get stuck on the missing source. It is to switch to a workflow that assumes the sample was built to hide inside a Python-heavy environment from the start.

Why Compiled Python Extensions Change the Malware Triage Problem

What the report says about InvisibleFerret in native extension form

The public report on InvisibleFerret describes malware being shipped as compiled Python extensions. That matters because the packaging choice is part of the evasion. Instead of obvious Python scripts, the operator can present something that looks like a normal dependency artifact: a .pyd on Windows or a .so on Linux.

That does not make the malware fundamentally harder to analyze. It does change the first pass. You are no longer asking, “What does this script do?” You are asking, “What code path runs when Python imports this module, what native APIs does it reach, and what secondary payloads does it unpack?”

Why .pyd and .so files slip past Python-first review habits

A lot of review processes are built around source visibility. Security teams grep repositories, inspect requirements.txt, open wheel contents, and read import graphs. That works well for pure Python packages. It is much less effective when the package contains compiled artifacts.

The failure mode is straightforward:

  • the filename looks like a normal extension module
  • the package metadata looks normal enough to pass casual review
  • the real logic is embedded in machine code
  • any Python wrapper code, if it exists at all, may only call into the binary

The result is that static review often stops too early. Analysts confirm that a package is “Python-related” and move on, while the dangerous behavior sits inside code that never appears in plain text.

The practical impact for analysts, defenders, and SOC triage

For analysts, the main impact is time. A source-level triage might take minutes. A binary extension can take hours because you need to identify architecture, resolve imports, and map native functions back to likely behavior.

For defenders, the impact is broader than one sample. Native extensions let malicious actors blend into ecosystems that already trust compiled wheels, C extensions, and platform-specific modules. If your detection logic only watches for .py files or suspicious script execution, you miss the payload class entirely.

For SOC triage, the point is simpler: do not assume “Python” means “safe to inspect lazily.” A Python host process loading a native extension can become the execution boundary for malware. Treat it like any other native code load event, because that is what it is.

What a Native Python Extension Actually Looks Like on Disk

File naming, ABI markers, and platform-specific artifacts

Native Python extensions usually carry platform and interpreter clues in the filename. You may see patterns like:

  • module.cpython-311-x86_64-linux-gnu.so
  • module.cp310-win_amd64.pyd
  • module.so
  • module.pyd

Those suffixes can tell you a lot before you open the file. They hint at:

  • target operating system
  • CPU architecture
  • Python ABI version
  • whether the module is likely a true extension or a renamed binary

That matters because the sample may only load successfully in one environment. If your lab runs the wrong Python version or architecture, the module may fail cleanly and hide its behavior from you.

A quick first-pass checklist looks like this:

file sample.pyd
strings -a sample.pyd | head -n 50
python -V
python -c "import sysconfig; print(sysconfig.get_config_var('EXT_SUFFIX'))"

You are not trying to solve the sample here. You are trying to confirm that the binary matches your lab environment.

How import-time execution can trigger behavior before source exists

Python extension modules can execute code during import through their initialization entry point. That is the first trap.

With pure Python, import-time side effects are visible in the source. With native extensions, the init routine is a compiled function that the interpreter calls as soon as the module is imported. If that routine decrypts a blob, creates a thread, reaches out to the network, or registers additional hooks, you may never see a readable trace unless you instrument the process.

That means the dangerous behavior can happen at:

  • package install time, if an installer or post-install hook loads the module
  • import time, when a parent script does import module
  • runtime, when the module exposes a function that is later invoked

You should assume the import boundary is active code, not a metadata step.

The difference between bundled Python code and compiled runtime code

There is an important distinction between a package that bundles Python code and a package that embeds runtime logic in a compiled extension.

PropertyBundled Python codeCompiled extension
ReadabilityHighLow
Grep-friendlyYesUsually no
Import-time behaviorVisible in sourceHidden in binary init path
Static configuration extractionOften easyOften requires disassembly
Defender visibilityBetterWorse

This is why an extension module deserves a native-code workflow even if the surrounding ecosystem is Python. The runtime boundary is still the same interpreter, but the implementation is no longer text.

Building a Safe Static Analysis Workflow for .pyd and .so Samples

Identify architecture, Python version, and load dependencies first

I start with compatibility, not reversing. If the sample is x86-64 Linux but my lab is Windows, or if it targets CPython 3.11 and I only have 3.9 available, I want to know that immediately.

Useful first checks:

file sample.so
readelf -h sample.so
readelf -d sample.so
ldd sample.so

On Windows, the equivalents are usually:

  • sigcheck or PE-bear for file metadata
  • dumpbin /headers for PE headers
  • dependency inspection tools for imported DLLs

The goal is to answer:

  1. What platform is this built for?
  2. Which interpreter ABI does it expect?
  3. Which native libraries does it need?
  4. Does it depend on unusual runtime helpers, packers, or crypto libraries?

That dependency list often says more than the first hundred lines of disassembly.

Use strings, exports, and section layout to map the sample quickly

I always do a fast pass over strings and exports before opening the binary in a decompiler. Compiled extensions often leak useful context:

  • Python API symbols
  • embedded module names
  • command names
  • configuration keys
  • URLs or IP fragments
  • error messages
  • crypto algorithm names
  • file paths

A safe workflow looks like this:

strings -a sample.so | sort -u > strings.txt
nm -D sample.so 2>/dev/null | head
objdump -T sample.so 2>/dev/null | head

For PE files:

strings -a sample.pyd | sort -u > strings.txt
dumpbin /exports sample.pyd

What I look for:

  • exports that match module initialization naming
  • unusually large .data or .rdata sections
  • compressed or encrypted-looking binary blobs
  • suspicious high-entropy regions
  • references to Python C API functions like PyImport_ImportModule, PyRun_SimpleString, PyEval_EvalCode, or PyBytes_FromStringAndSize

If the module only exports the init entry point and hides the rest behind internal functions, that is normal enough for C extensions. If the binary also contains a lot of crypto, socket, and filesystem strings, that is where I start paying attention.

Look for embedded configuration, encrypted blobs, and staged loaders

A lot of malware hiding in native extensions follows a familiar pattern:

  1. the extension loads
  2. the init function runs
  3. a configuration blob is decoded or decrypted
  4. secondary code is staged into memory or written to disk
  5. control passes to a separate execution path

That can look like:

  • base64-encoded config strings
  • XOR or AES routine usage
  • embedded shellcode-like blobs
  • packed resources in custom sections
  • a string table that only becomes meaningful after decryption

At static time, I try to decide whether the binary is a loader, a payload, or both. If I see a small amount of logic plus a large opaque blob, I assume the sample is staging something unless proven otherwise.

A useful question is: does the module contain logic that exists only to transform data before handing it somewhere else? That is often the signature of a loader, not a normal library.

Disassemble the binary and trace the init path into Python APIs

Once I have the metadata and strings, I move into the decompiler. I am looking for the init function, then the call chain out of it.

For a CPython extension, that usually means finding the module initialization entry point and following it into helper functions. From there, I trace calls into:

  • Python C API functions
  • OS file APIs
  • socket or WinINet/WinHTTP equivalents
  • memory allocation and permission-changing routines
  • crypto or decompression helpers

The exact names depend on the platform, but the pattern stays the same: import-time entry point, then a sequence of native calls that eventually becomes behavior.

A practical technique is to annotate only the functions that matter:

  • initialization
  • blob decode/decrypt
  • process creation
  • network setup
  • file write
  • persistence logic

That keeps the analysis from turning into a pile of irrelevant wrappers.

Reconstructing Hidden Logic When There Is No Readable Python Source

Recover control flow from imports, call graphs, and indirect jumps

The first thing I remind myself is that the source is not gone; it is just represented differently.

You can recover a surprising amount of behavior from:

  • imported libraries
  • cross-references to strings
  • function call graphs
  • indirect jump targets
  • exception handling paths
  • wrapper functions around native APIs

If the module is packed or heavily optimized, the control flow may be noisy. I still start with the same question: what is the shortest path from module entry to side effect?

If the path goes:

init -> config decode -> environment check -> network connect -> execute next stage

then I do not need to understand every helper to understand the sample’s purpose.

Spot hard-coded endpoints, keys, and command handlers without overfitting

People often overfit on one string and call it “the IOC.” I try not to do that. Hard-coded values are useful, but only if I understand their role.

Look for:

  • domains and IPs
  • URI paths
  • mutex names
  • filenames
  • bot identifiers
  • encryption keys or IV fragments
  • command verbs or subcommands

Then ask:

  • Is this a real endpoint or a fallback?
  • Is this a command router or just logging?
  • Is the string used directly, or only after decoding?
  • Does the binary contain multiple configuration sets?

That last point matters because compiled malware often carries staged config. You may find one string in the static sample and a different one after the first decrypt step at runtime.

Compare the extension against benign native modules to spot suspicious deltas

This is one of my favorite shortcuts when time is tight. Compare the sample to a benign extension built for the same interpreter and platform.

A normal extension usually has:

  • predictable exports
  • obvious bindings to a small set of functions
  • stable imports for platform APIs
  • modest initialization work

A suspicious extension often differs by:

  • unusually large init routine
  • opaque helper names
  • custom resource sections
  • embedded compressed data
  • unnecessary networking or process APIs
  • high entropy in non-code sections

You do not need the benign module to be the same package. It just needs to be the same kind of artifact. That gives you a baseline for what normal looks like in this ecosystem.

Dynamic Analysis in an Isolated Lab

Set up a reversible sandbox with network capture and filesystem logging

Once the static pass suggests the sample is active, I switch to a controlled lab. I want the machine to be disposable, and I want every meaningful action logged.

At minimum, I set up:

  • a snapshot-capable VM
  • host-only or heavily filtered networking
  • filesystem event logging
  • process creation logging
  • DNS and HTTP capture
  • a clear rollback plan

If the sample reaches out to the network, I want to see where. If it writes a file, I want to know exactly what it wrote and where. If it spawns another process, I want the process tree.

Do not test this on a developer laptop just because it is “only Python.”

Observe process creation, imported libraries, and runtime decryption steps

When the extension loads, I watch for:

  • child processes
  • DLL or shared-library loads
  • access to suspicious system APIs
  • repeated crashes before successful execution
  • delayed behavior after a sleep or environment check

On Windows, Process Monitor, Process Explorer, ETW-based tooling, and API tracing can be useful. On Linux, strace, ltrace, auditd, ptrace-based tooling, and sysdig-style visibility help a lot.

The key question is whether the binary does any of the following at runtime:

  • unpacks hidden data
  • resolves functions dynamically
  • copies itself somewhere else
  • launches an external interpreter or shell
  • creates a scheduled task, service, or login item
  • modifies shell startup or application-specific autoload paths

If I see a decrypt-then-execute sequence, I treat that as the real payload path and stop expecting the static binary to tell the full story.

Trace network beacons, file writes, and any persistence attempts

I usually break runtime observation into three buckets:

BehaviorWhat to captureWhy it matters
NetworkDNS, SNI, HTTP metadata, destination IPsTells you where the sample wants to talk
FilesystemPaths, names, hashes, timestampsReveals staging, drop locations, and persistence
ProcessParent-child tree, command line, modulesShows escalation and execution chaining

I keep the capture safe and minimal. If the sample writes a second-stage file, I hash it, quarantine it, and inspect that artifact separately rather than letting it run loose.

Use Python and OS-level instrumentation to recover behavior safely

Because the host is Python, I like to instrument from both sides: the interpreter side and the operating-system side.

Useful ideas include:

  • wrapping or monitoring imports
  • logging sys.modules growth over time
  • tracing filesystem and subprocess calls
  • using breakpoints on native entry points
  • observing environment-dependent behavior

If the sample is sensitive to environment checks, I may need to vary only one control at a time: hostname, username, domain membership, Python version, or presence of certain files. That helps me separate real logic from anti-analysis noise.

The safest rule is to change as little as possible between runs. If the sample behaves differently, you want to know why.

Detection Opportunities for SOC and Endpoint Teams

Hunt for unusual Python extension loading from user-writable paths

One of the cleanest detections is simple: Python loading a native extension from a place it should not.

Examples include:

  • user profile directories
  • downloads folders
  • temp directories
  • package cache paths that should not contain executable payloads
  • project workspaces with unexpected compiled binaries

This is especially suspicious when the extension is imported by a script that normally only uses pure Python modules.

A useful hunt question is: did a Python process load a native module from a path that was recently written by the same user or process tree?

Correlate suspicious parent-child process trees around Python hosts

A Python host is not automatically suspicious. The parent and child processes make it interesting.

I would flag cases where:

  • a Python interpreter launches shell commands unexpectedly
  • a script spawns PowerShell, cmd, bash, curl, wget, or a downloader
  • the extension causes Office, browser, or system utilities to launch
  • a service host or scheduled task runs Python from a nonstandard path

The process tree often reveals the real objective faster than the binary does. If the extension is just a loader, the child process may be the thing actually doing the damage.

Flag outbound connections from loaders that should stay local

If a Python extension that is supposed to parse data or provide a local binding starts making outbound connections, that is worth a look.

Detections should pay attention to:

  • DNS lookups from service accounts
  • HTTP requests from developer tooling that normally runs offline
  • connections to newly observed domains
  • repeated small beacons shortly after module load

I do not need to claim every outbound request is malicious. I just need enough signal to prioritize the sample for review.

Build YARA and EDR logic around packer traits, not just strings

String-based detection alone is weak here. The sample can rename strings, obfuscate configs, or move data into encrypted resources.

Better detection candidates include:

  • high-entropy sections
  • unusual import combinations
  • module init routines with large opaque blobs
  • runtime unpacking patterns
  • repeated use of memory permission changes
  • custom resource sections in extension binaries

If you are writing YARA, focus on structural traits and binary markers, not just one endpoint string that will disappear in the next build.

Defensive Controls That Reduce This Tradecraft’s Effectiveness

Restrict unsigned or untrusted native modules where possible

The cleanest defense is policy. If your environment does not need arbitrary native Python extensions, do not let them load freely.

Good controls include:

  • allowing only signed or vetted binaries
  • restricting writable module paths
  • enforcing package provenance checks
  • requiring review for compiled artifacts in dependency updates

You do not have to ban all native modules. You do have to make them hard to introduce without scrutiny.

Enforce application control and provenance checks for Python dependencies

Python dependency management is often treated as “just pip.” That is a mistake. Compiled wheels and extension modules deserve the same provenance checks as any other executable.

I would want:

  • pinned dependencies
  • private mirrors for approved packages
  • hash verification where feasible
  • review of compiled artifacts before promotion
  • alerts when a package adds native code unexpectedly

If a package that used to be pure Python suddenly ships a .so or .pyd, that is a review event.

Harden developer endpoints, build systems, and package mirrors

Developer workstations and CI systems are the most likely places for this tradecraft to succeed. They already run Python tooling, install packages frequently, and trust dependency caches.

Harden them by:

  • limiting internet access during builds
  • separating package download from execution
  • monitoring package index usage
  • scanning cached wheels and extension binaries
  • keeping build credentials off general-purpose endpoints

A lot of malware succeeds because the packaging environment is allowed to behave like a runtime environment.

Add runtime monitoring for extension loads, subprocesses, and DLL search abuse

At runtime, I want telemetry on three things:

  1. native extension loads
  2. subprocess creation
  3. suspicious library search behavior

If a Python process is searching user-writable directories for DLLs or shared libraries, that is a strong signal. If it loads an extension and immediately spawns another process, that is stronger.

These detections are useful because they catch the technique, not just one specimen.

Analyst Checklist and Verification Steps

Triage questions to answer before deep reversing

Before I spend time in the decompiler, I want to answer these questions:

  • What platform and Python version does the extension target?
  • Does it export only a module init function, or more?
  • Are there obvious strings for config, endpoints, or commands?
  • Does the binary look packed or encrypted?
  • Does import-time execution appear likely?
  • Are there dependent native libraries that need separate review?

If I cannot answer those, I am not ready to claim I understand the sample.

Minimum artifacts to preserve for later review

I save the following even when the sample looks simple:

  • the original binary
  • file hashes
  • strings output
  • dependency list
  • screenshots or notes from the decompiler
  • network captures
  • process tree logs
  • filesystem write logs
  • any extracted secondary payloads

The reason is boring but important: native samples often reveal only part of their behavior on the first run. You want enough evidence to reconstruct the rest later without rerunning the sample unnecessarily.

When to escalate to full reversing versus containment only

Not every sample deserves a full reverse. I escalate when I need to answer at least one of these:

  • What does the module decrypt?
  • Where does it beacon?
  • What persistence does it create?
  • What secondary payload does it launch?
  • What environment checks gate execution?

If the answer does not affect containment, I may stop at a strong static and dynamic summary. If the sample is already active in the environment, containment comes first and deep reversing comes second.

Conclusion: What InvisibleFerret Teaches About Malware Hiding in Plain Sight

The key lesson for Python-heavy environments

The main lesson from the InvisibleFerret report is not that Python is suddenly dangerous. It is that Python ecosystems now carry native-code risk in the same places defenders are used to seeing text.

A compiled extension can look like a harmless dependency, load like a normal module, and still hide most of its behavior in opaque machine code. That is enough to break source-centric triage if you are not expecting it.

For me, the practical shift is simple: when Python meets native code, I stop treating it like scripting and start treating it like a loader problem.

Practical next steps for hunters and defenders

If you want to make this class of tradecraft less effective, start here:

  • alert on extension modules loaded from user-writable paths
  • inspect compiled wheels and native artifacts during dependency review
  • monitor Python processes for unexpected subprocess and network activity
  • preserve binaries, not just source trees, during incident response
  • add native-code tooling to your Python triage playbook

That combination will not catch everything, but it closes the blind spot that compiled extensions exploit. And that is the real value of this report: it is a reminder that a Python package can still be a native malware delivery vehicle.

Share this post

More posts

Comments