CyberBlue Academy — Blue Team & SOC Training

What You'll Learn

Explain what static analysis is and why it is the first step in any malware investigation
Identify the key components of the PE (Portable Executable) file format: DOS header, PE header, section table, and entry point
Describe the purpose of common PE sections (.text, .data, .rsrc, .reloc) and what anomalies to look for in each
Extract strings from a binary using FLOSS and the strings command on both Windows and Linux
Identify suspicious string categories: URLs, IP addresses, file paths, API calls, registry keys, and encoded data
Apply a string analysis workflow to perform initial triage on an unknown binary
Connect static analysis findings to YARA rules (Module 10) and CyberChef for deeper investigation
Interpret compilation timestamps and linker metadata to assess binary origin and age

Why Static Analysis Comes First

When a suspicious file lands on your desk — pulled from a quarantine folder, extracted from a phishing email, or flagged by Wazuh — you face a critical decision: do you run it, or do you read it?

Static analysis means examining a binary without executing it. You inspect its structure, read its strings, examine its imports, check its metadata — all without letting it touch a running system. This is always the first step because it is safe, repeatable, and often reveals enough to classify a sample before you ever need a sandbox.

Analysis Type	What You Do	Risk Level	Speed
Static	Examine file structure, strings, imports, metadata	Zero — file never executes	Minutes
Dynamic	Execute in a sandbox and observe behavior	Contained — isolated environment	10–30 minutes
Manual reverse engineering	Disassemble and read code logic	Zero — file never executes	Hours to days

ℹ

Static analysis is not a replacement for dynamic analysis — it is a prerequisite. The goal is to extract as much intelligence as possible before execution. A 10-minute static pass might reveal the C2 server, the malware family, and the persistence mechanism — all without booting a sandbox. In Lab 11.1, you will perform a complete static analysis workflow on a real PE binary and extract actionable IOCs before any execution.

The PE File Format: Windows Executables Under the Microscope

Every .exe, .dll, .sys, and .scr file on Windows follows the Portable Executable (PE) format. Understanding PE structure is fundamental because malware authors must work within this format — and every shortcut they take leaves artifacts you can detect.

PE file structure — from DOS header through PE header, section table, and section data

DOS Header and DOS Stub

Every PE file begins with the DOS header, a legacy artifact from MS-DOS compatibility. The first two bytes are always 4D 5A (the ASCII characters "MZ" — named after Mark Zbikowski, a DOS architect). This magic number is how the operating system and analysis tools recognize a file as a PE executable.

The DOS header contains one critical field for analysts: e_lfanew — a 4-byte offset at position 0x3C that points to the PE header's location. Malware authors occasionally manipulate this value to confuse basic parsers.

Following the DOS header is the DOS stub — a small program that prints "This program cannot be run in DOS mode" if someone tries to run the executable in a DOS environment. Some malware replaces this stub with custom messages or junk data.

00000000  4D 5A 90 00 03 00 00 00  04 00 00 00 FF FF 00 00  |MZ..............|
00000010  B8 00 00 00 00 00 00 00  40 00 00 00 00 00 00 00  |........@.......|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 E0 00 00 00  |................|

PE Header (IMAGE_NT_HEADERS)

The PE header starts with the signature 50 45 00 00 ("PE\0\0") and contains two sub-structures:

File Header (COFF Header) — 20 bytes of critical metadata:

Field	What It Tells You
Machine	Target architecture: `0x14C` = x86, `0x8664` = x64
NumberOfSections	How many sections the binary contains
TimeDateStamp	Compilation timestamp (Unix epoch format)
Characteristics	Flags: executable, DLL, large address aware, etc.

Optional Header — despite the name, it is mandatory for executables:

Field	What It Tells You
AddressOfEntryPoint	RVA where execution begins — malware may point this to an unusual section
ImageBase	Preferred load address (typically `0x00400000` for EXEs, `0x10000000` for DLLs)
SectionAlignment / FileAlignment	Memory and disk alignment values
SizeOfImage	Total size when loaded in memory
Subsystem	GUI (`0x02`) vs Console (`0x03`) — a "GUI" app with no window is suspicious
DataDirectory	Array of 16 entries pointing to imports, exports, resources, relocations, etc.

⚠

Compilation timestamps are trivially spoofed. Malware authors routinely set fake timestamps to mislead investigators. A timestamp of January 1, 1970 (epoch zero) or a date far in the future is an obvious fake. A timestamp that exactly matches another known-good binary suggests timestomping. Use timestamps as one data point, never as conclusive evidence. Cross-reference with other metadata like the linker version and Rich header hash.

Section Table and Common Sections

After the PE header comes the section table — an array of headers describing each section in the binary. Every section has a name, virtual address, virtual size, raw size, and characteristics flags.

Section	Purpose	What to Watch For
.text	Executable code	Unusually small .text + large unknown section = packed binary
.data	Initialized global and static variables	Strings, configuration data, embedded payloads
.rdata	Read-only data, import/export tables	Import table analysis reveals API usage
.rsrc	Resources: icons, dialogs, version info, embedded files	Embedded executables, encrypted payloads hidden as resources
.reloc	Relocation table for ASLR	Missing .reloc with ASLR enabled = anomaly
UPX0, UPX1	UPX packer sections	Clear indicator of UPX packing
.themida	Themida protector	Commercial packer/protector, common in crimeware

💡

Section names are cosmetic — the OS ignores them. Malware can name sections anything: .code, .xyz, or even an empty string. What matters is the characteristics flags. A section marked as both writable and executable (0xE0000020) is a red flag — legitimate software rarely needs self-modifying code outside of packers and JIT compilers.

Entry Point Analysis

The AddressOfEntryPoint field tells the OS where to start executing code. In legitimate software, this points into the .text section. Anomalies to watch for:

Entry point in a non-standard section (not .text) — suggests packing or injection
Entry point at the very end of a section — common in appended shellcode
Entry point at offset 0 of a section with high entropy — likely packed or encrypted
Entry point in a section with a suspicious name (UPX1, .packed, random characters)

Extracting and Analyzing Strings

Strings are the single most productive static analysis technique for initial triage. Embedded text in a binary reveals what the malware communicates with, what it modifies, and what tools or techniques it uses.

The strings Command

On Linux, the strings command extracts printable ASCII sequences of a minimum length (default 4 characters):

strings suspicious.exe | head -50

strings -n 8 suspicious.exe     # minimum 8 characters (reduces noise)

strings -e l suspicious.exe     # extract UTF-16LE strings (common in Windows binaries)

On Windows, Sysinternals strings.exe provides equivalent functionality:

strings64.exe -n 8 suspicious.exe

strings64.exe -accepteula suspicious.exe | Select-String -Pattern "http"

FLOSS: Beyond Basic Strings

The FLARE Obfuscated String Solver (FLOSS) from Mandiant goes far beyond strings. It uses static analysis techniques to automatically deobfuscate strings that malware encrypts or encodes at compile time:

floss suspicious.exe

floss --no stack_strings suspicious.exe     # skip stack strings for faster results

floss -o floss_output.json suspicious.exe   # JSON output for scripting

Tool	Finds Static Strings	Finds Stack Strings	Deobfuscates Encoded Strings
`strings`	Yes	No	No
FLOSS	Yes	Yes	Yes

🚨

Never run FLOSS on a file you suspect is malicious on your analysis workstation without isolation. FLOSS performs partial emulation to decode strings, which can trigger certain behaviors. Always run string extraction tools inside your analysis VM or container — never on your host system.

Suspicious String Categories

When reviewing extracted strings, categorize them systematically:

Network Indicators:

URLs: http://, https://, ftp://
IP addresses: 192.168., 10.0., or public IPs
Domain names: especially DGA-looking domains (xkjr2.duckdns.org)
User-Agent strings: Mozilla/5.0, custom agents

File System Indicators:

Windows paths: C:\\Users\\, C:\\Windows\\Temp\\, %APPDATA%
Linux paths: /tmp/, /etc/cron.d/, /var/log/
File extensions: .bat, .ps1, .vbs, .dll
Known malware drop locations: C:\\ProgramData\\, C:\\Users\\Public\\

Windows API Calls:

Process manipulation: CreateRemoteThread, VirtualAllocEx, WriteProcessMemory
Execution: WinExec, ShellExecute, CreateProcess
Network: InternetOpen, URLDownloadToFile, HttpSendRequest
Registry: RegSetValueEx, RegCreateKey
Crypto: CryptEncrypt, CryptDecrypt, BCryptEncrypt

Persistence Indicators:

Registry keys: SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run
Service creation: CreateService, sc create
Scheduled tasks: schtasks, at.exe

Encoded / Obfuscated Data:

Base64 strings: long alphanumeric sequences ending in = or ==
Hex-encoded data: continuous hex characters
XOR keys: short repeated byte sequences

String Analysis Workflow

Efficient string analysis follows a structured workflow that moves from broad extraction to targeted investigation:

Step 1: Extract — Run strings (ASCII and UTF-16) and FLOSS on the binary. Pipe output to a file for reference.

strings -n 6 sample.exe > strings_ascii.txt
strings -n 6 -e l sample.exe > strings_utf16.txt
floss sample.exe > strings_floss.txt

Step 2: Filter noise — Remove common library strings, compiler artifacts, and Windows API boilerplate. Focus on unique, unusual, or contextually suspicious strings.

grep -iE "(http|ftp|\\.[a-z]{2,4}/|[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+)" strings_ascii.txt

grep -iE "(CreateRemoteThread|VirtualAlloc|WriteProcessMemory|URLDownload)" strings_ascii.txt

grep -iE "(CurrentVersion\\\\Run|schtasks|cron)" strings_ascii.txt

Step 3: Categorize — Group findings into network IOCs, file system IOCs, behavioral indicators, and encoded data.

Step 4: Pivot — Take discovered IOCs and search for them in threat intelligence platforms. A URL found in strings can be checked in VirusTotal, MISP, or URLhaus. An API call pattern can be matched against known malware family profiles.

Step 5: Document — Record every finding with the offset where the string was found, the category, and its significance.

Connecting Static Analysis to Your Toolkit

Static analysis does not exist in isolation. Every finding connects to tools you already know:

Finding	Next Step	Tool
Suspicious string pattern	Write a detection rule for it	YARA (Module 10)
Base64-encoded payload	Decode and analyze the payload	CyberChef
C2 domain or IP	Search threat intelligence feeds	MISP (Module 5)
Compilation timestamp	Correlate with campaign timelines	MISP timeline / ATT&CK
API call pattern	Create endpoint detection	Velociraptor (Module 6)
File hash (MD5/SHA256)	Check reputation databases	VirusTotal / MalwareBazaar

💡

YARA and static analysis are natural partners. In Module 10, you wrote YARA rules that match on strings and hex patterns. Every suspicious string you extract during static analysis is a candidate for a YARA rule. In Lab 11.1, you will practice the full loop: extract strings → write a YARA rule → scan a directory for additional samples matching the same patterns.

Linux ELF Binaries: The Other Side

While PE is the dominant format on Windows, Linux malware uses the ELF (Executable and Linkable Format). The same static analysis principles apply:

file suspicious_binary
# suspicious_binary: ELF 64-bit LSB executable, x86-64, dynamically linked

readelf -h suspicious_binary     # ELF header (entry point, architecture, type)

readelf -S suspicious_binary     # section headers (similar to PE sections)

readelf -d suspicious_binary     # dynamic section (shared library dependencies)

strings -n 8 suspicious_binary | grep -iE "(http|/tmp/|/bin/|socket|connect)"

PE Concept	ELF Equivalent
.text section	.text section
.data section	.data / .bss sections
.rsrc section	No direct equivalent (resources handled differently)
Import Address Table	.dynsym / .plt (dynamic symbols and procedure linkage table)
PE header	ELF header (`readelf -h`)
DLL dependencies	Shared library dependencies (`ldd` or `readelf -d`)

Key Takeaways

Static analysis examines a binary without executing it — it is always the first step because it is safe, fast, and often reveals enough to classify a sample
The PE format has a predictable structure: DOS header (MZ magic), PE header (compilation timestamp, entry point, characteristics), section table, and section data
Section anomalies reveal packing and tampering: writable+executable sections, entry points outside .text, unusual section names, or entropy mismatches
Compilation timestamps provide timeline intelligence but are trivially spoofed — always cross-reference with other metadata
String extraction using strings and FLOSS is the highest-value static technique: URLs, IPs, API calls, registry keys, and encoded data all reveal malware intent
Follow a structured string analysis workflow: extract → filter → categorize → pivot → document
Every static finding connects to your existing toolkit: strings feed YARA rules, encoded data feeds CyberChef, network IOCs feed MISP, API patterns feed Velociraptor hunts
ELF binaries on Linux follow the same analysis principles — use readelf, file, and strings instead of PE-specific tools

What's Next

You now know how to examine a binary's structure and extract strings — the "what is this file made of?" question. But two critical questions remain: "Have we seen this file before?" and "Is this file hiding something?" In Lesson 11.2, you will learn to hash files for reputation lookups, detect packers that compress and encrypt code, and analyze the Import Address Table to understand what Windows APIs a binary calls — the next layer of static analysis that separates commodity malware from sophisticated threats.

Knowledge Check: PE Structure & String Analysis

10 questions · 70% to pass

What is the primary advantage of static analysis over dynamic analysis as the first step in malware investigation?

What are the first two bytes (magic number) of every valid PE file?

Which PE section typically contains the executable code of a binary?

In Lab 11.1, you extract strings from a PE binary and find the string 'CreateRemoteThread'. What category of suspicious activity does this API call indicate?

What advantage does FLOSS provide over the standard 'strings' command?

You find a PE section with both the writable and executable characteristics flags set. Why is this a red flag?

During string analysis of a suspicious binary in Lab 11.1, you find 'SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run'. What does this indicate?

Why should compilation timestamps in PE headers be treated with caution during analysis?

What is the correct order of steps in a string analysis workflow for malware triage?

Which command extracts UTF-16 Little Endian strings from a binary on Linux — a critical step since Windows binaries often store strings in this encoding?

0/10 answered

Static Analysis: Hashing, Packing & ImportsNext

Static Analysis: PE Structure & Strings