What You'll Learn
- Construct YARA conditions using Boolean operators (and, or, not) to combine string matches
- Use string counting operators (#, any of, all of, N of) for flexible detection logic
- Apply file property checks (filesize, uint16, uint32) to restrict matches to specific file types
- Use positional operators ($string at N, $string in range) for precise byte-offset matching
- Compare weak rules with strong rules and identify techniques that reduce false positives
- Connect condition-building skills to the webshell detection challenge in Lab 7.3
The Condition Section: Where Precision Lives
You have built a toolkit for writing YARA strings — text with modifiers, hex with wildcards and jumps, and regex for variable patterns. But strings alone do not make a good rule. The condition section determines when your rule fires, and the difference between a useful rule and a noisy one is almost always in the condition.
A rule with great strings and a weak condition (any of them) will match thousands of legitimate files. A rule with the same strings and a precise condition will match only the target. The condition is where you control the signal-to-noise ratio.
Boolean Operators
YARA conditions use three Boolean operators: and, or, and not.
and — Both Must Be True
condition:
$download and $webclient
The rule fires only if both $download and $webclient are found in the file. Adding more and clauses makes the rule more restrictive (fewer matches, fewer false positives).
or — Either Can Be True
condition:
$eval or $system or $exec
The rule fires if any one of the three strings is found. Using or makes the rule more permissive (more matches, potentially more false positives). Use or when different strings indicate the same behavior — a web shell might use eval, system, or exec to execute commands, but they all mean "command execution."
not — Must NOT Be Present
condition:
$suspicious_string and not $known_good_string
The not operator excludes files that contain a specific string. This is powerful for eliminating known false positives. For example, if your web shell rule keeps matching a legitimate PHP framework file that happens to contain eval(, you can add a not clause for a string unique to that framework:
strings:
$eval = "eval(" nocase
$laravel = "Illuminate\\Foundation\\Application"
condition:
$eval and not $laravel
Operator Precedence and Grouping
YARA follows standard operator precedence: not binds tightest, then and, then or. Use parentheses to make complex conditions readable:
condition:
($eval and $b64) or ($system and $post) or ($exec and $get)
Without parentheses, $eval and $b64 or $system would be parsed as ($eval and $b64) or $system — which fires if $system alone is present. Always use parentheses when mixing and and or.
String Counting Operators
Counting operators are the most powerful tools for flexible detection. Instead of specifying exact Boolean combinations of named strings, you can count how many strings matched.
any of them / all of them
condition:
any of them // At least 1 string matches
all of them // Every defined string matches
any of them is the loosest possible condition (highest recall, lowest precision). all of them is the tightest (lowest recall, highest precision — but fails if the target is missing even one string).
N of them
condition:
3 of them // At least 3 of all defined strings
This is the sweet spot for most rules. If you define 6 strings that characterize a malware family, requiring 3 matches means the rule catches variants where some strings have been changed while keeping precision.
N of ($pattern*)
strings:
$web_eval = "eval(" nocase
$web_system = "system(" nocase
$web_exec = "exec(" nocase
$web_passthru = "passthru(" nocase
$input_post = "$_POST"
$input_get = "$_GET"
$input_request = "$_REQUEST"
condition:
2 of ($web_*) and 1 of ($input_*)
The $web_* wildcard matches all string names starting with $web_. This condition requires at least 2 execution functions and at least 1 user input source — a pattern that strongly indicates a web shell.
String Occurrence Count (#)
The # operator counts how many times a string appears in the file:
condition:
#eval > 3 // "eval" appears more than 3 times
Multiple occurrences of suspicious function calls are more indicative of malicious intent than a single occurrence. A legitimate PHP file might use eval once; a web shell often uses it repeatedly.
Avoid any of them in production rules. It is useful for rapid triage and testing, but a production rule with any of them will almost certainly generate false positives. Every string you define could appear in legitimate software. The power of YARA comes from requiring combinations of indicators — 3 of them, 2 of ($exec_*) and $post, or explicit Boolean logic. Single-string matching is antivirus; multi-indicator matching is threat hunting.
File Property Checks
File properties let you restrict your rule to specific file types and sizes without relying solely on string matching.
filesize
condition:
filesize < 50KB // Less than 50 kilobytes
filesize > 100 and filesize < 2MB // Between 100 bytes and 2 megabytes
filesize < 10MB // Less than 10 megabytes
The filesize check is arguably the single most effective false-positive reducer in YARA. Common ranges by target type:
| Target | Typical Size | filesize Check |
|---|---|---|
| Web shell | 50 bytes - 50KB | filesize < 50KB |
| Malware dropper/stager | 10KB - 500KB | filesize < 500KB |
| RAT / backdoor | 50KB - 5MB | filesize < 5MB |
| Ransomware | 100KB - 2MB | filesize < 2MB |
| Legitimate enterprise app | 10MB - 500MB | (excluded by above ranges) |
uint16 and uint32 — Magic Byte Checks
The uint16(offset) and uint32(offset) functions read 2 or 4 bytes at a specific file offset and return them as an integer. This is how you check file format magic bytes:
condition:
uint16(0) == 0x5A4D // PE executable (MZ header)
uint32(0) == 0x464C457F // ELF binary (\x7FELF)
uint32(0) == 0x04034B50 // ZIP archive (PK header)
uint16(0) == 0x8B1F // GZIP compressed data
The uint16(0) == 0x5A4D check is the standard way to ensure you only match PE (Windows executable) files. Combined with filesize and string checks, this creates highly precise rules:
condition:
uint16(0) == 0x5A4D and
filesize < 1MB and
3 of them
This means: the file must be a PE executable, under 1MB, with at least 3 matching strings.
Note the byte order. YARA reads uint16 and uint32 in little-endian format (least significant byte first), which matches how x86 processors store integers. The MZ header bytes are 4D 5A in the file, but as a uint16 value they are 0x5A4D (bytes reversed). The ELF header bytes are 7F 45 4C 46 in the file, but as a uint32 they are 0x464C457F. This catches many beginners off guard.
Positional Operators
Sometimes you need a string to appear at a specific location in the file, not just anywhere.
at — Exact Offset
condition:
$mz_header at 0 // MZ must be at the very start
$pe_sig at 128 // PE signature at offset 128
in — Offset Range
condition:
$mz_header at 0 and
$pe_sig in (60..1024) // PE signature within first 1KB
The in (start..end) operator restricts the string to a specific byte range. This is useful for file structure validation — you know the PE signature must be within a certain range of the MZ header.
entrypoint — PE/ELF Entry Point
condition:
$shellcode at entrypoint // Shellcode starts at the entry point
The entrypoint variable holds the file offset of the PE or ELF entry point. If your shellcode pattern appears exactly at the entry point, the file is almost certainly malicious — legitimate programs do not start with raw shellcode.
Weak Rules vs. Strong Rules
The difference between a noisy rule and a production-quality rule is almost always in the condition. Here is a concrete comparison:
Weak Rule
rule Weak_WebShell
{
strings:
$a = "eval("
condition:
$a
}
Problems: Matches ANY file containing eval( — including legitimate PHP frameworks (Laravel, WordPress, Drupal), JavaScript build tools, Python scripts, and configuration generators. This rule would fire thousands of times on a typical web server with zero malicious files.
Strong Rule
rule Strong_WebShell
{
strings:
$eval = "eval(" nocase
$b64 = "base64_decode" nocase
$system = "system(" nocase
$exec = "exec(" nocase
$passthru = "passthru(" nocase
$post = "$_POST" nocase
$get = "$_GET" nocase
$request = "$_REQUEST" nocase
$php = "<?php"
$safe_wordpress = "WordPress"
$safe_laravel = "Illuminate"
$safe_drupal = "Drupal"
condition:
filesize < 50KB and
$php and
($eval or $system or $exec or $passthru) and
($post or $get or $request) and
not ($safe_wordpress or $safe_laravel or $safe_drupal)
}
Why this works:
- filesize < 50KB — web shells are small; legitimate frameworks are large
- $php — only match PHP files (not JavaScript or Python)
- Execution function required — the file must contain at least one command execution function
- User input required — the file must read from user-supplied HTTP parameters
- Framework exclusions — explicitly exclude files belonging to known legitimate projects
This rule requires the confluence of PHP code, execution capability, user input handling, and small size — plus the absence of known framework markers. The chance of a legitimate file matching all these criteria is near zero.
The FP Prevention Checklist
Before deploying any rule to production:
- File format check — add
uint16(0)or$php_tagto restrict to the target file type - File size limit — add
filesize < Nbased on the expected size range - Multiple indicators — require 2+ strings with different indicator types
- Flexible counting — use
N of theminstead of requiring all or any - Exclusion strings — add
notclauses for known false-positive sources - Clean corpus test — run against a set of known-good files before deployment
Build conditions incrementally. Start with a loose condition (any of them) to verify your strings match the target. Then add filesize. Then add a format check. Then increase the required count. After each change, re-test against both your malware corpus and your clean corpus. Stop when you have 100% detection of targets and 0% false positives on clean files.
Combining Everything: A Complete Detection Rule
Here is a production-quality rule that demonstrates every condition technique:
rule Ransomware_LockBit3_Indicator
{
meta:
author = "CyberBlue Academy"
description = "Detects LockBit 3.0 ransomware indicators"
date = "2026-02-17"
severity = "critical"
mitre_att_ck = "T1486"
tlp = "TLP:GREEN"
strings:
// Ransom note strings
$note1 = "your data are stolen and encrypted" nocase wide ascii
$note2 = ".onion" nocase
$note3 = "restore-my-files" nocase wide ascii
// Technical indicators
$mutex = "Global\\lockbit" nocase wide ascii
$ext = ".lockbit" nocase
$shadow = "vssadmin delete shadows" nocase
$bcdedit = "bcdedit /set {default} recoveryenabled no" nocase
$wmic = "wmic shadowcopy delete" nocase
// Hex patterns
$lockbit_header = { 4C 6F 63 6B 42 69 74 20 33 2E 30 }
$pe_header = { 4D 5A [20-200] 50 45 00 00 }
condition:
uint16(0) == 0x5A4D and
filesize < 2MB and
(
(2 of ($note*)) or
($mutex and 1 of ($shadow, $bcdedit, $wmic)) or
($lockbit_header and $ext) or
(3 of ($note*, $mutex, $ext, $shadow, $bcdedit, $wmic))
)
}
This rule uses:
- uint16(0) == 0x5A4D — only PE executables
- filesize < 2MB — ransomware is compact
- Multiple detection paths connected by
or— catches different variants where some strings may be missing - Wildcard counting (
2 of ($note*)) — flexible matching within string groups - Named string combinations — specific pairs that together are highly indicative
Key Takeaways
- Boolean operators (
and,or,not) combine string matches — use parentheses when mixing operators to ensure correct evaluation - String counting (
N of them,N of ($pattern*),#string > N) provides flexible detection that catches malware variants filesizechecks are the single most effective false-positive reducer — always include one based on the expected target size rangeuint16(0)anduint32(0)magic byte checks restrict rules to specific file formats (PE, ELF, ZIP, etc.)- Positional operators (
at,in,entrypoint) match strings at specific file offsets for structural validation - Strong rules combine file format checks, size limits, multiple string types, flexible counting, and exclusion strings
- Weak rules use single strings with
any of them— they match thousands of legitimate files - In Lab 7.3, you will write 3 YARA rules to detect 5 webshells hidden among 500 files with zero false positives — condition precision is the key
What's Next
You now have the full YARA rule-writing toolkit: strings with modifiers (Lesson 7.2) and precise conditions (this lesson). In Lesson 7.4, you will learn how to deploy your rules at scale — scanning files, directories, disk images, and memory dumps efficiently. You will also learn performance optimization techniques that matter when scanning thousands of files with hundreds of rules.
Knowledge Check: Conditions & Logic
10 questions · 70% to pass
What is the difference between 'any of them' and '3 of them' in a YARA condition?
Why is 'uint16(0) == 0x5A4D' one of the most commonly used YARA condition checks?
Which condition correctly requires at least 2 execution functions AND at least 1 user input source using wildcard counting?
What does the 'not' operator do in a YARA condition like '$eval and not $laravel'?
What is the single most effective technique for reducing false positives in YARA rules?
What does the # operator do in a YARA condition like '#eval > 3'?
In the condition '$shellcode at entrypoint', what does the 'entrypoint' variable represent?
In Lab 7.3, you must find 5 webshells hidden among 500 files. A rule that uses 'any of them' matches 5 webshells but also 47 clean PHP files. How should you fix this?
When YARA reads uint16(0) as 0x5A4D, the actual bytes in the file at offset 0 are 4D 5A. Why is the value reversed?
Which YARA condition technique allows you to explicitly exclude known-good files from matching?
0/10 answered