- Differentiate between short-term containment, long-term containment, eradication, and recovery — and explain why the order matters - Apply network isolation techniques on both Windows and Linux systems to contain compromised endpoints - Make containment decisions using a decision matrix that weighs business impact against threat severity - Explain why evidence preservation must happen before or during containment — never after - Describe eradication procedures: malware removal, backdoor elimination, credential resets, and vulnerability patching - Execute recovery procedures: backup restoration, system rebuilding, validation testing, and re-infection monitoring - Define communication protocols for active incidents — who communicates what, to whom, and how often - Build a recovery validation checklist that confirms a system is clean before returning it to production - Connect containment and recovery concepts to the hands-on simulation in Lab 14.3

## The Most Intense Phase Detection tells you something is wrong. Analysis tells you what is wrong. But **containment, eradication, and recovery** is where you actually stop the attacker, remove their presence, and restore your environment. This is the phase where decisions have immediate, irreversible consequences — isolate the wrong server and you take down production; wait too long to isolate and the attacker pivots to more systems. NIST groups these three activities into a single phase because they overlap in practice. You do not finish containment, then start eradication, then start recovery in neat sequential steps. In a real incident, you are containing one system while eradicating malware from another while recovering a third that was already cleaned. The skill is knowing when to transition between activities and how to run them in parallel without losing control. ## Short-Term Containment Short-term containment is the emergency response — stop the immediate threat within minutes. The goal is to **limit damage** without destroying evidence. ### Network Isolation The most common short-term containment action. Isolate the compromised system from the network so the attacker cannot use it to pivot, exfiltrate data, or communicate with C2 infrastructure. **On Linux:** ```bash # Block all network traffic except local loopback and established SSH (for remote forensics) iptables -I INPUT 1 -i lo -j ACCEPT iptables -I INPUT 2 -p tcp --dport 22 -s 10.0.1.0/24 -j ACCEPT iptables -I INPUT 3 -j DROP iptables -I OUTPUT 1 -o lo -j ACCEPT iptables -I OUTPUT 2 -p tcp --sport 22 -d 10.0.1.0/24 -j ACCEPT iptables -I OUTPUT 3 -j DROP ``` **On Windows (PowerShell as Administrator):** ```powershell # Disable all network adapters except the management interface Get-NetAdapter | Where-Object { $_.Name -ne "Management" } | Disable-NetAdapter -Confirm:$false # Or use Windows Firewall to block everything except forensic access New-NetFirewallRule -DisplayName "IR-Block-All-In" -Direction Inbound -Action Block -Priority 1 New-NetFirewallRule -DisplayName "IR-Allow-Forensic" -Direction Inbound -Action Allow -RemoteAddress 10.0.1.0/24 -Protocol TCP -LocalPort 5985 -Priority 0 New-NetFirewallRule -DisplayName "IR-Block-All-Out" -Direction Outbound -Action Block -Priority 1 New-NetFirewallRule -DisplayName "IR-Allow-Forensic-Out" -Direction Outbound -Action Allow -RemoteAddress 10.0.1.0/24 -Protocol TCP -RemotePort 5985 -Priority 0 ```

**Never unplug the network cable as your first action.** Pulling the cable destroys volatile evidence — active network connections, running processes communicating with C2, and memory-resident malware that may not persist to disk. Use firewall rules or VLAN reassignment instead. The exception: if ransomware is actively encrypting and spreading laterally, speed trumps evidence preservation. Pull the cable, image the disk later.

### Account Disabling If compromised credentials are identified, disable the accounts immediately: **Active Directory (PowerShell):** ```powershell # Disable the compromised account Disable-ADAccount -Identity "compromised.user" # Reset the password (prevents any cached credential use) Set-ADAccountPassword -Identity "compromised.user" -Reset -NewPassword (ConvertTo-SecureString "TempP@ss!IR2026" -AsPlainText -Force) # Force logoff of all active sessions (requires PSExec or similar) # Check for active sessions first Get-CimInstance -ClassName Win32_LogonSession | Where-Object { $_.LogonType -eq 10 } ``` **Linux (disable user and kill sessions):** ```bash # Lock the account usermod -L compromised_user # Expire the account immediately chage -E 0 compromised_user # Kill all active sessions for this user pkill -u compromised_user # Revoke SSH keys rm -f /home/compromised_user/.ssh/authorized_keys ``` ### Blocking Malicious Infrastructure Block known C2 IPs and domains at the network perimeter: | Blocking Point | Method | Scope | |---|---|---| | Perimeter firewall | Add deny rule for C2 IP/CIDR | All outbound traffic | | DNS resolver | Add sinkhole entry for C2 domain | All DNS queries | | Web proxy | Block C2 URL pattern | All HTTP/HTTPS traffic | | EDR/Wazuh | Active response rule to kill connections | Per-endpoint | | Email gateway | Block sender domain/IP | All inbound email |

**Block at multiple layers.** An attacker with a compromised endpoint may have multiple C2 channels — one over HTTPS to an IP, another over DNS tunneling to a domain, and a fallback over email. Blocking only the IP lets the DNS tunnel continue operating. Defense in depth applies to containment too.

## Long-Term Containment Short-term containment stops the immediate bleeding. Long-term containment allows continued operations while you prepare for eradication. This phase exists because **eradication takes time** — you cannot always wipe and rebuild every system immediately, especially if the attacker has compromised critical infrastructure. Long-term containment strategies: | Strategy | Purpose | Example | |---|---|---| | VLAN segmentation | Isolate compromised subnet while allowing limited business operations | Move affected servers to quarantine VLAN with restricted egress | | Enhanced monitoring | Detect any new attacker activity in the "contained" environment | Deploy additional Suricata rules, increase Wazuh log verbosity | | Temporary hardening | Close the vulnerability used for initial access | Apply emergency patch, disable exploited service, add WAF rules | | Credential rotation | Invalidate potentially compromised credentials beyond the known accounts | Reset service accounts, rotate API keys, invalidate sessions | | Backup verification | Confirm backups exist and are not compromised before you need them for recovery | Restore backup to isolated environment, scan for malware, verify data integrity |

**The attacker knows you know.** Once you start containment actions, assume the attacker may detect your response. They may accelerate their objectives (rush data exfiltration), deploy additional backdoors, or destroy evidence. This is why short-term containment must be fast and decisive — you are racing the attacker. If possible, coordinate containment actions to execute simultaneously across all affected systems rather than one at a time.

## The Containment Decision Matrix Not every containment decision is straightforward. Isolating a developer laptop is easy. Isolating the primary database server during business hours requires balancing security against business continuity. ![Containment decision matrix — mapping threat severity against business impact to determine appropriate containment actions](https://cyberblue-academy-content.s3.us-east-2.amazonaws.com/courses/cyberbluesoc-academy/module-ir/lesson-ir-3/containment-decision-matrix.png) | | Low Business Impact | Medium Business Impact | High Business Impact | |---|---|---|---| | **High Threat Severity** | Immediate full isolation | Immediate isolation with failover | Isolation with IC + business owner approval, failover mandatory | | **Medium Threat Severity** | Full isolation within 1 hour | Partial isolation (restrict, don't disconnect) | Partial isolation + enhanced monitoring, escalate to IC | | **Low Threat Severity** | Isolate at next maintenance window | Enhanced monitoring, schedule isolation | Enhanced monitoring only, document risk acceptance | **Key principles:** - **High threat + any impact** = containment wins. An attacker with domain admin access justifies taking down production systems. - **Low threat + high impact** = monitoring may be acceptable temporarily, but document the risk acceptance decision with IC sign-off. - **Never delay containment solely because of business impact without documented approval from the Incident Commander.** This decision must be in the incident log. ## Evidence Preservation During Containment Evidence preservation and containment happen in parallel — not sequentially. You cannot wait to finish forensic imaging before isolating a system that is actively exfiltrating data. But you also cannot destroy critical evidence through careless containment actions. ### What to Preserve Before Remediation | Evidence | Volatility | Collection Method | Priority | |---|---|---|---| | Memory (RAM) | Highest — lost on reboot | `avml` (Linux), `winpmem` (Windows), `Velociraptor.Collectors.MemoryDump` | Collect first | | Running processes + network connections | High — changes constantly | `ps auxf`, `netstat -tulpn` (Linux); `Get-Process`, `Get-NetTCPConnection` (Windows) | Collect with memory | | Disk image | Medium — persists but can be modified | `dd` or `dc3dd` (Linux), FTK Imager (Windows) | Collect before eradication | | Log files | Medium — may rotate or be cleared by attacker | Export from Wazuh, copy /var/log/ (Linux), export Windows Event Logs | Export during containment | | Network captures | Low — only exists if capture was running | `tcpdump`, Suricata full packet capture, Arkime | Start capture during containment | ### The Golden Rule **Image before you eradicate.** Once you delete malware, remove a web shell, or reset an account, you have destroyed the evidence of how the attacker operated. The forensic image preserves the compromised state for post-incident analysis, legal proceedings, and threat intelligence extraction. ```bash # Linux: Create forensic disk image with verification dc3dd if=/dev/sda of=/mnt/evidence/EVD-001-linux-web-01.dd hash=sha256 log=/mnt/evidence/EVD-001.log # Verify the hash matches sha256sum /mnt/evidence/EVD-001-linux-web-01.dd ``` ```powershell # Windows: Memory capture with winpmem .\winpmem_mini_x64.exe C:\Evidence\EVD-002-memory.raw # Disk image with FTK Imager CLI ftkimager.exe \\.\PhysicalDrive0 C:\Evidence\EVD-003-disk --e01 --verify ``` ## Eradication Eradication removes the attacker's presence from your environment. This is not "delete the malware file and call it done." Eradication must be comprehensive — if you miss a single backdoor, the attacker returns. ### Eradication Checklist | Action | Details | Verification | |---|---|---| | **Remove malware** | Delete all identified malicious files, scheduled tasks, services, registry keys | YARA scan confirms no remaining matches | | **Close backdoors** | Remove web shells, reverse shell configurations, unauthorized SSH keys, rogue accounts | Velociraptor artifact collection confirms removal | | **Reset credentials** | All accounts that may have been compromised — not just confirmed ones. Include service accounts, API keys, database credentials | Verify no active sessions using old credentials | | **Patch vulnerabilities** | Apply patches for the exploited vulnerability (and any others discovered during investigation) | Vulnerability scan confirms remediation | | **Update detection rules** | Add SIEM rules for the TTPs observed during the incident | Test rules against preserved evidence to confirm they would have caught the attack | | **Review access controls** | Verify firewall rules, ACLs, and group policies have not been modified by the attacker | Compare current configs against known-good baselines |

**Credential resets must be broader than you think.** If an attacker had domain admin access, assume they extracted the NTDS.dit database (all Active Directory password hashes). Every account in the domain is potentially compromised. At minimum, reset all privileged accounts, service accounts, and the krbtgt account (twice, 12 hours apart to invalidate Golden Tickets). For Linux environments, rotate all SSH keys, service account passwords, and API tokens that the compromised system could access.

### Eradication on Linux ```bash # Remove identified web shell rm -f /var/www/html/.hidden/cmd.php # Remove malicious cron job crontab -l -u www-data | grep -v "evil-c2.example.com" | crontab -u www-data - # Remove unauthorized SSH keys find /home -name "authorized_keys" -exec grep -l "attacker-key" {} \; # Then remove the offending keys from each file # Remove rogue systemd service systemctl stop backdoor.service systemctl disable backdoor.service rm /etc/systemd/system/backdoor.service systemctl daemon-reload # Verify: Run YARA scan against the filesystem yara -r /opt/yara-rules/incident-ir2026-047.yar /var/www/ /tmp/ /home/ ``` ### Eradication on Windows ```powershell # Remove malicious scheduled task Unregister-ScheduledTask -TaskName "SystemUpdate" -Confirm:$false # Remove rogue service Stop-Service -Name "WindowsHelperSvc" -Force sc.exe delete "WindowsHelperSvc" # Remove malicious registry persistence Remove-ItemProperty -Path "HKLM:SoftwareMicrosoftWindowsCurrentVersionRun" -Name "SystemHelper" # Remove unauthorized local admin account Remove-LocalUser -Name "helpdesk_backup" # Verify: Check for remaining persistence mechanisms Get-ScheduledTask | Where-Object { $_.State -ne "Disabled" } | Select-Object TaskName, TaskPath Get-Service | Where-Object { $_.StartType -eq "Automatic" -and $_.Status -eq "Stopped" } Get-ItemProperty -Path "HKLM:SoftwareMicrosoftWindowsCurrentVersionRun" ``` ## Recovery Recovery restores compromised systems to normal operation. The goal is not just "the system is running" — it is "the system is running, verified clean, and monitored for attacker return." ### Recovery Strategies | Strategy | When to Use | Pros | Cons | |---|---|---|---| | **Rebuild from scratch** | System was deeply compromised, root-level access confirmed | Highest confidence in clean state | Time-consuming, requires configuration documentation | | **Restore from backup** | Clean backup exists from before compromise, data loss is acceptable | Fast recovery, known-good state | Backup must be verified clean, data since backup is lost | | **Clean and patch in place** | Low-severity compromise, limited attacker access, evidence shows no persistence | Minimal downtime | Risk of missed artifacts, requires thorough verification |

**Rebuild is almost always the right answer for P1/P2 incidents.** When an attacker has had root or domain admin access, you cannot be 100% certain that in-place cleaning removed everything. Rootkits, firmware implants, and deeply embedded backdoors may survive standard eradication. Rebuilding from a known-good image and restoring data from verified backups eliminates this uncertainty. Reserve "clean in place" for P3/P4 incidents where the attacker's access was limited and well-documented.

### Recovery Validation Checklist Before returning a recovered system to production, verify every item: ![Recovery validation checklist — systematic verification that a recovered system is clean, patched, monitored, and approved for production](https://cyberblue-academy-content.s3.us-east-2.amazonaws.com/courses/cyberbluesoc-academy/module-ir/lesson-ir-3/recovery-validation-checklist.png) | # | Check | Method | Pass Criteria | |---|---|---|---| | 1 | Operating system is clean install or verified backup | Image hash matches known-good, or fresh install from trusted media | Hash verified, documented | | 2 | All patches applied (including the exploited vulnerability) | Vulnerability scan (Nessus, OpenVAS) | Zero critical/high findings | | 3 | No unauthorized accounts or SSH keys | `cat /etc/passwd`, `Get-LocalUser`, check authorized_keys | All accounts match baseline | | 4 | No unauthorized services, scheduled tasks, or cron jobs | `systemctl list-unit-files`, `Get-ScheduledTask`, `crontab -l` | All entries match baseline | | 5 | No unauthorized startup entries or registry persistence | `Get-ItemProperty HKLM:\...\Run`, `ls /etc/init.d/` | All entries match baseline | | 6 | Firewall rules match approved baseline | `iptables -L -n`, `Get-NetFirewallRule` | Rules reviewed and approved | | 7 | YARA scan clean | Run incident-specific YARA rules against full filesystem | Zero matches | | 8 | Wazuh agent reporting | Verify agent is connected and sending events to manager | Events visible in Wazuh dashboard | | 9 | Enhanced monitoring configured | Additional Suricata rules, increased log verbosity for 30 days | Rules deployed and tested | | 10 | Incident Commander sign-off | IC reviews checklist and approves return to production | Signed and documented | ## Communication During Active Incidents Communication failures during active incidents cause as much damage as technical failures. Stakeholders left in the dark make bad decisions. Analysts working in silos duplicate effort or miss critical connections. ### Communication Cadence by Severity | Severity | Internal Updates | Executive Updates | External Updates | |---|---|---|---| | **P1** | Every 30 minutes to IR channel | Every 1 hour to CISO/CEO | As required by Legal (regulatory, customer) | | **P2** | Every 2 hours to IR channel | Every 4 hours to CISO | As determined by Communications Lead | | **P3** | Daily summary to IR channel | Weekly summary to management | None unless escalated | | **P4** | Case notes in TheHive | None | None | ### Status Update Template Every status update during a P1/P2 should follow a consistent structure: ``` INCIDENT STATUS UPDATE — IR-2026-0047 Time: 2026-02-23 06:00 UTC | Update #4 | Severity: P2 CURRENT STATUS: Containment in progress WHAT HAPPENED SINCE LAST UPDATE: - Forensic image of linux-web-01 completed (EVD-001, SHA-256 verified) - Web shell identified and removed: /var/www/html/.hidden/cmd.php - Velociraptor hunt across all Linux servers: no additional compromises found WHAT IS HAPPENING NOW: - Credential reset for www-data and all web application service accounts - Reviewing 48 hours of proxy logs for additional C2 communication - Patching CVE-2026-1234 on all internet-facing web servers NEXT UPDATE: 2026-02-23 08:00 UTC DECISIONS NEEDED: None at this time ```

**Use a dedicated incident channel, not your general SOC channel.** Create a channel named after the incident ID (e.g., `#ir-2026-0047`) at incident declaration. Only people working the incident or needing updates should be in the channel. This prevents noise from routine SOC chatter and creates a searchable record of all communications.

## Monitoring for Attacker Return Recovery is not the end. Attackers who are evicted often try to return — especially if the initial access vector was not fully remediated or if they established persistence that survived eradication. **Post-recovery monitoring (minimum 30 days):** - Increase Wazuh alerting sensitivity on recovered systems (lower thresholds) - Deploy Suricata rules specifically targeting IOCs and TTPs from the incident - Run Velociraptor hunts weekly on recovered systems checking for persistence artifacts - Monitor for the same C2 infrastructure across the entire environment (not just the recovered system) - Watch for new accounts, services, scheduled tasks, or SSH keys on recovered systems - Monitor DNS logs for queries to domains associated with the attacker

- Short-term containment stops the immediate threat (network isolation, account disabling, IP/domain blocking) — execute within minutes - Long-term containment allows continued operations during eradication (VLAN segmentation, enhanced monitoring, temporary hardening) - The containment decision matrix balances threat severity against business impact — high threat always wins, low threat with high impact requires documented IC approval - Evidence preservation happens in parallel with containment: capture memory first (highest volatility), then processes, disk, logs, and network captures - Image before you eradicate — forensic images preserve the compromised state for analysis, legal, and intel - Eradication must be comprehensive: remove malware, close backdoors, reset credentials (broader than you think), patch vulnerabilities, and update detection rules - Credential resets after domain admin compromise must include krbtgt (twice), all privileged accounts, all service accounts, and SSH keys - Recovery strategy selection depends on compromise severity: rebuild from scratch for P1/P2, restore from backup when verified clean, clean-in-place only for limited P3/P4 compromises - Every recovered system must pass a validation checklist before returning to production — including YARA scan, baseline comparison, and IC sign-off - Post-recovery monitoring (minimum 30 days) watches for attacker return: enhanced alerting, incident-specific Suricata rules, weekly Velociraptor hunts

## What's Next You now know how to contain, eradicate, and recover from an active incident. In **Lab 14.3 — Incident Containment Simulation**, you'll put these skills into practice by working through a simulated incident where you must isolate a compromised system, preserve evidence, execute eradication steps, and validate recovery — all under time pressure.