- Explain why DNS is a goldmine for security analysts and attackers alike - Identify DNS tunneling by recognizing unusually long subdomain queries - Detect DGA (Domain Generation Algorithm) domains using entropy and length heuristics - Understand DNS over HTTPS (DoH) and the visibility challenge it creates - Use DNS query analysis in EveBox to identify suspicious domain activity

## DNS — The Protocol Every Attack Touches If there's one protocol that touches nearly every cyberattack, it's DNS. Before malware can phone home to a C2 server, it needs to resolve the domain. Before stolen data can leave the network, it often gets encoded into DNS queries. Before a phishing page can load, the victim's browser queries DNS for the domain. DNS is the universal protocol — it's always allowed through firewalls, rarely inspected, and essential for virtually all network communication. This makes DNS simultaneously the attacker's best friend and the defender's goldmine. Attackers abuse DNS because it's trusted and ubiquitous. Defenders monitor DNS because it reveals intent — the domains a host resolves tell you what it's trying to communicate with, even before a single byte of application data is exchanged. In previous lessons, you learned how Suricata monitors network traffic and fires alerts based on rules (Lesson 3.2), and how protocol analysis reveals what's happening beneath the surface (Lesson 3.3). DNS analysis is a specific discipline within protocol analysis — and it deserves its own lesson because of how frequently DNS appears in attack chains and how rich the detection opportunities are. --- ## DNS Tunneling — Hiding Data in Queries DNS tunneling is one of the most effective data exfiltration techniques because it abuses a protocol that's almost never blocked. The technique encodes arbitrary data into DNS query labels (the subdomain portion), sends those queries to an attacker-controlled authoritative DNS server, which decodes the data from the queries and optionally returns data in DNS responses. ### How It Works A normal DNS query looks like this: ``` Query: www.google.com → A record → 142.250.80.4 ``` A DNS tunneling query looks like this: ``` Query: dXNlcm5hbWU9YWRtaW4mcGFzc3dvcmQ9.exfil.badactor.com → TXT record ``` The subdomain `dXNlcm5hbWU9YWRtaW4mcGFzc3dvcmQ9` is Base64-encoded data: `username=admin&password=`. The attacker's DNS server for `badactor.com` receives this query, decodes the subdomain, and extracts the stolen credential. The response can also carry encoded data back to the client, creating a bidirectional channel. ![DNS Tunneling — Data hidden in subdomain queries](https://cyberblue-academy-content.s3.us-east-2.amazonaws.com/courses/cyberbluesoc-academy/module-03/lesson-3-4/dns-tunneling.png) ### Why It's Effective - **DNS is almost never blocked** — blocking DNS breaks all internet connectivity - **DNS is often unmonitored** — many organizations don't log or inspect DNS queries - **DNS traverses firewalls** — even heavily restricted networks allow outbound DNS (UDP/53) - **DNS responses can carry data back** — TXT records can hold up to 255 characters per string, multiple strings per response ### Detection Indicators | Indicator | Normal DNS | DNS Tunneling | |-----------|-----------|---------------| | **Query length** | < 30 characters | > 60 characters (encoded data in subdomain) | | **Subdomain entropy** | Low (readable words) | High (random-looking Base64/hex characters) | | **Query volume** | Dozens of unique domains per hour | Hundreds to thousands of queries to ONE domain | | **Record types** | Mostly A and AAAA | Heavy TXT, NULL, or CNAME usage | | **Subdomain uniqueness** | Repeated queries to same subdomains | Every query has a unique subdomain (each carries different data) |

**DNS tunneling is one of the most effective exfiltration methods** because DNS is almost never blocked. A skilled attacker can exfiltrate data at 10-50 KB/s through DNS alone — enough to steal a database of credentials overnight. Even at a conservative 10 KB/s, that's 864 MB per day flowing out through a protocol most organizations don't monitor. Tools like `iodine`, `dnscat2`, and `dns2tcp` make DNS tunneling trivially easy to set up.

--- ## DGA Domains — Machine-Generated Throwaway Names Domain Generation Algorithms (DGAs) are used by malware to dynamically generate domain names for C2 communication. Instead of hardcoding a single C2 domain (which defenders can block), the malware generates hundreds or thousands of domains using a seed value (often based on the current date). The attacker registers just one or a few of these domains, and the malware cycles through the list until it finds one that resolves. ### How DGA Works 1. **Malware runs its algorithm** — input: current date + seed → output: list of domain names 2. **Malware queries each domain** — most return NXDOMAIN (not registered) 3. **One domain resolves** — the attacker pre-registered it → C2 connection established 4. **Defenders block that domain** — malware generates a new list tomorrow, attacker registers a different one ### Characteristics of DGA Domains DGA domains have distinctive properties that make them detectable: | Domain | Entropy | TLD | Length | Verdict | |--------|---------|-----|--------|---------| | `google.com` | 2.8 bits/char | .com | 10 | Legitimate — low entropy, recognizable | | `amazon.co.uk` | 3.0 bits/char | .co.uk | 13 | Legitimate — low entropy, known brand | | `xk7f9p2m.tk` | 3.9 bits/char | .tk | 12 | Likely DGA — high entropy, free TLD | | `qwz8m4nt5v.pw` | 3.8 bits/char | .pw | 14 | Likely DGA — high entropy, abused TLD | | `a8x2kd9fnw3.top` | 3.7 bits/char | .top | 15 | Likely DGA — high entropy, cheap TLD | | `microsoft-update.com` | 3.3 bits/char | .com | 20 | Typosquat — moderate entropy but mimics a brand | ### Entropy as a Detection Tool Shannon entropy measures the randomness of a string. Human-readable domain names have low entropy because they use common letter patterns. Machine-generated names have high entropy because they're algorithmically random. **Entropy calculation (simplified):** For each character, calculate how "surprising" it is given the character distribution. More uniform distribution = higher entropy. - `google` → entropy ~2.6 bits/char (repeated 'g' and 'o', common letters) - `xk7f9p` → entropy ~2.6 bits/char per unique char but ~3.9 overall (mixed case, digits, no common patterns)

**A simple heuristic: if the domain looks like someone mashed the keyboard, it's probably DGA.** More formally, calculate Shannon entropy: anything above 3.5 bits/char for a domain label warrants investigation. Combine entropy with other signals — NXDOMAIN response rate (DGA malware generates many non-resolving queries), unusual TLDs, and first-seen timing — for higher-confidence detection.

### NXDOMAIN Spikes One of the strongest DGA indicators is a sudden spike in NXDOMAIN responses (DNS response code indicating the domain doesn't exist). Normal hosts occasionally hit NXDOMAIN from typos or stale cache entries. A host generating 50+ NXDOMAIN responses per minute is likely running a DGA — it's cycling through generated domains, most of which the attacker hasn't registered. --- ## Suspicious TLDs and Newly Registered Domains Not all top-level domains are created equal. Some TLDs have earned a reputation for being disproportionately used in malicious activity, primarily because they offer free or extremely cheap registration with minimal verification. ### High-Risk TLDs | TLD | Registration Cost | Abuse Rate | Common Use | |-----|------------------|------------|------------| | `.tk` (Tokelau) | Free | Very High | Phishing, DGA, C2 | | `.pw` (Palau) | ~$1 | High | Malware distribution, DGA | | `.top` | ~$1 | High | Spam, phishing, DGA | | `.buzz` | ~$2 | High | Spam, phishing campaigns | | `.surf` | ~$3 | Moderate-High | C2 infrastructure | | `.xyz` | ~$1 | Moderate | Mixed — some legitimate, high abuse | | `.com` | ~$12 | Low | Dominant legitimate TLD (but still used in attacks) |

**In the Operation Wire Tap scenario**, you will see CYBERBLUE DNS rules flagging queries to .tk, .pw, .top, .buzz, and .surf TLDs — each one connected to a different stage of the attack. The attacker uses disposable domains on cheap TLDs for C2 fallback, exfiltration endpoints, and tool staging. This mirrors real-world attack patterns where threat actors register domains across multiple abused TLDs for resilience.

### Newly Registered Domains (NRDs) Domains registered less than 30 days ago are statistically more likely to be malicious. Legitimate organizations typically register domains well in advance of use. Attackers register domains immediately before campaigns and discard them after. Detection approaches: - **WHOIS lookups** — Check domain creation date - **Threat intelligence feeds** — Many feeds include NRD lists updated daily - **Suricata rules** — Can flag queries to domains matching known-bad patterns or TLDs - **DNS response monitoring** — New domains often have minimal DNS infrastructure (single A record, no MX, no SPF) --- ## DNS over HTTPS (DoH) — The Visibility Problem Traditional DNS operates over plaintext UDP port 53. Every DNS query and response is fully visible to any network device in the path — firewalls, NIDS sensors, DNS servers, even the ISP. This visibility is what makes DNS monitoring so powerful for defenders. DNS over HTTPS (DoH) changes this equation fundamentally. ### How DoH Works Instead of sending DNS queries as plaintext UDP packets to a DNS server on port 53, DoH wraps DNS queries inside HTTPS connections to web-based resolvers: | Aspect | Traditional DNS | DNS over HTTPS | |--------|----------------|---------------| | **Transport** | UDP port 53 (plaintext) | HTTPS port 443 (encrypted) | | **Visibility** | Fully visible to NIDS | Invisible — looks like any HTTPS traffic | | **Resolver** | Network-configured (local DNS server) | Application-configured (1.1.1.1, 8.8.8.8) | | **Inspection** | Full query/response visible | Only destination IP visible | | **Blocking** | Easy — filter on port 53 | Difficult — same port as all web traffic | ### The Impact on Security Monitoring When a host uses DoH, the NIDS loses all DNS visibility for that host. No query names, no response codes, no TXT record abuse, no DGA detection, no tunneling detection. The DNS queries are encrypted inside HTTPS — they look identical to any other web traffic to the NIDS sensor. This means: - **DNS tunneling detection fails** — Tunneled queries go to the DoH resolver, not the monitored DNS server - **DGA detection fails** — NXDOMAIN responses are inside the encrypted HTTPS stream - **Suspicious TLD monitoring fails** — Query domains are not visible - **DNS logging is incomplete** — Organizational DNS servers never see the queries

**DoH is a double-edged sword.** It protects user privacy by preventing ISPs, coffee shop WiFi operators, and authoritarian governments from monitoring browsing habits (good). But it also blinds corporate security monitoring by encrypting the DNS queries that SOC teams rely on for threat detection (bad for defenders). SOC teams must plan for reduced DNS visibility as DoH adoption increases.

### Detection and Mitigation Strategies Despite the challenges, defenders aren't completely helpless against DoH: | Strategy | How It Works | Effectiveness | |----------|-------------|---------------| | **Block known DoH resolvers** | Firewall rules blocking IPs of 1.1.1.1, 8.8.8.8, and other known DoH providers | Medium — new resolvers appear frequently | | **Force internal DNS** | Network policy requiring all DNS through corporate resolvers | High — if enforced at the network level | | **Endpoint DNS logging** | Wazuh agent or EDR captures DNS queries at the OS level before they're encrypted | High — sees queries regardless of transport | | **Monitor HTTPS to DoH IPs** | Alert on HTTPS connections to known DoH resolver IPs | Medium — identifies DoH usage but not query content | | **Canary domains** | Internal DNS resolves canary domains — if a host doesn't query internal DNS for these, it's using external DNS | Creative — detects bypass without blocking | The key insight: as DNS monitoring on the network becomes less reliable, **endpoint-level DNS logging** (via agents like Wazuh) becomes essential. The endpoint always sees the DNS query before it's encrypted — regardless of whether it goes over UDP/53 or HTTPS/443. --- ## DNS Analysis in EveBox Suricata logs DNS events in eve.json with the `dns` event type, and EveBox makes this data searchable. Understanding what fields are available and what patterns to look for turns DNS logs into a powerful investigation tool. ### Key DNS Fields in EveBox | Field | What It Contains | What to Look For | |-------|-----------------|------------------| | **Query name** (`dns.query.rrname`) | The domain being queried | Long subdomains (tunneling), random names (DGA), known-bad domains | | **Query type** (`dns.query.rrtype`) | Record type: A, AAAA, TXT, MX, CNAME, NULL | TXT queries (tunneling), unusual types for the domain | | **Response code** (`dns.rcode`) | NOERROR, NXDOMAIN, SERVFAIL | NXDOMAIN spikes (DGA), SERVFAIL (DNS infrastructure issues) | | **Answer** (`dns.answer.rdata`) | The resolved IP or record data | Known-bad IPs, private IPs for public domains, short TTL values | ### Investigation Patterns **Pattern 1: DNS Tunneling Hunt** 1. Filter: `event_type:dns` + sort by query name length 2. Look for: Queries with subdomain labels > 60 characters 3. Check: Do all long queries go to the same parent domain? (e.g., `*.exfil.badactor.com`) 4. Correlate: Volume — hundreds of unique subdomains to one parent domain confirms tunneling **Pattern 2: DGA Detection** 1. Filter: `event_type:dns` + response code NXDOMAIN 2. Look for: Single host generating many NXDOMAIN responses in a short window 3. Check: Do the queried domains look random? (high entropy, unusual TLDs) 4. Correlate: If one domain in the list eventually resolves, that's the active C2 domain **Pattern 3: Suspicious Domain Investigation** 1. Start with an alert referencing a suspicious domain 2. Filter: `event_type:dns` + query name matching that domain 3. Check: When was it first queried? By how many hosts? What record type? 4. Correlate: Cross-reference with TLS logs (was there a TLS connection after resolution?) and flow logs (how long was the subsequent connection?)

**TXT record queries deserve special attention.** While TXT records have legitimate uses (SPF, DKIM, domain verification), they're also the preferred record type for DNS tunneling because they can carry the most data per response (up to 255 characters per string, multiple strings per record). A host making frequent TXT queries to an unusual domain is a strong tunneling indicator.

---

- DNS is involved in nearly every cyberattack — resolving C2 domains, exfiltrating data, generating throwaway domains — making it one of the richest data sources for security monitoring - DNS tunneling encodes data in subdomain labels and sends it to attacker-controlled DNS servers — detected by query length (> 60 chars), high subdomain entropy, and high query volume to a single parent domain - DGA malware generates random domain names algorithmically — detected by high Shannon entropy (> 3.5 bits/char), NXDOMAIN response spikes, and unusual TLDs (.tk, .pw, .top) - Certain TLDs (.tk, .pw, .top, .buzz, .surf) are disproportionately used in attacks due to free or cheap registration with minimal verification - DNS over HTTPS (DoH) encrypts DNS queries inside HTTPS, blinding NIDS completely — endpoint-level DNS logging becomes essential to maintain visibility - Suricata logs DNS metadata (query name, type, response code, answer) as `dns` event types in eve.json — EveBox makes this searchable for investigation - TXT record queries, NXDOMAIN spikes, and first-seen domain timing are the three strongest DNS-based indicators of malicious activity

## What's Next Time to hunt for DNS threats hands-on. In **Lab 6.3 — Suspicious DNS**, you'll deep-dive into DNS-based threats using the Operation Wire Tap data. Investigate DNS tunneling, queries to known-malicious domains, and DNS-based C2 patterns — applying everything you just learned about DNS as an attack vector.