- Explain Velociraptor's server-client architecture and how agents communicate with the server - Navigate the Velociraptor GUI — Dashboard, Search, Client Detail, Hunt Manager, and Notebooks - Describe what an artifact is and how it differs from raw VQL - Run artifact collections against Windows and Linux endpoints and review results - Write basic VQL queries using core plugins (pslist, netstat, glob, read_file) - Execute a structured first-investigation workflow from client search through process and network analysis

## What Is Velociraptor? Velociraptor is an open-source endpoint detection, forensics, and response (EDR/DFIR) tool maintained by Rapid7. It gives SOC analysts and incident responders the ability to reach into any endpoint on the network — Windows, Linux, or macOS — and ask precise questions: *What processes are running right now? Which files changed in the last hour? What network connections are active? Is there a scheduled task that should not exist?* Where a SIEM like Wazuh collects logs that endpoints push to it, Velociraptor lets you *pull* data from endpoints on demand, at the exact moment you need it. Velociraptor is written in Go and compiles to a single binary — roughly 20 MB. That binary serves as both the server and the client agent depending on its configuration. There are no heavy dependencies, no Java runtime, no database server to maintain. This simplicity is deliberate: incident responders need a tool they can deploy in minutes during an active breach, not one that requires a week of infrastructure planning. ![Velociraptor server-client architecture — agents on endpoints poll the server, analysts interact through the web GUI](https://cyberblue-academy-content.s3.us-east-2.amazonaws.com/courses/cyberbluesoc-academy/module-06/lesson-6-2/velociraptor-architecture.png) The architecture follows a server-client model: - **Server**: Runs the Velociraptor backend, stores collected data, hosts the web GUI. In the lab, the server runs as a container on port 8889. - **Client agents**: Lightweight processes installed on each endpoint. Agents do not maintain a persistent connection — they poll the server at regular intervals (default: every 10 seconds) checking for new instructions. When the server has a collection request, the agent picks it up, executes the query locally, and sends the results back. - **Web GUI**: The analyst's interface. You log in to the server through a browser and manage everything — search for clients, run collections, review results, write VQL queries, launch hunts across the fleet. This polling model is important for understanding how Velociraptor operates. The agent is not streaming data continuously to the server. It sits quietly on the endpoint, consuming minimal resources, until the server gives it something to do. When you launch a collection, the agent picks up the task on its next poll, executes the query *locally on the endpoint*, and sends back only the results. Raw data never leaves the endpoint unless the query specifically requests it. This keeps bandwidth low and gives you surgical precision — you collect exactly what you need, nothing more. Velociraptor supports all three major operating systems. On Windows, it can query the registry, WMI, ETW event tracing, NTFS metadata (MFT, USN journal), prefetch files, and Windows Event Logs. On Linux, it queries `/proc`, `/sys`, crontabs, systemd services, audit logs, and file system metadata. On macOS, it covers launchd, plist files, Unified Logs, and the TCC privacy database. The same VQL query language works across all platforms, though many artifacts are OS-specific because the underlying data sources differ.

**Artifacts are just VQL queries wrapped in YAML metadata.** When you run a "Windows.Sys.Pslist" artifact, Velociraptor is executing a VQL query that reads the process list from the Windows API. The artifact format adds a name, description, parameters, and column definitions around the raw query — making it reusable, shareable, and self-documenting. You can read the VQL inside any artifact by clicking "View Artifact" in the GUI.

## The Velociraptor GUI The web GUI is where you will spend all your time as an analyst. It is a single-page application that communicates with the server backend via API calls. Everything you can do in the GUI, you can also do via VQL or the API — but the GUI is the fastest way to get oriented. ![Velociraptor GUI overview — Dashboard, Search, Client Detail, Hunt Manager, Server Artifacts, and Notebooks](https://cyberblue-academy-content.s3.us-east-2.amazonaws.com/courses/cyberbluesoc-academy/module-06/lesson-6-2/velociraptor-gui-overview.png) ### Dashboard The landing page after login. The dashboard shows a summary of connected clients: total enrolled agents, how many are currently online (last check-in within the polling interval), and recent hunt activity. In the lab environment, you will see one or two clients — the Linux endpoint and optionally a Windows endpoint, depending on the lab profile. In a production environment, this page might show thousands of enrolled agents across multiple operating systems. ### Search The search bar at the top of the GUI lets you find clients by hostname, IP address, client ID, or label. Type a partial hostname and Velociraptor returns matching clients. In an investigation, this is your starting point — you get the hostname from your SIEM alert, search for it in Velociraptor, and click through to the client detail page. You can also apply labels to clients (e.g., "compromised", "needs-review", "critical-server") and search by label later. ### Client Detail When you click a client, you enter the client detail view. This is a multi-tabbed interface with everything you need to investigate that specific endpoint: - **Overview**: Hostname, OS version, IP addresses, agent version, last check-in time, labels. A quick snapshot that tells you what the machine is and whether the agent is responsive. - **VQL Drilldown**: Pre-configured queries that show common system information — installed software, OS details, hardware specs. This runs automatically when you open the client. - **Collected Artifacts**: A list of every artifact collection that has been run against this client. Each collection shows the artifact name, when it was launched, its status (running, completed, error), and a link to view the results. This is the audit trail — you can see what other analysts have already collected. - **Events**: Event monitoring artifacts that run continuously on the client, watching for specific activities in real time (file system changes, process creation, network connections). Event artifacts are the "always-on" component — they run persistently, unlike standard collections which are one-shot queries. ### Hunt Manager Hunts let you run the same artifact across *all* enrolled clients (or a filtered subset) simultaneously. Instead of collecting process lists one client at a time, a hunt runs `Windows.Sys.Pslist` across every Windows machine in your environment and aggregates the results in one view. This is how you go from "one alert on one machine" to "is this happening anywhere else?" You find a suspicious scheduled task on `WIN-DC-01`, then launch a hunt running `Windows.System.TaskScheduler` across all Windows clients. If the same task appears on five other machines, you have just discovered lateral movement. ### Server Artifacts Server-side artifacts run on the Velociraptor server itself, not on endpoints. These handle housekeeping: importing third-party artifact packs, configuring event forwarding to your SIEM, managing user accounts, and running server-level VQL queries that aggregate data across clients. ### Notebooks Notebooks are the VQL playground. Each notebook is a collection of cells where you can write and execute VQL queries interactively, see results in real time, add markdown notes, and share the notebook with your team. Think of it as a Jupyter notebook for endpoint forensics.

**Use the Notebook for ad-hoc VQL queries.** When you need to test a VQL statement, filter collected data in a new way, or combine results from multiple collections, open a notebook instead of creating a formal artifact. Notebooks are disposable scratchpads — perfect for exploration and one-off investigations. Once you have refined your query, you can promote it into a reusable artifact.

## Artifacts: Asking the Endpoint a Question An artifact is the fundamental unit of work in Velociraptor. Every time you collect data from an endpoint, you are running an artifact. Every artifact is a VQL query packaged with YAML metadata — a name, description, author, parameter definitions, and column descriptions. The packaging makes the query reusable, self-documenting, and shareable across teams and organizations. Velociraptor ships with hundreds of built-in artifacts organized by operating system and function. Here are the ones you will use most frequently as a SOC analyst: ### Windows Artifacts | Artifact | Purpose | |---|---| | `Windows.Sys.Pslist` | List all running processes with PID, PPID, command line, user, memory usage | | `Windows.System.TaskScheduler` | Enumerate all scheduled tasks — a top persistence mechanism | | `Windows.Sys.StartupItems` | List startup programs from registry Run keys, Startup folders, and services | | `Windows.Network.Netstat` | Active network connections with owning process — maps PIDs to connections | ### Linux Artifacts | Artifact | Purpose | |---|---| | `Linux.Sys.Pslist` | List all running processes from `/proc` — PID, PPID, command line, user | | `Linux.Sys.Crontab` | Enumerate all cron jobs — scheduled tasks on Linux, common persistence vector | | `Linux.Sys.Services` | List systemd services — enabled, disabled, running, failed | | `Linux.Network.Netstat` | Active network connections parsed from `/proc/net/tcp` and `/proc/net/udp` | ### Generic Artifacts (Cross-Platform) | Artifact | Purpose | |---|---| | `Generic.Client.Info` | Basic system info — OS, hostname, IPs, agent version. Works on all platforms | | `Generic.Network.Netstat` | Cross-platform network connections — uses OS-appropriate method under the hood | ### Running a Collection The workflow for collecting data from a single client: 1. **Search** for the client by hostname or IP 2. **Click** the client to open the detail view 3. **Click "New Collection"** (the + button on the Collected Artifacts tab) 4. **Select the artifact** — type the name in the search box (e.g., "Pslist") and pick the one for the correct OS 5. **Configure parameters** — most artifacts have sensible defaults, but you can filter by PID, process name, path, or other criteria 6. **Launch** the collection 7. **Review results** — the collection appears in the Collected Artifacts list with a status indicator. Click it to see the output table. Results are displayed as structured rows and columns that you can sort, filter, and export The collection executes on the endpoint — the agent picks up the request, runs the VQL query against local system data, and streams the results back to the server. For most artifacts, results appear within seconds. Heavier collections (full disk scans, memory analysis) can take minutes.

**Large collections can impact endpoint performance.** Artifacts that scan the entire filesystem (`glob()` with broad patterns), parse large log files, or process NTFS metadata (MFT parsing) consume CPU and I/O on the endpoint. In production, always consider the performance impact before launching heavy collections against critical servers during business hours. Use parameters to narrow the scope — filter by path, time range, or file extension to reduce the workload.

## VQL: Velociraptor Query Language VQL is Velociraptor's query language. If you know SQL, VQL will feel familiar — it uses a `SELECT ... FROM ... WHERE` structure. But where SQL queries database tables, VQL queries *plugins* — functions that pull data from the operating system, parse files, make network requests, or transform data. ### Basic Syntax ```sql SELECT column1, column2 FROM plugin(arg1=value1, arg2=value2) WHERE condition ORDER BY column1 LIMIT 100 ``` ### Key Plugins | Plugin | Purpose | Platform | |---|---|---| | `pslist()` | Running processes | Windows, Linux, macOS | | `netstat()` | Network connections | Windows, Linux, macOS | | `glob()` | File search by path pattern | All | | `read_file()` | Read file contents | All | | `parse_json()` | Parse JSON structured data | All | | `stat()` | File metadata (size, timestamps, permissions) | All | | `environ()` | Environment variables | All | ### Practical VQL Examples **List all running processes (Linux or Windows):** ```sql SELECT Pid, Ppid, Name, CommandLine, Username FROM pslist() ``` **Find processes running from /tmp (Linux — common malware staging location):** ```sql SELECT Pid, Name, CommandLine, Exe FROM pslist() WHERE Exe =~ '/tmp/' ``` **Find processes running from unusual Windows paths:** ```sql SELECT Pid, Name, CommandLine, Exe FROM pslist() WHERE Exe =~ '(Temp|AppData|Downloads|ProgramData)' AND NOT Exe =~ 'Microsoft' ``` **Find all listening TCP ports:** ```sql SELECT Pid, Name, Laddr, Status FROM netstat() WHERE Status = 'LISTEN' ``` **Search for recently modified files in a directory (last 24 hours):** ```sql SELECT FullPath, Size, Mtime FROM glob(globs='/var/log/**') WHERE Mtime > now() - 86400 ORDER BY Mtime DESC ``` **Find files matching a suspicious pattern on Windows:** ```sql SELECT FullPath, Size, Mtime FROM glob(globs='C:/Users/*/AppData/Local/Temp/*.exe') ``` VQL is the power behind every artifact. When the built-in artifacts do not answer your specific question, you write VQL directly in a notebook. Need to find all processes that have network connections to external IPs? Write a VQL query that joins `pslist()` with `netstat()`. Need to search every user's home directory for files created in the last hour? Write a `glob()` query with a time filter. VQL gives you the flexibility to ask any question the operating system can answer. ## Your First Investigation Workflow Let us walk through a structured endpoint investigation. This is the same workflow you will follow in the lab and the same workflow you would use in production when a SIEM alert points you to a specific host. **Step 1: Search for the client.** You receive a Wazuh alert about suspicious activity on a host. Open Velociraptor's GUI and type the hostname in the search bar. Click the matching client to enter the detail view. **Step 2: Run Generic.Client.Info.** This is always your first collection — it confirms basic system details: operating system, hostname, IP addresses, agent version, uptime. This tells you what you are dealing with. Is this a Windows server? A Linux web host? An employee laptop? The answer determines which artifacts you will run next. **Step 3: Review system details.** Check the OS version, installed patches (if visible), IP configuration, and agent health. Note the last check-in time — if the agent last checked in hours ago, the machine might be powered off, disconnected, or the agent might have been killed by an attacker. **Step 4: Run the process listing artifact.** On Linux, run `Linux.Sys.Pslist`. On Windows, run `Windows.Sys.Pslist`. This gives you a complete snapshot of every process running at the moment of collection: PIDs, parent PIDs, command lines, users, executable paths, and memory usage. **Step 5: Review the process list.** Scan for anomalies: - Processes running from unusual paths (`/tmp`, `/dev/shm`, `C:\Users\*\AppData\Local\Temp`) - Processes with obfuscated or encoded command lines (base64 strings, random-looking names) - Processes running as root/SYSTEM that should not be (a web shell process spawned by the web server) - Child processes of unexpected parents (cmd.exe spawned by a Word process = macro execution) - Processes with high memory or CPU that do not match normal workload **Step 6: Run network connections.** Use `Linux.Network.Netstat` or `Windows.Network.Netstat`. This shows every active TCP/UDP connection, including the owning PID. **Step 7: Correlate processes with connections.** This is where the investigation starts to tell a story. Match the PIDs from the process list with the PIDs in the network connections. A process named `svchost.exe` with an outbound connection to an IP in Russia is suspicious. A process named `/tmp/.cache` with a LISTEN port is almost certainly malicious. The combination of *what the process is* and *what it is connected to* is the core of endpoint-based threat detection. **Step 8: Deepen as needed.** Based on what you find, run additional artifacts: - Suspicious scheduled task? Run `Windows.System.TaskScheduler` or `Linux.Sys.Crontab` - Unknown binary in /tmp? Use `glob()` to check creation time, then `read_file()` or hash the file - Persistence mechanism? Run `Windows.Sys.StartupItems` or check `Linux.Sys.Services` Each artifact answers a question, and each answer generates the next question. That iterative loop — collect, analyze, pivot — is what makes Velociraptor a forensic investigation tool rather than just a data collector.

**Lab credentials: admin / cyberblue.** The Velociraptor GUI in the lab environment uses these credentials. This is a pre-configured lab instance — never use default or weak credentials in a production deployment. Production Velociraptor deployments should use certificate-based authentication with unique client certificates per endpoint and multi-factor authentication for GUI access.

- Velociraptor is an open-source EDR/DFIR tool (Rapid7) — a single Go binary (~20 MB) that serves as both server and client agent across Windows, Linux, and macOS - Architecture is server-client with polling: agents check the server every ~10 seconds for tasks, execute queries locally, and send back only the results — keeping bandwidth low and giving analysts surgical precision - The GUI has six key areas: Dashboard (fleet overview), Search (find clients), Client Detail (per-host deep dive), Hunt Manager (fleet-wide collections), Server Artifacts (server-side ops), and Notebooks (VQL playground) - Artifacts are VQL queries wrapped in YAML metadata — reusable, self-documenting, and categorized by OS (Windows.Sys.Pslist, Linux.Sys.Crontab, Generic.Client.Info, etc.) - VQL uses familiar `SELECT FROM WHERE` syntax with plugins instead of tables — `pslist()` for processes, `netstat()` for connections, `glob()` for file search, `read_file()` for file contents - The investigation workflow is iterative: Search client → Generic.Client.Info → Process list → Network connections → Correlate PIDs → Deepen with targeted artifacts - Hunts extend single-client collections to the entire fleet — essential for answering "is this happening on other machines?" during incident response

## What's Next Time to use Velociraptor on a real investigation. In **Lab 8.1 — Endpoint Collection**, you'll receive a Wazuh alert about a suspicious host, pivot to Velociraptor, and collect endpoint artifacts to uncover hidden threats that the SIEM couldn't see. This is where you put the interface, artifacts, and VQL skills you just learned into practice.