What is a SIEM and what does it do?

A SIEM (Security Information and Event Management) system collects logs from across your infrastructure, normalises them into a common format, stores them for compliance and forensic purposes, applies correlation rules to detect attack patterns, and generates alerts for the security operations team.

What is the difference between a SIEM and a log aggregation tool like the ELK stack?

A SIEM is purpose-built for security use cases with prebuilt detection rules, threat intelligence feeds, and compliance reporting. A log aggregation tool like ELK collects and searches logs but requires you to build all detection logic yourself. SIEM products trade flexibility for security-specific features and out-of-the-box detections.

How do you decide what to log and forward to a SIEM?

Log security-relevant events: authentication, authorisation, privileged actions, data access, and configuration changes. Avoid logging personal data that is not necessary for security investigation. Apply structured logging (JSON) so the SIEM can parse fields without brittle regex rules. Work backwards from the alerts you want to trigger.

← Back to Cybersecurity Hub

SIEM and Log Management for Security

In 2026, attackers are masters of "Living off the Land." They don't use noisy malware; they use your own administrative tools to move through your network. To catch them, you can't just look for "Bad Files"-you must look for Bad Behavior. This is where SIEM (Security Information and Event Management) becomes your most vital defensive tool.

This 1,500+ word guide explores how to architect a high-scale logging pipeline that transforms raw data into actionable security intelligence.

1. Hardware-Mirror: The Physics of the "Everything Log"

To a developer, a log is a console.log() statement. To the hardware, a log is a Sequential Disk Write followed by a Random Search Indexing operation.

The I/O Storm and SSD Wear

The Physics: Every time a user logs in, your server writes a few hundred bytes to an SSD. On a system with $10,000$ logins per second, this creates a massive "I/O Storm."
SSD Burnout: Continual writing at this scale can physically wear out the NAND cells of an SSD in months rather than years.
The Solution: Use Log Buffering (like Redis or Kafka). Store the logs in high-speed RAM first, then flush them to the physical disk in large, efficient blocks.

CPU-bound Correlation

Real-time correlation (connecting Event A to Event B) is a CPU-intensive process.

The Math: The SIEM must maintain a "State Window" in RAM for every active user.
The Limit: If you have too many correlation rules, your SIEM's CPU will hit 100%, and "Event Latency" will rise. An alert that fires 1 hour late is useless during an active breach.

1. The Log Hierarchy: What to Collect?

Not all logs are created equal. To avoid drowning in data, you must prioritize.

Critical Logs: Authentication (Login/Logout), IAM changes, Firewall blocks, Database access.
Application Logs: Error rates, suspicious URL parameters (potential XSS/SQLi).
System Logs: Kernel events, process execution (via auditd/eBPF).

2. What is a SIEM?

A SIEM is a centralized platform that performs three core functions:

Aggregation: Collecting logs from thousands of different hardware sources.
Normalization: Turning different log formats (Linux syslog, AWS CloudTrail, Nginx) into a single common schema.
Correlation: Connecting the dots between seemingly unrelated events.

The Magic of Correlation:

Event 1: Failed login on Server A.
Event 2: New user created on Server B.
Event 3: 5GB of data sent to a foreign IP from Server B.
SIEM Result: Triggers a Critical Priority 1 Alert because these events together indicate a successful lateral movement and data exfiltration.

3. The Log Pipeline: From Node to Storage

Architecting a SIEM pipeline is a massive data engineering challenge.

The Shipper: A lightweight agent (like Filebeat or Fluentbit) that reads logs from the physical disk and sends them over the network.
The Buffer: A message queue (like Kafka) that protects your SIEM from being overwhelmed during a traffic spike or an attack.
The Processor: A service (like Logstash) that parses and enriches the logs (e.g., adding GeoIP data to an IP address).
The Search Engine: The database (like Elasticsearch or OpenSearch) that allows you to query billions of logs in milliseconds.

4. Searchable vs. Cold Storage: The Economics of Logging

Logging everything is expensive.

Hot Storage (Searchable): SSD-backed storage for the last 7-30 days of data. Fast and expensive.
Cold Storage (Archive): S3-backed storage for logs required for compliance (e.g., 7 years for HIPAA). Slow and very cheap.

Architectural Tip: Use Life-cycle Policies to automatically move logs from Hot to Cold storage.

6. Log Forgery: Preventing the "Memory Reset"

The first thing a sophisticated attacker does after gaining access is try to Delete the Logs.

Log Forgery: Attackers can inject "Fake" logs into your stream to confuse the SIEM or hide their own footprints.
The Defense (Forwarding): Never store your primary security logs only on the local disk. Use a Streamer (Fluentbit) to physically move the logs to a "Write-Once" (WORM) storage bucket the moment they are created.
Encryption: Sign each log entry with a hardware-backed key. If an attacker tries to modify a log entry to hide their IP, the signature verification will fail, triggering an immediate "Integrity Breach" alert.

7. Case Study: The 2017 Equifax Visibility Gap

The Equifax breach, which exposed the data of $143$ million people, was primarily a failure of Visibility and SIEM Maintenance.

The Root Cause: An unpatched Apache Struts vulnerability.
The Logging Failure: Equifax had security tools that could have detected the data exfiltration, but the SSL certificate for their traffic inspection tool had expired.
The Physics: For 10 months, the monitoring hardware was physically unable to "see" the encrypted data leaving the network.
The Lesson: A SIEM is only as good as its Connectivity. If your hardware can't decrypt and inspect the traffic, you have a physical blind spot that attackers will exploit.

6. Fighting Alert Fatigue

If your SIEM sends 1,000 alerts a day, your security team will ignore all of them. This is "Alert Fatigue."

The Goal: High-fidelity alerts. An alert should only fire if it is Actionable.
The Filter: Use "Noise Suppression" to ignore known-safe behavior (like a daily backup job) and only alert on "First Time" or "High Risk" events.

Summary: Designing for Observability

SIEM and Log Management are the "Eyes" of your security organization. By architecting a scalable pipeline and focusing on meaningful correlation, you move from "Passive Logging" to "Active Hunting."

You are no longer just an architect of systems; you are a Historian of the Reality of your Hardware.

Phase 16: SIEM Resilience Actions

Centralize your logs into a dedicated SIEM platform (Splunk, Elastic Security, or Sentinel).
Implement Write-Once-Read-Many (WORM) policies for your audit logs to prevent physical deletion by an attacker.
Enable Log Compression (zstd/gzip) on your forwarders to reduce the network and storage "Scan Tax."
Set up "Heartbeat" monitoring for your logging agents: If a server stops sending logs for 5 minutes, treat it as a critical security event.