Linux Survival Basics

Linux Survival Basics safe

Find the Files Eating Your Disk

The disk was full, but guessing at folders was the slow part.

find /var -type f -printf '%s %p\n' | sort -nr | head -20

Linux Survival Basics safe

Watch Logs Without Opening the Whole File

The app was failing now. Opening a giant log file was the wrong move.

tail -n 80 -f /var/log/nginx/error.log

Linux Survival Basics safe

Find Errors Before Reading Every Log Line

The error was in the log. The problem was finding it without reading noise.

grep -iE 'error|failed|denied|timeout' /var/log/nginx/error.log | tail -40

Linux Survival Basics safe

Find the Exact Log Line Before You Scroll

The error was there. The useful part was knowing exactly where it was.

grep -inE 'error|failed|denied|timeout' /var/log/nginx/error.log

Linux Survival Basics safe

Find Which Folder Is Filling the Disk

The disk was full. The fastest clue was the folder, not the file.

du -sh /var/* 2>/dev/null | sort -h

Linux Survival Basics safe

Show Only Recent Errors

The log had old failures too. I only cared about the newest ones.

grep -iE 'error|failed|denied|timeout' /var/log/nginx/error.log | tail -10

Linux Survival Basics safe

Check Owner and Mode in One Line

The file existed. The owner and mode explained why it still failed.

stat -c '%A %U:%G %n' /var/www/example/index.html

Linux Survival Basics safe

Find the Processes Using Memory

The server felt slow. Memory pressure was the first thing to rule out.

ps -eo pid,comm,%mem,%cpu --sort=-%mem | head

Linux Survival Basics safe

Show Big Files in Human Units

Byte counts are precise. Human units are faster under pressure.

find /var -type f -printf '%s %p\n' | sort -nr | head -10 | awk '{printf "%.1f MB %s\n", $1/1024/1024, $2}'

Linux Survival Basics safe

List Contents of a Backup Tarball

You can inspect an archive without extracting it.

tar -tf archives/site-backup.tar | sort | head

Linux Survival Basics safe

Count Source Files by Extension

A quick extension count can show whether expected content made it into the source tree.

find source -type f -printf '%f\n' | sed -n 's/.*\.//p' | sort | uniq -c | sort -nr

Linux Survival Basics safe

Fingerprint a Debian or Ubuntu Host

Before package triage, prove what OS family and release you are actually on.

. /etc/os-release && printf '%s %s %s\n' "$ID" "$VERSION_ID" "$VERSION_CODENAME"

Linux Survival Basics safe

Compare Kernel and Distro Versions

The distro version and kernel version answer different questions.

printf 'kernel=%s arch=%s distro=%s\n' "$(uname -r)" "$(uname -m)" "$(lsb_release -ds)"

Linux Survival Basics safe

List Installed Package Versions

A package inventory beats memory when a server is drifting.

dpkg-query -W -f='${Package}\t${Version}\t${Architecture}\n' | sort

Linux Survival Basics safe

See Which Packages Want Updates

Before you upgrade anything, list what would move.

apt list --upgradable

Linux Survival Basics safe

Check the Installed and Candidate Package Version

apt policy explains where the next version would come from.

apt policy nginx

Linux Survival Basics safe

Check One Installed Package Cleanly

For one package, dpkg-query gives a clean status line.

dpkg-query -W -f='${Status} ${Version}\n' openssl

Linux Survival Basics safe

Find Which Package Owns a File

That binary came from somewhere. dpkg can tell you where.

dpkg-query -S /usr/sbin/nginx

Linux Survival Basics safe

Find Broken or Leftover dpkg States

Not every package row is cleanly installed.

dpkg-query -W -f='${db:Status-Abbrev}\t${Package}\n' | awk '$1 !~ /^ii$/'

Linux Survival Basics safe

Find the Largest Installed Packages

Disk cleanup starts with evidence, not random package removal.

dpkg-query -W -f='${Installed-Size}\t${Package}\n' | sort -nr | head -20

Linux Survival Basics safe

Spot Foreign-Architecture Packages

One unexpected architecture can explain confusing dependency output.

dpkg-query -W -f='${Architecture}\t${Package}\n' | awk '$1 != "amd64" && $1 != "all"'

Linux Survival Basics safe

Find the Largest CI Logs

Huge logs often point to loops, noisy tests, or runaway debug output.

find logs/ -type f -printf '%s %p\n' | sort -nr | head -10

Linux Survival Basics safe

Count Failures by Test File

Turn noisy test logs into a ranked failure list.

Linux Survival Basics safe

Show Context Around the First App Error

The first error often explains more than the last one.

awk '{buf[NR%5]=$0} tolower($0) ~ /(error|exception|fatal)/ {for (i=NR-4;i<=NR;i++) if (i>0) print buf[i%5]; exit}' fixtures/incidents/app.log

Linux Survival Basics safe

Spot OOM Kills in the Kernel Journal

Exit code 137 often means the kernel has something to say.

journalctl -k --since "2 hours ago" --no-pager -o short-iso | grep -Ei 'out of memory|oom|killed process'

Linux Survival Basics safe

Trace Every Parent Directory on a Permission Denial

The file mode can look fine while a parent directory blocks the whole path.

namei -l fixtures/perm-audit/current/app/config/prod.token

Linux Survival Basics safe

Check Memory Pressure with free

Linux memory numbers look scary until you know which column matters.

free -h

Linux Survival Basics safe

Read Load Average Before You React

A high load number is a clue, not a diagnosis.

uptime

Linux Survival Basics safe

Show the Real User Cron Jobs

Cron problems often hide behind comments, blank lines, and copied folklore.

crontab -l | sed -n '/^[[:space:]]*#/d;/^[[:space:]]*$/d;p'

Linux Survival Basics safe

Turn Cron Into a Readable Table

Cron is easier to debug when the schedule and command stop blending together.

crontab -l | awk 'NF && $1 !~ /^#/ {printf "%-16s %s\n", $1" "$2" "$3" "$4" "$5, substr($0,index($0,$6))}'

Linux Survival Basics safe

Map systemd Timers to Services

A timer is only half the scheduled job. The service is the payload.

systemctl list-timers --all --no-pager --plain | awk 'NR==1 || /\.timer/ {print $(NF-1), "->", $NF}'

Linux Survival Basics safe

List Tables in a SQLite Database

Before querying a database file, see what tables are actually inside it.

sqlite3 app.db ".tables"

Linux Survival Basics safe

List URLs from a Sitemap

Before comparing sitemap coverage, print the URLs plainly.

grep -o '[^<]*' public/sitemap.xml | sed 's###;s###'

Linux Survival Basics safe

Show Failed systemd Units

One command tells you which services systemd already knows are broken.

systemctl --failed --no-pager

Linux Survival Basics safe

Inspect One Service Without Pager Traps

Make systemctl status safe for scripts, screenshots, and quick incident notes.

systemctl status nginx --no-pager --lines=30

Linux Survival Basics safe

Read Current-Boot Logs for One Service

Ignore stale logs and inspect only what happened since this boot.

journalctl -u nginx -b --no-pager -n 80

Linux Survival Basics safe

Check systemd Journal Disk Usage

Before deleting random logs, ask journald how much disk it owns.

journalctl --disk-usage

Linux Survival Basics safe

Find Slow Services During Boot

Find which units made your VPS boot slowly.

systemd-analyze blame | head -20

Linux Survival Basics safe

Check Whether a Service Starts at Boot

Running now does not mean it will survive the next reboot.

systemctl is-enabled nginx

Linux Survival Basics safe

Check If a Service Is Active

Get a clean yes-or-no service state without the full status page.

systemctl is-active nginx

Linux Survival Basics safe

Show Recent Server Reboots

Confirm whether the server actually rebooted and when.

last -x reboot | head -5

Linux Survival Basics safe

Check Memory Pressure Quickly

See whether memory is actually tight before restarting services.

free -h

Linux Survival Basics safe

List Upcoming systemd Timers

Cron is not the only scheduler on modern Linux servers.

systemctl list-timers --all --no-pager

Linux Survival Basics safe

Read the Failure Cause in systemctl Status

The status page often tells you the failed startup step before you open every log.

systemctl status app-worker --no-pager --lines=50

Linux Survival Basics safe

Print the Exact systemd Exit Fields

Turn a noisy service failure into four fields you can paste into an incident note.

systemctl show app-worker --property=Result,ExecMainCode,ExecMainStatus,NRestarts --no-pager

Linux Survival Basics safe

Inspect the Unit File and Drop-ins Together

The bug may be in an override file, not the main unit.

systemctl cat app-worker

Linux Survival Basics caution

Reset Failed State After Capturing Evidence

Clear the red failed state only after you have captured the evidence.

systemctl reset-failed app-worker

Linux Survival Basics safe

Compare Failure Output With the Effective Unit

Put the failed step next to the unit config that created it.

systemctl status app-worker --no-pager --lines=50 && systemctl cat app-worker

One-liners in this area