Hosting Operations

Group Journal Errors by Unit

Recent journal errors mention several processes and you need to see which unit or source is producing most of them.

Command

journalctl -p err..alert --since "2 hours ago" --no-pager -o short-iso | awk '{split($3,a,"["); unit=a[1]; count[unit]++} END {for (u in count) print count[u], u}' | sort -nr

What changed

Nothing changes. The command groups severe journal entries by source field.

Danger

safe

When to use it

Use after a severity summary to decide which service log deserves attention first.

When not to use it

Do not assume the noisiest unit caused the incident; it may only be reporting downstream failure.

Undo or recovery

No undo needed because the command is read-only.

Expected output

Counts followed by unit or process names.

demo script

Disposable terminal steps

journalctl -p err..alert --since "2 hours ago" --no-pager -o short-iso
journalctl -p err..alert --since "2 hours ago" --no-pager -o short-iso | awk '{split($3,a,"["); unit=a[1]; count[unit]++} END {for (u in count) print count[u], u}' | sort -nr

Terminal transcript Demo JSON

simulated output

What it looks like

disposable vessel

::fixture-ready::
$ journalctl -p err..alert --since "2 hours ago" --no-pager -o short-iso
2026-06-25T14:03:08+00:00 vps api[1842]: err request_id=req-103 ERROR database timeout after 30000ms
2026-06-25T14:03:12+00:00 vps api[1842]: err request_id=req-103 ERROR retry failed upstream=db
2026-06-25T14:05:10+00:00 vps worker[2201]: crit FATAL job runner exited code=137
2026-06-25T14:06:33+00:00 vps api[1842]: err request_id=req-107 ERROR payment provider returned 500
::exit-code::0
$ journalctl -p err..alert --since "2 hours ago" --no-pager -o short-iso | awk '{split($3,a,"["); unit=a[1]; count[unit]++} END {for (u in count) print count[u], u}' | sort -nr
3 api
1 worker
::exit-code::0

YouTube Short

Find the noisiest unit.

Group severe journal lines by source. It quickly tells you whether the incident is centered on the app, worker, kernel, or supervisor.

experiments

A/B tests to run

Metric: short_click_through_rate

A: Find the noisiest service.

B: Group errors before reading details.