I recently decided to update the health monitoring solution for Squid proxy servers running on Linux.
I have covered this before and other posts but thought it could do with a lick of paint and a bit of modernization, the new script complete with full-featured monitoring, visual indicators, and intelligent health scoring - makes this a handy update to my script arsenal.
The Initial Challenge
The requirement was straightforward: create a monitoring script for Squid proxy servers that would generate HTML email reports with system metrics, SVG charts, and be compatible across multiple Linux distributions.
Visual Results
First lets start with the healthy version, this means all is running well
You then get what is is a degraded state as you can see here:
I began with a simple monitoring script that collected system metrics and checked if Squid was running:
# Check if Squid is running
if systemctl is-active --quiet squid 2>/dev/null; then
SQUID_RUNNING=1
SQUID_STATUS="Running"
fi
# Get basic metrics
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | sed "s/.*, *\([0-9.]*\)%* id.*/\1/" | awk '{print 100 - $1}')
MEM_INFO=$(free -m | grep Mem)
MEM_PERCENT=$(awk "BEGIN {printf \"%.0f\", ($MEM_USED/$MEM_TOTAL)*100}")
Focusing on What Matters: Proxy-Specific Metrics
For a proxy server, the critical metrics are fundamentally different from typical server monitoring. I focused on what actually impacts proxy performance:
# File descriptors - THE most critical metric for proxies
FD_USED=$(ls /proc/$SQUID_PID/fd 2>/dev/null | wc -l)
FD_TOTAL=$(grep "Max open files" /proc/$SQUID_PID/limits | awk '{print $5}')
FD_PERCENT=$(awk "BEGIN {printf \"%.1f\", ($FD_USED/$FD_TOTAL)*100}")
# Cache performance metrics
REQUEST_HIT_RATIO=$(echo "$SQUID_INFO" | grep "Request Hit Ratios:" | grep -oE '[0-9.]+%')
BYTE_HIT_RATIO=$(echo "$SQUID_INFO" | grep "Byte Hit Ratios:" | grep -oE '[0-9.]+%')
# DNS performance (critical for proxy speed)
DNS_LOOKUPS=$(echo "$SQUID_INFO" | grep "Number of external DNS lookups" | awk '{print $6}')
FQDN_HIT_RATE=$(awk "BEGIN {printf \"%.1f\", ($FQDN_HITS/$FQDN_REQUESTS)*100}")
File descriptors are particularly critical because running out of them is one of the most common causes of proxy failure under load.
Performance Optimization: Preventing Script Hangs
The script would hang when collecting metrics because squidclient could timeout. I implemented a robust timeout wrapper:
# Timeout wrapper function
run_with_timeout() {
local timeout=$1
shift
if check_command timeout; then
timeout "$timeout" "$@" 2>/dev/null
else
"$@" 2>/dev/null
fi
}
# Use with 2-second timeout for all squidclient calls
SQUID_INFO=$(run_with_timeout 2 squidclient -h localhost -p 3128 mgr:info)
I also optimized log parsing to only process recent entries instead of entire files:
# Process only last 1000 lines for recent activity
RECENT_REQUESTS=$(tail -1000 "$SQUID_LOG" 2>/dev/null | awk -v cutoff=$ONE_MIN_AGO '$1 > cutoff' | wc -l)
# Quick hit rate from last 200 requests only
TCP_HITS=$(tail -200 "$SQUID_LOG" 2>/dev/null | grep -c "TCP_HIT\|TCP_MEM_HIT\|TCP_IMS_HIT")
Creating Visual Health Indicators
To make the health status immediately apparent, I implemented a color-coded system with visual indicators:
# Function to create status dot in HTML
create_status_dot() {
local status=$1
local color=$(get_status_color "$status")
local text=$(get_status_text "$status")
echo "<div style='display: flex; align-items: center; gap: 8px;'>
<div style='width: 12px; height: 12px; border-radius: 50%;
background-color: ${color}; box-shadow: 0 0 0 2px ${color}33;'></div>
<span style='color: ${color}; font-weight: 500;'>${text}</span>
</div>"
}
This produces a clean table with instant visual feedback:
- 🟢 Green = Good
- 🟡 Amber = Warning
- 🔴 Red = Critical
Dynamic Content: Only Show What's Available
One key improvement was ensuring the HTML report only displays metrics that have actual data:
# Only show file descriptors if we have the data
if [ -n "$FD_PERCENT" ] && [ "$FD_PERCENT" != "0" ]; then
echo "<tr>
<td>File Descriptors</td>
<td>${FD_USED}/${FD_TOTAL} (${FD_PERCENT}% used)</td>
<td>$(create_status_dot "$FD_STATUS")</td>
</tr>" >> /tmp/squid_health_report.html
fi
The Health Scoring Algorithm
I developed a weighted scoring system that focuses on proxy-critical metrics:
# File descriptor check (most critical for proxies)
if [ "$FD_PERCENT" -gt 90 ]; then
HEALTH_SCORE=$((HEALTH_SCORE - 25)) # Heavy penalty
FD_STATUS="critical"
elif [ "$FD_PERCENT" -gt 75 ]; then
HEALTH_SCORE=$((HEALTH_SCORE - 10))
FD_STATUS="warning"
fi
# Cache performance check
HIT_RATE_NUM=$(echo "$REQUEST_HIT_RATIO" | sed 's/%//')
if [ "$HIT_RATE_NUM" -lt 20 ]; then
HEALTH_SCORE=$((HEALTH_SCORE - 10))
HIT_STATUS="warning"
fi
Console Output for Quick Checks
The script provides a clean console summary with visual indicators:
═══════════════════════════════════════════════════════════════
SQUID PROXY HEALTH SUMMARY
═══════════════════════════════════════════════════════════════
▶ OVERALL HEALTH: GOOD (85%) ✓
QUICK STATUS CHECK
───────────────────────────────────────────────────────────────
System Health: CPU ✓ Memory ✓ Disk ✓
Squid Health: Service ✓ FDs ✓ Cache ✓
The final script executes in under 5 seconds, provides clear visual health indicators, and focuses on the metrics that actually matter for proxy server health monitoring.