Clean Lines, Clear Signals: A Minimalist Approach to Squid Proxy Monitoring

I recently decided to update the health monitoring solution for Squid proxy servers running on Linux.

I have covered this before and other posts but thought it could do with a lick of paint and a bit of modernization, the new script complete with full-featured monitoring, visual indicators, and intelligent health scoring - makes this a handy update to my script arsenal.

The Initial Challenge

The requirement was straightforward: create a monitoring script for Squid proxy servers that would generate HTML email reports with system metrics, SVG charts, and be compatible across multiple Linux distributions.

Visual Results

First lets start with the healthy version, this means all is running well

Now lets look at a degraded status from the server:

You then get what is is a degraded state as you can see here:

Then finally we move on to critical:

You should get a breakdown of everything that has become degraded as a warning/critial alert:

Starting with the Basics

I began with a simple monitoring script that collected system metrics and checked if Squid was running:

# Check if Squid is running
if systemctl is-active --quiet squid 2>/dev/null; then
    SQUID_RUNNING=1
    SQUID_STATUS="Running"
fi

# Get basic metrics
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | sed "s/.*, *\([0-9.]*\)%* id.*/\1/" | awk '{print 100 - $1}')
MEM_INFO=$(free -m | grep Mem)
MEM_PERCENT=$(awk "BEGIN {printf \"%.0f\", ($MEM_USED/$MEM_TOTAL)*100}")

Focusing on What Matters: Proxy-Specific Metrics

For a proxy server, the critical metrics are fundamentally different from typical server monitoring. I focused on what actually impacts proxy performance:

# File descriptors - THE most critical metric for proxies
FD_USED=$(ls /proc/$SQUID_PID/fd 2>/dev/null | wc -l)
FD_TOTAL=$(grep "Max open files" /proc/$SQUID_PID/limits | awk '{print $5}')
FD_PERCENT=$(awk "BEGIN {printf \"%.1f\", ($FD_USED/$FD_TOTAL)*100}")

# Cache performance metrics
REQUEST_HIT_RATIO=$(echo "$SQUID_INFO" | grep "Request Hit Ratios:" | grep -oE '[0-9.]+%')
BYTE_HIT_RATIO=$(echo "$SQUID_INFO" | grep "Byte Hit Ratios:" | grep -oE '[0-9.]+%')

# DNS performance (critical for proxy speed)
DNS_LOOKUPS=$(echo "$SQUID_INFO" | grep "Number of external DNS lookups" | awk '{print $6}')
FQDN_HIT_RATE=$(awk "BEGIN {printf \"%.1f\", ($FQDN_HITS/$FQDN_REQUESTS)*100}")

File descriptors are particularly critical because running out of them is one of the most common causes of proxy failure under load.

Performance Optimization: Preventing Script Hangs

The script would hang when collecting metrics because squidclient could timeout. I implemented a robust timeout wrapper:

# Timeout wrapper function
run_with_timeout() {
    local timeout=$1
    shift
    if check_command timeout; then
        timeout "$timeout" "$@" 2>/dev/null
    else
        "$@" 2>/dev/null
    fi
}

# Use with 2-second timeout for all squidclient calls
SQUID_INFO=$(run_with_timeout 2 squidclient -h localhost -p 3128 mgr:info)

I also optimized log parsing to only process recent entries instead of entire files:

# Process only last 1000 lines for recent activity
RECENT_REQUESTS=$(tail -1000 "$SQUID_LOG" 2>/dev/null | awk -v cutoff=$ONE_MIN_AGO '$1 > cutoff' | wc -l)

# Quick hit rate from last 200 requests only
TCP_HITS=$(tail -200 "$SQUID_LOG" 2>/dev/null | grep -c "TCP_HIT\|TCP_MEM_HIT\|TCP_IMS_HIT")

Creating Visual Health Indicators

To make the health status immediately apparent, I implemented a color-coded system with visual indicators:

# Function to create status dot in HTML
create_status_dot() {
    local status=$1
    local color=$(get_status_color "$status")
    local text=$(get_status_text "$status")
    
    echo "<div style='display: flex; align-items: center; gap: 8px;'>
        <div style='width: 12px; height: 12px; border-radius: 50%; 
             background-color: ${color}; box-shadow: 0 0 0 2px ${color}33;'></div>
        <span style='color: ${color}; font-weight: 500;'>${text}</span>
    </div>"
}

This produces a clean table with instant visual feedback:

🟢 Green = Good
🟡 Amber = Warning
🔴 Red = Critical

Dynamic Content: Only Show What's Available

One key improvement was ensuring the HTML report only displays metrics that have actual data:

# Only show file descriptors if we have the data
if [ -n "$FD_PERCENT" ] && [ "$FD_PERCENT" != "0" ]; then
    echo "<tr>
        <td>File Descriptors</td>
        <td>${FD_USED}/${FD_TOTAL} (${FD_PERCENT}% used)</td>
        <td>$(create_status_dot "$FD_STATUS")</td>
    </tr>" >> /tmp/squid_health_report.html
fi

The Health Scoring Algorithm

I developed a weighted scoring system that focuses on proxy-critical metrics:

# File descriptor check (most critical for proxies)
if [ "$FD_PERCENT" -gt 90 ]; then
    HEALTH_SCORE=$((HEALTH_SCORE - 25))  # Heavy penalty
    FD_STATUS="critical"
elif [ "$FD_PERCENT" -gt 75 ]; then
    HEALTH_SCORE=$((HEALTH_SCORE - 10))
    FD_STATUS="warning"
fi

# Cache performance check
HIT_RATE_NUM=$(echo "$REQUEST_HIT_RATIO" | sed 's/%//')
if [ "$HIT_RATE_NUM" -lt 20 ]; then
    HEALTH_SCORE=$((HEALTH_SCORE - 10))
    HIT_STATUS="warning"
fi

Console Output for Quick Checks

The script provides a clean console summary with visual indicators:

═══════════════════════════════════════════════════════════════
                 SQUID PROXY HEALTH SUMMARY
═══════════════════════════════════════════════════════════════

▶ OVERALL HEALTH: GOOD (85%) ✓

QUICK STATUS CHECK
───────────────────────────────────────────────────────────────
System Health:      CPU ✓ Memory ✓ Disk ✓
Squid Health:       Service ✓ FDs ✓ Cache ✓

The final script executes in under 5 seconds, provides clear visual health indicators, and focuses on the metrics that actually matter for proxy server health monitoring.

Clean Lines, Clear Signals: A Minimalist Approach to Squid Proxy Monitoring

نموذج الاتصال