Multi-Factor Authentication (MFA) is a critical security component that helps prevent unauthorized access to company resources. when you rely on local on-premesis servers like for example Entrust IdentityGuard for OTP (One-Time Password) which serves as a crucial security layer for our internal and external systems. However, the effectiveness of this security measure depends entirely on its availability and proper functioning.
I recently developed a PowerShell monitoring solution for our Entrust IdentityGuard OTP implementation that has proven invaluable in ensuring system reliability. This blog post details my approach to monitoring MFA systems in a dual-server environment.
This is the monitoring side of this post here
Visual Results
This is a valid healthy test where you can see both the servers are processing login requests:
This is an example of a critical alert where one of the servers has not been working for multiple days at a time but the other server is healthy:
This finally shows the output of the script when run:
The Challenge
Our Entrust IdentityGuard implementation runs on two servers in an active-active configuration. This redundancy is designed to ensure high availability, but it introduces monitoring complexities:
- How do we verify that both servers are actively processing authentication requests?
- How can we detect when one server is doing all the work while the other remains idle?
- How do we know if the entire OTP system has stopped functioning?
When our MFA system fails, the consequences are severe - users can't access critical business applications, productivity plummets, and our service desk gets flooded with tickets. I needed a monitoring solution that would alert us to problems before they escalated to full outages.
The Solution: Audit Log Monitoring
After analyzing our Entrust deployment, I discovered that the key to monitoring the health of our MFA system lies in the identityguard_audit.log
files. These logs contain detailed records of OTP creation and delivery events.
The solution I developed is a PowerShell script that:
- Remotely connects to both MFA servers
- Analyzes the audit logs for specific OTP events
- Determines the age of the most recent events
- Alerts the IT team based on predefined thresholds
Tracking the Right Events
The script specifically looks for two sequential log entries that indicate successful OTP operations:
First, an OTP creation event:
[2025-04-28 15:43:18,136] [IG Audit Writer] [INFO ] [IG.AUDIT] [AUD3003] [OTP/username]
One time password with index 468 created for user OTP/username.
Expiry Date: 2025-04-28 15:53:18
Followed by an OTP delivery event:
[2025-04-28 15:43:18,281] [IG Audit Writer] [INFO ] [IG.AUDIT] [AUD3008] [OTP/username]
One time password delivered to user OTP/username.
Contact info label: Mobile Phone, Contact info value: 075*****98,
Delivery configuration label: Mobile
These paired events confirm that the system is not only generating OTPs but also successfully delivering them to users.
Here's how I parse these events in PowerShell:
# Function to extract OTP events from audit log
function Get-OTPEvents {
param (
[string]$serverPath,
[string]$serverName
)
$auditLogPath = Join-Path -Path $serverPath -ChildPath "identityguard_audit.log"
try {
if (Test-Path -Path $auditLogPath) {
Write-Log ("Analyzing audit log on $serverName...")
# Read the file content
$logContent = Get-Content -Path $auditLogPath -ErrorAction Stop
# Look for OTP creation events (most recent first)
$createEvents = $logContent |
Where-Object { $_ -match '\[AUD3003\].*One time password with index
\d+ created for user' } |
Select-Object -Last 20
if ($createEvents.Count -gt 0) {
# Take the most recent creation event
$latestCreateEvent = $createEvents[$createEvents.Count - 1]
# Extract the timestamp and username from the creation event
if ($latestCreateEvent -match '\[([\d\-]+ [\d:,]+)\].*\[AUD3003\]
\[(.*?)\]') {
$eventTimeStr = $matches[1]
$username = $matches[2]
# Find the corresponding delivery event
$deliverEvent = $logContent |
Where-Object { $_ -match "\[AUD3008\] \[$username\]" -and
$_ -match "One time password delivered to user" } |
Select-Object -Last 1
# Return the event details
return @{
CreateEventLine = $latestCreateEvent
DeliverEventLine = $deliverEvent
EventTime = [DateTime]::Parse($eventTimeStr.Replace(',','.'))
EventUser = $username
Status = "Success"
}
}
}
}
}
catch {
$errorMsg = $Error[0].Exception.Message
Write-Log ("Error accessing audit log: " + $errorMsg)
}
# Return error status if we couldn't find events
return @{
Status = "Error"
ErrorMessage = "No valid OTP events found"
}
}
Alert Thresholds and Notifications
I implemented a tiered alert system based on the age of the most recent OTP events:
- Green Status (Normal): OTP events less than 1 day old
- Warning Status: OTP events between 1-2 days old
- Critical Status: OTP events more than 2 days old or missing on either server
This approach gives us adequate time to investigate and resolve issues before they impact users.
The alert logic looks like this:
# Configure alert thresholds
$warningThresholdDays = 1 # More than 1 day old = warning
$criticalThresholdDays = 2 # More than 2 days old = critical
# Determine overall status based on event age
if ($server1Events.Status -eq "Success" -and $server2Events.Status -eq "Success") {
# Calculate event ages
$server1EventAgeDays = ($currentDate - $server1Events.EventTime).TotalDays
$server2EventAgeDays = ($currentDate - $server2Events.EventTime).TotalDays
# Use the oldest event age to determine status
$oldestEventAgeDays = [Math]::Max($server1EventAgeDays, $server2EventAgeDays)
if ($oldestEventAgeDays -lt $warningThresholdDays) {
$status = "Normal"
$statusDescription = "OTP events are current
(less than $warningThresholdDays day old) on both servers."
}
elseif ($oldestEventAgeDays -lt $criticalThresholdDays) {
$status = "Warning"
$statusDescription = "OTP events are between $warningThresholdDays
and $criticalThresholdDays days old."
}
else {
$status = "Critical"
$statusDescription = "OTP events are $criticalThresholdDays days old or older."
}
}
The email alerts include:
- Current status with color-coded indicators
- Timestamps of the latest events on each server
- The age of the latest events
- Actual log entries for troubleshooting
- Username of the last user to receive an OTP
Here's an example of the information included in our alerts:
Server: otp.server.1
Latest OTP Event: 2025-04-28 15:43:18
Event Age: 0.42 days
OTP User: OTP/username
Latest OTP Events:
[2025-04-28 15:43:18,136] [IG Audit Writer] [INFO ] [IG.AUDIT] [AUD3003]
[OTP/username] One time password with index 468 created for user OTP/username.
Expiry Date: 2025-04-28 15:53:18
[2025-04-28 15:43:18,281] [IG Audit Writer] [INFO ] [IG.AUDIT] [AUD3008]
[OTP/username] One time password delivered to user OTP/username.
Contact info label: Mobile Phone, Contact info value: 075*****98,
Delivery configuration label: Mobile
I use HTML for formatting the emails to make them more readable and visually informative:
# CSS for email styling
$emailCSS = @"
<style type="text/css">
body {
font-family: Arial, sans-serif;
color: #333333;
}
.status-card {
border-radius: 5px;
padding: 15px;
margin-bottom: 20px;
}
.status-normal {
background-color: #4CAF50;
color: white;
}
.status-warning {
background-color: #FF9800;
color: white;
}
.status-critical {
background-color: #F44336;
color: white;
}
.event-log {
padding: 15px;
background-color: #f0f0f0;
font-family: Consolas, monospace;
font-size: 13px;
white-space: pre-wrap;
}
</style>
"@
# Build the email with HTML formatting
$emailBody = @"
<!DOCTYPE html>
<html>
<head>$emailCSS</head>
<body>
<div class="status-card $statusCardClass">
<div>$statusText</div>
</div>
<p>$statusDescription</p>
<h3>$server1Name</h3>
<div>Latest OTP Event: $server1EventDate</div>
<div>Event Age: $server1EventAge</div>
<div>OTP User: $server1EventUser</div>
<div class="event-log">
$server1CreateEvent
$server1DeliverEvent
</div>
<!-- Server 2 information follows with the same format -->
</body>
</html>
"@
Addressing Server-Specific Issues
One of the key insights that prompted this monitoring solution was discovering that sometimes one MFA server would become inactive while the other handled all authentication traffic. While users didn't experience an outage, this situation was dangerous - if the active server failed, we'd face a complete MFA outage.
The script specifically checks for this condition by comparing event timestamps across both servers. If it detects a significant discrepancy (logs on one server being much older than on the other), it generates a warning so I can investigate and restore proper load distribution.
# Check if only one server is being used
if ($server1Events.Status -eq "Success" -and $server2Events.Status -eq "Success") {
# Calculate the difference between server event timestamps in days
$dateDiff = [Math]::Abs(($server1Events.EventTime - $server2Events.EventTime).TotalDays)
if ($dateDiff -gt 5) { # More than 5 days difference
$inactiveServer = if ($server1EventAgeDays -gt $server2EventAgeDays) { $server1Name }
else { $server2Name }
$status = "Warning"
$statusDescription = "Server $inactiveServer may not be participating in
authentication. " +
"The timestamp difference between servers is $([Math]::Round($dateDiff, 1)) days."
}
}
Technical Implementation
The PowerShell script runs as a scheduled task every hour on our monitoring server. It connects to both MFA servers using UNC paths to access the log files:
# Configuration parameters
$server1Path = "\\otpserver1\c$\Program Files\Entrust\IdentityGuard\identityguard130\logs"
$server2Path = "\\otpserver2\c$\Program Files\Entrust\IdentityGuard\identityguard130\logs"
$server1Name = "otp.server.1"
$server2Name = "otp.server.2"
$auditLogName = "identityguard_audit.log"
# Email configuration
$smtpServer = "smtprelay.bear.local"
$emailFrom = "mfa-monitor@bythepowerofgreyskull.com"
$emailTo = @("lee@bythepowerofgreyskull.com")
$emailSubject = "MFA OTP Events Alert - Entrust Identity Guard"
The script uses several functions to handle different aspects of the monitoring process:
# Main execution flow
Write-Log "MFA OTP event monitoring script started"
# Get OTP events from both servers
$server1Events = Get-OTPEvents -serverPath $server1Path -serverName $server1Name
$server2Events = Get-OTPEvents -serverPath $server2Path -serverName $server2Name
# Format event information
$server1EventDate = $server1Events.EventTime.ToString("yyyy-MM-dd HH:mm:ss")
$server1EventAge = [Math]::Round(($currentDate - $server1Events.EventTime).TotalDays, 2)
# Determine status and prepare alert
$status = DetermineAlertStatus -server1Events $server1Events -server2Events $server2Events
# Send alert when needed
if ($status -ne "Normal" -or $forceAlert) {
$emailBody = Get-EmailContent -status $status -server1Events $server1Events
-server2Events $server2Events
Send-AlertEmail -body $emailBody
}
Write-Log "Monitoring completed. Final status: $status"
Results and Benefits
Since implementing this monitoring solution, I've been able to:
- Detect MFA configuration issues before they affect users
- Identify load balancing problems between servers
- Verify successful OTP delivery to end users
- Reduce MFA-related downtime by catching issues early
- Establish baseline activity patterns for normal operation
In one instance, the script alerted me to a situation where one server hadn't processed an OTP event for over two weeks. Investigation revealed a configuration issue preventing the server from participating in the authentication flow. Without this monitoring, we might not have discovered the problem until a critical failure occurred.