prod@blog:~$

Automating Cleanup of Folder-ZIP Pairs in Shared Directories


I recently needed to solve a problem with accumulating folders and ZIP files in our shared directory. Our process creates folders, compresses them into ZIP files, and then uploads those ZIPs elsewhere. The challenge was creating an automated cleanup solution that wouldn't interfere with files still being processed.

The Problem

In and internal process we generate folders containing documents, compress them into ZIP files, and then upload them to OneDrive for long-term storage. After the upload completes, both the folder and ZIP file remain on the file share, gradually consuming disk space. Over time, this became a significant issue as hundreds of these folder-ZIP pairs accumulated, eating up valuable storage on our file servers.

Manual cleanup wasn't practical because I couldn't easily tell which files were safe to delete. A ZIP file might look old but still be uploading, especially for large archives that can take 20-30 minutes to transfer. Deleting too early would cause upload failures and data loss. Waiting too long meant wasted disk space and potential issues when the volume filled up.

Understanding the Workflow

Before writing the script, I needed to understand our exact workflow timing. I monitored the system for several days and discovered that our process follows this pattern:

  1. A folder gets created with documents inside
  2. Within 1-2 minutes, the folder is compressed into a ZIP file
  3. The ZIP file begins uploading to OneDrive immediately after creation
  4. Upload times vary from 30 seconds for small files to 30 minutes for large archives
  5. After successful upload, both the folder and ZIP serve no further purpose

This timing information was crucial for setting appropriate safety thresholds in my cleanup script.

Solution

I wrote a PowerShell script that carefully validates multiple conditions before deleting anything. The script checks timing relationships between folders and their corresponding ZIP files, ensures files are old enough that any upload would have completed, and verifies that files aren't currently locked by another process.

Here's the complete script:

# ================= CONFIG =================
$RootPath = "\\smbserver1.local\ZipFiles"
$LogFile  = "$RootPath\Cleanup.log"

$MinFolderZipGapMinutes = 1
$MinZipAgeMinutes       = 35

# ================= LOG FUNCTION =================
function Write-Log {
    param ([string]$Message)
    $timestamp = Get-Date -Format "yyyy-MM-dd HH:mm:ss"
    "$timestamp - $Message" | Out-File -FilePath $LogFile -Append -Encoding UTF8
}

# ================= FILE LOCK CHECK =================
function Test-FileLocked {
    param ([string]$Path)
    try {
        $stream = [System.IO.File]::Open($Path, 'Open', 'ReadWrite', 'None')
        $stream.Close()
        return $false  # Not locked
    }
    catch {
        return $true   # Locked / in use
    }
}

Write-Log "----- Cleanup run started -----"

# ================= MAIN LOGIC =================
$Folders = Get-ChildItem -Path $RootPath -Directory -ErrorAction Stop

foreach ($Folder in $Folders) {
    $ZipPath = Join-Path $RootPath ($Folder.Name + ".zip")

    if (-not (Test-Path $ZipPath)) {
        Write-Log "ZIP missing for folder '$($Folder.Name)' — skipping"
        continue
    }

    $Zip = Get-Item $ZipPath
    $FolderTime = $Folder.CreationTime
    $ZipTime    = $Zip.CreationTime
    $Now        = Get-Date

    $FolderZipGapMinutes = [math]::Abs(($ZipTime - $FolderTime).TotalMinutes)
    $ZipAgeMinutes       = ($Now - $ZipTime).TotalMinutes

    Write-Log "Checking '$($Folder.Name)'"
    Write-Log " - Folder/ZIP gap: $([math]::Round($FolderZipGapMinutes,2)) mins (minimum $MinFolderZipGapMinutes mins)"
    Write-Log " - ZIP age: $([math]::Round($ZipAgeMinutes,2)) mins (minimum $MinZipAgeMinutes mins)"

    if ($FolderZipGapMinutes -lt $MinFolderZipGapMinutes) {
        Write-Log "SKIP: Folder/ZIP gap < $MinFolderZipGapMinutes minutes"
        continue
    }
    else {
        Write-Log "PASS: Folder/ZIP gap >= $MinFolderZipGapMinutes minutes"
    }

    if ($ZipAgeMinutes -lt $MinZipAgeMinutes) {
        Write-Log "SKIP: ZIP age < $MinZipAgeMinutes minutes"
        continue
    }
    else {
        Write-Log "PASS: ZIP age >= $MinZipAgeMinutes minutes"
    }

    if (Test-FileLocked $Zip.FullName) {
        Write-Log "SKIP: ZIP is currently in use / being uploaded"
        continue
    }
    else {
        Write-Log "PASS: ZIP is not locked"
    }

    try {
        Remove-Item -Path $Zip.FullName -Force -ErrorAction Stop
        Write-Log "DELETED ZIP: $($Zip.FullName)"

        Remove-Item -Path $Folder.FullName -Recurse -Force -ErrorAction Stop
        Write-Log "DELETED FOLDER: $($Folder.FullName)"
    }
    catch {
        Write-Log "ERROR deleting '$($Folder.Name)': $($_.Exception.Message)"
    }
}

Write-Log "----- Cleanup run completed -----"

How It Works in Detail

The script implements a three-layer safety approach that has proven extremely reliable in production. Let me walk through each component and explain why it's necessary.

Configuration Section

The configuration variables at the top make it easy to adapt the script to different environments. The $RootPath points to your shared directory, while $LogFile determines where audit logs are written. The two timing variables are critical for safety and need to be tuned to your specific workflow.

Logging Function

The Write-Log function creates detailed audit trails of every decision the script makes. This isn't just for troubleshooting - it's essential for compliance and proving that files were deleted appropriately. Each log entry includes a timestamp and clear description of what happened.

File Lock Detection

The Test-FileLocked function is perhaps the most important safety feature. It attempts to open each ZIP file with exclusive access before deletion. If any other process has the file open, such as our OneDrive upload process, the function returns true and the script skips that file. This check has prevented numerous potential data loss incidents during slow uploads.

The Main Process Loop

For each folder in the target directory, the script looks for a matching ZIP file with the same name. If found, it performs three sequential checks:

First, it calculates the time difference between when the folder was created and when the ZIP was created. In my environment, the compression process typically takes about 1-2 minutes, so I require at least a 1-minute gap to ensure the ZIP creation has completed. This prevents the script from interfering with the compression process.

Second, it checks how old the ZIP file is. I set this to 35 minutes because our OneDrive uploads usually complete within 30 minutes, even for large files. This provides a comfortable safety margin for those occasional slow uploads during peak network usage.

Third, it attempts to acquire an exclusive lock on the ZIP file. Even if both timing checks pass, this final verification ensures no other process is actively using the file.

Only when all three checks pass does the script proceed with deletion, removing the ZIP file first and then the folder.

Real-World Example

Here's what the script looks like in action, taken from an actual log file during my testing phase:

2026-01-06 10:36:10 - ----- Cleanup run started -----
2026-01-06 10:36:10 - Checking 'REQ586697 - Bond' | Gap: 1.46 mins | Zip Age: 1368.1 mins
2026-01-06 10:36:10 - SKIP: Folder/ZIP gap < 2 minutes
2026-01-06 10:36:10 - ----- Cleanup run completed -----

In this case, I had temporarily increased the gap requirement to 2 minutes for testing. The folder "REQ586697 - Bond" and its ZIP file were created 1.46 minutes apart, which didn't meet the threshold. Even though the ZIP was over 22 hours old (1367 minutes), the script correctly skipped it based on the gap check.

After adjusting the gap requirement back to 1 minute, the next run successfully cleaned up the files:

2026-01-06 10:38:56 - ----- Cleanup run started -----
2026-01-06 10:38:56 - Checking 'REQ586697 - Bond' | Gap: 1.46 mins | Zip Age: 1370.88 mins
2026-01-06 10:38:56 - DELETED ZIP: \\smbserv1.bear.local\Uploads\REQ586697 - Bond.zip
2026-01-06 10:38:56 - DELETED FOLDER: \\smbserv1.bear.local\Uploads\REQ586697 - Bond 2026-01-06 10:38:56 - ----- Cleanup run completed -----

Notice how the script logs every check and decision. This level of detail has proven invaluable when users ask about specific files or when auditors need to verify our retention policies.

Why These Specific Checks Matter

Each safety check addresses a specific risk I identified during development:

The folder-ZIP gap check prevents the script from deleting files during the compression process. Even though 1 minute seems short, it's sufficient because our compression typically completes in seconds for most files. However, I've seen cases where antivirus scanning or disk I/O bottlenecks can extend this time, so the buffer is necessary.

The 35-minute age requirement provides protection against deleting files during upload. I chose this conservative timeframe after monitoring our upload patterns for two weeks and finding that even our largest files (up to 2GB) complete within 30 minutes. The extra 5 minutes accounts for any network congestion or OneDrive service delays.

The file lock check is the final safety net. Even if the timing checks pass, this prevents deletion if any process still has the file open. This has saved me from deleting files during unexpectedly slow uploads on several occasions, particularly when OneDrive was experiencing service degradation.

Lessons Learned and Best Practices

Through developing and deploying this solution, I've learned several important lessons:

  1. Always log everything: The detailed logging has helped me troubleshoot edge cases and prove compliance with our data retention policies. When a user reported a missing file, I could show exactly when it was deleted and that all safety checks passed.
  2. Be conservative with timing: My initial testing used a 20-minute age requirement, but I increased it to 35 minutes after observing some uploads taking longer during network maintenance windows. It's better to use a bit more disk space temporarily than to lose data.
  3. Test with production data: I initially tested with small files that compressed and uploaded quickly. Only when I tested with real production data did I discover that some of our ZIP files can exceed 2GB and take much longer to process.
  4. Monitor continuously: I review the logs weekly to ensure the script is working as expected and to identify any patterns that might require adjustment to the timing parameters.

Conclusion

This automated cleanup solution has transformed what was once a manual, error-prone process into a reliable, hands-off system. The combination of timing checks and file lock verification provides multiple layers of protection against data loss while ensuring efficient disk space usage.

If you're facing similar challenges with temporary file cleanup, I encourage you to adapt this approach to your environment. Start by monitoring your workflow timing, be conservative with your safety thresholds, and always include file lock checking as your final safety net. With these precautions in place, you can automate cleanup with confidence, knowing that your data remains protected while your disk space is efficiently managed.