PowerShell : Searching CSV files for matching data


This was another requirement I came across, this was to search a large amount of CSV files in folder and get a script to match the data in the CSV file, so this is what the folder looks like:



The contents of the CSV will contain a header field which is formatted for this example like this:

Message Date,Text,FirstName,LastName


Then the data is formatted in this format separated by commas, hence the CSV file that looks like this:


All the files contain the same data, except some the files have this line removed, that line has been removed to test the script works:

2023-08-21 19:54:14,Honey for the world, lots of Honey,Harper,Quibblequack,

Now lets move on to the script:

# Define the date range criteria
$startDate = Get-Date "2023-08-21"
$endDate = Get-Date "2023-10-25"
$desiredKeywords = "Honey", "keyword2", "keyword3"  

# Specify the folder path containing the CSV files
$folderPath = "<path to folder>"

# Get a list of CSV files in the folder
$csvFiles = Get-ChildItem -Path $folderPath -Filter *.csv

# Loop through each CSV file
foreach ($csvFile in $csvFiles) {
    # Read the CSV data
    $csvData = Import-Csv -Path $csvFile.FullName
    # Extract date and time using regex pattern
    $datePattern = "\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}"
    $messageDateMatch = [regex]::Match($csvData."Message Date", $datePattern)
    if ($messageDateMatch.Success) {
        $messageDate = [datetime]::ParseExact($messageDateMatch.Value, "yyyy-MM-dd HH:mm:ss", $null)

# Check if "Message Date" is within the desired date range
        if ($messageDate -ge $startDate -and $messageDate -le $endDate) {

# Check if "Text" contains the desired keyword (case-insensitive)

            if ($csvData."Text" -match "(?i)$desiredKeywords") {
               Write-Host "File $($csvFile.Name) matches date range and contains the keyword."
            }
        }
    }
}


The one highlighted in blue have the keywords and valid dates contained within the data, therefore the ones not highlighted should not be included in the PowerShell results:


Now when we run the script we get this, which is exactly what we want, all the file with the data in them excluding the files that have the data missing:


Just for completeness, lets add the keyword of "Honey" to a missing file and then run the script again: 


Right, job done, now we can zip though large amounts of CSV files in seconds!


Previous Post Next Post

نموذج الاتصال