Tuesday, September 24, 2024

Save hash for all files in a directory and list duplicates with PowerShell

Powershell scripts can be saved with extension ".ps1". Also the code can just be copies to PowerShell directly and hit Enter.

This PowerShell code "get_sha512_hashes.ps1" gets the SHA-512 hash for all files in current directory and saves the hashes and the names to "hashes.txt":

$files = Get-ChildItem -File
$totalFiles = $files.Count
$currentFile = 0
$files | ForEach-Object {
    try {
        $currentFile++
        Write-Host "Hashing ${currentFile}/${totalFiles}: $($_.Name)"
        $hash = Get-FileHash -Algorithm SHA512 -LiteralPath $_.FullName
        if ($hash.Hash) {
            "$($hash.Hash) $($_.Name)" 
        } else {
            "$($_.Name) - Hash generation failed!"
        }
    } catch {
        "$($_.Name) - Error generating hash: $_"
    }
} | Out-File "hashes.txt"

This other code reads the file "hashes.txt" and lists only duplicate files, that is files with different name but same hash:

$hashLines = Get-Content "hashes.txt"
$hashTable = @{}
foreach ($line in $hashLines) {
    $parts = $line -split " ", 2
    $hash = $parts[0]
    $filename = $parts[1]
    if ($hashTable.ContainsKey($hash)) {
        $hashTable[$hash] += $line
    } else {
        $hashTable[$hash] = @($line)
    }
}
foreach ($key in $hashTable.Keys) {
    if ($hashTable[$key].Count -gt 1) {
        Write-Host "Duplicate hash found: $key"
        $hashTable[$key] | ForEach-Object { Write-Host $_ }
        Write-Host "----------------------"
    }
}


No comments:

Post a Comment

Note: Only a member of this blog may post a comment.