https://github.com/theohbrothers/get-duplicateitem
Gets duplicate or non-duplicate files.
https://github.com/theohbrothers/get-duplicateitem
duplicate duplicate-files hash powershell pwsh search unique unique-files
Last synced: 6 months ago
JSON representation
Gets duplicate or non-duplicate files.
- Host: GitHub
- URL: https://github.com/theohbrothers/get-duplicateitem
- Owner: theohbrothers
- License: apache-2.0
- Created: 2018-06-14T05:33:38.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2025-02-13T11:22:38.000Z (8 months ago)
- Last Synced: 2025-03-26T07:22:18.446Z (6 months ago)
- Topics: duplicate, duplicate-files, hash, powershell, pwsh, search, unique, unique-files
- Language: PowerShell
- Homepage:
- Size: 91.8 KB
- Stars: 2
- Watchers: 1
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Get-DuplicateItem
[](https://github.com/theohbrothers/Get-DuplicateItem/actions/workflows/ci-master-pr.yml)
[](https://github.com/theohbrothers/Get-DuplicateItem/releases/)
[](https://www.powershellgallery.com/packages/Get-DuplicateItem/)Gets duplicate or non-duplicate files.
## Install
Open [`powershell`](https://docs.microsoft.com/en-us/powershell/scripting/windows-powershell/install/installing-windows-powershell?view=powershell-5.1) or [`pwsh`](https://github.com/powershell/powershell#-powershell) and type:
```powershell
Install-Module -Name Get-DuplicateItem -Repository PSGallery -Scope CurrentUser -Verbose
```## Usage
The cmdlet supports the same parameters as `Get-ChildItem`: `-Path`, `-LiteralPath`, `-Include`, `-Exclude`, and `-Recurse`.
Use the `-AsHashtable` switch to get a hashtable containing `[string]$md5 = [System.Collections.ArrayList]$files`.
```powershell
# Get duplicate files in 'C:/folder1' only
Get-DuplicateItem -Path 'C:/folder1'# Alternatively, you may pipe folder paths
'C:/folder1' | Get-DuplicateItem# Or DirectoryInfo objects
Get-Item 'C:/folder1' | Get-DuplicateItem# Get duplicate files in 'C:/folder1' and its descendents
Get-DuplicateItem -Path 'C:/folder1' -Recurse# Get duplicate files in 'C:/folder1' and its descendents in the form: hash => FileInfo[]
Get-DuplicateItem -Path 'C:/folder1' -Recurse -AsHashtable# Remove all duplicate items
Get-DuplicateItem -Path 'C:/folder1' | Remove-Item# Remove all duplicate files in 'C:/folder1' and its descendents
Get-DuplicateItem -Path 'C:/folder1' -Recurse | Remove-Item
```Use the `-Inverse` switch to get non-duplicates.
```powershell
# Get non-duplicate files in 'C:/folder1' only
Get-DuplicateItem -Path 'C:/folder1' -Inverse# Get non-duplicate files in 'C:/folder1' and its descendents
Get-DuplicateItem -Path 'C:/folder1' -Inverse -Recurse# Get non-duplicate files in 'C:/folder1' and its descendents in the form: hash => FileInfo[]
Get-DuplicateItem -Path 'C:/folder1' -Inverse -Recurse -AsHashtable# Remove all non-duplicate files in 'C:/folder1' only
Get-DuplicateItem -Path 'C:/folder1' -Inverse | Remove-Item# Remove all non-duplicate files in 'C:/folder1' and its descendents
Get-DuplicateItem -Path 'C:/folder1' -Inverse -Recurse | Remove-Item
```## Notes
The cmdlet calculates the md5 hash of each descendent file, to be able to identify duplicates and non-duplicates. Therefore if there are many large descendent files, it is normal for the Cmdlet to take several seconds to several minutes to complete.