https://github.com/theohbrothers/get-duplicateitem
Gets duplicate or non-duplicate files.
https://github.com/theohbrothers/get-duplicateitem
duplicate duplicate-files hash powershell pwsh search unique unique-files
Last synced: about 1 year ago
JSON representation
Gets duplicate or non-duplicate files.
- Host: GitHub
- URL: https://github.com/theohbrothers/get-duplicateitem
- Owner: theohbrothers
- License: apache-2.0
- Created: 2018-06-14T05:33:38.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2025-02-13T11:22:38.000Z (over 1 year ago)
- Last Synced: 2025-03-26T07:22:18.446Z (over 1 year ago)
- Topics: duplicate, duplicate-files, hash, powershell, pwsh, search, unique, unique-files
- Language: PowerShell
- Homepage:
- Size: 91.8 KB
- Stars: 2
- Watchers: 1
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Get-DuplicateItem
[](https://github.com/theohbrothers/Get-DuplicateItem/actions/workflows/ci-master-pr.yml)
[](https://github.com/theohbrothers/Get-DuplicateItem/releases/)
[](https://www.powershellgallery.com/packages/Get-DuplicateItem/)
Gets duplicate or non-duplicate files.
## Install
Open [`powershell`](https://docs.microsoft.com/en-us/powershell/scripting/windows-powershell/install/installing-windows-powershell?view=powershell-5.1) or [`pwsh`](https://github.com/powershell/powershell#-powershell) and type:
```powershell
Install-Module -Name Get-DuplicateItem -Repository PSGallery -Scope CurrentUser -Verbose
```
## Usage
The cmdlet supports the same parameters as `Get-ChildItem`: `-Path`, `-LiteralPath`, `-Include`, `-Exclude`, and `-Recurse`.
Use the `-AsHashtable` switch to get a hashtable containing `[string]$md5 = [System.Collections.ArrayList]$files`.
```powershell
# Get duplicate files in 'C:/folder1' only
Get-DuplicateItem -Path 'C:/folder1'
# Alternatively, you may pipe folder paths
'C:/folder1' | Get-DuplicateItem
# Or DirectoryInfo objects
Get-Item 'C:/folder1' | Get-DuplicateItem
# Get duplicate files in 'C:/folder1' and its descendents
Get-DuplicateItem -Path 'C:/folder1' -Recurse
# Get duplicate files in 'C:/folder1' and its descendents in the form: hash => FileInfo[]
Get-DuplicateItem -Path 'C:/folder1' -Recurse -AsHashtable
# Remove all duplicate items
Get-DuplicateItem -Path 'C:/folder1' | Remove-Item
# Remove all duplicate files in 'C:/folder1' and its descendents
Get-DuplicateItem -Path 'C:/folder1' -Recurse | Remove-Item
```
Use the `-Inverse` switch to get non-duplicates.
```powershell
# Get non-duplicate files in 'C:/folder1' only
Get-DuplicateItem -Path 'C:/folder1' -Inverse
# Get non-duplicate files in 'C:/folder1' and its descendents
Get-DuplicateItem -Path 'C:/folder1' -Inverse -Recurse
# Get non-duplicate files in 'C:/folder1' and its descendents in the form: hash => FileInfo[]
Get-DuplicateItem -Path 'C:/folder1' -Inverse -Recurse -AsHashtable
# Remove all non-duplicate files in 'C:/folder1' only
Get-DuplicateItem -Path 'C:/folder1' -Inverse | Remove-Item
# Remove all non-duplicate files in 'C:/folder1' and its descendents
Get-DuplicateItem -Path 'C:/folder1' -Inverse -Recurse | Remove-Item
```
## Notes
The cmdlet calculates the md5 hash of each descendent file, to be able to identify duplicates and non-duplicates. Therefore if there are many large descendent files, it is normal for the Cmdlet to take several seconds to several minutes to complete.