dedupe
is a URL deduplication tool that normalizes and deduplicates URLs based on hostname and query parameters. It offers options to filter URLs by query strings and exclude URLs with specific file extensions.
curl -sSL https://raw.githubusercontent.com/0xPugal/dedupe/master/dedupe -o dedupe && chmod +x dedupe && sudo mv dedupe /usr/bin/
- Normalize and Deduplicate URLs: Based on hostname and sorted query parameter names.
- Filter by Query Strings: Optionally include only URLs with query strings.
- Exclude Specific Extensions: Remove URLs with specified file extensions.
Usage: dedupe [options] [<input_file>]
Options:
-h, --help Usage/help info for dedupe
-u, --urls <filename> Filename containing URLs (use this if you don't pipe URLs via stdin)
-V, --version Get current version for dedupe
-qs, --query-strings-only Only include URLs if they have query strings
-ne, --no-extensions <ext> Do not include URLs with specific extensions
cat urls.txt | dedupe -qs
ordedupe -u urls.txt -qs
to get only parameterized URLscat urls.txt | dedupe -ne css,png,js
ordedupe -u urls.txt -ne css,png,js
to remove URLs with these extensions- Chain with other tools
echo example.com | gau | dedupe -qs -ne css,png,jpg,gif | anew output.txt
Before:
$ cat test.txt
https://test.com/api/users/123
https://test.com/api/users/222
https://test.com/api/users/412/profile
https://test.com/users/photos/photo.jpg
https://test.com/users/photos/myPhoto.jpg
https://demo.com/photo.png
https://google.com/home?qs=fuzz
https://google.com/home?qs=new&second=old
https://google.com/home?qs=asd&xyz=das
https://bing.com/test
https://bing.com/test.php?x=y&y=z
Only URLs with query strings:
$ ./dedupe -u test.txt -qs
https://google.com/home?qs=fuzz
https://google.com/home?qs=new&second=old
https://google.com/home?qs=asd&xyz=das
https://bing.com/test.php?x=y&y=z
Remove URLs with certain extensions:
$ ./dedupe -u test.txt -ne jpg,png,gif
https://test.com/api/users/123
https://test.com/api/users/222
https://test.com/api/users/412/profile
https://google.com/home?qs=fuzz
https://google.com/home?qs=new&second=old
https://google.com/home?qs=asd&xyz=das
https://bing.com/test
https://bing.com/test.php?x=y&y=z