Comments (9)
Hi there,
I assume you are asking about website (website crawl) scan. Purple HATS automatically excludes subdomains (or any domain does not match the starting URL's). You can verify the list of URLs scanned by Purple HATS by examining details.json
located in your results folder.
For example, in ./purple-hats/results/PHScan_<domain name>_.../details.json
you will see the list of URLs crawled and URLs out of domain:
...
"urlsCrawled": {
"toScan": [],
"scanned": [
...
],
"invalid": [],
"outOfDomain": [
...
]
}
...
from purple-a11y.
Do I understand correctly that the list of addresses given in the array "outOfDomain": is the list of addresses found, but excluded from the survey?
PS My congratulations and huge thanks! This is a wonderfully prepared automated testing tool.
from purple-a11y.
In the scanning setings of th min domain address (e.g. https://lepszyweb.pl), I put the patterns of subdomain addreses I want to exclude in the exclusions.txt file. Unfortunately, sites in subdomains are also scanned.
My exclusions.txt file
\.*wcag.lepszyweb.pl\.*
\.*wcag21.lepszyweb.pl\.*
\.*tad.lepszyweb.pl\.*
\.*deklaracja.lepszyweb.pl\.*
\.*przedipo.lepszyweb.pl\.*
\.*raport.lepszyweb.pl\.*
\.*walidator.lepszyweb.pl\.*
\.*kontrast.lepszyweb.pl\.*
\.*testy.lepszyweb.pl\.*
I don't know how to make it scan only domain addresses. Of cours, I can use sitemap, but I would like to use crawl option.
from purple-a11y.
Do I understand correctly that the list of addresses given in the array "outOfDomain": is the list of addresses found, but excluded from the survey?
Yes that is correct.
PS My congratulations and huge thanks! This is a wonderfully prepared automated testing tool.
Thank you, it would be great if you can share more about what we have done well, how you are using Purple-hats and ways we can improve. 😃
from purple-a11y.
In the scanning setings of th min domain address (e.g. https://lepszyweb.pl), I put the patterns of subdomain addreses I want to exclude in the exclusions.txt file. Unfortunately, sites in subdomains are also scanned. My exclusions.txt file
\.*wcag.lepszyweb.pl\.* \.*wcag21.lepszyweb.pl\.* \.*tad.lepszyweb.pl\.* \.*deklaracja.lepszyweb.pl\.* \.*przedipo.lepszyweb.pl\.* \.*raport.lepszyweb.pl\.* \.*walidator.lepszyweb.pl\.* \.*kontrast.lepszyweb.pl\.* \.*testy.lepszyweb.pl\.*
I don't know how to make it scan only domain addresses. Of cours, I can use sitemap, but I would like to use crawl option.
The exclusions.txt
is only needed for Custom flow scan. In the website crawl and sitemap scan modes, sub-domains and other domains that do not match the website URL you want to scan is automatically excluded.
Hope it helps.
from purple-a11y.
Unfortunately, this is not the case. You can find out by doing a scan of the page I provided. Scanning the site https://lepszyweb.pl yields results from both the main domain and subdomains.
I working on the Windows 11.
I testing both - portable purple hats v. 0.0.15 and purple-hats-master - downloaded on 2023/05/26
from purple-a11y.
Hi @zwiastunsw,
Thanks for sharing your experience. I have implemented an advanced scan option: -s "same-hostname"
that will make the crawler match the hostname in the url provided for the scan.
The default scans without the -s "same-hostname"
option will remain where the crawler will match "same-domain"
(sub-domains).
This is available in the new release https://github.com/GovTechSG/purple-hats/releases/tag/0.9.0
-s, --strategy Strategy to choose which links to crawl in a website scan
. Defaults to "same-domain".
[choices: "same-domain", "same-hostname"]
Let me know if this meets your usage scenario?
from purple-a11y.
Thank you, this solves my problem perfectly.
from purple-a11y.
Glad I am able to assist and provide the new option to exclude sub-domains in scanning. 😄
I will close the issue as completed.
from purple-a11y.
Related Issues (20)
- Include xpath & severity from axe HOT 3
- Provide Proof of Progress
- Document as much as you can about the system you are evaluating
- Add to text summary
- Make it easier to amplify the sitemap.xml crawl HOT 1
- Make it easier to analyze reports HOT 1
- Make it easier to find Sitemap.xml files that can be used. HOT 3
- 01/03/2024 Not able to run on Windows 11 HOT 4
- Not scanning all pages for all sitemap.xml files in some, more in others.
- I'm getting this error - The `punycode` module is deprecated. Please use a userland alternative instead. HOT 7
- UserData.txt is not being created and populate. Scan will not run in background after choosing "Y" HOT 1
- Error when Chrome is running
- I couldn't seem to include the exclude this file HOT 2
- Couldn't keep to the domain HOT 3
- PDFs are being scanned when they shouldn't be. HOT 1
- What do you mean by "customisable"? HOT 1
- Sitemap Scan Broken HOT 5
- Cannot find module - HOT 1
- Question, more than issue. Every scan results says "Website crawl (100 pages)". Is Purple A11Y limited to 100? HOT 2
- Unable to run the application since this build 06.06.24 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from purple-a11y.