Comments (9)
from cewl.
I had a couple of thoughts about this that I'd like to throw out there.....
-
I use an application called "Screaming Frog SEO Spider" and it has a progress bar that basically shows how much it's crawled out of how much the spider has found so far. So, even in a single thread, the progress bar is constantly bouncing around a bit but it's basically showing how much it's crawled out of how much the spider has found. This is especially nice because you can judge (roughly) where things are going. In other words, if the spider finds 10,000 pages in 2 seconds and then 30,000 in 6 seconds, then you know it's going to take a really long time. Where as if the spider finds 6 pages in 2 seconds and then 12 pages in 6 seconds, you know it's probably not going to take very long. Hopefully that makes sense but here is a little screenshot of what their progress bar looks like after just a few seconds of starting a crawl.....so you can sort of guess it's probably going to take 'a long time'.
-
the second thought would be to add an optional parameter and have the spider dump the spider results to a file instead of indexing/crawling them. That would then give the user the option to crawl them individually (likely running cewl multiple times) on their own. If you really wanted to go the extra mile, you could also add an option for cewl to be able to crawl the results from the created file at a later time
Overall, (IMO) any option that provides some sort of arbitrary progress or 'I'm still running and this is how much I've done so far and this is how much I think I still need to do' would be helpful. Right now, using -v or --debug is the only way to validate it's still crawling and not hung up somewhere.
from cewl.
from cewl.
from cewl.
Well, any options you're willing to do would be greatly appreciated. I love the app and use it a lot
from cewl.
Hello @digininja ,
I am trying to scrape the Ironman website to solve the last challenge of Cracking JWT keys (Obscure).
But the Cewl tool is really slow. In fact, it is just sitting there without making any requests, then after an hour or so it continues for a bit and then goes idle repeatedly.
I had to hibernate my PC twice instead of shutting it down to keep the tool working.
I am using the latest version CeWL 6.1 (Max Length) On Windows 11.
I had to use a proxy to monitor the work as there is no indication of work at all in the tool itself (It would be Cewl to show any kind of progress).
Command:
Task Manager:
Proxy:
Thanks for the Awesome Auth lab challenges.
from cewl.
The same thing in WSL 2.0 Ubuntu,
It's extremely slow, I think it starts with 1 or 2 requests per second, then the more requests it gathers the slower it becomes,
It's now doing ~1 request every 30 minutes or something like that.
Maybe it's doing some comparison of the new words with the old ones to handle duplicates?
I've never used it on Windows do don't know the base performance levels but it shouldn't be that slow. I'll see if I can give it a run against the site later and see what speed I get.
…
from cewl.
from cewl.
It might be an issue with my system, even though I have a reasonably good one.
I've conducted some additional tests on https://www.ironman.com.
Initially, it maintains a rate of 1 request per second,
But after around 110 requests, it begins to slow down.
By the time it reaches approximately 200 requests, the rate drops to about half a request per second.
I also tested it on your site, https://digi.ninja/.
Initially, it starts with a rate of 3 requests per second for the first 10 requests.
After that, there's a pause where it doesn't make any requests for a few seconds, and instead, it prints "Offsite link, not following:..."
During this "Offsite link, not following:..." phase,
I attempted to stop the process using CTRL + C.
It took a few seconds to stop, even though it wasn't actively making requests,
just printing the message. This happened after only 10 requests,
so it shouldn't have accumulated a significant amount of data (only 2500 lines).
So I think the bottleneck is somewhere in the checking phase.
PS: I'm struggling with the JWT cracking Obscure level. Can you provide any hints?
from cewl.
Related Issues (20)
- Release on rubygems.org HOT 2
- max length HOT 3
- Add redirection option HOT 4
- --meta_file flag not generating meta file HOT 2
- Anyone know how to scape a sharepoint site? HOT 3
- 'rexml/document' not found HOT 3
- Cewl does nothing. HOT 4
- Missing URL argument HOT 1
- [Feature Request] add range to groups of words HOT 4
- [Feature Request] add include/exclude spaces to -g HOT 8
- Can't run CeWL due to Nokogiri gem missing HOT 10
- question about releases HOT 3
- Exclude & Allowed Switches Not Behaving as Expected HOT 7
- Question: Possible to use on local directory? HOT 1
- install issue HOT 3
- CeWL on Ldapdomaindump page HOT 1
- --exclude not working HOT 2
- Tel: Protocol Mishandled HOT 1
- Separate by special characters
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cewl.