Comments (10)
Since this issue hasn't been discussed in a while, if anyone is interested I implemented that possible solution in this commit of my personal fork:
This implementation will allow the usage of big wordlists without the mentioned memory bottleneck. I'm not sure it's the best solution, but it works :)
from o365spray.
I like this approach and if you wanted to open a PR I would be happy to test/review and merge.
The only thing I noticed that would be blocking for a merge is that you are using loop.create_task
to execute each enumerator call. In the original code, we are using loop.run_in_executor
specifically to leverage the ThreadPoolExecutor
to honor the specified threads the user defined via the --rate
flag. To my understanding, the updated implementation would not honor the threads so we would need to either move back to the original call to run_in_executor
or implement a Semaphore to execute the tasks through.
from o365spray.
@Macmod - Thanks for looking into this further and testing it out a bit. I ran some local tests with the original code base, your updated code base, and another local alternative I wrote that just limited the number of blocking tasks at any given time. While I was able to match your memory management, the overall performance was far superior using your method.
If you want to open up a PR for this change, I would be happy to run some final tests and merge it into the main branch.
Before merging though, I wanted to mention a few things regarding the overall code formatting and style to ensure consistency with the rest of the code base:
- If you could document the
_consume_threads
function and its parameters (see therun()
function for docstring formatting). - The
threads
parameter in the_consume_threads
function is typed as alist
, but it is actually a Dictionary. Is there a reason for using a dictionary or is it just for easier access to the given Future object when deleting it on completion? Whichever is the cleaner solution, would be great if you could update the type reference. - Lastly, while not a style or formatting suggestion, the newly added
--conlimit
parameter mentions "Concurrency limit", but the--rate
parameter more aligns with the actual concurrency executed within the executor whereas the new flag value relates more to the pool size of the executor. I would suggest maybe using--poolsize
to reflect this? - Per the stackoverflow link you mentioned, there is a point mentioning the
threads
object name is not necessarily accurate and a potential alternative would be to call the objectfutures
.
The above are just suggestions to closer align to the existing code base, but please open a PR and I am happy to merge. Thanks again for your efforts on this - awesome work!
from o365spray.
Just a quick note - the same issue probably exists in the sprayer module, but I can't test it right now. Anyway, it's more logic that people will only use large lists with the enumerator module and then perform spraying on top of the valid emails found. If you're interested I can check it another time and send a new PR so we can keep things consistent 😃
from o365spray.
This has now been merged into the main branch. Awesome work @Macmod!
Regarding spraying -- I think for now enumeration is enough as you mentioned that most scenarios won't require massive lists for spraying, only enumeration. If there is a need to update, we can revisit.
from o365spray.
Hey @0xZDH, thanks for the reply =)
At the risk of being too naive, I can't help but to wonder whether a ThreadPoolExecutor is really needed in this particular use case. Maybe we'd get the same performance by just using coroutines and replacing conlimit
from my commit for your original rate
?
I don't have a clear answer right now, but it's something I've been thinking about.
from o365spray.
I wrote this with 0 validation or testing, but the idea is to leverage both your concurrency tasks limit and a Semaphore for async concurrency execution limits:
semaphore = asyncio.Semaphore(args.rate)
async def async_enumerat(domain: str, user: str, password: str = "Password1"):
"""Async call of enumerate"""
return self._enumerate(domain=domain, user=user, password=password)
async def safe_enumerate(domain: str, user: str, password: str = "Password1"):
"""Safe call to enumerate to keep within bounds of concurrency limits"""
async with semaphore:
return await async_enumerat(domain=domain, user=user, password=password)
blocking_tasks = set()
for user in userlist:
# Check the concurrency task limit and wait if we reach the upper bound
# Default: 1,000 concurrency task limit
if len(blocking_tasks) >= 1000:
_, blocking_tasks = await asyncio.wait(
blocking_tasks,
return_when=asyncio.FIRST_COMPLETED,
)
# Add new tasks as task pool frees up
blocking_tasks.add(
self.loop.create_task(
safe_enumerate(
domai=domain,
use=user,
passwor=password,
)
)
)
from o365spray.
Hey @0xZDH, thanks for the reply =)
At the risk of being too naive, I can't help but to wonder whether a ThreadPoolExecutor is really needed in this particular use case. Maybe we'd get the same performance by just using coroutines and replacing
conlimit
from my commit for your originalrate
?I don't have a clear answer right now, but it's something I've been thinking about.
After some testing I don't think this is the case anymore. Ditching ThreadPoolExecutor in favor of coroutines in this case seems to hurt performance, in either my solution or yours... But take a look at this test using this idea:
import asyncio
import random
import matplotlib.pyplot as plt
from concurrent.futures import ThreadPoolExecutor, wait, FIRST_COMPLETED
import time
import requests
loop = asyncio.get_event_loop()
N = 7000000
CONLIMIT = 1000
RATE = 10
def download(code, completion_times):
requests.get('http://www.github.com/', headers={'Cache-Control': 'no-cache'})
completion_times.append((code, loop.time()))
print(f'Downloaded {code}', completion_times[-1])
def consume(threads: list, max_n: int = CONLIMIT):
while len(threads) > max_n:
done, _ = wait(threads, return_when=FIRST_COMPLETED)
for t in done:
t.result()
del threads[t]
async def main(executor):
completion_times = []
threads = {}
for x in range(N):
future = executor.submit(download, x, completion_times)
threads[future] = x
consume(threads)
consume(threads, 0)
codes, times = zip(*completion_times)
plt.scatter(times, codes)
plt.xlabel('Time (ms)')
plt.ylabel('Task Code')
plt.title('Task Completion Times')
plt.savefig('test.png')
executor = ThreadPoolExecutor(RATE)
try:
loop.run_until_complete(main(executor))
finally:
loop.close()
It runs really fast and seems to prevent the issue. I'm going to experiment a little bit more with that idea on o365spray's code and come back later with a pull request if it works.
from o365spray.
I think it works, can you test it so we can maybe proceed to a pull request?
from o365spray.
Hey @0xZDH, you're right in your suggestions, I have included them in the PR.
Thanks again for the collaboration! 👍
from o365spray.
Related Issues (13)
- Feature Request: Safe Mode HOT 8
- Doesn't work with Python 3.9+ HOT 3
- Add additional code indicating Password was fine, but Conditional Access Policy thwarted attempt HOT 2
- AttributeError: module 'o365spray' has no attribute '__version__' HOT 2
- Setup install broken HOT 3
- Spraying doesn't start due to UnicodeDecodeError HOT 2
- Problems with running Domain Validation HOT 1
- ModuleNotFoundError: No module named 'o365spray' HOT 2
- Validate username and password files on start
- Ignore comment lines in user file
- Safe/lockout is not used when using a paired list HOT 7
- Object\Attribute Error HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from o365spray.