Giter Club home page Giter Club logo

Comments (12)

adiroiban avatar adiroiban commented on August 18, 2024

Hi @DragosFlorea , Thanks for reporting the issue.

I think that this warning should be updated to also log the stack trace like log.warning('There be dragons.', exc_info=True)

This should provide more info about the place where this is raised.

Are you able to patch your code and run it again to get a better stacktrace?


Most probably this is due to some protocol features that are not supported (yet) by smbprotocol

Which SMB version do you use ?

Can you add more info about how to reproduce this error?

Cheers!

from smbprotocol.

DragosFlorea avatar DragosFlorea commented on August 18, 2024

Hi @adiroiban,
I can patch the code locally, with that property and come back with more log info.
I don't set the smb version, the code choose it automatically and I remember that it picks the last 3.1.1.

Regarding the reproducing part on your side, kind of difficult.
Multiple threads that connect to the same share, trying to download files locally, important is to have large files which keep the socket connection open for a long period of time.

Thanks, I will come back with a more descriptive log.

from smbprotocol.

DragosFlorea avatar DragosFlorea commented on August 18, 2024

hi,
Until the warning occurs the threads looks like this (also i confirm that the smb version used in connection is SMB_3_1_1):
image
Green are the threads created by the grpc server, which means there are aprox 10 files in processing
Red are the threads created by the smbprotocol library

And the warning is this:
image

As you can see there are multiple files in parallel and after the warning, the processing stops and the threads created by smbprotocol are closed

from smbprotocol.

jborean93 avatar jborean93 commented on August 18, 2024

Sorry I've been meaning to reply for a few days now. The whole logic around handling the underlying connection is quite complex and I'm not 100% happy with it. It should work for a basic example but as soon as you start putting in multiple threads with concurrent read/writes then I'm not surprised it is falling over.

In this case it's trying to read 4 bytes from the socket which is the length of the SMB packet after that. This part is failing as there wasn't 4 bytes to read for some reason. Because all this runs in a background thread it logs the error and closes the thread because it was unable to get the information about the incoming packet. Why this this I'm not sure, but if it cannot get the expected length it cannot get the actual packet. What you could try is to change the first few lines to

b_packet_size = b""
while len(b_packet_size) != 4:
    b_packet_size += self._sock.recv(4)
    if b_packet_size == b"":
        return

This will ensure that it always tries to read up to 4 bytes before it goes to get the length but the fact that it doesn't have 4 bytes anyway may mean it's broken in some other way.

As for the proper fix I'm not sure if I will be able to find it right now. I have a few other things on my plate that I'm focusing on and this will most likely take some time to figure out and solve properly. What I would recommend if you are trying to do this all in parallel is to create a new connection for each thread instead of trying to share it all. This will be a bit less efficient but it still means you can copy multiple files in parallel, there's just a few steps to set up the socket for each thread. To ensure each copy uses it's own connection and thus it's own socket you need to specify your own connection cache in the copy call like so

import smbclient

from smbclient.shutil import (
    copyfile,
)

def copyfile_with_new_connection(src, dest):
    # By specifying `connection_cache` in the kwargs we are bypassing the builtin pool and creating a connection in our own cache
    # each time. We just need to make sure we reset the cache when we are finished with it. Most calls should support the
    # connection_cache kwarg in smbclient.
    connection_cache = {}
    try:
        copyfile(src, dest, connection_cache=connection_cache)
    finally:
        smbclient.reset_connection_cache(connection_cache=connection_cache)

# call copyfile_with_new_connection in as many threads as you wish

from smbprotocol.

DragosFlorea avatar DragosFlorea commented on August 18, 2024

Hi,
Thanks for your answer. After a few tests the solution with connection_cache seems to work for me.
I just want to verify if I'm using this correctly.
My flow is something like this:

register_session(request.share.Server, username=request.share.User, password=request.share.Password)
stat(filepath)
with open_file(filePath, mode = self.READ) as fileRead:
   stream file

From what I observed I need to pass connection_cache=connection_cache at every smb operation, so the code looks like this:

connection_cache = {}
try:
  register_session(request.share.Server, username=request.share.User, password=request.share.Password, 
  connection_cache=connection_cache)
  stat(filepath,connection_cache=connection_cache)
  with open_file(filePath, mode = self.READ,connection_cache=connection_cache) as fileRead:
      streamfile()
finally:
      reset_connection_cache(connection_cache=connection_cache)

I can do something about the connection_cache to not walk it around through every function or it's mandatory to pass it?

Thanks

from smbprotocol.

jborean93 avatar jborean93 commented on August 18, 2024

Unfortunately no, if you omit the connection cache it will fallback to the default global one which as you’ve noticed doesn’t handle multiple requests across multiple threads too well. Technically you can omit the register_session call and just pass your credentials as kwargs to the first call, this case being stat. That looks something like:

connection_cache = {}
try:
  stat(filepath, username=request.share.User, password=request.share.Password, connection_cache=connection_cache)
  with open_file(filePath, mode = self.READ,connection_cache=connection_cache) as fileRead:
      streamfile()
finally:
  reset_connection_cache(connection_cache=connection_cache)

If you are on the latest release and you want to always use the same username and password you could even set you credentials as the default globally like so:

smbclient.ClientConfig(username=request.share.User, password=request.share.Password)

connection_cache = {}
try:
  stat(filePath, connection_cache=connection_cache)
  with open_file(filePath, mode = self.READ,connection_cache=connection_cache) as fileRead:
      streamfile()
finally:
  reset_connection_cache(connection_cache=connection_cache)

The ClientConfig is global and not safe to mutate across threads. Basically this route will only work for you if the credentials to connect are always the same. You just set the config at the start of your script and before you start running your threads.

from smbprotocol.

DragosFlorea avatar DragosFlorea commented on August 18, 2024

Hi @jborean93

I've started to implement this workaround and it seams ok until I've started to stress it a bit
My issue is the following:

image

In this case multiple files are downloaded in parallel and work well until I interrupt the internet connection.
After the internet is interrupted the application hangs for 600 seconds and after that it throws 2 errors(check the above image)
I've tried to investigate a little bit and I've found that all these errors happen here

image

The Empty exception is raised in msg_queue.get and afterwards self.send(SMB2Echo(), sid=sid) does not run as expected

The issue with these, is that the recv-... threads remain opened. I've tried to catch ConnectionResetError but I've saw that these runs in separate threads and I cannot handle them. And also I don't have any feedback on a higher level regarding the errors in threads...

Any suggestions on what I can do about these?

Thanks

from smbprotocol.

DragosFlorea avatar DragosFlorea commented on August 18, 2024

Hi @jborean93
Any feedback on the issue described earlier?
Thanks.

from smbprotocol.

jborean93 avatar jborean93 commented on August 18, 2024

Sorry I haven't had much time to investigate this to the extent I'm wanting. There's an outstanding issue #78 that talks about how connection related issues cause a hang in the code. I can see how that is happening and think there's a way to make it better but I just don't have the time right at the moment to put it in place. I'm not sure if the proposed fix will help your use case but essentially if the _process_message_thread function detects the TCP socket being closed it will notify any outstanding requests that has happened so they pass along that connection error.

I'm not 100% sure that's the route that should be taken but so far it's the only thing I can think off at the moment.

from smbprotocol.

jborean93 avatar jborean93 commented on August 18, 2024

@DragosFlorea I've just opened a draft PR which I think will solve this issue #80. It is hard to tell as I've never been able to fully replicate your Unpack issue but it should fix up the problem when the code hangs on a connection failure. Any stress tests you could do to test out the changes there would be greatly appreciated.

from smbprotocol.

jborean93 avatar jborean93 commented on August 18, 2024

I can't guarantee it's fixed your problem here but I hope #80 has. If you do get a chance to try it out and it's still present then I'm happy to reopen this issue.

from smbprotocol.

DragosFlorea avatar DragosFlorea commented on August 18, 2024

I wasn't able to test this, I hope is fixed. Thanks again

from smbprotocol.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.