Giter Club home page Giter Club logo

sshfs's People

Contributors

aguschin avatar efiop avatar ianthomas23 avatar isidentical avatar notspecial avatar pmrowla avatar skshetry avatar uunal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

sshfs's Issues

Checksum command fails if remote doesn't support uname operation

When I try to fs.checksum(path) on a server that does not permit a uname command, I get a generic Channel Open Error: Session request failed. Looking through the debug logs, it seems like the issue is caused by the logic in the _get_system:

https://github.com/fsspec/sshfs/blob/3c10c1bfff44f111926d763a54343726832e2d42/sshfs/spec.py#L295:L300

The server I am working with does support the md5sum and sha1sum commands so the _checksum method as written should work but the actual command never triggers because it errors before that.

Potential Solutions:

  • Provide ability to pass checksum commands to the _checksum() method, this would allow me to specify a known command (so the _get_system() check can be bypassed). Ideally, being able to provide both the remote and the local checksum commands would be even better in the case of a Darwin system speaking with a Linux server.
  • Include error handling logic for _get_system() to provide a more detailed error message.

If I can get some guidance on the preferred approach, and if contributions are welcome. I'm happy to submit a PR.

Thank you!
Pratheek

Registering with fsspec, and speed relative to SFTP

Hi - I'm interested in using sshfs as a faster alternative to the builtin sftp filesystem in fsspec (and also need server side copy) in Runhouse, a compute and data sharing layer for ML. It appears to me that sshfs still is not built-into fsspec and I need to register it as suggested here to use it with apis like fsspec.open(). A few questions I couldn't figure out:

  1. Why has sshfs not yet been made a builtin implementation of fsspec, nor register itself upon installation like other non-builtins? Is it due to some stability or hardness bar it hasn't yet reached?
  2. Will it indeed be faster than the builtin SFTPFileSystem? I see that SSHFileSystem is faster than Paramiko, but can't tell if there's any reason it'd be faster than SFTP.

synchronous `rmdir()` fails silently

Hello.

Trying to remove directories using SSHFileSystem via rmdir fails silently.

Appears to be missing the synchronous wrapper for _rmdir. i.e. the equivalent of: mkdir = sync_wrapper(_mkdir) and thus ends up all the way in AbstractFileSystem.rmdir which is implemented as pass # not necessary to implement, may not have directories.

A local test of adding the sync_wrapper works ok so far.

Question regarding call of stat() for parent dir

Hello,
while using sshfs in fsspec.open_files(), I discovered that stat() is called for the parent directory of the wanted files, even if it is already clear that this must be a directory. While this is most certainly not an issue for most cases, the sftp server I have to use behaves somewhat strange regarding this, as I get a permission error when trying to call stat() on these directories.

When using the default sftp implementation from fsspec there is no issue at all, so at least for me it seems that it should be possible without a call to stat(). Is there any way to achieve this with this library as well? I really like to use it because of performance reasons compared to sftp. Thank you!

_cat_file implementation

Hi, I have a feature request. Could the sshfs.SSHFileSystem get an implementation for _cat_file?

I'm trying to use sshfs with zarr, but hit a NotImplementedError when I try to construct a group.

Roughly what I've run:

import sshfs, zarr

fs = sshfs.SSHFileSystem(host)
store = zarr.storage.FSStore("/path/to/data.zarr", fs=fs, mode="r")

g = zarr.open(store)
File /usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/tasks.py:442, in wait_for(fut, timeout, loop)
    437     warnings.warn("The loop argument is deprecated since Python 3.8, "
    438                   "and scheduled for removal in Python 3.10.",
    439                   DeprecationWarning, stacklevel=2)
    441 if timeout is None:
--> 442     return await fut
    444 if timeout <= 0:
    445     fut = ensure_future(fut, loop=loop)

File /usr/local/lib/python3.9/site-packages/fsspec/asyn.py:395, in AsyncFileSystem._cat_file(self, path, start, end, **kwargs)
    394 async def _cat_file(self, path, start=None, end=None, **kwargs):
--> 395     raise NotImplementedError
Full Traceback
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
Cell In[69], line 1
----> 1 g = zarr.open(store, mode="r")

File /usr/local/lib/python3.9/site-packages/zarr/convenience.py:120, in open(store, mode, zarr_version, path, **kwargs)
    118     return open_array(_store, mode=mode, **kwargs)
    119 elif contains_group(_store, path):
--> 120     return open_group(_store, mode=mode, **kwargs)
    121 else:
    122     raise PathNotFoundError(path)

File /usr/local/lib/python3.9/site-packages/zarr/hierarchy.py:1465, in open_group(store, mode, cache_attrs, synchronizer, path, chunk_store, storage_options, zarr_version, meta_array)
   1462 # determine read only status
   1463 read_only = mode == 'r'
-> 1465 return Group(store, read_only=read_only, cache_attrs=cache_attrs,
   1466              synchronizer=synchronizer, path=path, chunk_store=chunk_store,
   1467              zarr_version=zarr_version, meta_array=meta_array)

File /usr/local/lib/python3.9/site-packages/zarr/hierarchy.py:164, in Group.__init__(self, store, path, read_only, chunk_store, cache_attrs, synchronizer, zarr_version, meta_array)
    162     mkey = _prefix_to_group_key(self._store, self._key_prefix)
    163     assert not mkey.endswith("root/.group")
--> 164     meta_bytes = store[mkey]
    165 except KeyError:
    166     if self._version == 2:

File /usr/local/lib/python3.9/site-packages/zarr/storage.py:1393, in FSStore.__getitem__(self, key)
   1391 key = self._normalize_key(key)
   1392 try:
-> 1393     return self.map[key]
   1394 except self.exceptions as e:
   1395     raise KeyError(key) from e

File /usr/local/lib/python3.9/site-packages/fsspec/mapping.py:143, in FSMap.__getitem__(self, key, default)
    141 k = self._key_to_str(key)
    142 try:
--> 143     result = self.fs.cat(k)
    144 except self.missing_exceptions:
    145     if default is not None:

File /usr/local/lib/python3.9/site-packages/fsspec/asyn.py:114, in sync_wrapper.<locals>.wrapper(*args, **kwargs)
    111 @functools.wraps(func)
    112 def wrapper(*args, **kwargs):
    113     self = obj or args[0]
--> 114     return sync(self.loop, func, *args, **kwargs)

File /usr/local/lib/python3.9/site-packages/fsspec/asyn.py:99, in sync(loop, func, timeout, *args, **kwargs)
     97     raise FSTimeoutError from return_result
     98 elif isinstance(return_result, BaseException):
---> 99     raise return_result
    100 else:
    101     return return_result

File /usr/local/lib/python3.9/site-packages/fsspec/asyn.py:54, in _runner(event, coro, result, timeout)
     52     coro = asyncio.wait_for(coro, timeout=timeout)
     53 try:
---> 54     result[0] = await coro
     55 except Exception as ex:
     56     result[0] = ex

File /usr/local/lib/python3.9/site-packages/fsspec/asyn.py:409, in AsyncFileSystem._cat(self, path, recursive, on_error, batch_size, **kwargs)
    407     ex = next(filter(is_exception, out), False)
    408     if ex:
--> 409         raise ex
    410 if (
    411     len(paths) > 1
    412     or isinstance(path, list)
    413     or paths[0] != self._strip_protocol(path)
    414 ):
    415     return {
    416         k: v
    417         for k, v in zip(paths, out)
    418         if on_error != "omit" or not is_exception(v)
    419     }

File /usr/local/Cellar/[email protected]/3.9.12/Frameworks/Python.framework/Versions/3.9/lib/python3.9/asyncio/tasks.py:442, in wait_for(fut, timeout, loop)
    437     warnings.warn("The loop argument is deprecated since Python 3.8, "
    438                   "and scheduled for removal in Python 3.10.",
    439                   DeprecationWarning, stacklevel=2)
    441 if timeout is None:
--> 442     return await fut
    444 if timeout <= 0:
    445     fut = ensure_future(fut, loop=loop)

File /usr/local/lib/python3.9/site-packages/fsspec/asyn.py:395, in AsyncFileSystem._cat_file(self, path, start, end, **kwargs)
    394 async def _cat_file(self, path, start=None, end=None, **kwargs):
--> 395     raise NotImplementedError

NotImplementedError: 

put_file does not create parent directories

Test to confirm this:

@pytest.mark.parametrize("cloud", [pytest.lazy_fixture("ssh")])
def test_put_file_ssh(tmp_dir, cloud):
    tmp_dir.gen("foo", "foo")
    cls, config, _ = get_cloud_fs(None, **cloud.config)
    fs = cls(**config)

    fs.fs.put_file("foo", "dir/foo")

move to fsspec org

There was talk of this and other fsspec-compatible implementations being transferred to github.com/fsspec . No rush, merely recording what was previously suggested.

`get_file` behaves differently using `SSHFileSystem` vs `LocalFileSystem `

I've implemented the AbstractFileSystem in my code in order to direct my application either to the local file system or file system over SSH. Only I noticed that the behavior of get_file is different in both. I wrote this little script to test and demonstrate.

from fsspec.implementations.local import LocalFileSystem
from sshfs import SSHFileSystem

ssh_fs = SSHFileSystem(
    "localhost",
    username="foobar",
    password="foobar",
)
ssh_fs.get_file("/tmp/foobar", ".")

local_fs = LocalFileSystem()
local_fs.get_file("/tmp/foobar2", ".")

I would expect that calling get_file on either with similar parameters would result the copying of the requested file to the local folder. Only the LocalFileSystem implementation results in an error:

IsADirectoryError: [Errno 21] Is a directory: '/home/west/Research/sshfs/.'

The LocalFileSystem implementation requires a full file path:

local_fs.get_file("/tmp/foobar2", "./foobar2")

I seems to me that the implemenation of get_file in SSHFileSystem does not follow the fsspec API. Or am I missing something?

Corrupted files when using `get()`

I'm not able to debug this issue further, I can only share it.

Some files (.zip archives) are corrupted when using:

ssh.get(ssh_file["name"], "/tmp/")

What's important is the same files are corrupted the same way, it's not random at all, when tested multiple times. Files which are not corrupted are not currupted always also.

The fix was in this case to switch to https://github.com/althonos/fs.sshfs (completely solved the issue)

No permissions on root level causes `SFTPPermissionDenied`

In my project I'm connected to a SFTP server I don't own. I just have rights to a few folders. Using put_file was throwing the following error:

  File "/home/west/venvs/invoice-processor/lib/python3.8/site-packages/fsspec/asyn.py", line 85, in wrapper
    return sync(self.loop, func, *args, **kwargs)
           │    │    │     │      │       └ {}
           │    │    │     │      └ ('/home/west/Projects/Abel/invoice-processor/files/odoo_downloads/742/8713783500248_F-2022-00082.xml', '/out/invoice/87137835...
           │    │    │     └ <bound method SSHFileSystem._put_file of <sshfs.spec.SSHFileSystem object at 0x7f4523e48a30>>
           │    │    └ <property object at 0x7f45252ec770>
           │    └ <sshfs.spec.SSHFileSystem object at 0x7f4523e48a30>
           └ <function sync at 0x7f45252ee5e0>
  File "/home/west/venvs/invoice-processor/lib/python3.8/site-packages/fsspec/asyn.py", line 65, in sync
    raise return_result
          └ SFTPPermissionDenied('')
  File "/home/west/venvs/invoice-processor/lib/python3.8/site-packages/fsspec/asyn.py", line 25, in _runner
    result[0] = await coro
    │                 └ <coroutine object SSHFileSystem._put_file at 0x7f452473cf40>
    └ [SFTPPermissionDenied('')]
  File "/home/west/venvs/invoice-processor/lib/python3.8/site-packages/sshfs/utils.py", line 27, in wrapper
    return await func(*args, **kwargs)
                 │     │       └ {}
                 │     └ (<sshfs.spec.SSHFileSystem object at 0x7f4523e48a30>, '/home/west/Projects/Abel/invoice-processor/files/odoo_downloads/742/87...
                 └ <function SSHFileSystem._put_file at 0x7f4525357e50>
  File "/home/west/venvs/invoice-processor/lib/python3.8/site-packages/sshfs/spec.py", line 169, in _put_file
    await self._makedirs(self._parent(rpath), exist_ok=True)
          │    │         │    │       └ '/out/invoice/8713783500248_F-2022-00082.xml'
          │    │         │    └ <classmethod object at 0x7f4525544070>
          │    │         └ <sshfs.spec.SSHFileSystem object at 0x7f4523e48a30>
          │    └ <function SSHFileSystem._makedirs at 0x7f452535a4c0>
          └ <sshfs.spec.SSHFileSystem object at 0x7f4523e48a30>
  File "/home/west/venvs/invoice-processor/lib/python3.8/site-packages/sshfs/utils.py", line 27, in wrapper
    return await func(*args, **kwargs)
                 │     │       └ {'exist_ok': True}
                 │     └ (<sshfs.spec.SSHFileSystem object at 0x7f4523e48a30>, '/out/invoice')
                 └ <function SSHFileSystem._makedirs at 0x7f452535a430>
  File "/home/west/venvs/invoice-processor/lib/python3.8/site-packages/sshfs/spec.py", line 231, in _makedirs
    await channel.makedirs(path, exist_ok=exist_ok, attrs=attrs)
          │       │        │              │               └ SFTPAttrs(type=5, size=None, alloc_size=None, uid=None, gid=None, owner=None, group=None, permissions=511, atime=None, atime_...
          │       │        │              └ True
          │       │        └ '/out/invoice'
          │       └ <function SFTPClient.makedirs at 0x7f452571dc10>
          └ <asyncssh.sftp.SFTPClient object at 0x7f4523df9ac0>
  File "/home/west/venvs/invoice-processor/lib/python3.8/site-packages/asyncssh/sftp.py", line 4045, in makedirs
    await self.mkdir(curpath, attrs)
          │    │     │        └ SFTPAttrs(type=5, size=None, alloc_size=None, uid=None, gid=None, owner=None, group=None, permissions=511, atime=None, atime_...
          │    │     └ b'/'
          │    └ <function SFTPClient.mkdir at 0x7f4525722ee0>
          └ <asyncssh.sftp.SFTPClient object at 0x7f4523df9ac0>
  File "/home/west/venvs/invoice-processor/lib/python3.8/site-packages/asyncssh/sftp.py", line 4989, in mkdir
    await self._handler.mkdir(path, attrs)
          │    │        │     │     └ SFTPAttrs(type=5, size=None, alloc_size=None, uid=None, gid=None, owner=None, group=None, permissions=511, atime=None, atime_...
          │    │        │     └ b'/'
          │    │        └ <function SFTPClientHandler.mkdir at 0x7f452571a040>
          │    └ <asyncssh.sftp.SFTPClientHandler object at 0x7f4523df9af0>
          └ <asyncssh.sftp.SFTPClient object at 0x7f4523df9ac0>
  File "/home/west/venvs/invoice-processor/lib/python3.8/site-packages/asyncssh/sftp.py", line 2769, in mkdir
    await self._make_request(FXP_MKDIR, String(path),
          │    │             │          │      └ b'/'
          │    │             │          └ <function String at 0x7f4526581310>
          │    │             └ 14
          │    └ <function SFTPClientHandler._make_request at 0x7f45257181f0>
          └ <asyncssh.sftp.SFTPClientHandler object at 0x7f4523df9af0>
  File "/home/west/venvs/invoice-processor/lib/python3.8/site-packages/asyncssh/sftp.py", line 2370, in _make_request
    result = self._packet_handlers[resptype](self, resp)
             │    │                │         │     └ <asyncssh.packet.SSHPacket object at 0x7f4523f13190>
             │    │                │         └ <asyncssh.sftp.SFTPClientHandler object at 0x7f4523df9af0>
             │    │                └ 101
             │    └ {101: <function SFTPClientHandler._process_status at 0x7f4525718280>, 102: <function SFTPClientHandler._process_handle at 0x7...
             └ <asyncssh.sftp.SFTPClientHandler object at 0x7f4523df9af0>
  File "/home/west/venvs/invoice-processor/lib/python3.8/site-packages/asyncssh/sftp.py", line 2386, in _process_status
    raise exc
          └ SFTPPermissionDenied('')

Turns out it is trying to the folder /, since I don't permissions on this level it returns a SFTPPermissionDenied. I've commented out the following line:

await self._makedirs(self._parent(rpath), exist_ok=True)

This fixes it for now, any ideas how to go about this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.