Giter Club home page Giter Club logo

Comments (19)

NobodyXu avatar NobodyXu commented on May 26, 2024 1

@Xuanwo @silver-ymz Since you guys are using this crate in opendal, a crate designed for high-performance unified data access and is used in sccache, I wonder did you encounter any performance like the one shown in this issue?

If you did encounter that, feel free to send me more data/info so that I can fix it.

I will also welcome any patch that improve performance/ergonomic of this crate.

from openssh-sftp-client.

jeinwag avatar jeinwag commented on May 26, 2024

I just noticed that there's tokio::io::copy, but using it makes matters worse.

from openssh-sftp-client.

NobodyXu avatar NobodyXu commented on May 26, 2024

Can you try creating the Sftp like this:

    let sftp_client = Sftp::new(
        child.stdin().take().unwrap(),
        child.stdout().take().unwrap(),
        SftpOptions::new().max_pending_requests(NonZeroU16::new(1).unwrap()),
    )
    .await?;

by default, Sftp tries to group requests and send them in batch to improve throughput, but in case where you only have one active requests anyway, setting it to flush immediately makes more sense.

from openssh-sftp-client.

jeinwag avatar jeinwag commented on May 26, 2024

That doesn't help, unfortunately.

To put things in perspective: When using the sftp command line client, I get transfer speeds of about 35-40 MB/s.
The Rust program's throughput is probably less than 5 MB/s.

from openssh-sftp-client.

NobodyXu avatar NobodyXu commented on May 26, 2024

Can you try replacing the I/O loop with this, in additional to the SftpOptions configuration:

    let mut tmp_file = File::create("myfile").unwrap();
    let mut buf = BytesMut::with_capacity(100 * 1024 * 1024);

    while let Some(data) = zip_file
        .read(buf.capacity(), buf)
        .await
        .transpose() // Return Result<Option<_>, _>
        .unwrap()     // Return Option<_>
    {
        buf = tokio::task::spawn_blocking(move || {
            tmp_file.write_all(&data).unwrap();
            data
        }).await.unwrap();
        buf.clear();
    }

The problem I've seen with this loop:

  • BytesMut isn't reused, but instead allocates it from heap everytime
  • Writing to std::file::File could block, so you need to use tokio::task::spawn_blocking

from openssh-sftp-client.

jeinwag avatar jeinwag commented on May 26, 2024

I'm sorry, I can't get your suggested code to compile.

Please note I also tried using tokio::io::copy instead of my own loop, but it seems even slower.

from openssh-sftp-client.

NobodyXu avatar NobodyXu commented on May 26, 2024

Please note I also tried using tokio::io::copy instead of my own loop, but it seems even slower.

Not sure that will even work considering you are using std::fs::File.

I'm sorry, I can't get your suggested code to compile.

Thanks, I've fixed the errors:

use std::{error::Error, fs::File, io::Write, num::NonZeroU16};

use bytes::{Bytes, BytesMut};
use openssh::{KnownHosts, Session, Stdio};
use openssh_sftp_client::{Sftp, SftpOptions};
use tokio::sync::mpsc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    let session = Session::connect_mux("ssh://user@host", KnownHosts::Accept).await?;
    let mut child = session
        .subsystem("sftp")
        .stdin(Stdio::piped())
        .stdout(Stdio::piped())
        .spawn()
        .await?;

    let sftp_client = Sftp::new(
        child.stdin().take().unwrap(),
        child.stdout().take().unwrap(),
        SftpOptions::new().max_pending_requests(NonZeroU16::new(1).unwrap()),
    )
    .await?;

    let mut zip_file = sftp_client.options().read(true).open("myfile").await?;

    let (tx, mut rx) = mpsc::unbounded_channel::<Bytes>();
    let task = tokio::task::spawn_blocking(move || {
        let mut tmp_file = File::create("myfile")?;
        if let Some(data) = rx.blocking_recv() {
            tmp_file.write_all(&data)?;
        }

        tmp_file.flush()
    });

    let mut buf = BytesMut::with_capacity(8192);

    while let Some(data) = zip_file
        .read(1024, buf)
        .await
        .unwrap()
    {
        buf = data;
        // Reserve before splitting to increase the possibility of
        // existing buffer getting reused.
        buf.reserve(4096);

        if tx.send(buf.split().freeze()).is_err() {
            // The write task failed, break the loop
            // and retrieve the result of the task.
            break;
        }
    }

    task.await??;

    Ok(())
}

Edit:

I've made a few changes to the I/O loop to make it runs the file I/O and sftp in parallel, should be faster.

Plus I also adjust the bytes to read in the I/O loop since each tcp packet can be at most 64K large (jumbo packet) but in practice it is likely to be 1.5K, so setting the bytes to read to something so large doesn't any sense and would only make it slower since now sftp has to wait for several tcp packets and group them in memory instead of writing them out to file immediately.

from openssh-sftp-client.

jeinwag avatar jeinwag commented on May 26, 2024

Not sure that will even work considering you are using std::fs::File.

I did use openssh_sftp_client::file::TokioCompatFilewhen using tokio::io::copy

Thank you for the hints.
I tried your latest example, and unfortunately, it does not work for me. It only transfers 1 KB and then exits - looks like tx.send(buf.split().freeze()).is_err() returns true and therefore breaks the loop.

from openssh-sftp-client.

NobodyXu avatar NobodyXu commented on May 26, 2024

I did use openssh_sftp_client::file::TokioCompatFilewhen using tokio::io::copy

I'm not referring to the source, but the destination which is std::fs::File.
I guess you probably use tokio::fs::File for this.

I tried your latest example, and unfortunately, it does not work for me. It only transfers 1 KB and then exits - looks like tx.send(buf.split().freeze()).is_err() returns true and therefore breaks the loop.

Hmmm if that is true, then it's possible writing to the file myfile somehow fails.

from openssh-sftp-client.

jeinwag avatar jeinwag commented on May 26, 2024

I guess you probably use tokio::fs::File for this.

Yes, I had to change it to use tokio::fs::File.

Hmmm if that is true, then it's possible writing to the file myfile somehow fails.

I modified the program to display the error message and all I'm getting is "channel closed", but I'm not sure why.

from openssh-sftp-client.

jeinwag avatar jeinwag commented on May 26, 2024

Ok, got it: if let Some(data) = rx.blocking_recv() { needs to be replaced with while let Some(data) = rx.blocking_recv() {

It does work now, but it still isn't faster.

from openssh-sftp-client.

NobodyXu avatar NobodyXu commented on May 26, 2024

Ok, got it: if let Some(data) = rx.blocking_recv() { needs to be replaced with while let Some(data) = rx.blocking_recv() {

Aha sorry about this.

It does work now, but it still isn't faster.

There's no speedup at all?
I will have to look this up closer tomorrw, but did you compile the program with --release?
You could also turn on lto = true to improve performance.

If none of this help, then it could be the sftp implementation itself isn't very efficient, that's the only thing I could think of.

from openssh-sftp-client.

jeinwag avatar jeinwag commented on May 26, 2024

Still no performance improvement, unfortunately. I'm also thinking that it's some issue with the sftp implementation, but I know nothing aber sftp protocol internals.

from openssh-sftp-client.

NobodyXu avatar NobodyXu commented on May 26, 2024

Still no performance improvement, unfortunately. I'm also thinking that it's some issue with the sftp implementation, but I know nothing aber sftp protocol internals.

Maybe you can do a profile and then use tools like cargo-flamegraph to print a flamegraph of the profile?
That will help me finding out the bottleneck.

from openssh-sftp-client.

jeinwag avatar jeinwag commented on May 26, 2024

Sure thing, here's a flamegraph generated during the download of a small 10 MB file. Please note that I had to add a drop(tx) to the code after the reading loop for the program to actually terminate.

flamegraph

from openssh-sftp-client.

NobodyXu avatar NobodyXu commented on May 26, 2024

@jeinwag Thanks, I will have a look later.

P.S. it will be great if I can zoom in/out

from openssh-sftp-client.

jeinwag avatar jeinwag commented on May 26, 2024

@NobodyXu apparently the required JavaScript execution is being blocked, but zooming should work if you download the SVG and open it.

from openssh-sftp-client.

NobodyXu avatar NobodyXu commented on May 26, 2024

Thanks, will download it and have a look.

from openssh-sftp-client.

NobodyXu avatar NobodyXu commented on May 26, 2024

@jeinwag Sorry I've been busy with other projects.

I took a look at the flamegraph and it seems that locking/syscalls actually take quite some time?

Maybe switching to a different kernel/fs would help?

I see that you are using ext4, maybe trying using tmpfs to see if it would be faster?

Also try updating to latest ssh and linux kernel, that might help.

from openssh-sftp-client.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.