Comments (19)
@Xuanwo @silver-ymz Since you guys are using this crate in opendal, a crate designed for high-performance unified data access and is used in sccache, I wonder did you encounter any performance like the one shown in this issue?
If you did encounter that, feel free to send me more data/info so that I can fix it.
I will also welcome any patch that improve performance/ergonomic of this crate.
from openssh-sftp-client.
I just noticed that there's tokio::io::copy
, but using it makes matters worse.
from openssh-sftp-client.
Can you try creating the Sftp
like this:
let sftp_client = Sftp::new(
child.stdin().take().unwrap(),
child.stdout().take().unwrap(),
SftpOptions::new().max_pending_requests(NonZeroU16::new(1).unwrap()),
)
.await?;
by default, Sftp
tries to group requests and send them in batch to improve throughput, but in case where you only have one active requests anyway, setting it to flush immediately makes more sense.
from openssh-sftp-client.
That doesn't help, unfortunately.
To put things in perspective: When using the sftp command line client, I get transfer speeds of about 35-40 MB/s.
The Rust program's throughput is probably less than 5 MB/s.
from openssh-sftp-client.
Can you try replacing the I/O loop with this, in additional to the SftpOptions
configuration:
let mut tmp_file = File::create("myfile").unwrap();
let mut buf = BytesMut::with_capacity(100 * 1024 * 1024);
while let Some(data) = zip_file
.read(buf.capacity(), buf)
.await
.transpose() // Return Result<Option<_>, _>
.unwrap() // Return Option<_>
{
buf = tokio::task::spawn_blocking(move || {
tmp_file.write_all(&data).unwrap();
data
}).await.unwrap();
buf.clear();
}
The problem I've seen with this loop:
BytesMut
isn't reused, but instead allocates it from heap everytime- Writing to
std::file::File
could block, so you need to usetokio::task::spawn_blocking
from openssh-sftp-client.
I'm sorry, I can't get your suggested code to compile.
Please note I also tried using tokio::io::copy
instead of my own loop, but it seems even slower.
from openssh-sftp-client.
Please note I also tried using
tokio::io::copy
instead of my own loop, but it seems even slower.
Not sure that will even work considering you are using std::fs::File
.
I'm sorry, I can't get your suggested code to compile.
Thanks, I've fixed the errors:
use std::{error::Error, fs::File, io::Write, num::NonZeroU16};
use bytes::{Bytes, BytesMut};
use openssh::{KnownHosts, Session, Stdio};
use openssh_sftp_client::{Sftp, SftpOptions};
use tokio::sync::mpsc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let session = Session::connect_mux("ssh://user@host", KnownHosts::Accept).await?;
let mut child = session
.subsystem("sftp")
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.spawn()
.await?;
let sftp_client = Sftp::new(
child.stdin().take().unwrap(),
child.stdout().take().unwrap(),
SftpOptions::new().max_pending_requests(NonZeroU16::new(1).unwrap()),
)
.await?;
let mut zip_file = sftp_client.options().read(true).open("myfile").await?;
let (tx, mut rx) = mpsc::unbounded_channel::<Bytes>();
let task = tokio::task::spawn_blocking(move || {
let mut tmp_file = File::create("myfile")?;
if let Some(data) = rx.blocking_recv() {
tmp_file.write_all(&data)?;
}
tmp_file.flush()
});
let mut buf = BytesMut::with_capacity(8192);
while let Some(data) = zip_file
.read(1024, buf)
.await
.unwrap()
{
buf = data;
// Reserve before splitting to increase the possibility of
// existing buffer getting reused.
buf.reserve(4096);
if tx.send(buf.split().freeze()).is_err() {
// The write task failed, break the loop
// and retrieve the result of the task.
break;
}
}
task.await??;
Ok(())
}
Edit:
I've made a few changes to the I/O loop to make it runs the file I/O and sftp in parallel, should be faster.
Plus I also adjust the bytes to read in the I/O loop since each tcp packet can be at most 64K large (jumbo packet) but in practice it is likely to be 1.5K, so setting the bytes to read to something so large doesn't any sense and would only make it slower since now sftp has to wait for several tcp packets and group them in memory instead of writing them out to file immediately.
from openssh-sftp-client.
Not sure that will even work considering you are using
std::fs::File
.
I did use openssh_sftp_client::file::TokioCompatFile
when using tokio::io::copy
Thank you for the hints.
I tried your latest example, and unfortunately, it does not work for me. It only transfers 1 KB and then exits - looks like tx.send(buf.split().freeze()).is_err()
returns true and therefore breaks the loop.
from openssh-sftp-client.
I did use
openssh_sftp_client::file::TokioCompatFilewhen
usingtokio::io::copy
I'm not referring to the source, but the destination which is std::fs::File
.
I guess you probably use tokio::fs::File
for this.
I tried your latest example, and unfortunately, it does not work for me. It only transfers 1 KB and then exits - looks like
tx.send(buf.split().freeze()).is_err()
returns true and therefore breaks the loop.
Hmmm if that is true, then it's possible writing to the file myfile
somehow fails.
from openssh-sftp-client.
I guess you probably use
tokio::fs::File
for this.
Yes, I had to change it to use tokio::fs::File
.
Hmmm if that is true, then it's possible writing to the file
myfile
somehow fails.
I modified the program to display the error message and all I'm getting is "channel closed", but I'm not sure why.
from openssh-sftp-client.
Ok, got it: if let Some(data) = rx.blocking_recv() {
needs to be replaced with while let Some(data) = rx.blocking_recv() {
It does work now, but it still isn't faster.
from openssh-sftp-client.
Ok, got it:
if let Some(data) = rx.blocking_recv() {
needs to be replaced withwhile let Some(data) = rx.blocking_recv() {
Aha sorry about this.
It does work now, but it still isn't faster.
There's no speedup at all?
I will have to look this up closer tomorrw, but did you compile the program with --release
?
You could also turn on lto = true
to improve performance.
If none of this help, then it could be the sftp implementation itself isn't very efficient, that's the only thing I could think of.
from openssh-sftp-client.
Still no performance improvement, unfortunately. I'm also thinking that it's some issue with the sftp implementation, but I know nothing aber sftp protocol internals.
from openssh-sftp-client.
Still no performance improvement, unfortunately. I'm also thinking that it's some issue with the sftp implementation, but I know nothing aber sftp protocol internals.
Maybe you can do a profile and then use tools like cargo-flamegraph
to print a flamegraph of the profile?
That will help me finding out the bottleneck.
from openssh-sftp-client.
Sure thing, here's a flamegraph generated during the download of a small 10 MB file. Please note that I had to add a drop(tx)
to the code after the reading loop for the program to actually terminate.
from openssh-sftp-client.
@jeinwag Thanks, I will have a look later.
P.S. it will be great if I can zoom in/out
from openssh-sftp-client.
@NobodyXu apparently the required JavaScript execution is being blocked, but zooming should work if you download the SVG and open it.
from openssh-sftp-client.
Thanks, will download it and have a look.
from openssh-sftp-client.
@jeinwag Sorry I've been busy with other projects.
I took a look at the flamegraph and it seems that locking/syscalls actually take quite some time?
Maybe switching to a different kernel/fs would help?
I see that you are using ext4, maybe trying using tmpfs to see if it would be faster?
Also try updating to latest ssh and linux kernel, that might help.
from openssh-sftp-client.
Related Issues (20)
- Invalid response id after closing file in blocking context HOT 3
- Can we implement `AsyncWrite` for `openssh_sftp_client::File`? HOT 6
- FR: Add `path()` method to `DirEntry` HOT 1
- openssh-sftp-clien 0.14 seems broken HOT 6
- `openssh-sftp-error` needs to bump major version HOT 5
- Missing Sftp::from_session HOT 1
- Possible bug in lifetime checker of rustc 1.62 HOT 11
- chore: Cleanup the integration tests for highlevel API
- chore: prepare for v0.11.0 release HOT 6
- Move upstream dependency into this org HOT 2
- Something in `Sftp::new` does not implement `Send` HOT 6
- Misleading naming of `file::write` and `file::write_all` HOT 6
- How can I upload big File HOT 2
- `read_dir` miss some files in multiple file directories HOT 4
- Is it possible to remove lifetime from `File`? HOT 1
- Incorrect behavior about `AsyncSeek` of `TokioCompatFile` HOT 6
- TokioCompatFile has an infinite write buffer size, which is unexpected HOT 9
- exaxmples/ folder demonstrating how to use this crate. HOT 1
- Create remote sftp subsystem error HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from openssh-sftp-client.